We all know not to poke at alien life forms in another planet, right? But what about metrics, do you know how to pick, measure and draw conclusions from them? In this talk we will cover various Site Reliability Engineering topics, such as SLIs and SLOs while we explore real life examples of defining and implementing metrics in a system with examples using Prometheus, an open-source system monitoring and alert platform, to demonstrate implementation. Let's get back to some real science.

Comments

Comments are closed.

Great talk about measuring, metrics, and what to do with it. Really like the concepts of SLIs, SLAs and SLOs. Will definitely try out Prometheus.

Simone Basso at 12:47 on 22 Sep 2018

Great talk and well delivered

As always Rafael does a great job explaining complex subjects. Great speaker, great content except the adhoc made-up elephpant example :P

Ben Roob at 09:51 on 26 Sep 2018

Thanks for sharing your experiences. Great talk. Liked the way you sharpened the awareness of measuring/metrics and its purposes. Btw, nice sense of humor!