What to do when you must monitor the whole infrastructure of the biggest European hosting and cloud provider? How to choose a tool when the most used ones fail to scale to your needs? How to build an Metrics platform to unify, conciliate and replace years of fragmented legacy partial solutions?

In this talk we will relate our experience building and maintaining OVH Metrics, the platform used to monitor all OVH infrastructure. We needed to go to places where most monitoring solutions hadn’t gone before, it needed to operate at the scale of the biggest European hosting and cloud providers: 27 data centers, more than 300k servers (bare metal!), and hundreds of products to fulfill our mission to host 1.3 million customers.

You will hear about time series, about open source solutions pushed to the limit, about HBase clusters operated at the extreme, and how about a small team leveraged the power of a handful of open source solution and lots of coding glue to build one of the most performant monitoring solutions ever.


Comments are closed.

Martin at 21:23 on 6 Jun 2019

This was a talk I was looking forward to the most. The technical information and the "story" behind it all was very interesting.
Unfortunately the speaker with his french "english" was really really really hard to listen to. His pronunciation was absolutely terrible and instead of enjoying the talk it took a really big mental toll just to keep attention.

I enjoyed the talk, and the story was really interesting IMO