In the era of cloud-native distributed applications and micro-services, ensuring the robustness and reliability of services is hard. Our applications make use of ephemeral compute resources (e.g. Containers / Cloud Infrastructures) and rely on network-based communication to coordinate and distribute information. In such context, designing applications for robustness and reliability is definitely an hard task. The culture and engineering approach promoted by tech giants like Netflix, AWS, etc to deal with such complex systems is called Chaos Engineering.
Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. (http://principlesofchaos.org/).
In this talk we will introduce the audience to Chaos Engineering providing concrete examples of the motivations behind this engineering approach and tools that can be used to apply this methodology in practice.