7 Tweets 1 reads Aug 06, 2024
How Netflix Uses Chaos Engineering to Create Resilience Systems:
I spent hours studying about it, so you don't have to.
Here's a summary of what I learned:
β€’ They run microservices architecture
β€’ They use chaos engineering to manage inherent chaos in distributed computing
β€’ They use chaos engineering to find a problem before it causes a production outage
β€’ They proactively find potential issues & then automate the fix
β€’ Then failure gets fixed quickly via automation if it reoccurs
β€’ They use a tool to randomly shut down servers & named it Chaos Monkey
β€’ Chaos Monkey gets information about servers via the continuous delivery platform
β€’ They wrote Chaos Monkey in Go and open-sourced it
β€’ They run chaos engineering in production traffic for accuracy
β€’ They always keep a backup plan to reduce blast radius
If you liked this thread:
πŸ”” Follow: @systemdesign42
β™» Repost the tweet below to help others find it
πŸ§‘β€πŸ¦° Tag a person learning system design
πŸ’Ύ Bookmark it for future reference

Loading suggestions...