How Netflix Uses Chaos Engineering to Create Resilience Systems:
I spent hours studying about it, so you don't have to.
Here's a summary of what I learned:
I spent hours studying about it, so you don't have to.
Here's a summary of what I learned:
β’ They run microservices architecture
β’ They use chaos engineering to manage inherent chaos in distributed computing
β’ They use chaos engineering to find a problem before it causes a production outage
β’ They proactively find potential issues & then automate the fix
β’ They use chaos engineering to manage inherent chaos in distributed computing
β’ They use chaos engineering to find a problem before it causes a production outage
β’ They proactively find potential issues & then automate the fix
β’ Then failure gets fixed quickly via automation if it reoccurs
β’ They use a tool to randomly shut down servers & named it Chaos Monkey
β’ Chaos Monkey gets information about servers via the continuous delivery platform
β’ They use a tool to randomly shut down servers & named it Chaos Monkey
β’ Chaos Monkey gets information about servers via the continuous delivery platform
β’ They wrote Chaos Monkey in Go and open-sourced it
β’ They run chaos engineering in production traffic for accuracy
β’ They always keep a backup plan to reduce blast radius
β’ They run chaos engineering in production traffic for accuracy
β’ They always keep a backup plan to reduce blast radius
What did I miss?
-
π PS - I wrote an article with visuals of this case study for you:
newsletter.systemdesign.one
-
π PS - I wrote an article with visuals of this case study for you:
newsletter.systemdesign.one
If you liked this thread:
π Follow: @systemdesign42
β» Repost the tweet below to help others find it
π§βπ¦° Tag a person learning system design
πΎ Bookmark it for future reference
π Follow: @systemdesign42
β» Repost the tweet below to help others find it
π§βπ¦° Tag a person learning system design
πΎ Bookmark it for future reference
Loading suggestions...