Relying on Cloud Security and Redundancy

Compartilhar no facebook
Compartilhar no linkedin
Compartilhar no twitter
Compartilhar no whatsapp
Compartilhar no telegram

A specific cloud client has created a paradigm of maximum trust in the redundancy and resiliency of your network: Netflix.

In 2011, Netflix unveiled, through its technology blog (http://techblog.netflix.com), the Simian Army: a set of test and monitoring applications that the company uses to constantly evaluate its ability to continue the service during contingency situations.

The use of these tools demonstrates Netflix’s willingness and vision to create disaster hazards to refine and improve its service.

All through trust in his redundancy.

The Simian Army is not just a set of automated alert and response software, but includes Doctor Monkey who performs both functions after researching all of Netflix’s features to find any degradation in performance.

The Simian Army includes several programs that would confuse security professionals because of the knowledge they need to handle them: Chaos Monkey and Gorilla Chaos specifically.

These two programs are not responsive: they are aggressive.

They intentionally and randomly disconnect elements of the Netflix resource network. Recalling that Netflix runs largely on the public cloud of Amazon Web Services.

Chaos Monkey disables specific instance production, and Gorilla Chaos disables services across entire Amazon availability zones.

The intent is to ensure that all load balancing features built into the entire network can withstand the failure and continue to provide services transparently to customers.

It is an action that some security professionals might call reckless and management in many organizations would consider it insane.


Basically, the company is constantly DOS itself.

But it’s also brilliant, courageous and ultimately necessary: ​​it’s quite likely the only way to be absolutely sure that all the planning and design of redundant systems and automated response controls that manage them are fully functional in real time.

Now think: This constant scenario, is not that what happens several times? Do not lie!.

I would not recommend this approach to all organizations.

But organizations who want and need a full assurance that their cloud capabilities are totally fault tolerant may want to consider it.

And Netflix made this capability available to the world: not only did they announce the existence of the Simian Army on a public site, but in 2014, the company created Chaos Monkey as an open source free download:


It is a very courageous level to create such a consistent methodology in disaster scenarios, even more attacking your own resources.

Another is to announce to the world that you are using this methodology.

What if it goes wrong?

Would not failure be aggravated by the shame of being caught in his obvious arrogance?