Amazon Web Services Failed and We Still Streamed Netflix Perfectly -- Here's Why

Netflix learned from the Christmas Eve outage of 2012.

By Tim Beyers – Sep 28, 2015 at 9:00PM EST

Despite an outage, I was still able to stream Person of Interest flawlessly. Image source: CBS.

When Amazon.com's (AMZN -0.70%) Web Services infrastructure suffered cascading problems on Sunday, Sept. 20, one notable media outlet, techtimes.com, reported it as a "monstrous" outage.

Um, no. The Internet did not fall to pieces.

Shoppers still browsed Amazon, Tinder users still found dates, Redditors... did something else, and my wife and I still managed to binge-stream episodes of Person of Interest via Netflix (NFLX +1.74%), the highest-profile site affected by the outage.

How it happened
According to Amazon's account of the outage, AWS's DynamoDB experienced what the company called a "service event" due to problems with how the database handles metadata -- i.e., data that describes data -- and the storage servers used to capture information in tables.

In its postmortem, Amazon said:

On Sunday morning, a portion of the metadata service responses exceeded the retrieval and transmission time allowed by storage servers. As a result, some of the storage servers were unable to obtain their membership data, and removed themselves from taking requests.

Why you should care
Still confused? In the simplest terms, DynamoDB "timed out" over and over and over again, leading to a cascading flow of problems. Think of it as the digital equivalent of clogged pipes. Eventually, nothing gets through.

Frankly, I think Amazon did about as well it could have under the circumstances. Outages happen. Only the (small-f) foolish don't plan for them.

In Amazon's case, it took about six hours to recover to full service -- not quite the scale of the outage that plagued Netflix on Christmas Eve 2012. That one cut off viewers for 7.5 hours, making grinches out of millions who hoped to celebrate the holiday spirit watching Netflix.

Why you should love Netflix even more today
Netflix had far fewer problems this go-round. We didn't even notice. To be fair, that's at least partially because of where we live -- Colorado -- as most of the problem was centered in Amazon's East Coast data center operations. Yet that's not the only reason; Netflix has created a whole system to account for problems in delivering streams to its 60 million-plus customers around the world.

According to an insightful account at TechRepublic, the company has built automated troublemakers dubbed its "Simian Army." Their only job: poke, prod, and otherwise try to break the network and expose weaknesses. You know how the government hires hackers? The Simian Army is to Netflix what hackers are to the feds; they help surface and fix issues before they become problems.

It's because of this level of "chaos engineering," as Netflix calls it, that the company was able to easily weather the fallout of the DynamoDB outage. And it's why I still won't sell my shares, despite the clear premium at which they trade.

According to TechRepublic, Netflix was able to quickly redirect traffic to datacenters in an unaffected area. "Netflix was able to do this because it practices what it refers to as multi-region, active-active replication - where all of the data needed for its services is replicated between different AWS regions in a way that allows rapid recovery from failures," TechRepublic reported.

What do you think of Amazon stock at current prices? What about Netflix stock? Tell me on Twitter, or reach out on Google Plus. I may use your comment in a follow-up article.

Tim Beyers is a Senior Investment Analyst and Lead Advisor at The Motley Fool. Tim leads the signature Rule Breakers investing service and co-leads Motley Fool Supernova’s Odyssey mission, a real-money portfolio designed to help individual investors build and manage a portfolio of Rule Breaker stocks. He also serves as a portfolio lead for Cloud Disruptors and is a contributing stock analyst for the Trends team. Tim has over 20 years of professional investing experience, including 17 years as an analyst for the market-beating Rule Breakers service and three years as its lead advisor.

TMFTimBeyers

Amazon Web Services Failed and We Still Streamed Netflix Perfectly -- Here's Why

Read Next

About the Author

Stocks Mentioned

Motley Fool Stock Advisor’s Latest Pick

Premium Investing Services

Amazon Web Services Failed and We Still Streamed Netflix Perfectly -- Here's Why