The Dark Side of the Cloud

Cloud computing is often seen as a silver bullet for traditional IT problems. The flexible nature of this virtual outsourcing model lets your staff focus on writing great applications, making systems management somebody else's problem. And since pretty much everything can be managed in software, it should be easy to recover from even quite severe hardware outages.

Except it doesn't always work that way.

Amazon.com (Nasdaq: AMZN  ) suffered another long outage in its suite of cloud services yesterday. The event really highlights some best practices for using the cloud. We just caught a glimpse of which online businesses are managing their cloud systems the right way -- and who's cutting corners where it might hurt.

What happened?
This day-long outage was a very different beast from the widely covered Amazon Web Services downtime we saw last spring. That time, a massive thunderstorm took Amazon's east coast AWS data center off the grid, causing downtime for tons of major customers. I'd argue that the weather is a bit beyond Amazon's control.

This one's different -- but not unique.

Amazon's Elastic Block Storage service started suffering performance problems around 1:30 p.m. EDT. The problem spread to a larger number of customers over the next few hours, and when these chunks of data storage became unusable, this took down whatever AWS services had been designed to use the bad storage.

That's a big problem, but not necessarily a show-stopper. Properly designed cloud services and applications could mirror their block storage to other data centers in Amazon's fold, or even to another company's storage service. Many programming tool kits make it trivially simple to back your vital Amazon-hosted apps up with alternatives such as Rackspace Hosting (NYSE: RAX  ) . You do end up paying a bit more when storage is backed up to another location, but that's hardly news -- tape drives and racks full of backup hard drives aren't free, either.

Now, Amazon's service agreement for AWS and related products promises 99.95% uptime, or no more than about 4.5 hours of downtime per year. The company could probably dodge that commitment here, since the uptime promise is technically broken only when the service was classified as "unavailable." That didn't happen here, as Amazon slapped a "degraded performance" label on the event instead. But I still expect the company to send some service fee refunds this month.

Why? Because this isn't the first time we've seen pretty much exactly this scenario. The same thing happened in 2011, and that four-hour outage yielded 10 days of service refunds for affected customers. That spot of trouble was cause by a bungled network equipment upgrade; I'm hoping the root cause is different this week. You'd expect Amazon to learn from its mistakes, after all.

Lessons learned
So here's the deal. Amazon promises flawless performance, but nobody's perfect. These things happen. And if your business depends on one tool in one place, with no workaround when things go bad, you kind of deserve to suffer. Some corners just can't be cut without suffering the consequences.

Who saw their sites go down in a blaze of cost-saving regret? Slaps on the wrist go out to Reddit, Foursquare, Pinterest, and Imgur, as well as the fantastically popular multiplayer game Minecraft. Boo, hiss, get a clue!

But you know the bright side of that ignominious list? You don't own shares in any of these companies.

That's right -- every service that went down (to the best of my knowledge) was a private company with no responsibility to public shareholders. They may have ticked off their private equity owners and perhaps lost a few loyal users, but the big names who really, really need their stuff to work had already designed their wares the right way.

Netflix (Nasdaq: NFLX  ) is perhaps Amazon's best-known customer, since the company offloaded most of its IT needs to the AWS platform. Pretty much everything except actual movie streams flows through Amazon's servers, and Netflix ticked along undisturbed on Monday. That's hardly surprising, given that Netflix plans for mishaps and even randomly breaks stuff on purpose -- there's just no substitute for hands-on experience with unlikely problems.

When NASDAQ OMX Group (Nasdaq: NDAQ  ) built its Market Replay service, it needed to store "ten years of historical tick data down to the millisecond." Amazon's platform was a natural tool for the job, combining low cost with simple management -- and massive scale. The service took a licking and just kept on ticking. Nasdaq is now thinking about moving more services into the cloud, encouraged by Market Replay's success. And I don't think this week's events changed its mind about that.

The list of major customers goes on for miles. Spotify runs its music services right off Amazon's cloud storage. Washington Post (NYSE: WPO  ) uses AWS for data management. Pfizer (NYSE: PFE  ) runs large-scale research there. None of these customers complained about hiccups on Monday, and they'll continue to buy cloud services with confidence.

The end of cloud computing as we know it?
So if you were getting ready to sell or short Amazon and Rackspace based on the vulnerability you saw yesterday, I'd ask you to relax that trigger finger and back away slowly.

Cloud services are only scary if you don't know how to manage them properly. The big boys are already perfectly capable of handling temporary outages like this one, and the rest will learn from their early mistakes.

Everyone knows Amazon is the big, bad wolf in the retail world right now, but at its sky-high valuation most investors are worried it's the share price that will get knocked down instead of competitors. We'll tell you what's driving Amazon's growth, and how to know when to buy and sell this company today in our new premium report. Our report also has you covered with a full year of free analyst updates to keep you informed as the story changes, so click here now to read more.

Fool contributor Anders Bylund owns shares of Netflix. He has also created a bull call spread on top of his Netflix shares. Check out Anders' bio and holdings, or follow him on Twitter and Google+. The Motley Fool owns shares of Amazon.com and Netflix. Motley Fool newsletter services recommend Amazon.com, Nasdaq Stock Market, Netflix, and Rackspace Hosting. Try any of our Foolish newsletter services free for 30 days. We Fools may not all hold the same opinions, but we all believe that considering a diverse range of insights makes us better investors. The Motley Fool has a disclosure policy.


Read/Post Comments (1) | Recommend This Article (2)

Comments from our Foolish Readers

Help us keep this a respectfully Foolish area! This is a place for our readers to discuss, debate, and learn more about the Foolish investing topic you read about above. Help us keep it clean and safe. If you believe a comment is abusive or otherwise violates our Fool's Rules, please report it via the Report this Comment Report this Comment icon found on every comment.

  • Report this Comment On October 24, 2012, at 5:31 AM, TMFZahrim wrote:

    Netflix reported earnings last night, and I took the opportunity to ask this question on the conference call:

    Amazon's Web Services had another significant outage this week. Did this impact Netflix in any way? If not, was that due to Netflix's planning for random outages or due to not using effective services in the first place? Do these events affect your confidence in using services like Amazon's AWS for critical infrastructure and customer-facing functions?

    Reed Hastings

    Okay. AWS has done a great job for us as they have for other customers by having an architecture that isolates faults, and so it's built upon the notion that occasionally data centers will go down, and they give you the components to build around that. That would be true if they were our own data centers or if they were AWS. And in recent outage that they had, our customers were not materially affected. The traffic switched over smoothly, as it's designed to, to other Amazon data centers. So we're extremely happy with the decision to expand within the Amazon Web Service's footprint.

    ...so yeah, design it right and it just works.

    Anders

Add your comment.

Sponsored Links

Leaked: Apple's Next Smart Device
(Warning, it may shock you)
The secret is out... experts are predicting 458 million of these types of devices will be sold per year. 1 hyper-growth company stands to rake in maximum profit - and it's NOT Apple. Show me Apple's new smart gizmo!

DocumentId: 2070957, ~/Articles/ArticleHandler.aspx, 8/29/2014 8:14:58 PM

Report This Comment

Use this area to report a comment that you believe is in violation of the community guidelines. Our team will review the entry and take any appropriate action.

Sending report...


Advertisement