Lots of companies want to manage Big Data. Few have shown the bonafides to handle cascading waves of bits and bytes without folding like a lawn chair. Count Apache Mesos among those occupying this rarefied air.
Mesos is an established, open-source technology for managing and deploying data-crunching applications in a large-scale environment. Or, in English, the cloud. Think of it as akin to an operating system for apps that's skilled at doling out what's needed, where, at precisely the right time.
If that sounds good, it's because large-scale apps are messy. Any software large enough to routinely crunch through terabytes or even petabytes of data needs the aid of not one server but several, a setup commonly known as a cluster. Yet every new server, or node, introduces a point of failure, which is why large-scale cloud apps sometimes go down at exactly the wrong time.
Twitter had that very problem a few years ago. Mesos helped solve it, and now a company called Mesosphere is developing tools to bring the technology to other businesses. A group of investors led by Andreessen Horowitz just poured $10.5 million into the business.
Don't be surprised if there's plenty of interest from both prospects and other investors. Consider these estimates from Cisco Systems' most recent forecast for Internet traffic:
By 2018, there will be nearly 4 billion global Internet users (about 52% of the world's population), up from 2.5 billion in 2013.
By 2018, there will be 21 billion networked devices and connections globally, up from 12 billion in 2013.
Globally, the average fixed broadband connection speed will increase 2.6-fold, from 16 Mbps in 2013 to 42 Mbps by 2018.
Globally, IP video will represent 79% of all traffic by 2018, up from 66% in 2013.
Not exactly encouraging, is it? Overcrowded networks will soon be bursting with data. If pattern holds, we'll process a great deal of it via clustered applications that exist in the cloud. Amazon.com (NASDAQ:AMZN) would love that business. So would Rackspace Hosting, Inc. (NYSE:RAX), and more than a few others.
Hosting providers thrive on the idea of helping client manage the deluge via what's known as a "hybrid cloud" environment in which internal company servers share the data processing load with servers that exist in the cloud in a sort of digitized harmony that sounds an awful lot like a ringing cash register. Already, Amazon Web Services handles data for some of the world's largest businesses. Revenue zoomed to at least $3.7 billion last year, up from less than $1 billion in 2010. Rackspace booked $1.53 billion in sales last year, up from $780.6 million in 2010.
Amazing growth for each, to be sure. And yet there's a cost to playing host to so many 1s and 0s. Cloud computers aren't any different than the local kind in that they can only handle so much, even when networked together. For Amazon, Rackspace, and their ilk, there are really only two ways to guard against this sort of eventuality. Either add servers and storage faster than data can pile up -- which, as you might imagine from the Cisco data, isn't likely to work out well -- or use tools for making existing infrastructure vastly more efficient. That's why innovative software and mechanisms for handling cascading waves of data will always be en vogue. Without them, servers fail and Fail Whales surface,
Venture capitalists know this, of course, which is why we're seeing them put money into Mesosphere. It's only a matter of time before the hosting industry catches on, too, kicking off a potential bidding war. This is a race, Rackspace. Don't let Amazon win.