While "big data" has clearly become a critical area of research for top technology firms, it is important to understand the fundamental nature of this tool before allocating investment capital. As IBM (NYSE:IBM) explains, "[E]very day, we create 2.5 quintillion bytes of data -- so much that 90% of the data in the world today has been created in the last two years alone." This proliferation of information has created a new race among firms including IBM, Intel (NASDAQ:INTC), EMC (NYSE:EMC), and Brocade Communications (NASDAQ: BRCM) to translate this information into some usable format.
What big data is
Big data is the term that has been selected to describe the process of analyzing impossibly large data sets and turning it into usable and actionable information. IBM explains that "[t]his data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few." As we all continue to move an increasing amount of our lives online, the amount of data we create that is easily collectable and available for analysis is nearly unlimited. Research firm IDC recently forecast a 50% increase from 2012 in the amount of data generated in 2013 at 4 trillion gigabytes.
The most straightforward application of big data technology is to provide companies with an insight into consumer decision making, by "[u]sing advanced analytics techniques such as predictive analytics, data mining, statistics, and natural language processing, businesses can study big data to understand the current state of the business and track evolving aspects such as customer behavior." While these applications will likely start with those companies able to invest in the technology, it is easy to imagine this technology playing an increasingly important role in our lives over time.
What big data is not
As the field continues to advance, an increasing number of experts have become vocal in pointing out the limitations in big data applications. Columbia historian Matthew Jones explains, "Data science depends utterly on algorithms but does not reduce to those algorithms. The use of those algorithms rests fundamentally on what sociologists of science call "tacit knowledge" -- practical knowledge not easily reducible to articulated rules -- or perhaps impossible to reduce to rules." As Jones points out, the human element remains critical.
In a recent Wall Street Journal interview, Stephen Sorkin of Splunk, a company in the big data analytics field, echoed Mr. Jones: "If you take any sufficiently large data sets, you are going to find correlations. You need a human in the loop to work out which are important." As an industry insider, Sorkin is well positioned to speak on the limitations of the technology.
To bring the matter to an even more human level, Lindsay Elsner, a health care executive recruiter with Link Executive Search in Minneapolis and Milwaukee, explains: "While the human and relationship aspects of this business will always be central, once this technology filters down to this level, it will be a powerful tool in providing both clients and candidates with statistically meaningful criteria by which to judge each other." As Elsner notes, big data is a tool that has the potential to aid in the decision-making process, not replace the human element. The x factor that distinguishes one job applicant from the next may remain impossible to reduce to an algorithm.
Leading the way
While it may be too early to pick a winner in the big data field, IBM is taking strides to remain at the forefront of the field. It recently announced that it is opening a center for big data in collaboration with Ohio State University. Perhaps the most critical advance represented by the move is that it has the potential to create a pipeline for well-trained employees in the field. Gartner's head of global research Peter Sondergaard said, "Our public and private education systems are failing us. Data experts will be a scarce, valuable commodity." The firm predicted that the field would be responsible for the creation of 1.9 million jobs by 2015, tripling the number of positions created in all other fields combined.
By positioning itself to have a direct pipeline to those individuals that will best be able to advance the field, IBM will give itself an edge. This could be important as competitors make major pushes to become increasingly relevant in the big data space. EMC, for example, is looking to make inroads by streamlining the process with its three-step process : "Infrastructure, Agile Analytics, and Actionable Insights." This is differentiated from Intel's edge, which lies more in its expertise in the server segment as a major competitor for IBM.
Ultimately, IBM's reach and focus at the enterprise level are the factors that make it one of the more interesting big data plays in the market. While it is likely still early in the process to consider big data fully actionable investment information, it should definitely be squarely on your radar. Big Blue is well positioned in big data and early investors should start here.