Companies are scrambling to board the Big Data train. Most of their efforts seem to be on building their big data production capabilities. Whether it be transforming their data management processes and tools, hiring “Data Scientists” or enlisting the services of “Big Data” consultants, companies want to squeeze ever bit of insight out of their data.
However, from what I’ve seen, most companies are ignoring a big part of the Big Data equation – the data consumer. It doesn’t matter how many numbers are crunched if there isn’t anyone around who understands how to use them. In fact, an increasing number of new articles on Big Data have cropped up that specifically point to the risks of Big Data in the absence of Big Data understanding.
For example, in a February 8 opinion column in Wired magazine, Nicholas Taleb points out that an increased number of data points produces an increased number of spurious correlations:
“I am not saying here that there is no information in big data. There is plenty of information. The problem — the central issue — is that the needle comes in an increasingly larger haystack.”
In other words, if you toss enough data into a barrel and shake it around, you’ll eventually find some that sticks together completely randomly (e.g., Nate Silver’s illustration of spurious correlations between stock market performance and which teams won the Superbowl.).
Other articles point out that Big Data is increasingly becoming a “black box” where end users don’t understand the underlying assumptions, models, or transformations being made to the data. In these cases, interpreting the results of such data becomes muddied and can lead to poor decision-making or conclusion. For instance, in the recent HBR article, Advertising Analytics 2.0, one CEO lamented:
“When I add up the ROIs from each of our silos, the company appears twice as big as it actually is.”
Being able to crunch huge datasets is not enough. Companies must focus the same or even more effort on helping their people use Big Data. Being a competent consumer of data requires rethinking three ways in which we work with data and information:
Our brains – understanding and attending to the limitations and cognitive challenges we face when trying to make sense of information
Our approach – shifting from an information-focused approach to a decision driven approach to using data. This reduces the noise and clutter associated with big data.
Our reports – shifting from providing numbers to providing answers. There are too many numbers to plow through and our brains can’t process them anyway. Good reports should have fewer numbers.
In addition to this, the successful data consumer must have a strong understanding of the business or content area in which they are working. At the end of the day, data still do not tell you what to do. They tell you what is happening, what things are connected to other things, and sometimes why things are happening. Acting on that information is predicated on understanding the world in which that information was produced.