Read any article on Big Data these days and you will inevitably run into the seemingly obligatory statement, “Remember, correlation does not equal causality.” Often, if two things move simultaneously, especially if one is an “input” type variable and the other is an “output” type variable, we tend to conclude that the first has an influence on the second. This can be a costly and even dangerous mistake to make.
Unfortunately, heeding the advice not to confuse the two can be difficult. It turns out that our brains are hardwired to find causality, even when none exists. It’s one of the many shortcuts that the brain takes to cut through all of the data that it receives and make sense of the world.
The typical reaction to this new insight often is to overcorrect and dismiss the value of correlation altogether. That’s a mistake. Correlation is a powerful tool for understanding your business and driving action.
There are two kinds of predictive power. The first is knowing that if you change one thing, something else will change in response. That’s causality. Understanding causality is important if you are trying to drive change. The second type of predictive power is knowing that when one thing occurs you are more (or less) likely to see something else occur. That’s correlation. Correlation is important if you need to know what to expect.
Knowing that two things tend to occur together (or that one rarely occurs with the other) is a very powerful piece of information. You don’t always have to know why something happens in order to capitalize on the fact that it happens.
Your credit score doesn’t cause you to be a good or bad driver, but it helps insurance companies predict which is more likely. They can then use that prediction in their underwriting process. That’s a huge advantage because actually assessing your driving risk is a lot harder and time consuming than is looking up a credit score. The type of operating system you use when logging into a websites doesn’t cause you to spend more money on that sites, but it does let companies know that you are more likely to do so. This can help them make more effective decisions about how they will present their products to you.
As with any tool, correlation has its limits. Correlation tells you what to expect under certain conditions. It helps you play the odds. Your job is to determine what actions to take based on that. Correlation doesn’t tell you what you need to do to make something happen. If your goal is to find the right lever to pull to cause a change, correlation isn’t the right tool.
The big data craze is forcing us to confront the world of statistics. Many of us are not prepared. Treat the new tools and techniques that you read about as you would any other set of tools or techniques. Understand how they work. Understand what problems they solve. Understand what problems they don’t solve. And, think critically about their use and limitations. Don’t’ dismiss a tool just because it doesn’t solve every problem. Sometimes the best tools are the ones that are the most specialized.
Brad Kolar is an executive consultant, speaker, and author. He can be reached at brad.kolar@kolarassociates.com.