Correlation doesn’t imply causation (but some correlations can imply at least some causation)
If you’ve been reading science news, you’ve seen over the years that obesity, happiness, loneliness, and divorce are all contagious. Just a few weeks ago, I read that grades are also contagious. Certainly, it’s not surprising to find out that friends’ behaviors are correlated on all these fronts, but is that enough to say for sure that your friends cause you to become more obese/happy/lonely etc.?
Generally, the answer is no. Shalizi explains on his blog and in a paper with Andrew Thomas why this is the case. The basic idea is expressed in the picture below. It could be that Alice influences Bob directly, but it could also be (for example) that Alice has joined a Pie-Eating club and Bob has joined the same club. The Pie-Eating club explains why they both have a tendency to become obese and it explains how they became friends. The club, in this case, is a hidden variable that gives an alternate explanation for correlations in obesity.
The problem is that it is easy to come up with more and more elaborate hidden variables that might explain correlations in, e.g., obesity, and we could never hope to account for them all. Is hope lost? Not quite.
It turns out that a similar problem arises in quantum physics. Einstein saw the “spooky action at a distance” implied by quantum physics and declared that it could not be: ultimately, there must be some hitherto unmeasured hidden variables that explain the correlations between distantly separated particles. Amazingly, Einstein was wrong and John Bell demonstrated a simple test that would be satisfied by any hidden variable theory but is violated by quantum physics. What we do is extend this reasoning for correlations in social networks. We don’t want to account for every possible source of correlations (like Pie-Eating Club, yum), we want a test that tells us that no hidden variables explain the correlations and therefore there must be some influence between friends.
Everything else is just mathematical details. We show how to construct these types of tests in a general way. In the end, we were able to show that hidden variable theories do not explain the correlations in obesity and, therefore, some other causal effect is needed to explain what is going on. Of course, my language here is purposely cagey, as all causality researchers are:
You will have to see the paper for all the caveats, but I can make the bold statement that according to the best causal tests, obesity is contagious, probably.
I’m presenting this work at AISTATS, here is the paper which builds on work I previously discussed.
Filed under: Posted by Greg Ver Steeg | Leave a Comment
No Responses Yet to “Correlation doesn’t imply causation (but some correlations can imply at least some causation)”