Correlation doesn’t imply causation (but some correlations can imply at least some causation)


If you’ve been reading science news, you’ve seen over the years that obesity, happiness, loneliness, and divorce are all contagious. Just a few weeks ago, I read that grades are also contagious. Certainly, it’s not surprising to find out that friends’ behaviors are correlated on all these fronts, but is that enough to say for sure that your friends cause you to become more obese/happy/lonely etc.?

Generally, the answer is no. Shalizi explains on his blog  and in a paper with Andrew Thomas why this is the case. The basic idea is expressed in the picture below. It could be that Alice influences Bob directly, but it could also be (for example) that Alice has joined a Pie-Eating club and Bob has joined the same club. The Pie-Eating club explains why they both have a tendency to become obese and it explains how they became friends. The club, in this case, is a hidden variable that gives an alternate explanation for correlations in obesity.

influence or homophily

The problem is that it is easy to come up with more and more elaborate hidden variables that might explain correlations in, e.g., obesity, and we could never hope to account for them all. Is hope lost? Not quite.

It turns out that a similar problem arises in quantum physics. Einstein saw the “spooky action at a distance” implied by quantum physics and declared that it could not be: ultimately, there must be some hitherto unmeasured hidden variables that explain the correlations between distantly separated particles. Amazingly, Einstein was wrong and John Bell demonstrated a simple test that would be satisfied by any hidden variable theory but is violated by quantum physics. What we do is extend this reasoning for correlations in social networks. We don’t want to account for every possible source of correlations (like Pie-Eating Club, yum), we want a test that tells us that no hidden variables explain the correlations and therefore there must be some influence between friends.

Bell inequality for networks

Everything else is just mathematical details. We show how to construct these types of tests in a general way. In the end, we were able to show that hidden variable theories do not explain the correlations in obesity and, therefore, some other causal effect is needed to explain what is going on. Of course, my language here is purposely cagey, as all causality researchers are:

You will have to see the paper for all the caveats, but I can make the bold statement that according to the best causal tests, obesity is contagious, probably.

I’m presenting this work at AISTATS, here is the paper which builds on work I previously discussed.


No Responses Yet to “Correlation doesn’t imply causation (but some correlations can imply at least some causation)”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: