Correlation vs. Causation
Everyday Einstein uncovers the truth (and lies) of the correlation/causation fallacy. Just because something seems to cause something else, does not necessarily mean it does.
A few weeks ago, our family took a ferry across the English Channel. On the way from England to France, the ferry was extremely crowded, and the crossing took about 75 minutes. On the way back from France to England, there was hardly anyone on the ferry, and the captain announced that the crossing would take 95 minutes.
My 5-year old, wondering why the crossing was going to take longer on the way back, came up with this hypothesis:
“Maybe when there are less people, the boat goes slow; but when there are more people the boat goes faster.”
All the facts seemed to support her, and while this could be possible, I’ll hope you’ll agree that it wasn’t the most likely explanation. My 5-year-old had fallen prey to a classic statistical fallacy: correlation is not causation.
This phrase is so well known, that even people who don’t know anything about statistics often know this to be true. But the thing is, sometimes in science correlation is all you’ve got.
Sponsor: Netflix Instant Streaming. Watch thousands of TV episodes and movies on your PC, Mac, iPad, iPhone or Touch. Or on your TV through your XBox, PS3 or Wii. All streamed instantly by Netflix, saving you time, money and hassle. For a free 30-day trial, including the new Netflix Original Series Hemlock Grove, go to Netflix.com/qdt.
The official name for this type of logical fallacy is “Cum hoc ergo propter hoc,” or “with this, therefore because of this.” According to my daughter’s reasoning, after fewer people got on the ferry, the trip took longer. Therefore the trip took longer because fewer people got on the ferry.
It’s easy to see the problem with that logic in these examples:
“After I washed my car, it rained. Therefore washing my car causes rain.”
“When I got in the bath tub, the phone rang. Therefore getting in the bath will lead to the phone ringing.”
“We won our baseball game when I was wearing these socks, so it must be the lucky socks that caused our win.”
So if this is such a well-known fallacy, why does it show up so often? The place where this fallacy shows up the most often is in media headlines, which unfortunately is where most people get their science information and news. Imagine you’re looking to buy a magazine. Which headline best grabs your attention:
“One study on a limited population shows that when people do X, Y happens a certain percentage of the time.”
“Link found between doing X and Y happening!”
“New research shows that X causes Y.”
If you ever read a scientific paper, you’ll find that almost all scientists make statements like that first one. However by the time this research hits the popular media, it’s often transformed to look a lot more like that last one.
See also: How the Media Sensationalizes Science
Why We Care About Correlations
So if correlations are such rubbish, why do scientists spend so much time telling us about them? The thing is, that while no scientist believes that correlation necessarily means causation, to a scientist, a correlation between two things can be like a signpost that helps guide them to the truth.
Imagine that you’re trying to figure out your boyfriend’s favorite kind of ice cream so you can buy him some for his birthday. You don’t want to give away the surprise by coming right out and asking him. However, you’ve noticed that whenever he goes out for ice cream with his friends, he always comes back with a chocolate stain on his shirt. In other words, chocolate shirt stains are correlated with going out for ice cream.
Now you might want to jump to a conclusion here and say, “The chocolate stain is caused by him eating chocolate ice cream!” but that would be succumbing to the correlation fallacy and you’re smarter than that. Maybe one of his friends always manages to spill chocolate ice cream on your boyfriend’s shirt; maybe they’re involved in some kind of male bonding that requires them to throw ice cream at each other; it might not even be plain chocolate, maybe it’s rocky road, or fudge ripple. The possibilities are endless.
So you decide to use your stain observations to come up with a hypothesis that can be tested. You hypothesize that your boyfriend likes plain chocolate. To test your hypothesis, you buy some chocolate ice cream for yourself and offer him a bite. He turns his nose up at it and says that plain chocolate is too boring.
While you haven’t discovered the truth yet, you can use this new evidence to refine your hypothesis. You look for more correlations, noticing how much he seems to like marshmallows. A few days later you offer him a bite of rocky road. His eyes light up and a broad smile stretches across his face as he takes half of your ice cream in a single bite. Success!
So now you know all about correlation and causation. The 3 takeaway messages are:
Just because two things happen together, doesn’t necessarily mean that one causes the other
Looking for correlations is one of the most frequently used techniques in science because it provide us with hypotheses we can test to find the true cause of what we’re investigating.
Your boyfriend needs a lesson on how to eat ice cream properly.
If you liked today’s episode, you can become a fan of Everyday Einstein on Facebook or follow me on Twitter, where I’m @QDTeinstein. If you have a question that you’d like to see on a future episode, send me an email at firstname.lastname@example.org.
Scientist image from Shutterstock