Big data shows: people who carry a lighter in their pocket are more than ten times as likely to develop lung cancer as those who do not. The p-value is significant.
If you feed this data to an AI model that only understands correlation, its health recommendation might be: “To reduce your cancer risk, please discard your lighter immediately.”
Absurd? Your intuition can refute it right away: lighters do not cause cancer — smoking does. Smokers carry lighters, and smokers are more likely to get lung cancer. “Smoking” simultaneously causes both “carrying a lighter” and “getting lung cancer.” It is the puppet master behind the curtain, making two otherwise unrelated variables appear strongly correlated.
The problem is: in the real world, most “lighters” are not this obvious. Every day, as we analyze data, build models, and make decisions, the hidden third party is often lurking where we cannot see it.
19. Confounding Factor: The Puppet Master Behind the Curtain
A confounding factor (confounder) is a variable that simultaneously influences both X and Y, making X and Y appear causally related when in reality they are both just being driven by it.
Does red wine lead to longer life? Numerous studies have shown that moderate drinkers live longer than non-drinkers. Wine companies promoted this aggressively — until more rigorous research revealed the truth: people who can leisurely sip red wine every day tend to have higher socioeconomic status, better healthcare, healthier diets, and less life stress. It is “being wealthy” that makes them live longer, and being wealthy also lets them afford red wine. Red wine is a correlate, not a cause.
In product development, “users who use advanced features have higher retention” is a common observation. The PM’s conclusion: push the advanced feature on everyone. Result: retention does not budge. Because “highly loyal users” (the confounder) are inherently more willing to explore advanced features — it is loyalty that leads to usage, not usage that creates loyalty.
The question for identifying confounders: “Is there a third variable that simultaneously influences both the X and Y I am studying?”
20. Reverse Causality: You Got the Direction Backwards
“Data shows that neighborhoods with higher police presence have higher crime rates. So to lower crime, we should reduce police presence?”
This is causal reversal. It is the high crime rate that leads to more police being deployed. A and B are associated, but the direction producing the association is the opposite of what you assumed.
In marketing analysis, this trap is extremely common: “Regions where we spend more on ads have higher sales.” But is it the ads driving sales, or is it that those regions already had strong demand and the marketing team decided to invest more budget there? If you get the direction wrong, you might burn your budget in a market with zero potential, expecting ads to “create” demand that was never there.
A basic tool for judging causal direction: temporal order. A must occur before B for A to possibly cause B. But temporal order is a necessary condition, not a sufficient one — A happening before B does not mean A caused B.
21. Collider Bias: Filtering Your Sample Creates a False Correlation
Why is the stereotype “handsome guys are jerks” so widespread? Does beauty really corrode character?
No. This is the result of sample filtering. The people you choose to date or pay attention to typically need to satisfy at least one condition: either good-looking or nice personality. Someone who is both unattractive and has a terrible personality never enters your social circle. Among the people you can actually observe, those who are unattractive but still become your friends must have great personalities; those who are good-looking do not need a great personality to appear in your field of vision.
Result: in your sample, attractiveness and personality are negatively correlated — but in the real population, the two might have no relationship at all. “Entering your social circle” is a collider — it is caused by both attractiveness and personality. When you only observe within this filtered sample, you manufacture a false correlation.
In user satisfaction analysis, a similar structure is common: you only analyze users who “continue using the product,” and “continue using” is jointly filtered by “satisfaction” and “switching costs.” The relationships between various characteristics observed within this sample may not reflect the true user population.
22. Spurious Correlation: Coincidence, Not Causation
Ice cream sales and drowning deaths are strongly positively correlated. If you did not know summer exists, you might conclude that ice cream causes drowning.
The difference between spurious correlation and confounding factors: a confounding factor is a meaningful third party (smoking genuinely causes both carrying a lighter and lung cancer), while a spurious correlation might just be a statistical coincidence, or the mechanism behind it is extremely indirect. The number of films Nicolas Cage appeared in each year correlated strongly with U.S. swimming pool drowning deaths for many consecutive years — with no plausible causal mechanism whatsoever.
In time series data, spurious correlations are especially easy to produce: any two metrics that both trend upward over time can show high correlation, yet they may have absolutely no causal relationship — they are just both growing. Mistaking “growing together” for “influencing each other” is one of the most common errors in time series analysis.
23. Mediation Fallacy: You Severed the Causal Path You Were Trying to Study
This is one of the most insidious traps in causal inference — technically it looks correct, but logically it is completely wrong.
Suppose you want to study whether an onboarding feature (X) improves 6-month long-term retention (Y). You know that Day-3 retention (M) is an important early indicator, so you control for M in your regression model to make the analysis more “precise.”
Result: after controlling for M, the onboarding feature’s effect on long-term retention nearly vanishes. Conclusion: “The onboarding feature has no effect on long-term retention.”
This conclusion is wrong. The onboarding feature improves long-term retention precisely because it first improves Day-3 retention, and Day-3 retention then influences long-term retention. The causal path is X -> M -> Y. When you control for M, you block the main pathway through which X affects Y — of course you cannot see the effect. What you measured is not “no effect” — you severed the causal chain with your own analytical method.
In epidemiology, this is called the “Table 2 Fallacy”: in a multivariable regression, controlling for mediators and confounders simultaneously when the two must be treated differently. Controlling for confounders makes your conclusions more accurate; controlling for mediators makes your conclusions meaningless. The problem is, without drawing a causal diagram first, you have no way of knowing which is which.
The distance from correlation to causation is far greater than most people imagine. Correlation can help you “predict” (people carrying lighters really do have higher lung cancer risk — useful for insurance pricing), but if you want to “change” an outcome, you must find the true causal path.
A recommended habit: every time you see “A and B are associated,” draw a simple causal diagram. Circles for variables, arrows for what you believe is the causal direction. Once you draw it, you will often find an un-circled third party quietly controlling the whole picture.