# What does lack of evidence of a causal relationship tell you?

post by James_Miller · 2011-06-08T19:03:45.283Z · score: 1 (2 votes) · LW · GW · Legacy · 10 commentsImagine that you know there is a strong correlation between X and Y. Statistically competent scholars have extensively examined the causal relationship between X and Y and have failed to find a significant causal relationship and have failed to rule out the possibility that there is a significant causal relationship.

Would it be reasonable for you to claim that the causal relationship between X and Y probably isn't too strong or it would have shown up clearly on statistical analysis? At the very least, should learning of the negative results of the scholars cause you to decrease your estimate of the causal relationship between X and Y?

## 10 comments

Comments sorted by top scores.

Yes, see this for details on why this is so.

But in probability theory, absence of evidence is always evidence of absence. If E is a binary event and P(H|E) > P(H), "seeing E increases the probability of H"; then P(H|~E) < P(H), "failure to observe E decreases the probability of H". P(H) is a weighted mix of P(H|E) and P(H|~E), and necessarily lies between the two.

Thanks! The second to last paragraph from your EY citation was exactly what I was looking for.

The causality looks like *something*. An obscure common cause is the most obvious (to me) source of the correlation if no one's put forth a plausible causal relationship yet. I'm not sure what you mean, though, by "causal relationship between X and Y"... do you mean specifically a relationship of the form "X -> ..Z.. -> Y" / "X <- ..Z.. <- Y", or do you mean "any causal structure connecting X and Y in any way"?

(Are you interested in some specific X,Y but phrasing it generally so we don't get distracted? I feel like seeing some examples of the failed tests run by statistically competent scholars would help me know what they haven't ruled out)

I'm mostly interested in whether X causes Y vs. whether some Z causes both X and Y.

I didn't find that clear from your article. A correlation between X and Y tells you no more than that causality is present *somewhere*. It tells you absolutely nothing about whether X causes Y, Y causes X, Z causes X and Y, how long the causal chains are, or whether it's a sampling artefact due to common effects of X and Y.

Those options aren't mutually exclusive...

Or exhaustive. Imperfect sampling can produce sample correlations among variables with no causal connection. (Toy example: X and Y are independent, Z is jointly caused by X and Y and is equal to X+Y, and everyone is unwittingly sampling from a subpopulation with a narrow range of values of Z. Sample X and Y will have a high negative correlation.)

Could you give a concrete example of such sampling bias?

A real one? Not off hand, not being a statistician, but sampling bias is a standard problem that has to be guarded against in statistical investigations. It can affect not just the sample means of variables, but correlations and indeed every statistic whatsoever.

To flesh out the toy example with an imaginary narrative, suppose X = intelligence, Y = effort, and Z = exam grade. Suppose Z is highly correlated with X+Y. If we divide the population up by exam grade, we may find that in every subpopulation, X and Y are negatively correlated, even while in the whole population, X and Y are uncorrelated.

If there is no correlation between how many apples I drop from a height and how many hits the bottom, then the most likely hypothesis is that either there is some kind of barrier and the dropped apples I'm noticing come from an unrelated source, or that some process is reacting to me dropping X apples and dropping k-x apples.