↑ comment by IlyaShpitser ·
2013-02-10T11:36:31.572Z · LW(p) · GW(p)
Dozens of fields are concerned with "identifying causal effects from data", pretty much all the natural
sciences and all their myriad subspecializations can be viewed through such a lense.
That's the crux, can be viewed as such. Yet, I doubt you'll find all that many medical studies, physical
experiments, etc. invoking do-calculus. That does not void their results, there are ways of interpreting the
results that do not rely on grasping - or even be aware of - the math behind the curtain.
"That's just like, your opinion, man."
See, you don't get to say that. When people talk about causal effects from randomization (a la what Fisher talked about), effects of interventions is what they mean. That is the math behind what they want, just like complex valued matrices is the math behind quantum mechanics, or Peano axioms the math behind doing arithmetic. Not everyone uses the language of do(.) (some use potential outcome language, which is equivalent). But either their language is equivalent to do(.), or they are essentially doing garbage (and I assure you, there is a lot of garbage out there). In fields like epidemiology, what they often have is the data people (who know about HIV, say, or cancer), and methods people (who know how not to get garbage from the data).
The fact of the matter is, there are all sorts of gotchas about doing causal inference that being careless and relying on intuitions makes you vulnerable to. I can give endless examples:
(a) People doing longitudinal causal inference basically failed at time-varying confounders until 1986, when the right method was developed. So they would report garbage causal effects from longitudinal studies, because they thought they just need to adjust for these confounders. No. Wrong. Have to use the equivalent of g-computation.
(b) People try to use coefficients of regressions as mediated causal effects, even when this is not warranted (that is, the coefficient doesn't correspond to anything causal). No. Wrong. This fails if you have discrete mediators. This fails with interaction terms. This fails under certain natural modeling choices. This fails if you have unobserved confounding. In general a mediated effect is a complicated function of the observed data, not a regression coefficient.
(c) People try to test for causal null, even when their model does not permit the null to happen. (null paradox)
(d) Don Rubin (famous Harvard statistician, one of the people who wrote down the EM algorithm, and one of the people behind potential outcomes) once said that you should adjust for all covariates. He was just trying to be a good Bayesian (have to use all the data, right?) No. Wrong. You only adjust for what you need to block all non-causal paths, while not opening any non-causal paths.
(e) An example from something written at lesswrong: a Bayesian network is a causal model. No. Wrong. A Bayesian network is a statistical model (a set of densities) defined by conditional independence. In order to have a causal model you need to talk about how interventions relate to observations (essentially you need to say parents are direct causes formally).
Actually the list is so long, I am trying to put it in a paper format.
This stuff is not simple, and even very smart people can be confused! So if you want to do causal inference, you know, read up on it.. I am surprised this is a controversial point. To quote Miguel Hernan, the g-formula (expressing do(.) in terms of observed data) is not a causal method, it is the causal method.
If you don't want to read Pearl, you can read Robins, or Dawid, or the potential outcomes people who learned from Rubin. The formalism is the same.