Do we have a term for the issue with quantifying policy effect Scott Alexander stumbled on multiple times?

post by yhoiseth · 2021-07-29T08:45:29.636Z · LW · GW · 6 comments

This is a question post.

In Things I learned Writing The Lockdown Post, Scott Alexander describes a really tricky issue when trying to quantify the effects of some policies:

This question was too multi-dimensional. As in, you could calculate everything right according to some model, and then someone else could say "but actually none of that matters, the real issue is X", and you would have a hard time proving it wasn't.

A long time ago, I remember being asked whether banning marijuana was good or bad. I spent a long time figuring out the side effects of marijuana, how addictive it was, how many people got pain relief from it, how many people were harmed by the War on Drugs, etc - and it turned out all of this was completely overwhelmed by the effects of deaths from intoxicated driving. If even a few people drove on marijuana and crashed and killed people, that erased all its gains; if even a few people used marijuana instead of alcohol and then drove while less intoxicated than they would have been otherwise, that erased all its losses. This was - "annoying" is exactly the right word - because what I (and everyone else) wanted was a story about how dangerous and addictive marijuana was vs. how many people were helped by its medical benefits, and none of that turned out to matter at all compared to some question about stoned driving vs. substituting-for-drunk-driving, which nobody started out caring about.

It might actually be even worse than that, because there was some hard-to-quantify chance that marijuana decreased IQ, and you could make an argument that if there was a 5% chance it decreased IQ by let's say 2 points across the 50% of the population who smoked pot the most, and you took studies about IQ vs. job success, criminality, etc, really seriously, then lowering the national IQ 1 point might have been more important than anything else. But this would be super-annoying, because the studies showing that it decreased IQ were weak (and you would have to rely on a sort of Pascal-type reasoning) and people reading something on the costs/benefits of marijuana definitely don't want to read something mildly politically incorrect trying to convince them that IQ is super important. And if there are twenty things like this, then all the actually interesting stuff people care about is less important than figuring out which of the twenty 5%-chance-it-matters things actually matters, and it's really tempting to just write it off or put it in a "Future Research Needed" section, but that could be the difference between your analysis being right vs. completely wrong and harmful.

The same was true here. How do we quantify the effect of Long COVID? Who knows? Given the giant pile of bodies, maybe we just round COVID off the the number of deaths it causes, and ignore this mysterious syndrome where we've only barely begun the work of proving it exists? But under certain assumptions, the total suffering caused by Long COVID is worse than the suffering caused by the acute disease, including all the deaths!

There is more, but this covers the phenomenon I’m curious about. Let me try to describe the problem in general terms:

Important policies have so many effects that it is near impossible to keep track of them all. In addition, some effects tend to dwarf all others, so it is critical to catch every last one. (Perhaps they follow a Paretian distribution?) It follows that any quantitative analysis of policy effects tends to be seriously flawed.

Do we already have a term for this problem? It reminds me of moral cluelessness as well as known and unknown unknowns, but none of those seem fit the bill exactly.


answer by johnswentworth · 2021-07-29T12:10:31.068Z · LW(p) · GW(p)

Important policies have so many effects that it is near impossible to keep track of them all. In addition, some effects tend to dwarf all others, so it is critical to catch every last one. (Perhaps they follow a Paretian distribution?) It follows that any quantitative analysis of policy effects tends to be seriously flawed.

I don't think this is the right way to frame the problem.

It is true that even unimportant policies have so many effects that it is de-facto impossible to calculate them all. And it is true that one or a few effects tend to dwarf all others. But that does not mean that it's critical to catch every last one. The effects which dwarf all others will typically be easier to notice, in some sense, precisely because they are big, dramatic, important effects. But "big/important effect" is not necessarily the same as "salient effect", so in order for this work in practice, we have to go in looking for the big/important effects with open eyes rather than just asking the already-salient questions.

For instance, in the pot/IQ example, we can come at the problem from either "end":

  • What things tend to be really important to humans, in the aggregate, and how does pot potentially impact those? Things like IQ, long-term health, monetary policy, technological development, countries coming out of poverty, etc, are "big things" in terms of what humans care about, so we should ask if pot potentially has predictable nontrivial effects on any of them.
  • On what things does pot have very large impact, and how much do we care? Pot probably has a big impact on things like recreational activity or how often people are sober. So, how do those things impact the things we care about most?

If people think about the problem in a principled way like this, then I expect they'll come up with hypotheses like the pot-IQ thing. There just aren't that many things which are highly important to humans in the aggregate, or that many things on which any given variable has a large expected effect. (Note the use of "expected effect" - variables may have lots of large effects via the butterfly effect, but that's relevant to decision-making only insofar as we can predict the effects.)

The trick is that we have to think about the problem in a principled way from the start, not just get caught up in whatever questions other people have already brought to our attention.

comment by Daniel V · 2021-07-29T15:40:24.496Z · LW(p) · GW(p)

johnswentworth makes the great point that "some effects tend to dwarf all others, so it is critical to catch every last one" assumes that we can't identify the big effects early. If people are looking around with open eyes, they're not so unable to pick up the relevant stuff first. 

What yhoiseth's framing gets right is that big effects are sometimes not salient, even for people with open eyes. And especially when effects are hard to directly observe or estimate with certainty because they're indirect in nature (like substitution effects), not only are they low on salience for affective reasons, they're low on salience because they don't benefit from a "big is relevant" heuristic (or "open eyes") since the effect size is unknown. That is, rather than effect size and salience having a positive correlation...because effects are often not known with certainty, even among people with open eyes, effect size and salience can have a negative correlation, necessitating getting to the bottom of the salience barrel to identify the big ones. I am unsure how the relative frequencies of "open eyes will see big effects" vs. "open eyes can still struggle to see big effects" compare though.

For example, Scott mentions side effects, addictiveness, pain relief, War on Drugs, high driving, and less drunk driving as effects. The idea is that the first four are rather small effects and the last two are rather large effects, but the error bars are small on the first four and huge on the last two. The first four are also highly salient (obvious, personalized, affectively-charged) despite being small, and the last two are not as salient despite being large (they would be salient if we knew their effect sizes; effect size matters for salience. But because we lack precision, we're left with how salient these are on the basis of being non-obvious, de-personalized, and coldly econometric). If you were to plot these with effect size on X and salience on Y, you'd get a negative correlation if you were omniscient and able to include the last two effects in your dataset (per yhoiseth). But for Scott and typical discussants, the last two effects are missing data so you're left with the weak positive correlation among the first four effects. At least until someone annoyingly but helpfully tells you it is time to go looking. But again, johnswentworth is also right that the actual correlation is frequently positive, so this isn't always a problem.

Aside: it also, technically, depends on what counts as "open eyes." I figure Scott and friends are pretty dialed in, so I take their "missing" the big effects as evidence for "open eyes can still struggle to see big effects." But I suppose an economist whose spent their career studying substitution effects might think Scott et al. were approaching the problem with blindfolds on, duh, substitution is definitely the biggie here.

answer by Richard_Kennaway · 2021-07-29T17:41:13.339Z · LW(p) · GW(p)

It seems similar to what Andrew Gelman has called the piranha problem (two links there). Also related is Gelman's kangaroo.

answer by Kenny · 2021-08-12T23:03:07.003Z · LW(p) · GW(p)

Arnold Kling has written a bit about 'causal density', which seems pretty relevant.

answer by Tao Lin · 2021-07-29T22:44:18.388Z · LW(p) · GW(p)

We could dub this "Long Tail Externalities" - the idea that most of the impact comes from a few indirect effects, and sometimes the more indirect the bigger - for instance, most policies might impact the future mainly through AI safety.


Comments sorted by top scores.

comment by romeostevensit · 2021-07-29T22:46:27.321Z · LW(p) · GW(p)

related to curse of dimensionality