Causality and its harms

post by George3d6 · 2020-07-04T14:42:56.418Z · LW · GW · 19 comments

This is a link post for https://blog.cerebralab.com/Causality_and_its_harms

Contents

  1. Hypothetical number one
  2. Causality and effect size
  3. Causality order, time and replication
  4. (Partially real) Hypothetical number two
  5. Why I find causality harmful
None
19 comments

I'll assume that you do not hold fast to a rigorous system of metaphysics, in which case I think you can fancy me and accept that, if I so desire to struggle, I could reduce the concept of causality to one (or a chain of) probabilistic relationships between events.

Here's a naive definition of causality: P(E) ~= 1 | C (where "|" stands for "given"). If I can say this, I can most certainly say that C causes E, at least in a system where a thing such as "time" exists and where C happens before E.

It should be noted this doesn't imply P(C) ~= 1 | E.

Obviously, this definition doesn't cover all or even most of the things we call causal and can be taken into an extreme context where it holds in spite of no causal relationship. I'm just starting with it because it's a point from which we can build up to a better one. Let's roll with it and look at an example.

1. Hypothetical number one

Human height can be viewed as a function of haplogroup alleles + shoe size, that is to say, we can predict h fairly well given those 2 parameters.

Height can also be viewed as a function of mTORC1 expression + HGH blood levels. Let's say some assay that tells us the levels of mTORC1 and HGH predicts height equally well to haplogroup alleles + shoe size.

But I think most scientists would agree the later are to some extent "causal" for height while the former aren't. Why?

Well consider 2 hypothetical experiment:

  1. Take a 30yo human of average height and use a saline solution pump to inflate their feet to 1.5x their size. Then use {magical precision nucleotide addition and deletion vector} to remove all his haplogroup alleles and insert those most associated with increased height.
  2. Take a 30yo human of average height and then use {magical precision nucleotide addition and deletion vector} to overexpress the heck out of mTORC1 and HGH.

In which case do we expect to see an increase in height? That might indicate causality.

Trick question, of course, in neither.

We'd expect to see increased height in the second case if the human was, say, 13 instead of 30.

The causality here is P(E) given C under some specific conditions. Where C has happened before P(E), even if C happened 10 years ago and C happening again now would not affect E.

Also, see physics for situations where that doesn't quite cut it either because the temporal relationship "seems" inverted [citation needed].

Causality is what we call P(E) ~= 1 | C happened in the past in a world where we have "intuitive" temporal relationships AND we can be pretty certain about the environment.

Hit glass with a hammer and it breaks, except for the fact that this only happens in a fairly specific environment with some types of glass and some types of hammer. Move the environment 300 meters underwater, make the glass bulletproof, make the hammer out of aluminum or let an infant wield it and we lose "causality", even though the event described is still the same.

But even that doesn't cut it in terms of how weird causality is.

2. Causality and effect size

Now let's move onto the idea of "cause" that doesn't fall into the whole P(E) ~= 1 | C. This is not so easy because there are two different ways by which this could happen.

The easy one involves E as a continuous value rather than a binary event. In the previous height example, E could have been an increase as counted by a % of the subject's initial height or in centimeters.

But this is easy because we can just abstract away E as something like "and increase in height by between 10 and 25%", basically strap some confidences ranges on a numerical increase and it's back to being a binary cause.

The hard one involves events that happen due to a given cause only very seldom.

For example, if we are traveling through a mountain pass after a heavy snowstorm and our friends start yodeling very loudly we might say something like:

Shut up, you might cause an avalanche

But this seems like the exact kind of situation where we'd be better suited saying:

Shut up, there's a very spurious correlation between loud singing and avalanches and I'd rather not take my chances.

Well, maybe we say "cause" only for the sake of brevity? Doubtfully.

I think the reason we say "cause" here rather than "correlate with" is that we seem to have some underlying intuition about the laws of the physical world, that allows us to see the mechanism by which yodeling might put in motion a series of events (using the "naive" strong definition of causality) which end up being "causal" of an avalanche using the naive definition used before (e.g. some to do with very strong echos + very unstable snow covering on a steep slope).

Conversely, if we saw a black cat jump out of the snow and just realized today is Friday 13th we might start being a bit afraid of an avalanche happening, maybe even more so than if our friends start yodeling. But I think even the most superstitious person would shy away from calling black cats in random places and certain dates "causal" of avalanches.

But then again, if yodeling can cause an avalanche by this definition, so can the butterfly flapping its wings in China the action of which snowballed into a slight direction of current in your mountain pass which (coupled with the yodeling) could cause the avalanche.

Heck, maybe the slops were avalanche secure for some obtuse reason, but then someone moved some medium-sized rocks a few weeks ago and accidentally really harmed the avalanche-related structural stability.

3. Causality order, time and replication

Ok, this section could be expanded in an article on its own, but I don't find it as interesting as the last, so I will try to keep it brief. To patch up our previous model we need to introduce the idea of order, time, and replication.

Why is the wind causal of an avalanche but not the butterfly flapping its wings?

Well, because of the order those events happened in. Given P(E) = x | C1 and P(E) = x | C2 but P(E) = x | C1 & C2 the "cause" of E will be whicever of the two causes happened "first".

Sometimes we might change this definition in order to better divert our action. If you push someone on a subway track and he is subsequently unable to climb back in time and gets hit by a subway, you could hardly say to a judge:

Well, your honor, based on precedence, I think it's fair to say that it was his failure to get off the tracks that caused him to be hit. Yes, my pushing might have caused him to fail at said task... But if we go down that slippery slope you could also place blame on my boss, for making me angry this morning and thus causing the pushy behavior.

Similarly with the yodeling causing the avalanche, rather than the yodeling causing some intermediary phenomenon chain which ends up with one of them causing the avalanche.

We say yodeling causes the avalanche because "yodeling" is an actionable thing, the reverberation of sound through a valley once it leaves the lips, not so much.

A cause is defined based on how easy it is to replicate (or, in the case of the track-pushing, how easy it is to avoid it ever again be replicated).

Barring ease of replication, some spatiotemporal ordering of the events seems to be preferred.

We usually want the cause to be the "simplest" (easy to state and replicate) necessary and sufficient condition to get the effect (Okams Razor + Falsifiability).

That is to say, crossing the US border is what "causes" one to be in the USA.

Taking a plane to NYC also causes one to be in the USA, but it explains fewer examples and is a much more complex action. So I think we prefer to say that the border crossing is the "cause" here.

Introducing spatial-temporal order via appealing to the scientific method doesn't make a lot of sense, but it's quite an amazing heuristic once you think about it.

Two causes seem linked and equally easy to replicate, what is a good heuristic by which we can get the least amount of experimental error if our replication assumption is wrong?

Well, replicating the one that's closest in space and time to the event observed (harder if one is close in space and the other in time, but this is so seldom the case I'd call it an edge case, and heuristic aren't made for that).

Or, what if we can't decide on how easy they are to replicate? Or think they are linked but can't be sure?

Well, again, the spatial-temporal heuristic cleaves through the uncertainty and tells us we are most likely to observe the desired effect again (or stop it) by acting upon the cause closest to it in space and time.

Interesting... and getting more complex.

But at this point causality still sounds kinda reasonable.

Granted, we haven't gotten into the idea of ongoing cause-effect relationships. I've kind of assumed very complex cause-effect relationships can be split into hundreds of little "naive" causations and that somehow hundreds of naive causations can add up to a single bigger cause.

But those things aside, I think there's one final (potentially most important ?) point I want to consider:

4. (Partially real) Hypothetical number two

Assume we have 5 camps that argue about what is the causes of human violence, from people attacking their spouse to sadomasochism, to mass shootings, to drunken fistfights, to gang wars.

In the first 4 cases, we see an example of what we recognize as causality.

The blankslateist seems to correctly figure out some strong causes, but he's much too idealist in hoping one can design the cultural context and education systems that would rid us of violence. After all, we've been at it for a long while and no matter how much money one throws at education it doesn't seem to stick.

The economist has found some causes, but they are high-level causes he uses for everything and his solution is too vague to be applicable.

The genetic determinist seems to have cause and effect backward. He doesn't understand the fact that humans self-segregate into communities/tribes based on phenotype, and some communities are forced into situations that promote violence. His solution seems to us morally abhorrent and likely not to work unless you literally engineer a population of identical humans. Even then, they'd likely find ways to make tribes and some of those tribes would be forced or randomly stumble into a corrupt equilibrium that promotes violence.

The Freudian's explanation is outright silly to modern ears, but again, he seems to be getting at something like a cause, even though it's so abstract he might have well pointed to "God" as the cause. Conversely, since his cause is so vague, so is his solution.

But the statistician seems to not even understand causality. He's confusing a correlation for causation.

Lead is not a cause of violence, maybe it's a proxy at best, an environmental hazard that encourages certain behavior patterns, but a cause, nah, it's...

  1. Lead level (even if we only track measures in the air) correlated with aggravated assault more so then antibiotics are with bacterial infection survival. [link]
  2. Strongly correlated (both high p-value and large effect size) with violent crime as far back as the 20th century and the lowering of crime rates as the centuries progress match its decrease. [link]
  3. Lifetime exposure is strongly correlated (both high p-value and large effect size) with violent criminal behavior. [link]
  4. Strongly correlated in a fairly homogenous population with small variations in lead exposure (same city) with gun violence, homicide, and rape. [link]

Huh, I wonder if the other 4 can claim anything similar. And this is just me searching arbitrary primary sources on google scholar.

You can find hundreds of studies looking at lead levels in the environment and body and their correlation with crime. Including at the fact that decreasing lead levels seem to decrease violence in the same demographic and area where violence proliferated when lead levels were high.

The lead blood level in a toddler tracks violent crime so well it's almost unbelievable. Most drug companies or experimental psychologist can't hack their way into something that looks 1/3rd as convincing as this graph.



Did I mention the interventions that remove lead by replacing the pipe or banning leaded gasoline and see a sharp drop in crime rate only a few years afterward?

To my knowledge, what little correlation education has with violence vanishes when controlling for socioeconomic status.

Poverty is surprisingly uncoupled from violence when looked at in the abstract (e.g. see rates of violence in poor Asian countries vs poor European countries and poor vs rich cities), when it can be considered a proxy for violence, the lead-violence correlation would eat it up as just a confounder.

Psychoanalyst therapy doesn't seem whatsoever related to violence, though due to the kind of people that usually get it, it's hard to deconfound past a point.

One could argue genes are related to violence from a snapshot at a single point in time, but looking at violence dropping in the same population over just a single generation this doesn't seem so good.

So, if we could cut violent crime by 50% in a population by reducing serum lead levels to ~0 (a reasonable hypothesis, potentially even understated)... then why can't most people declare, with a straight face and proud voice, that lead is the single most important cause of violence? Why would anyone disagree with such a blatantly obvious C => P(E) ~= 1 statement? (Where E is something like "reduction in violent crime by between 30 and 80%)

What if I make my hypothesis stronger by adding some nutritional advice to the mix? Something like: reduce lead blood level to ~0, reduce boron blood level to as little as possible, increase iodine and omega-3 intake to 2x RDA in every single member of a population.

If, this intervention reduced violence in all populations by ~90%, would I be able to claim:

Hey guys, I figured out the cause of human violence, apparently, it has to do with too much residual lead and boron in the body coupled with lack of iodine and omega-3. Good news, with a 5-year intervention that costs less than 1% of the yearly US budget we can likely end almost all crime and war.

I'd wager the answer is, no and I think it's no mainly for misguided reasons. It has to do with the aesthetics we associate with a cause. It's the same reason why the butterfly effect sounds silly.

Violence seems like such a fundamental human problem to us that it seems silly beyond belief that the cause was just some residual heavy metal all along, or at least for the last 200 years or so.

Yet... I see no reason not to back up this claim. It seems a much stronger cause than anything else people have come up with based on the scientific evidence. It respects our previous definition of causality, it gets everything right. Or, at least, much more so than any other hypothesis one can test.

So really, P(E) ~= 1 | C is not enough even if we use the scientific method to find the simplest C possible. Instead, it has to be something like P(E) ~= 1 | C where C respects {specific human intuition that reasons about the kind of things that are metaphysically valid to be causes for other things}.

This is where we get into issues because "{specific human intuition that reasons about the kind of things that are metaphysically valid to be causes for other things}" varies a lot between people for basically no reason.

It varies in that a physicist, chemist and biologist might think different of what a valid cause is. It also varies in that a person that grew up disadvantaged their whole life might have a fundamentally different understanding of "what a human can cause" than someone that grew up as the son of a powerful politician.

It varies based on complex taxonomies of the world, the kind that classifies things into levels of "importance" and tells us that a cause which is too many levels of importance bellow an effect cannot be a "real cause".

If e.g. love, violence, and death are "intuitive importance level 100", then education, economics, and social status might be "intuitive importance level 98". On the other hand, lead blood levels, what we eat for breakfast, or our labrador's ownership status are closer to "intuitive importance level 10".

To say that something that's "intuitive importance level 98" can cause something that's "intuitive importance level 100" sounds plausible to us. To say that something that's "intuitive importance level 10" can cause something that's "intuitive importance level 100" is blasphemy.

5. Why I find causality harmful

I admit that I can't quite pain a complete picture of causality in ~3000 words, but the more edge cases I'd cover, the leakier a concept causality would seem to become.

Causality seems like a sprawling mess that can only be defined using very broad statistical concepts, together with a specific person's or groups intuition about how to investigate the world. And all of that is coupled protected by a vague pseudo-religious veil that dictates taboos about what kind of things are "pure enough" or "important enough" to serve as causes to other things on the same spectrum of "importance" or "purity".

I certainly think that causality is a good layman term that we should keep using in day to day interactions. If my friend wants to ingest a large quantity of cyanide I want to be able to tell them "Hey, you shouldn't do that, cyanide causes decoupling of mitochondrial electron transport chains which in turn cause you to die in horrible agony".

But if a scientist is studying "cyanide's effects upon certain mitochondrial respiratory complexes" I feel like this kind of research is rigorous enough to do away with the concept of causality.

On the other hand, replacing causality with very strict mathematical formulas that are tightly linked to the type of data we are looking at doesn't seem like a solution either. It might be a solution in certain cases, but it would make a lot of literature pointlessly difficult to read.

However, there might be some middle ground where we replace the ideas of "cause" and "causality" with a few subspecies of such. Subspecies that could also stretch the definition to include things like lead causing violence or butterflies flapping their wings causing thunderstorms.

Maybe I am wrong here, I certainly know it would be hard for me to stop using causal language. But I will at least attempt to reduce my usage of such and/or be more rigorous when I do end up using it.

19 comments

Comments sorted by top scores.

comment by shminux · 2020-07-05T02:16:45.673Z · LW(p) · GW(p)

TL;DR: Causality is an abstraction, a feature of our models of the world, not of the world itself, and sometimes it is useful, but other times not so much. Notice when it's not useful and use other models.

Replies from: gworley
comment by Gordon Seidoh Worley (gworley) · 2020-07-05T09:36:28.489Z · LW(p) · GW(p)

Agreed. Assigning causality requires having made a choice about how to carve up the world into categories so one part of the world can affect another. Without having made this choice we lose our normal notion of causality because there are no things to cause other things, hence causality as normally formulated only makes sense within an ontology.

And yet, there is some underlying physical process which drives our ability to model the world with the idea that things cause other things and we might reasonably point to it and say it is the real causality, i.e. the aspect of existence that we perceive as change.

Replies from: shminux, TAG
comment by shminux · 2020-07-06T01:58:00.328Z · LW(p) · GW(p)
And yet, there is some underlying physical process which drives our ability to model the world with the idea that things cause other things and we might reasonably point to it and say it is the real causality, i.e. the aspect of existence that we perceive as change.

Hmm. Imagine the world as fully deterministic. Then there is no "real causality" to speak of, everything is set in stone, and there is no difference between cause and effect. The "underlying physical process which drives our ability to model the world with the idea that things cause other things" are essential in being an embedded agent, since agency equals a perceived world optimization, which requires, in turn, predictability (from the inside the world), but I don't think anyone has a good handle on what "predictability from inside the world" may look like. Off hand, it means that there is a subset of the world that runs a coarse-grained simulation of the world, but how do you recognize such a simulation without already knowing what you are looking for? Anyway, this is a bit of a tangent.

Replies from: Richard_Kennaway, TAG
comment by Richard_Kennaway · 2020-07-06T22:45:18.884Z · LW(p) · GW(p)
Imagine the world as fully deterministic. Then there is no "real causality" to speak of, everything is set in stone, and there is no difference between cause and effect.

If causation is understood in terms of counterfactuals — X would have happened if Y had happened — then there is still a difference between cause and effect. A model of a world implies models of hypothetical, counterfactual worlds.

Replies from: shminux
comment by shminux · 2020-07-07T02:33:43.166Z · LW(p) · GW(p)
If causation is understood in terms of counterfactuals — X would have happened if Y had happened — then there is still a difference between cause and effect. A model of a world implies models of hypothetical, counterfactual worlds.

Yes, indeed, in terms of counterfactuals there is. But counterfactuals are in the map (well, to be fair a map is a tiny part of the territory in the agent's brain). Which was my original point: causality is in the map.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2020-07-07T10:32:23.301Z · LW(p) · GW(p)

The map and the territory are not separate magisteria. A good map, or model, fits the territory: it allows one to make accurate and reliable predictions. That is what it is, for a map to be a good one. The things in the map have their counterparts in the world. The goodness of fit of a map to the world is a fact about the world. Causation is there also, just as much as pianos, and gravitation, and quarks.

Replies from: TAG
comment by TAG · 2020-07-21T15:10:18.400Z · LW(p) · GW(p)

This claim..

A good map, or model, fits the territory: it allows one to make accurate and reliable predictions. That is what it is, for a map to be a good one.

is not obviously equivalent to this claim:-

The things in the map have their counterparts in the world.

Causation is there also, just as much as pianos, and gravitation, and quarks.

If you accept usefulness in the map as the sole criterion for existence in the territory, then causation is there, along with much else, including much that you do not believe in ,and much that is mutually contradictory.

comment by TAG · 2020-07-07T12:57:11.103Z · LW(p) · GW(p)

Hmm. Imagine the world as fully deterministic. Then there is no “real causality” to speak of, everything is set in stone, and there is no difference between cause and effect

There's a difference between strict causal determinism and block universe theory. Under causal determinism, future events have not happened yet,and need to be caused, even though there is is only one way they can turn out. Whereas under the block universe theory , the future is already "there" -- ontologically fixed as well as epistemologically fixed.

comment by TAG · 2020-07-07T16:21:32.757Z · LW(p) · GW(p)

Which is the correct theory -- the first paragraph or the second?

There is plenty of evidence that human notions of causality are influenced by human concerns, but it doesn't add up to the conclusion that there is no causality in the territory. The comparison with ontology is apt: just because tables and chairs are human level ontology, doesn't mean that there's no quark level ontology to the universe.

Replies from: gworley
comment by Gordon Seidoh Worley (gworley) · 2020-07-08T16:56:21.736Z · LW(p) · GW(p)

What would it even mean to say a theory of causality is "correct" here? We're talking about what makes sense to apply the term causality to, and there's matter of correctness at that level, only of usefulness to some purpose. It's only after we have some systematized way of framing a question that we can ask if something is correct within that system.

Replies from: TAG
comment by TAG · 2020-07-21T10:42:55.836Z · LW(p) · GW(p)

Correctness as opposed to usefulness would be correspondence to reality.

There's a general problem of how to establish correspondence, a problem which applies to many things other than causality. You can't infer that something corresponds just because it is useful, but you also can't infer that something does not correspond just because it is useful -- "in the map" does not imply "not in the territory".

comment by romeostevensit · 2020-07-04T16:31:33.979Z · LW(p) · GW(p)

+1 on the sprawling mess. What I have personally found useful is figuring out what is going on in terms of mental heuristics when some causal explanations seem 'better' than others. Which involves type errors and degrees of freedom.

Replies from: rsphinx8
comment by rsphinx8 · 2020-07-04T22:50:17.879Z · LW(p) · GW(p)

Can you please elaborate on type errors and degrees of freedom in terms of mental heuristics? I am not sure if I followed

Replies from: romeostevensit
comment by romeostevensit · 2020-07-06T17:04:56.309Z · LW(p) · GW(p)

Type error: consider Aristotle's 4 causes. If I ask you a why question about one kind of cause and you give me an explanation about another kind of cause there has been a type error.

Degrees of freedom: if there are more degrees of freedom in your explanation than in the thing you are attempting to explain then you can always get the answer you want. Consider astrology. A good explanation has fewer degrees of freedom than the thing it is explaining and thus creates compression and prediction power, i.e. it eliminates more possible worlds whereas bad explanations leave you with the same number of possible worlds as you started with.

Replies from: TAG
comment by TAG · 2020-07-06T19:02:43.467Z · LW(p) · GW(p)

If I ask you a why question about one kind of cause and you give me an explanation about another kind of cause there has been a type error

"At Milliways, you can go as many times as you like without meeting yourself, because of the embarrassment that would cause".

comment by siclabomines · 2020-07-06T23:50:18.535Z · LW(p) · GW(p)

I have the opposite impressions. Science should embrace causality more and do it better. And as a layman term it should be refined so that we stop talking about the causes of any event as a cake where each slice has a name and only one name.


I find it hard to summarize why, at least right now, but my view is sorta similar to Pearl's (though I don't totally like how he puts it). Hopefully later I'll re-read this more attentively and comment something more productive (if no one has done a strictly better job already).

Replies from: George3d6
comment by George3d6 · 2020-07-09T22:50:15.184Z · LW(p) · GW(p)

I believe the thing we differ on might just be a semantic, at least as far as redefinition goes. My final conclusion is around the fact that the term is bad because it's ill-defined, but with a stronger definitions (or ideally multiple definitions for different cases) it would be useful, it would also, however, be very foreign to a lot of people.

comment by noggin-scratcher · 2020-07-05T00:50:49.389Z · LW(p) · GW(p)

P(E) ~= 1 | C (where "|" stands for "given"). If I can say this, I can most certainly say that C causes E

Well... unless P(E) also ~= 1 | !C because P(E) ~= 1 and C is irrelevant

Replies from: George3d6
comment by George3d6 · 2020-07-06T18:38:15.256Z · LW(p) · GW(p)

Corrected the wording to be a bit "weaker" on that claim, but also, it's just a starting point and the final definition I dispute against doesn't rest on it.