Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.
post by Andrew_Critch · 2024-09-11T04:41:24.872Z · LW · GW · 7 commentsContents
7 comments
People often attack frontier AI labs for "hypocrisy" when the labs admit publicly that AI is an extinction threat to humanity. Often these attacks ignore the difference between various kinds of hypocrisy, some of which are good, including what I'll call "reformative hypocrisy". Attacking good kinds of hypocrisy can be actively harmful for humanity's ability to survive, and as far as I can tell we (humans) usually shouldn't do that when our survival is on the line. Arguably, reformative hypocrisy shouldn't even be called hypocrisy, due to the negative connotations of "hypocrisy". That said, bad forms of hypocrisy can be disguised as the reformative kind for long periods, so it's important to pay enough attention to hypocrisy to actually figure out what kind it is.
Here's what I mean, by way of examples:
***
0. No Hypocrisy —
Lab: "Building AGI without regulation shouldn't be allowed. Since there's no AGI regulation, I'm not going to build AGI."
Meanwhile, the lab doesn't build AGI. This is a case of honest behavior, and what many would consider very high integrity. However, it's not obviously better, and arguably sometimes worse, than...
1. Reformative Hypocrisy:
Lab: "Absent adequate regulation for it, building AGI shouldn't be allowed at all, and right now there is no adequate regulation for it. Anyway, I'm building AGI, and calling for regulation, and making lots of money as I go, which helps me prove the point that AGI is powerful and needs to be regulated."
Meanwhile, the lab builds AGI and calls for regulation. So, this is a case of honest hypocrisy. I think this is straightforwardly better than...
2. Erosive Hypocrisy:
Lab: "Building AGI without regulation shouldn't be allowed, but it is, so I'm going to build it anyway and see how that goes; the regulatory approach to safety is hopeless."
Meanwhile, the lab builds AGI and doesn't otherwise put efforts into supporting regulation. This could also be a case of honest hypocrisy, but it erodes the norm that AGI should regulated rather than supporting it.
Some even worse forms of hypocrisy include...
3. Dishonest Hypocrisy, which comes in at least two importantly distinct flavors:
a) feigning abstinence:
Lab: "AGI shouldn't be allowed."
Meanwhile, the lab secretly builds AGI, contrary to what one might otherwise guess according to their stance that building AGI is maybe a bad thing, from a should-it-be-allowed perspective.
b) feigning opposition:
Lab: "AGI should be regulated."
Meanwhile, the lab overtly builds AGI, while covertly trying to confuse and subvert regulatory efforts wherever possible.
***
It's important to remain aware that reformative hypocrisy can be on net a better thing to do for the world than avoiding hypocrisy completely. It allows you to divert resources from the thing you think should be stopped, and to use those resources to help stop the thing. For mathy people, I'd say this is a way of diagonalizing against a potentially harmful thing, by turning the thing against itself, or against the harmful aspects of itself. For life sciencey people, I'd say this is how homeostasis is preserved, through negative feedback loops whereby bad stuff feeds mechanisms that reduce the bad stuff.
Of course, a strategy of feigning opposition (3a) can disguise itself as reformative hypocrisy, so it can be hard to distinguish the two. For example, if a lab says for long time that they're going to admit their hypocritical stance, and then never actually does, then it turns out to be dishonest hypocrisy. On the other hand, if the dishonesty ever does finally end in a way that honestly calls for reform, it's good to reward the honest and reformative aspects of their behavior. Note also that, it's not reformative, even honest hypocrisy can erode positive norms as in (2), by overtly denegrating the idea of even establishing norms. So the key distinction is not just to avoid supporting dishonesty, but to specifically reward honesty that takes action in support of broader reform.
In summary, what I'm suggesting is to pay close attention to the three different kinds of hypocrisy above, and close enough attention to actually distinguish between them and treat them separately, without being fooled as to which one is which. This can be a lot of work, but it's important work that is necessary to create the right incentives when you are in the habit of criticizing people for hypocrisy. The key is to make sure that all hypocrisy is sufficiently actively reformative. Otherwise, it's not part of a homeostatic loop, and hence not a positive contribution to a working survival strategy when the stakes are existential.
That's all for now. Happy Tuesday :)
7 comments
Comments sorted by top scores.
comment by Mateusz Bagiński (mateusz-baginski) · 2024-09-11T06:21:23.700Z · LW(p) · GW(p)
Do you think [playing in a rat race because it's the most locally optimal for an individual thing to do while at the same advocating for abolishing the rat race] is an example of reformative hypocrisy?
Or even more broadly, defecting in a prisoner's dilemma while exposing an interface that would allow cooperation with other like-minded players?
I've had this concept for many years and it hasn't occurred to me to give it a name (How Stupid Not To Have Thought Of That) but if I tried to give it a name, I definitely wouldn't call it a kind of hypocrisy.
Replies from: DusanDNesic↑ comment by DusanDNesic · 2024-09-11T20:00:39.987Z · LW(p) · GW(p)
A "Short-term Honesty Sacrifice", "Hypocrisy Gambit", something like that?
Replies from: mateusz-baginski↑ comment by Mateusz Bagiński (mateusz-baginski) · 2024-09-20T10:58:07.178Z · LW(p) · GW(p)
It's better but still not quite. When you play on two levels, sometimes the best strategy involves a pair of (level 1 and 2) substrategies that are seemingly opposites of each other. I don't think there's anything hypocritical about that.
Similarly, hedging is not hypocrisy.
comment by Richard_Kennaway · 2024-09-11T06:32:43.267Z · LW(p) · GW(p)
In case 1, if I don't know how to make a safe AGI while preventing an unsafe AGI, and no-one else does (i.e. the current state of the art), what regulations would I be calling for?
comment by gb (ghb) · 2024-09-11T14:29:07.721Z · LW(p) · GW(p)
I agree with the overall message you're trying to convey, but I think you need a new name for the concept. None of the things you're pointing to are hypocrisies at all (and in fact the one thing you call "no hipocrisy" is actually a non sequitur). To give an analogue, the fact that someone advocates for higher taxes and at the same time does not donate money to the government does not make them a hypocrite (much less a "dishonest hypocrite").
comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-12T03:14:33.293Z · LW(p) · GW(p)
I disagree with your taxonomy and ranking (I find holier-than-thou sanctimony more loathsome than straightforward deception), but agree it's a shame that the same word is used both for one trying to better an imperfect world while living in it as well as for one criticizing others for actions that mutatis mutandis one engages in oneself.