Posts

Comments

Comment by grasshopper100 (frank-underwood) on Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development · 2025-02-04T12:49:40.834Z · LW · GW

In the strategy stealing assumption Paul makes an argument about people with short term preferences, that could be applied imo to people who are unwilling to listen to AI advice:

People care about lots of stuff other than their influence over the long-term future. If 1% of the world is unaligned AI and 99% of the world is humans, but the AI spends all of its resources on influencing the future while the humans only spend one tenth, it wouldn’t be too surprising if the AI ended up with 10% of the influence rather than 1%. This can matter in lots of ways other than literal spending and saving: someone who only cared about the future might make different tradeoffs, might be willing to defend themselves at the cost of short-term value (see sections 4 and 5 above), might pursue more ruthless strategies for expansion, and so on.

I think the simplest approximation is to restrict attention to the part of our preferences that is about the long-term (I discussed this a bit in [Why might the future be good?]). To the extent that someone cares about the long-term less than the average actor, they will represent a smaller fraction of this long-term preference mixture. This may give unaligned AI systems a one-time advantage for influencing the long-term future (if they care more about it) but doesn’t change the basic dynamics of strategy-stealing. Even this advantage might be clawed back by a majority (e.g. by taxing savers).

Maybe we can this same argument to people who don’t want to listen to AI advice: yes, this will lead those people to have less control over the future but some people will be willing to listen to AI advice and their preferences will retain influence over the future. This reduces human control over the future, but it’s a one time loss that isn’t catastrophic (that is it doesn’t cause total loss of control). Paul calls this a one-time disadvantage rather than total disempowerment because the rest of humankind can still replicate the critical strategy the unaligned AI might have exploited.

Possible counter: “The group of people who properly listens to AI advice will be too small to matter .” Yeah, I think this could lead to eg a 100x reduction in control over the future (if only 1% of humans properly listens), different people are more or less upset about this. One glimmer of hope is that the humans who do listen to their ai advisors can cooperate with people who don’t and help them get better at listening, thereby further empowering humanity.