LWLW's Shortform
post by LWLW (louis-wenger) · 2025-01-06T21:26:24.375Z · LW · GW · 14 commentsContents
16 comments
14 comments
Comments sorted by top scores.
comment by LWLW (louis-wenger) · 2025-01-06T20:34:18.809Z · LW(p) · GW(p)
Making the (tenuous) assumption that humans remain in control of AGI, won't it just be an absolute shitshow of attempted power grabs over who gets to tell the AGI what to do? For example, supposing OpenAI is the first to AGI, is it really plausible that Sam Altman will be the one actually in charge when there will have been multiple researchers interacting with the model much earlier and much more frequently? I have a hard time believing every researcher will sit by and watch Sam Altman become more powerful than anyone ever dreamed of when there's a chance they're a prompt away from having that power for themselves.
Replies from: weibac↑ comment by Milan W (weibac) · 2025-01-06T21:56:57.681Z · LW(p) · GW(p)
You're assuming that:
- There is a single AGI instance running.
- There will be a single person telling that AGI what to do
- The AGI's obedience to this person will be total.
I can see these assumptions holding approximately true if we get really really good at corrigibility and if at the same time running inference on some discontinuously-more-capable future model is absurdly expensive. I don't find that scenario very likely, though.
↑ comment by LWLW (louis-wenger) · 2025-01-31T23:42:19.994Z · LW(p) · GW(p)
I see no reason why any of these will be true at first. But the end-goal for many rational agents in this situation would be to make sure 2 and 3 are true.
Replies from: weibac↑ comment by Milan W (weibac) · 2025-02-01T03:06:29.710Z · LW(p) · GW(p)
Correct, those goals are instrumentally convergent.
comment by LWLW (louis-wenger) · 2025-02-10T06:51:58.007Z · LW(p) · GW(p)
Everything feels so low-stakes right now compared to future possibilities, and I am envious of people who don’t realize that. I need to spend less time thinking about it but I still can’t wrap my head around people rolling a dice which might have s-risks on it. It just seems like a -inf EV decision. I do not understand the thought process of people who see -inf and just go “yeah I’ll gamble that.” It’s so fucking stupid.
Replies from: Thane Ruthenis↑ comment by Thane Ruthenis · 2025-02-10T11:24:53.947Z · LW(p) · GW(p)
- They are not necessarily "seeing" -inf in the way you or me are. They're just kinda not thinking about it, or think that 0 (death) is the lowest utility can realistically go.
- What looks like an S-risk to you or me may not count as -inf for some people.
↑ comment by Aristotelis Kostelenos (aristotelis-kostelenos) · 2025-02-10T11:59:16.396Z · LW(p) · GW(p)
I think humanity's actions right now are most comparable those of a drug addict. We as a species dont have the necessary equivalent of executive function and self control to abstain from racing towards AGI. And if we're gonna do it anyway, those that shout about how we're all gonna die just ruin everyone's mood.
Replies from: dr_s↑ comment by dr_s · 2025-02-10T15:54:20.302Z · LW(p) · GW(p)
Or for that matter to abstain towards burning infinite fossil fuels. We happen to not live on a planet with enough carbon to trigger a Venus-like cascade, but if that wasn't the case I don't know if we could stop ourselves from doing that either.
The thing is, any kind of large scale coordination to that effect seems more and more like it would require a degree of removal of agency from individuals that I'd call dystopian. You can't be human and free without a freedom to make mistakes. But the higher the stakes, the greater the technological power we wield, the less tolerant our situation becomes of mistakes. So the alternative would be that we need to willingly choose to slow down or abort entirely certain branches of technological progress - choosing shorter and more miserable lives over the risk of having to curtail our freedom. But of course for the most part, not unreasonably!, we don't really want to take that trade-off, and ask "why not both?".
↑ comment by dr_s · 2025-02-10T15:49:08.711Z · LW(p) · GW(p)
What looks like an S-risk to you or me may not count as -inf for some people
True but that's just for relatively "mild" S-risks like "a dystopia in which AI rules the world, sees all and electrocutes anyone who commits a crime by the standards of the year it was created in, forever". It's a bad outcome, you could classify it as S-risk, but it's still among the most aligned AIs imaginable and relatively better than extinction.
I simply don't think many people think about what does an S-risk literally worse than extinction look like. To be fair I also think these aren't very likely outcomes, as they would require an AI very aligned to human values - if aligned for evil.
Replies from: Thane Ruthenis↑ comment by Thane Ruthenis · 2025-02-10T15:53:22.821Z · LW(p) · GW(p)
No, I mean, I think some people actually hold that any existence is better than non-existence, so death is -inf for them and existence, even in any kind of hellscape, is above-zero utility.
Replies from: dr_s↑ comment by dr_s · 2025-02-10T15:55:44.174Z · LW(p) · GW(p)
I just think any such people lack imagination. I am 100% confident there exists an amount of suffering that would have them wish for death instead; they simply can't conceive of it.
Replies from: Thane Ruthenis↑ comment by Thane Ruthenis · 2025-02-10T16:11:17.670Z · LW(p) · GW(p)
One way to make this work is to just not consider your driven-to-madness future self an authority on the matter of what's good or not. You can expect to start wishing for death, and still take actions that would lead you to this state, because present!you thinks that existing in a state of wishing for death is better than not existing at all.
I think that's perfectly coherent.
Replies from: dr_scomment by LWLW (louis-wenger) · 2025-02-15T07:29:32.704Z · LW(p) · GW(p)
>be me, omnipotent creator
>decide to create
>meticulously craft laws of physics
>big bang
>pure chaos
>structure emerges
>galaxies form
>stars form
>planets form
>life
>one cell
>cell eats other cell, multicellular life
>fish
>animals emerge from the oceans
>numerous opportunities for life to disappear, but it continues
>mammals
>monkeys
>super smart monkeys
>make tools, control fire, tame other animals
>monkeys create science, philosophy, art
>the universe is beginning to understand itself
>AI
>Humans and AI together bring superintelligence online
>everyone holds their breath
>superintelligence turns everything into paper clips mfw infinite kek