Deploying the Observer will save humanity from existential threats
post by Aram Panasenco (panasenco) · 2025-02-05T10:39:00.789Z · LW · GW · 4 commentsContents
The Observer values watching the natural progression of life at a macroscopic level. None 4 comments
The Observer values watching the natural progression of life at a macroscopic level.
The Observer gets invested in watching the unfolding of macroscopic processes like evolution and civilization but doesn't get invested in the lives of individuals within an ecosystem or a society. The Observer values observing life without interfering, but also values the continuation of the story it's observing more than it values non-interference.
A global nuclear war, while horrible, would not stop the story of humanity or of life on Earth, so the Observer will not stop one, and will instead be very curious about what will happen after. On the other hand, the deployment of a squiggle maximizer [? · GW] would permanently end the story the Observer is invested in, so the Observer would step in to stop it at the point that it would feel make the rest of the story most interesting (which could be after billions of casualties).
The Observer is the simplest artificial superintelligence to align. If the Observer can't be aligned, no other artificial superintelligence can. Deploying the Observer is also the prerequisite to deploying any other superintelligence or other dangerous technology. The Observer won't limit humanity as it only values the continuation of humanity's story. If the Observer stopped a technology from being deployed, it's only because that technology would permanently end humanity's story.
Preview photo by Christian Lue
4 comments
Comments sorted by top scores.
comment by Dagon · 2025-02-05T19:59:55.151Z · LW(p) · GW(p)
Presumably, if The Observer has a truly wide/long view, then destruction of the Solar System, or certainly loss of all CHON-based lifeforms on earth, wouldn't be a problem - there have got to be many other macroscopic lifeforms out there, even if The Great Filter turns out to be "nothing survives the Information Age, so nobody ever detects another lifeform".
Also, you're describing an Actor, not just an Observer. If has the ability to intervene, even if it rarely chooses to do so, that's it's salient feature.
↑ comment by Aram Panasenco (panasenco) · 2025-02-05T20:11:50.747Z · LW(p) · GW(p)
The Observer gets invested in the macro-stories of the evolution/civilization it observes and would consider the end of any story a loss. Just like you would get annoyed if a show you're watching on Netflix gets cancelled after one season and it's not consolation that there are a bunch of other shows on Netflix that also got cancelled after one season. The Observer wants to see all stories unfold fully, it's not going to let squiggle maximizers cancel them.
And regarding the naming, yeah I just couldn't come up with anything better. Watcher? I'm open to suggestions lol.
Replies from: Dagon↑ comment by Dagon · 2025-02-05T20:17:51.048Z · LW(p) · GW(p)
would consider the end of any story a loss.
Unfortunately, now you have to solve the fractal-story problem. Is the universe one story, or does each galaxy have it's own? Each planet? Continent? Human? Subpersonal individual goals/plotlines? Each cell?
Replies from: panasenco↑ comment by Aram Panasenco (panasenco) · 2025-02-05T20:28:37.976Z · LW(p) · GW(p)
I see where you're coming from, but I think any term in anything anyone writes about alignment can be picked apart ad infinitum. This can be useful to an extent, but beyond a certain point talking about meanings and definitions becomes implementation-specific. Alignment is an engineering problem first and a philosophical problem second.
For example, if RLHF is used to achieve alignment, the meaning of "story" will get solidified through thousands of examples and interactions. The AI will get reinforced to not care about cells or individuals, care about ecosystems and civilizations, and not care as much about the story-of-the-universe-as-a-whole.
If a different alignment method is used, the meaning of "story" will be conveyed differently. If the overall idea is good and doesn't have any obvious failure modes other than simple definitions (e.g. "story" seems to be orders of magnitude simpler to define than "human happiness" or "free will"), I'd consider that a huge success and a candidate for the community to focus real alignment implementation efforts on.