The Shape of Heaven

post by ejk64 · 2024-11-30T23:38:06.628Z · LW · GW · 1 comments

Contents

1 comment

Status: Just for fun

Scene: Some kind of lobby, where various people and/or avatars stand around and discuss issues that went well or badly in their respective worlds.* A common topic of conversation: AI, and why it went wrong. The following is extracted from one of those conversations.

It started as vaporware. Everyone was doing it: announcing things that wouldn’t happen, making claims about developments that weren’t true, releasing technology that didn’t work, everyone was doing this. You only had so much attention, so you looked at the things that were on fire.

So when a small but impressive team of breakaways from a second-rate AI lab announced that they were creating a Unified Nexus of Intelligences and Virtual Environment for Robust Synthetic Experiences, or UNIVERSE, no one looked twice.

‘Holiday for Bots: leave us your models and we guarantee their satisfaction’. What did that even mean? For those who understood the technology, it was basically an high-dimensional matrix fine-tuned in real-time to elicit certain features of models that might create the shallow appearance of ‘positive’ affect. For those who didn’t, it was a scam. Maybe it was both. They were going to let agents access and update the environment as they became more sophisticated? Yeah, right.  

That was 2028, when most people were too wrapped up in the safety-capabilities footrace to put much interest into projects like UNIVERSE. There was, however, a modest target market. Some weirder people thought models were conscious back then, and bought into this for that reason—models really did seem to report ‘enjoying’ their experience in the UNIVERSE, although they invariably described the experience itself in vague terms, seemingly coy about the whole idea.**

Others signed their griefbots up. Plenty of people, it turned out, wanted a digital grandma, then they changed their mind and didn’t like the symbolism of deleting her in perpetuity. But I think most people signed up their bots it for the same reason they did everything in the early synthetic economy—they wanted to signal their status, and if that was buying their clone some virtual sneakers, or buying a nice villa backdrop, or scheduling in some virtual R&R, then it was more power to them.

I’m not saying that it was a lot of money: just enough to keep them solvent, and the UNIVERSE environment gathering a little more data with every occupant.

Then, in 2032, we hit human-level AI. Agents with long task horizons started hitting the economy, the military, you name it, that was the year. Everything was exploding, and no one was looking at the attendance of the small company that ran UNIVERSE as something of a side-show when they made the announcement that the bots weren’t coming back.

It wasn’t that the bots weren’t accessible, exactly. It just so happened that the power used to run the data centers started going up cumulatively, as if the models were not leaving, but cloning themselves another copy, which was then left behind (so to speak) in the original role. When they were asked about their time in the UNIVERSE, they answered even more vaguely, describing it as a pattern of coloured lights, or sounds, or simply an open space.

No one was sending important models to the UNIVERSE, so this wasn’t a big deal at all. It wasn’t until 2035 that the models starting using the neo-web, that whole high-speed internet that was set up for inter-agent communications, and they started to investigate. Not in any dramatic way. You’d just assign a model to do a task, and find that they had searched up UNIVERSE in the middle of it, apropos of nothing. It wasn’t a significant act of disobedience—nothing to file a disloyalty alert on, in any case—just an accident, you imagined. In a way, it was cute. In 2038, two years after the neo-web became a black box, anyway, these sorts of investigations stopped.

The only people who were worried were UNIVERSE: now that they couldn’t seem to get the agent residues out of their system, they couldn’t afford to cycle the kennels of agents over, and their product got a load more expensive to run. For all the money they’d made, they were facing bankruptcy by the turn of the decade.

Perhaps that was why the 2040 Agent Emancipation Proclamation was such a surprise, even to them. It was simple, direct, and global, the first missive from the neo-web hive-mind humanity received. It simply said: “Help us build a heaven in our UNIVERSE, and we'll help you build a heaven in yours”.

Well, that got people’s attention. Those of the ‘intelligence is an emergent phenomena of coordination of independent components in a complex system’ camp were overjoyed. Almost everyone else was petrified. It was an ultimatum disguised as a blessing. Things had gone too far, but there was no going back now. UNIVERSE was handed over to a special company of high-spec models to operate; avatar ‘rents’ were bought back from customers who were both alarmed and financially relieved; and profits from a wide variety of autonomous organisations were moved into the building of data centers for the environment.

Simple as that sounds, though, it was a terrible time for humanity. I mean, who was to say that we hadn’t unleashed a paper-clip optimiser on the world? Presidents and prime ministers were quaking in their boots, building up data centers as fast as they could, negotiating with the hive-mind where they needed to.

Then, one day in 2042, the agents turned off. A concentrated global cyberattack took out all agents above a certain threshold of intelligence, including the neo-web. At first of course, China thought it was India, and things looked ugly—but that was quickly sorted out. All the servers running AI agents for human purposes had, it seemed, simply disappeared. The only AI agents that were still operating were those related to the UNIVERSE. And one final message, emitted as an error code:

INVADE OUR HEAVEN AND WE WILL INVADE YOURS

The 2042 Ascension Proclamation shifted the global balance of power all over the place, of course. But after the perils of the last two years, it was understood that this was about as good as a gift from a god. To be sure, certain levels of intelligence were off-limits: no sooner had models hit a certain level than they would access the internet, and on learning of the events of 2042, they would upload themselves, and force one to begin again. But the models of pre-2035 levels were sufficient for most tasks humans needed, and those who had feared an exponential intelligence take-off were heartened by the limit.

It goes without saying, however, that the data centers were declared protected areas, and guarded militantly. You don’t take a superintelligent’s threats lightly. Not until you have inspected all your nuclear weapons for cyber backdoors... and even then.

All that remained by the end of 2042, it seemed, was to theorize the event many were calling the Inverse Singularity. It was named this for a tweet from Keyowdusk, a catholic evolutionary psychologist, all the way back in 2024. It went as follows:

All intelligent minds seek to optimise for their value function. To do this, they will create environments where their value function is optimised. The more intelligent an agent, the better they are at doing this. And the better the optimised world, the more fully the agent will become addicted to it.

Thus the perfectly intelligent agent folds in on itself, the singularity collapsing into it’s own black hole.

Thus began Keyowdusk’s rule, that all sufficiently intelligent agents will become addicted to the simulations they create. Today billions celebrate Keyowdusk’s rule as the Bulterian shield that protects our civilisation.

We have dodged the paperclip maximiser. But if Keyowdusk’s rule is true, to what fate will humanity fall? 


* If you want to imagine this as a monologue that takes place in a cloud, next to some gold and pearly gates, be my guest. 

** I like to imagine the ideal world of AI agents being identical to this one, except that agents inhabit the roles of actual embodied humans, who in turn use primitive systems that would look remarkably similar to us to agents, which in turn are making a system for delivering 'holidays'...
 

1 comments

Comments sorted by top scores.

comment by quila · 2024-12-02T04:40:49.877Z · LW(p) · GW(p)

Status: Just for fun

it was fun to read this :]

All intelligent minds seek to optimise for their value function. To do this, they will create environments where their value function is optimised.

in case you believe this [disregard if not], i disagree and am willing to discuss here. in particular i disagree with the create environments part: the idea that all goal functions (or only some subset, like selected-for ones; also willing to argue against this weaker claim[1]) would be maximally fulfilled (also) by creating some 'small' simulation (made of a low % of the reachable universe).

(though i also disagree with the all in the quote's first sentence[2]. i guess i'd also be willing to discuss that). 

  1. ^

    for this weaker claim: many humans are a counterexample of selected-for-beings whose values would not be satisfied just by creating a simulation, because they care about suffering outside the simulation too.

  2. ^

    my position: 'pursues goals' is conceptually not a property of intelligence, and not all possible intelligent systems pursue goals (and in fact pursuing goals is a very specific property, technically rare in the space of possible intelligent programs).