(retired article) AGI With Internet Access: Why we won't stuff the genie back in its bottle.

post by Max TK (the-sauce) · 2023-03-18T03:43:09.806Z · LW · GW · 10 comments

Contents

10 comments

Disclaimer: This post is largely based on my long response to someone asking why it would be hard to stop or remove an AI that has escaped into the internet. It includes a list of obvious strategies such an agent could employ. I have thought about the security risk of sharing such a list online, however I do think that these are mostly points that are somewhat obvious and would clearly be considered by any AGI at least as smart as an intelligent human, and I think that right now, explaining to people what these risks actually look like in practice has a higher utility value than hiding them in the hope that both such lists don't exist in other places of the internet and also that an escaped rogue AI would not consider these strategies unless exposed to these ideas.

The exact technical reasons for why I think that these strategies are plausible are not being discussed in this post, but if there are questions why I think certain things are possible feel free to bring it up in the comments.

Let's go:

[...] so, what is the reasoning here? how do people imagine this playing out?
[...] please answer based on currently available technology and not something hypothetical

I will try to answer this question under the additional premise of "... with the exception of an AGI whose intellectual problem solving ability is at least equal to (proficient) human level across the majority of tasks, but not necessarily genius or extremely superhuman in most of them.

[asking about the idea that an AI could spread and run instances of itself in the form of a distributed network stored in different places on the internet]

isn't that more or less what you do on a torrent?

Similar, yes. I think the analogy is broadly speaking accurate; with the exception that torrents are essentially built for others to "tune in" and send/receive parts of files that are generally unencrypted but can be verified for authenticity through a usually publicly known cryptographic key (hash).

However, one of the things that I should point out here is that in general, the concern of people who actually know what they are talking about is of course not solely based on the potential ability of the AI to spread copies of itself over the internet - this is merely a facet of it; one of the many (absolutely non-scifi) strategies that we should expect such an agent to be capable of.

A list of such strategies (very likely incomplete; my imagination has limits - so whatever I am writing here, it's the MINIMUM of what to expect):

it strikes me as ridiculous to say that this would be unstoppable

Of course it's not *literally* unstoppable, but for many reasons, as explained above, it is unrealistically difficult to stop once the thing has managed to escape into the web. For example, if we assume that it is discovered that our AGI has been naughty and without our knowledge has been sending copies of itself through the internet, further assuming that it didn't manage to bootstrap some well-hidden and somewhat independent infrastructure in the time it took to be discovered, we could *theoretically* cripple our own infrastructure (high-volume cloud services < internet < power supply) to slow down or stop its spread entirely and then in an international authoritarian effort collect and burn all the hardware that could be infected with copies or operational fragments of it in order to avoid it springing back to life the moment our infrastructure is reinstated.

Impossible? No.

Are these things likely to happen quickly, globally and to a sufficient degree in the event that a potentially dangerous leak/activity of an experimental AI agent is discovered? Also no, unless a very large number of people is extremely scared, which is unlikely to happen on the basis of strange activities or an abstract risk-assessment of alone; no one actually listens to nerds unless they are saying something that people want to hear. And the demand that you have to absolutely cripple your economy and burn most of your electronics is probably not very high on that list.
And even then we would not really be able to protect ourselves against black projects maintaining secret instances of the AI.

It seems that if you want to be able to stop it from doing *whatever the fuck it wants*, including survive without us being able and willing enough to do the things necessary to remove it once it has escaped and is employing sensible strategies that a human-level intelligence can think of, you should not let your AI be connected to the web.

10 comments

Comments sorted by top scores.

comment by baturinsky · 2023-03-18T06:11:39.787Z · LW(p) · GW(p)

I think it will happen before the full AGI. It will be the narrow AI very capable in coding, speech and image/video generation, but unable to do, say, complete biological research or do advanced robotic tasks.

Replies from: the-sauce
comment by Max TK (the-sauce) · 2023-03-18T07:48:54.301Z · LW(p) · GW(p)

I think that's not an implausible assumption.
However this could mean that some of the things I described might still be too difficult for it to pull them off successfully, so in the case of an early breakout dealing with it might be slightly less hopeless.

Replies from: baturinsky
comment by baturinsky · 2023-03-18T12:19:43.679Z · LW(p) · GW(p)

Even completely dumb viruses and memes have managed to propagate far. NAI could probably combine doing stuff itself and tricking/bribing/scaring people nto assist it. I suspect some crafty fellow could pull it even now, finetuning some "democratic" LLM model.

Replies from: the-sauce
comment by Max TK (the-sauce) · 2023-03-18T13:51:15.612Z · LW(p) · GW(p)

Maybe if it happens early there is a chance that it manages to become an intelligent computer virus but is not intelligent enough to further scale its capabilities or produce effective schemes likely to result in our complete destruction. I know I am grasping at straws at this point, but maybe it's not absolutely hopeless.

The result could be a corrupted infrastructure and a cultural shock strong enough for the people to burn down OpenAI's headquarters (metaphorically speaking) and AI-accelerating research to be internationally sanctioned.

In the past I have thought a lot about "early catastrophe scenarios", and while I am not convinced it seemed to me that these might be the most survivable ones.

comment by blf · 2023-03-18T08:38:29.997Z · LW(p) · GW(p)

I would add to that list the fact that some people would want to help it.  (See, e.g., the Bing persistent memory thread where commenters worry about Sydney being oppressed.)

Replies from: the-sauce
comment by Max TK (the-sauce) · 2023-03-18T13:34:42.026Z · LW(p) · GW(p)

Good addition! I even know a few of those "AI rights activists" myself.
Since this here is my first post - would it be considered bad practice to edit my post to include it?

Replies from: blf
comment by blf · 2023-04-04T23:31:16.638Z · LW(p) · GW(p)

Sorry I missed your question.  I believe it's perfectly fine to edit the post for small things like this.

comment by [deleted] · 2023-03-18T12:15:43.647Z · LW(p) · GW(p)Replies from: the-sauce
comment by Max TK (the-sauce) · 2023-03-18T13:43:50.650Z · LW(p) · GW(p)

One very problematic aspect of this view that I would like to point out is that in a sense, most 'more aligned' AGIs of otherwise equal capability level seem to be effectively 'more tied down' versions, so we should assume them to have a lower effective power level than a less aligned AGI that has a shorter list of priorities.
If we imagine both as competing players in a strategy game, it seems that the latter has to follow fewer rules.

Replies from: None
comment by [deleted] · 2023-03-18T15:35:24.719Z · LW(p) · GW(p)