An example elevator pitch for AI doom

post by laserfiche · 2023-04-15T12:29:17.303Z · LW · GW · 5 comments

Contents

5 comments

I have been surprised to repeatedly see the claim that there isn't even an argument for concern about AI.  That the claim has been made without evidence and can therefore be dismissed.

Obviously, there is an extensive library of evidence and arguments that have been made for decades.  Additionally, I would argue that it is the default assumption.  However, there is clearly still a need to have a concise argument that can be produced on the fly with no need to understand terminology or any additional background.  Here is another attempt at that:

Does this mean that Ais are 100% certain to wipe out humanity? No, of course not. That's an absurd bar.  Rather, the burden of proof should be to show that AIs are 99% certain not to cause catastrophe.   If there's a 10% chance that Ais will sterilize the earth, that's already an all hands on deck situation.

5 comments

Comments sorted by top scores.

comment by trevor (TrevorWiesinger) · 2023-04-15T14:49:51.442Z · LW(p) · GW(p)

AFAIK, the Superintelligence FAQ is still considered to be the best introduction for most people [LW · GW]. 

Ngo wrote AI safety from first principles [LW · GW], but I don't know if it was good enough. Along with List of Lethalities, it's probably a solid list of things to include. Possibly worth doing some work on in order to make a better version.

Zvi's Basics of AI wiping out all value [LW · GW] definitely looks pretty neat, but at the end of the day we need someone to go out and empirically test all of these and see what gets the best results.

Replies from: laserfiche
comment by laserfiche · 2023-04-15T16:14:49.631Z · LW(p) · GW(p)

I find those first two and Lethalities to be too long and complicated for convincing an uninitiated, marginally interested person. Zvi's Basics is actually my current preference along with stories like It Looks Like You're Trying To Take Over The World [LW · GW] (Clippy).

comment by TAG · 2023-04-16T16:47:40.587Z · LW(p) · GW(p)

This doesn't work as a from-scratch explanation because

  • You don't explain what alignment is or why it is desireable.

  • You don't explain what agentising is. or why it is dangerous.

  • You don't explain why "training successor versions of itself" is dangerous.

Replies from: laserfiche
comment by laserfiche · 2023-04-18T17:09:43.296Z · LW(p) · GW(p)

I agree that there are many situations where this cannot be used. But there appears at least to be a gap that arguments like this can fill that is missed by the existing explanations.

comment by Devansh Shah (devansh-shah) · 2023-04-15T15:40:10.138Z · LW(p) · GW(p)

Contrary opinion: Agents like #AutoGPT are more aligned than the underlying LLMs due to chaining. e.g. If the model will say "I need to make $1M" & then answers itself as "Stealing is the best plan to achieve it"; there are two prompts for the underlying LLMs to refuse help.