[LINK] AI-boxing Is News, Somehow

post by MugaSofer · 2012-10-19T12:34:27.117Z · score: 6 (10 votes) · LW · GW · Legacy · 11 comments

Tech News Daily have published an article advocating AI-boxing, which namechecks Eliezer's AI-Box Experiment. It seems to claim AI-Boxes are a revolutionary new idea; is covered in pictures of fictional, evil AIs; and worries that a superintelligent AI might develop psychic powers.

Seems like a good reminder of the state of reporting on such matters.


Comments sorted by top scores.

comment by VincentYu · 2012-10-19T15:12:45.543Z · score: 10 (10 votes) · LW(p) · GW(p)

The paper (Yampolskiy 2012). The "psychic powers" are in section 2.1.4, "Pseudoscientific attacks".

comment by shminux · 2012-10-19T17:24:51.515Z · score: 5 (5 votes) · LW(p) · GW(p)

The original article is somewhat cautious:

While it is unlikely that long-term and secure confinement of AI is possible, we are hopeful that the proposed protocol will give researchers a little more time to find a permanent and satisfactory solution for addressing existential risks associated with appearance of superintelligent machines

but not nearly cautious enough:

All supervision, maintenance and interactions with the AI need to be done only via unintelligent programs, which have no option to release the AI.

as if an unintelligent program is not gameable.

Their approach to AI safety:

Asking the AI to answer open ended questions opens up the possibilities for social engineering attacks as well as numerous other dangerous consequences. Our proposed protocol is based on the idea of asking what we will call ‘safe questions’with restricted answers. We define a safe question as one for which a human being could find an answer without any help from superintelligence, all on his own merit.Why is this useful? Suppose that a scientist working on a cure for cancer is looking at two potential cures. Each one could be developed and tested without assistance from the AI, but would take about 3 years of hard work each. Which one to try first if the scientist believes that they have about an equal chance of working? If a wrong option is chosen to be tested first, humanity will still obtain a cure for cancer but it will happen 3 years later.What if we could ask the AI to suggest which option to try first?

This question is safe for a number of reasons. First, each potential answer has an equal probability of being correct. Second, a human being could answer it without assistance from the AI and so asking AI for help will only speed up progress but in no way change the outcome. It is essentially equal to being very lucky multiple choice guesser. Finally, the answer to this question could be encoded in a single bit making it hard to hide additional messages in the provided answer.

To make sure that a potential question is safe, a group of experts could be used to review it.

We believe our communication protocol to be verifiably secure against information leaking even if the AI had full access to the description of the security protocol. If the protocol is correctly implemented, humanity will essentially have access to a superintelligent oracle, which might be extremely beneficial and safer than an unconstrained AI.

I fail to see how "only safe questions with two possible answers of even likelihood which are independently computable by people should be submitted to the AI" can even be constructed for a non-trivial case, and how it can stop a determined superintelligence to make people want to relax the constraints bit by bit.

comment by ewang · 2012-10-20T18:04:54.195Z · score: 3 (5 votes) · LW(p) · GW(p)

As a Dwarf Fortress player, I'd prefer using "&" to warn about AI hazards rather than "@".

comment by wedrifid · 2012-10-21T05:09:30.312Z · score: 1 (1 votes) · LW(p) · GW(p)

As a Dwarf Fortress player, I'd prefer using "&" to warn about AI hazards rather than "@".

Definitely. Whether in Dwarf Fortress or a rogue-like "@" means "player character or another demi-human". Humanoid Terminators aren't the risk here!

comment by Dallas · 2012-10-19T16:18:29.582Z · score: 2 (2 votes) · LW(p) · GW(p)

I somehow really thought this article was going to be about upscaled Rock 'Em Sock 'Em Robots. I'm not sure if this is better or worse.

comment by DavidPlumpton · 2012-10-20T08:09:04.571Z · score: 1 (1 votes) · LW(p) · GW(p)

Whatever happened with that (Russian?) movie based on the idea?

comment by Thomas · 2012-10-19T12:43:48.027Z · score: -1 (3 votes) · LW(p) · GW(p)

No magic powers were mentioned on this site. Maybe powers virtually indistinguishable from magic powers. But that is something different.

comment by Lapsed_Lurker · 2012-10-19T14:01:09.841Z · score: 3 (3 votes) · LW(p) · GW(p)

…powers such as precognition (knowledge of the future), telepathy or psychokinesis…

Sounds like a description of magic to me. They could have written it differently if they'd wanted to evoke the impression of super-advanced technologies.

comment by MugaSofer · 2012-10-19T14:09:43.096Z · score: 1 (1 votes) · LW(p) · GW(p)

I somehow doubt that meant super-advanced technology - remember, this AI is trapped in a box.

I changed it to "psychic powers", since that seems more accurate - high intelligence leading to "psychic powers" is a well-established sci-fi trope.

comment by Thomas · 2012-10-19T14:46:02.668Z · score: -2 (2 votes) · LW(p) · GW(p)

Yes, I wasn't clear enough. By "this site" I mean Lesswrong, not the NWT.

On LW (and on any somehow related site), NWT could not get the information that an AI might become magical. Only very advanced or very very very advanced.

For an outside observer it may be hard to tell where is the difference, but it always is and it always will be a fundamental difference.

The message "there is no magic" is the loudest message here on Lesswrong, as I see it.

The second one "AI may LOOK LIKE a magic" is ... well, subordinate to the first one.

And it is the NWT who doesn't understand this hierarchy.

comment by MugaSofer · 2012-10-19T12:55:12.586Z · score: 1 (3 votes) · LW(p) · GW(p)

Changed to "psychic powers". It's clearer if nothing else.