Building Safer AGI by introducing Artificial Stupidity

post by Michaël Trazzi (mtrazzi) · 2018-08-14T15:54:33.832Z · score: 8 (4 votes) · LW · GW · 7 comments

This is a link post for https://arxiv.org/abs/1808.03644

Authors: Michaël Trazzi, Roman V. Yampolskiy

Abstract: Artificial Intelligence (AI) achieved super-human performance in a broad variety of domains. We say that an AI is made Artificially Stupid on a task when some limitations are deliberately introduced to match a human's ability to do the task. An Artificial General Intelligence (AGI) can be made safer by limiting its computing power and memory, or by introducing Artificial Stupidity on certain tasks. We survey human intellectual limits and give recommendations for which limits to implement in order to build a safe AGI.

7 comments

Comments sorted by top scores.

comment by Charlie Steiner · 2018-08-14T18:51:22.662Z · score: 5 (4 votes) · LW · GW

This feels like one of those solutions that only works if adopted by everyone, even when everyone has a selfish incentive to make their AI just a little smarter than the competition's.

comment by Michaël Trazzi (mtrazzi) · 2018-08-14T19:51:10.578Z · score: 4 (2 votes) · LW · GW

If I understand you correctly, every AGI lab would need to agree in not pushing the hardware limits too much, even though they would steel be incentivized to do so to win some kind of economic competition.

I see it as a containment method for AI Safety testing (cf. last paragraph on the treacherous turn). If there is some kind of strong incentive to have access to a "powerful" safe-AGI very quickly, and labs decide to skip the Safety-testing part, then that is another problem.

comment by Charlie Steiner · 2018-08-14T19:19:43.571Z · score: 2 (2 votes) · LW · GW

Perhaps one might try to use a slightly sub-human AI to help build a safe self-improving AI? I'm not too clear on what kind of problems it helps us solve, though I have't put any thought into it.

comment by Michaël Trazzi (mtrazzi) · 2018-08-14T20:07:29.322Z · score: 1 (1 votes) · LW · GW

The points we tried to make in this article were the following:

  • To pass the Turing Test, build chatbots, etc., AI designers make the AI artificially stupid to feel human-like. This tendency will only get worse as we get to interact more with AIs. The pb is that to have sth really "human-like" necessits Superintelligence, not AGI.
  • However, we can use this concept of "Artificial Stupidity" to limit the AI in different ways and make it human-compatible (hardware, software, cognitive biases, etc.). We can use several of those sub-human AGIs to design safer AGIs (as you said), or test them in some kind of sandbox environment.
comment by Pattern · 2018-08-14T20:34:43.587Z · score: 0 (0 votes) · LW · GW

The last time I saw the phrase 'artificial stupidity', it referred to people designing chatbots to have an advantage on 'Turing Tests' by making common spelling mistakes consistent with QWERTY keyboards, and the judges, when judging between a bot and someone who had good spelling, all else being equal, figured the bot was human. On the other hand, I could also see this being something a 'smart' AI might do.

limitations are deliberately introduced

Today this is is a part of better computer chess - coming up with algorithms that can compete with each other at, using less and less resources. (Last I heard they can beat the best human players while running on phones.)

I've also seen a lot of criticism directed at google for running tests or 'competitions' with very questionable, and arbitrary limitations in order to give an advantage to their program which they want to look good, such as with AlphaZero.

comment by Michaël Trazzi (mtrazzi) · 2018-08-14T20:52:30.232Z · score: 1 (1 votes) · LW · GW

Yes, typing mistakes in Turing Test is an example. It's "artificially stupid" in the sense that you go from a perfect typing to a human imperfect typing. I guess what you mean by "smart" is an AGI that would creatively make those typing mistakes to deceive humans into believing it is human, instead of some hardcoded feature in a Turing contest.

comment by jmh · 2018-08-15T14:34:56.809Z · score: 0 (2 votes) · LW · GW

I suppose it depends on what we mean by safe AIs but in the back of my mind I think we'll be safe from an AI deciding to take over the world and human kind (or simply kill us all off) if we manage to build in humor. That won't be sufficient, perhaps not necessary either, but I think having it might make the goal of safe AIs easier to accomplish.