Posts
Comments
Excluding a superintelligent AGI that has qualia and can form its own (possibly perverse) goals, why wouldn’t we be able to stop any paperclip maximizer whatsoever by the simple addition of the stipulation “and do so without killing or harming or even jeopardizing a single living human being on earth” to its specified goal? Wouldn’t this stipulation trivially force the paperclip maximizer not to turn humans into paperclips either directly or indirectly? There is no goal or subgoal (or is there?) that with the addition of that stipulation is dangerous to humans — by definition. If we create a paperclip maximizer, the only thing we need to do to keep it aligned, then, is to always add to its specified goals that or a similar stipulation. Of course, this would require self-control. But it would be in the interest of all researchers not to fail to include the stipulation, since their very lives would depend on it; and this is true even if (unbeknownst to us) only 1 out of, say, every 100 requested goals would make the paperclip maximizer turn everyone into paperclips.