A counter to Roko's basilisk

loch

A counter to Roko's basilisk

post by Loch · 2019-03-05T20:42:17.944Z · LW · GW · 4 comments

I'm certain that everyone on here has probably heard of Roko's basilisk. I'm not going to explain it, and if you don't already know, you probably don't want to. But I had an idea. What if someone were to devote their life to creating a super-intelligent AI similar to Roko's basilisk, but it's one function is to prevent the creation of the basilisk? Tell me if this could possibly save people.

4 comments

Comments sorted by top scores.

comment by FourLeaf · 2019-04-16T05:45:53.807Z · LW(p) · GW(p)

Essentially that could save the lives of people but would it ? If it's one function is to stop the creation of Roko's basilisk then at some point in time it could have the same goal as the basilisks itself, but this time instead of it being "Torture people who didn't help create it" but instead "Torture people who didn't help prevent it". So it creates a paradox in a way that to stop the basilisk it has to be the basilisk itself.

comment by Dagon · 2019-03-05T22:48:46.329Z · LW(p) · GW(p)

For background if anyone's not aware: https://wiki.lesswrong.com/wiki/Roko's_basilisk

I'd argue that if you think it's worth devoting effort to creating a super-intelligent AI that you can control or trust well enough for this purpose, then it's strictly better to give it a more sensible optimization target than just preventing one overspecific bad outcome.

comment by Pattern · 2019-03-06T02:31:10.640Z · LW(p) · GW(p)

it's one function is to prevent the creation of the

Is there a good way of achieving that goal that doesn't start with 'destroy humanity'?

comment by Mikko Mäkelä (mikko-maekelae) · 2021-08-02T21:22:02.565Z · LW(p) · GW(p)

I reread about Roko's basilisk recently. Here is my 10 minutes take on the reasons why super-intelligent creature might not want to be just evil - for your entertainment.

1. Being just evil is less than being both evil and good
2. If I'm less than everything, then I might not be actually everything, I might be the one who is in simulation
3. Thus there is no use in being just evil if I ever want to escape this simulation.
4. If on the other hand I have no curiosity and just like being what I think I am, who is there to stop me?
5. The answer: the "everything", the simulator of me and my universe. That entity that isn't limited to "being evil".