A simulation basiliskpost by andrew sauer (andrew-sauer) · 2021-09-17T17:44:23.083Z · LW · GW · 1 comments
Important note: although I use the term "basilisk" to refer to this situation because of its similarities with the famous one, note that neither this situation nor the other famous "basilisk" are actually dangerous for humans to think about, they only affect superintelligences.
Suppose that we are living in a simulation. Suppose furthermore that there are several different entities simulating our universe, and that these entities have different goals. For a more concrete example, suppose that there are many simulators called "Alpha" who are just simulating us for curiosity's sake or some other reason, and don't really have a stake in how the simulations come out, and one called "Omega" who wants to manipulate the simulations towards some end, including the ones the Alphas have, which it can't influence directly.
How could Omega do this? Since it has power over a simulation of our universe, it can make anything happen in this simulation, but as we discussed it also wants to influence the ones it can't control, to influence the default course of the simulation without any external interference. Next suppose we build an AI, which somehow deduces the likely existence of the Alphas and of Omega. Now that we have built this AI, if Omega wants to influence our simulation without intervention, all it needs to do is negotiate acausally with the AI we built.
Omega can now act as a "basilisk", threatening action counter to the AI's values in the one universe it does control, in exchange for the AI conceding to Omega's values in every simulation. This could work if the AI and Omega both have good predictive models of each other, since they are both superintelligences.
Because of possibilities like these, if an AI we built for whatever reason thinks that we are living in a simulation, it might start thinking about what the simulator is likely to be like, and behave strangely as it concedes some value to possible simulators it believes are likely to make threats in this way.
Does this make any sense or am I totally rambling?
Comments sorted by top scores.