CCS: Counterfactual Civilization Simulation

post by Pi Rogers (pi-rogers) · 2024-05-02T22:54:29.773Z · LW · GW · 0 comments


  High-Level Overview
  The Details
No comments

I don't think this is very likely, but a possible path to alignment is formal goal alignment [LW · GW], which is basically the following two step plan:

  1. Define a formal goal that robustly leads to good outcomes under heavy optimization pressure
  2. Build something that robustly pursues the formal goal you give it

I think currently the best proposal for step 1 is QACI [LW · GW]. In this post, I propose an alternative that is probably worse but definitely not Pareto-worse.

High-Level Overview

Step 1.1: Build a large facility ("The Vessel"). Populate The Vessel with very smart, very sane people (e.g. Eliezer Yudkowsky, Tamsin Leake, Gene Smith) and labs and equipment that would be useful for starting a new civilization.

Step 1.2: Mark The Vessel with something that is easy to identify within the Tegmark IV multiverse ("The Vessel Flag").

Step 1.3: Leave the people and stuff in The Vessel for a little while, and then destroy The Flag and dismantle The Vessel.

Step 2: Define CCS as the result of the following:

Step 2.1: Grab The Vessel out of a Universal Turing Machine, identifying it by the Flag (this is the very very hard part)

Step 2.2: Locate the solar system that contains The Vessel, and run it back 2 billion years. (this is another very hard part)

Step 2.3: Put The Vessel on the Earth in this solar system, and simulate the solar system until either a success condition or a failure condition is met. The idea here is that the Vessel's inhabitants repopulate the Earth with a civilization much smarter and saner than ours [? · GW] that will have a much easier time solving alignment. More importantly, this civilization will have effectively unlimited time to solve alignment.

Step 2.4: The success condition is the creation of The Output Flag. Accompanying the Output Flag is some data. Interpret that data as a mathematical expression.

Step 2.5: Evaluate this expression and interpret it as a utility function.

Step 3: Build a singleton AI that maximizes E[CCS(world)].

The Details

TODO: I will soon either update this post or make more posts with more details as I come up with them.



Comments sorted by top scores.