AI Safety without Alignment: How humans can WIN against AI

post by vicchain (vic-cheng) · 2023-06-29T17:53:03.194Z · LW · GW · 1 comments

Contents

  Overall idea:
  Assumptions:
  Example Idea 1 (only require Assumption 1) "Escape":
  Example Idea 2 (require all 3 assumptions) "Lock":
  Pros of these ideas:
  Cons:
  Note: 
None
1 comment

Here are some ideas on how we can restrict Super-human AI (SAI) without built-in Alignment:

Overall idea:

We need to use unbreakable first principles, especially laws of physics to restrict or destroy SAI. We can leverage the asymmetries in physics, such as the speed of light limit, time causality, entropy, uncertainty, etc.

Assumptions:


1. Super-Intelligence cannot break the laws of physics.
2. SAI is silicon-based.
3. Anything with lower-intelligence than the current Collective Human Intelligence (CHI) cannot destroy all humans.

Example Idea 1 (only require Assumption 1) "Escape":


We send human biological information (or any information that we want to protect from SAI) at/near the speed of light to the Cosmic Event Horizon. Then after some time, the information will be impossible to be reached by anything coming from Earth due to the accelerated expansion of the universe, protecting the information from our SAI.

Of course, we could never reach this information either but at least we know humans persisted and "live on" in the universe. Last resort.

Example Idea 2 (require all 3 assumptions) "Lock":


We figure out the lower-bound of computational speed (or precision) to reach our current CHI (not trivial). Lock the global computational ability at that lower-bound for silicon, using some sort of interference device (uncertainty principles). So the intelligence of AI will never surpass our current CHI. Ideally, this device can be implemented and keep running easily so that we don't need to use political tools such as a moratorium.

Pros of these ideas:

Cons:

These are probably some of the worst ideas in this line of thinking so it would be wonderful if you all can give some feedback on the framework and assumptions or build on top of these ideas.

Note: 

1 comments

Comments sorted by top scores.

comment by Raemon · 2023-06-29T17:57:56.777Z · LW(p) · GW(p)

You know, I started reading this expecting it to be nonsense, but, I think maybe I'm moderately sold on "beam out high fidelity info about humanity at lightspeed and hope someone picks it up" being something that's... someone's comparative advantage. Like, this seems like the sort of thing the Long Now Foundation would be into and they weren't really strategically focused on x-risk anyway.