Optimization and Adequacy in Five Bullets

post by james.lucassen · 2022-06-06T05:48:03.852Z · LW · GW · 2 comments

This is a link post for https://jlucassen.com/optimization-adequacy-five-bullets/

Contents

  Five Bullet Points
  Main Implications
  Musings
None
2 comments

Context: Quite recently, a [? · GW] lot of [LW · GW] ideas [? · GW] have sort of snapped together into a coherent mindset for me. Ideas I was familiar with, but whose importance I didn't intuitively understand. I'm going to try and document that mindset real quick, in a way I hope will be useful to others.

Five Bullet Points

  1. By default, shit doesn't work. The number of ways that shit can fail to work absolutely stomps the number of ways that shit can work.
  2. This means that we should expect shit to not work, unless something forces it into the narrow set of states that actually work and do something.
  3. The shit that does work generally still doesn't work for humans. Our goals are pretty specific and complicated, so the non-human goals massively outnumber the human goals.
  4. This means that even when shit works, we should expect it to not be in our best interests unless something forces it into the narrow range of goal-space that we like.
  5. Processes that force the world into narrow, unlikely outcome ranges are called optimization processes - they are rare, and important, and not magic.

Main Implications

The biggest takeaway is look for optimization processes. If you want to use a piece of the world (as a tool, as an ally, as evidence, as an authority to defer to, etc), it is important to understand which functions it has. In general, the functions a thing is "supposed to have" can come wildly apart from the things that it's actually optimized to do. If you can't find a mechanism that forces a particular thing to have a particular useful property, it probably doesn't. Examples:

The obvious first step when looking for optimization processes: learn to recognize optimization processes. This is the key to what Yudkowsky calls an adequacy argument [LW · GW], which is what I've been broadly calling "hey does this thing work the way I want it to?"

Musings

2 comments

Comments sorted by top scores.

comment by Alex_Altair · 2022-06-06T19:24:02.771Z · LW(p) · GW(p)

Nice post! My main takeaway is "incentives are optimization pressures". I may have had that thought before but this tied it nicely in a bow.

Some editing suggestions/nitpicks;

The bullet point that starts with "As evidence for #3" ends with a hanging "How".

Quite recently, a [? · GW] lot of [LW · GW] ideas [? · GW] have sort of snapped together into a coherent mindset.

I would put "for me" at the end of this. It does kind of read to me like you're about to describe for us how a scientific field has recently had a breakthrough.

I don't think I'm following what "Skin in the game" refers to. I know the idiom, as in "they don't have any skin in the game" but the rest of that bullet point didn't click into place for me.

We definitely optimize for something, otherwise evolution wouldn't let us be here

I think this might be confusing "an optimizer" with "is optimized". We're definitely optimized, otherwise evolution wouldn't let us be here, but it's entirely possible for an evolutionary process to produce non-optimizers! (This feels related to the content of Risks from Learned Optimization.)

capabilities/alignment

Might be worth explicitly saying "AI capabilities/AI alignment" for readers who aren't super following the jargon of the field of AI alignment.

Optimization processes are themselves "things that work", which means they have to be created by other optimization processes.

If you're thinking about all the optimization processes on earth, then this is basically true, but I don't think it's a fundamental fact about optimization processes. As you point out, natural selection got started from that one lucky replicator. But any place with a source of negentropy can turn into an optimization process.

Replies from: james.lucassen
comment by james.lucassen · 2022-06-11T18:43:53.238Z · LW(p) · GW(p)

Thanks! Edits made accordingly. Two notes on the stuff you mentioned that isn't just my embarrassing lack of proofreading:

  • The definition of optimization used in Risks From Learned Optimization is actually quite different from the definition I'm using here. They say: 

    "a system is an optimizer if it is internally searching through a search space (consisting of possible outputs, policies, plans, strategies, or similar) looking for those elements that score high according to some objective function that is explicitly represented within the system."

    I personally don't really like this definition, since it leans quite hard on reifying certain kinds of algorithms - when is there "really" explicit search going on? Where is the search space? When does a configuration of atoms consitute an objective function? Using this definition strictly, humans aren't *really* optimizers, we don't have an explicit objective function written down anywhere. Balls rolling down hills aren't optimizers either.

    But by the definition of optimization I've been using here, I think pretty much all evolved organisms have to be at least weak optimizers, because survival is hard. You have manage constraints from food and water and temperature and predation etc... the window of action-sequences that lead to successful reproduction are really quite narrow compared to the whole space. Maintaining homeostasis requires ongoing optimization pressure.
  • Agree that not all optimization processes fundamentally have to be produced by other optimization processes, and that they can crop up anywhere you have the necessary negentropy resevoir. I think my claim is that optimization processes are by default rare (maybe this is exactly because they require negentropy?). But since optimizers beget other optimizers at a rate much higher than background, we should expect the majority of optimization to arise from other optimization. Existing hereditary trees of optimizers grow deeper much faster than new roots spawn, so we should expect roots to occupy a negligible fraction of the nodes as time goes on.