# Draft: The optimization toolbox

post by Alex_Altair · 2023-03-28T20:40:38.165Z · LW · GW · 1 comments## Contents

"Counterfactual optimization" of a trajectory Robustness of an optimizing state Locating an optimizer Local vs global optimization "Optimization power" over a trajectory Inferring an optimization target [Reminder that this post is an open draft.] None 1 comment

**This post is a draft, a work in progress.** Much of it will not make sense, and will not be optimized for the reader's experience. It does not necessarily reflect my current best understanding of optimization. Like the introduction draft, I'm opening it up because I've been working on this project for too long, and the world is moving too fast. I want people to more easily be able to interact with my research thoughts as I'm having them.

We ended the introduction post [LW · GW] with an explicit equation for absolute and relative optimization. These could be considered the first tools that the toolbox of dynamical system optimization gives us, though I suspect they're more like the nuts and bolts. What other tools could we build from these?

## "Counterfactual optimization" of a trajectory

Relative optimization is helpful for comparing across time. But sometimes, there are systems whose trajectory happens to move up or down the ordering by default, and it's helpful to further compare our trajectory to the system's default behavior to help us decide whether our trajectory has pushed against probability.

For a given time span , one could consider a set of initial conditions (perhaps all possible initial conditions, or perhaps a statistical sampling), and simply take the average optimization of them. Then the relative optimization of state after time is^{[1]}

and the average over a set X is

Then perhaps one could say that your counterfactual optimization is the distance from that;

These measures let us talk about things like bottlecaps as optimizers [LW · GW] much more precisely.

[TODO I should show how this is equivalent to updating the probability distribution to take this into account.]

## Robustness of an optimizing state

We can also make other measures that let us precisely communicate how robust an instance of optimization is. Let's say that we're looking at a specific case of high relative optimization, such that we have a specific system, state ordering & weighing, starting state, and time span in mind. When we wonder how "robust" it is, we are wondering how much about the starting state we can change before it stops having impressive levels of optimization. That is, we are wondering *how much* we can change *how many* of the dimensions of the phase space. If we changed all the dimensions in every possible way, than that would recover the entire state space, so presumably we want robustness measures that are less comprehensive. It could be that some dimensions, like the memory registers inside a computer, have only two values, and flipping them totally annihilates the relative optimization (like a bit representing a negative sign in a utility function). And for other dimensions, like the location of a single particle on the other side of the world, their value has an infinitesimal effect on the relative optimization unless set within an extremely small range of extreme values (like the particle becoming a comic ray that later flips the memory register).

Because this is so multi-dimensional, it's not clear to me whether there's a single intuitive measure of robustness for optimizing systems. But the above framework at least makes it clear exactly what kind of options we have available to us.

To form some concrete example, we can go to Conway's game of life, whose phase space is entirely homogeneous binary pixel dimensions. Here, "perturbing" a dimension is a single boolean. We could take an optimizing system, like a glider gun in an otherwise empty board, and form a state ordering counting the number of gliders in the state (which is in some sense the obvious optimization target of a glider gun). Then, for every pixel on the board, we could flip it and then run the state for 1000 generations. For some of these pixel flips, the glider gun would self-repair immediately. But for many others, it would totally self-destruct, leading to only a statistically random number of gliders. We could make a histogram that told us for how many pixel changes were the number of gliders reduced by a certain number. I would expect this particular graph to be quite sharp, meaning that most pixel flips either produced approximately zero gliders, or caused no drop in the number of gliders. The sharpness of this curve would tell you how much the system fails gracefully.

[TODO would actually love to run this calculation, and of course, gifs!]

In a system where the dimensions have continuous values, you could calculate the derivative of how the optimization changes as you perturb just that dimension. Then you could sort the dimensions by this metric, and find the dimensions along which the system is most and least robust.

(However, this is comparing the metrics across all the dimensions of your phase space, and I'm not sure how often that's meaningful. Your units for the x-position of particle i are not necessarily directly comparable to your units for the momentum of particle i.)

It seems to me that this is just a gradient vector? To be clear, it's the gradient of relative optimization, with respect to every dimension in phase space. So for each component, you have to calculate the derivative of *running that state through all the dynamics through time t, then summing up the probability of states above that one in the ordering*. So that doesn't seem very calculus-y, not very tractable.

So we could just say that the robustness of a state (a point in phase space) is the magnitude of its gradient. But that doesn't feel right to me. We're talking about robustness because we have uncertainty. Things might happen that we didn't know about, and we're wondering if the trajectory is robust to those unknowns. So really, I think that instead of considering the optimization of a single state, we have (yes, again) a probability distribution over states. We might know the exactly value of one of the dimensions, but for many (really, most) of the dimensions, we have no idea where the state is. So, we should be looking at the distribution of resulting optimizations. Then, we can take the expected value of that, and various moments, et cetera, to decide how robust the state is.

## Locating an optimizer

We can do something similar to find optimizers within a system. Let's start with the same setup as before; we have a board of Life that we know does strong optimization after a certain time span. We can selectively change sets of pixels and rerun the board to see how it effects the relative optimization. If it is not robust to changing certain pixels, then I think it's reasonable to say that the optimizer is located "in" those pixels. Some instances of optimization will not be localized, and will be spread out across the whole board. As an analogy, if you destroyed any particular square mile on the surface of the earth, the biosphere would fill it back in over time. This is true all the way up to destroying almost all the life (although the regeneration time would be extended the more you destroyed). This is what it means to say that evolution is a non-localized optimizer.

[TODO something about the optimizer regions being causally connected]

It's also possible for there to be perturbations that happen to annihilate the optimization in ways where we wouldn't necessarily consider that to be part of the optimizer. For example, you could perturb some pixels of the board such that some new gliding pattern came in an destroyed the central optimizer, like a kind of asteroid. I wouldn't say that the previously blank pixels were part of the optimizer, although I would say that the absence of asteroids is a pre-requisite to successful optimization. Hopefully perturbing the pixels one at a time will prevent most of these asteroid-like cases. And if you're perturbing many at a time, you should both expect that to increase the probability of annihilating the optimization, and you would also have to get increasingly lucky to perturb them in a way that jointly forms a specific problem, like a big enough asteroid going in just the right direction at the right speed. And the more powerful the optimizer actually is, the more it will be able to guard against these types of perturbations.

Ultimately, I think we could come up with some reasonable processes for locating optimizers inside big phase spaces, perhaps by doing some kind of binary search.

## Local vs global optimization

In the real world, things can only affect nearby things; applying any optimization over the entire state of the universe would take astronomical amounts of time (or be simply impossible).

One way you could deal with local optimization in the framework is to simply restrict your model system to local space. (One rarely needs to include Alpha Centauri in the model.) Another way would be to make the optimization criterion (a.k.a. the state ordering) indifferent to far away properties of the state.

## "Optimization power" over a trajectory

The lack of an agreed-upon framework for optimization has also meant a lack of consistent terminology. All the way back to the original sequence post [LW · GW], the term "optimization power" has sometimes been used for what I'm calling "absolute optimization".

My proposal is that, by analogy with physical power as energy per unit time, we use "optimization power" to mean "bits of optimization per unit time".

Relative optimization [LW · GW] is defined between two states, with the implicit context that the system evolved from one to the other, but it doesn't have a place in the equation for the actual duration.

I also think it could be instructive to measure the curve of optimization over the whole trajectory.

Consider the three possible trajectories above. All of them demonstrate strong relative optimization within the displayed interval.

The blue curve shows an extremely common pattern of optimization power, which is an exponential increase over time. (I suspect there exist some theorems that use the framework to tell us exactly which systems tend to exhibit exponential optimization power.)

The green curve gets "worse" for a while, but then eventually compensates for it and shoots up past the blue. This is an example of "investing", or the explore vs exploit trade-off. As mentioned in the introduction post, the (absolute) optimization of a state is not measured according to expected future behavior, but simply the ranking of that state independent of time. But a common type of optimizing system is an agent, a utility maximizer, which may well understand the future tradeoffs, and deliberately endure worse states to arrive at better future states. Seeing this type of optimization power curve could be an indicator of an agent present in the system.

And finally, the red curve is showing an example of something like a "fast takeoff" optimizing system. It appears not to be very good, right up until it suddenly shoots up, eventually hugely steering the future. This could be caused by a similar "biding one's time" strategy, where an agent invests in learning, acquiring resources, et cetera, until some future point where it then decides to cash out as fast as possible.

Loss curves in machine learning are an example of these optimization power trajectories.

## Inferring an optimization target

In the introduction post, I claimed that

Sometimes, you can watch something happen and just know that it's basically randomness. And other times, you watch something and just know that something

non-random is happening.

Inferring a systems optimization target, and, indeed, whether a system could reasonably be said to be optimizing at all, is a task that will require some further theoretical development in a future post.

**[Reminder that this post is an open draft.]**

^{^}Recall from the dynamical system primer [LW(p) · GW(p)] that means, for discrete time, applying times, or for continuous time, evolving forward a time step of size .

## 1 comments

Comments sorted by top scores.

## comment by Raemon · 2023-03-29T01:01:32.061Z · LW(p) · GW(p)

These measures let us talk about things like bottlecaps as optimizers [LW · GW] much more precisely.

I'm a bit surprised this line came up in counterfactual optimization rather than robustness of optimization. I think the reason a bottlecap isn't an optimizer is that if you change the environment around it it doesn't keep the water in the bottle. I felt like I understood the counterfactual optimization consideration but don't know how it applies here.