Is Fisherian Runaway Gradient Hacking?

post by Ryan Kidd (ryankidd44) · 2022-04-10T13:47:16.454Z · LW · GW · 6 comments

Contents

6 comments

TL;DR: No; there is no directed agency that enforces sexual selection through an exploitable proxy. However, Fisherian runaway is an insightful example of the path-dependence of local search, where an easily acquired and apparently useful proxy goal can be so strongly favored that disadvantageous traits emerge as side effects.

Why are male peacocks so ornamented that they are at greatly increased risk of predation? How could natural selection favor such energetically expensive plumage that offers no discernible survival advantage? The answer is “sex”, or more poetically, “demons in imperfect search [AF · GW]”.

Fisherian runaway is a natural process in which an easy-to-measure proxy for a “desired” trait is “hacked” by the optimisation pressure of evolution, leading to “undesired” traits. In the peacock example, a more ornamented tail could serve as a highly visible proxy for male fitness: peacocks that survive with larger tails are more likely to be agile and good at acquiring resources for energy. Alternatively, perhaps a preference for larger tail size is randomly acquired. In any case, once sexual selection by female peacocks has zeroed in on “plumage size” as a desirable feature, males with more plumage will likely have more children, reinforcing the trait in the population. Consequently, females are further driven to mate with large-tail men, as their male offspring will have larger tails and thus be more favored by mates. This selection process may then “run away” and produce peacocks with ever more larger tails via positive feedback, until the fitness detriment of this trait exceeds the benefit of selecting for fitter birds.

In outsourcing to sexual selection, natural selection has found an optimization demon. The overall decrease in peacock fitness is possible because the sexual selection pressure of the peahen locally exceeds the selection pressure imposed by predation and food availability. Peacocks have reached an evolutionary “dead-end”, where a maladaptive trait is dominant and persistent. If peacocks were moved “off distribution” to an environment where predation was harsher or food more scarce, they would fare significantly worse than their less ornamented, “unsexy” ancestors.

Gradient hacking [AF · GW] is a process by which an internally acquired “mesa-optimizer [AF · GW]” might compromise the optimization process of stochastic gradient descent (SGD) in a machine learning system. A mesa-optimizer might accomplish this by:

  1. Introducing a countervailing, “artificial” performance penalty that “masks” the performance benefits of ML modifications that do well on the SGD objective, but not on the mesa-objective;
  2. “Spoofing” performance benefits of certain ML modifications that are desirable to the mesa-objective by withholding performance gains until their implementation; or
  3. In a reinforcement learning context, selectively sampling environmental states that will either leave the mesa-objective unchanged or "steer" the ML model in a way that favors the mesa-objective.

Mesa-optimization might be an “easily acquired policy” for good performance on a sufficiently complex ML task. Many mesa-objectives that allow for good performance in training may point to a proxy that, when optimized for in deployment, leads to undesirable behavior. Worse still is the case where a mesa-optimizer is instrumentally motivated [? · GW] to “deceive [AF · GW]” the SGD objective because it has acquired both a mesa-objective that is misaligned with the outer objective, and the capability to retain or achieve the mesa-objective via gradient hacking.

Fisherian runaway seems similar to the first gradient hacking mechanism in that:

Fisherian runaway seems unlike gradient hacking in that:

Fisherian runaway offers the following insights for AI alignment:

Fisherian runaway in peacock plumage is a surprisingly useful "intuition pump" for exploring gradient hacking. I suspect there are many further examples of possible runaway Fisher processes in nature that could be mined for useful insight, such as that discussed here. Ecological models that favor Fisherian runaway might be adapted into useful mathematical approximations of gradient hacking and allow this phenomenon to be instantiated and studied in minimal ML models.

6 comments

Comments sorted by top scores.

comment by Thomas Sepulchre · 2022-04-11T09:17:08.313Z · LW(p) · GW(p)

TL;DR: Tailed peacocks make better female chicks

Let's, for a moment, pretend to be a peahen choosing a sexual mate. We have a few options, with different degrees of impressive tails. As stated in the post, it is difficult to tell whether the tail is a good proxy for fitness. Indeed one could either argue that having a big tail is a handicap for the peacock, limiting agility for example, or that it is a strong hint that the peacock is otherwise very fit, despite the big tail. I would argue that, given the information we have, i.e. all the potential male mates survived so far, we shouldn't assume a higher/lower fitness between them.

But, why do we care anyway? We are not interested in the fitness of our future mate, but rather in the fitness of our future chicks. And here, I think, the tail is relevant.

If we have male chicks, the choice of a mate will influence both the size of their tail and other characteristics like the ability to find food, agility and so on. As before, it doesn't seem that the tail is a reasonable proxy on how to produce better male chicks.

If we have female chicks, the story is very different. A female chick will partially inherit the agility and general ability to survive from the mate we will choose, but will not inherit the handicap of a big tail. Therefore, we should choose the mate with the biggest tail.

comment by Oliver Sourbut · 2022-04-10T22:08:58.140Z · LW(p) · GW(p)

Interesting! I came to a similar conclusion (with less detail) in a post about real-life gradient hacking [LW · GW] which contains some other possible examples you might also be interested in (very un-elaborated)

comment by AlexMennen · 2022-04-10T14:47:49.088Z · LW(p) · GW(p)

Fisherian runaway doesn't make any sense to me.

Suppose that each individual in a species of a given sex has some real-valued variable , which is observable by the other sex. Suppose that, absent considerations about sexual selection by potential mates for the next generation, the evolutionarily optimal value for  is 0. How could we end up with a positive feedback loop involving sexual selection for positive values of , creating a new evolutionary equilibrium with an optimal value  when taking into account sexual selection? First the other sex ends up with some smaller degree of selection for positive values of  (say selecting most strongly for ). If sexual selection by the next generation of potential mates were the only thing that mattered, then the optimal value of  to select for is , since that's what everyone else is selecting for. That's stability, not positive feedback. But sexual selection by the next generation of potential mates isn't the only thing that matters; by stipulation, different values of  have effects on evolutionary fitness other than through sexual selection, with values closer to  being better. So, when choosing a mate, one must balance the considerations of sexual selection by the next generation (for which  is optimal) and other considerations (for which  is optimal), leading to selection for mates with  being evolutionarily optimal. That's negative feedback. How do you get positive feedback?

Replies from: ryankidd44
comment by Ryan Kidd (ryankidd44) · 2022-04-10T15:48:32.909Z · LW(p) · GW(p)

In the context of your model, I see two potential ways that Fisherian runaway might occur:

  1. Within each generation, males that survive with higher are consistently fitter on average than males that survive with lower because the fitness required to survive monotonically increases with . Therefore, in every generation, choosing males with higher is a good proxy for local improvements in fitness. However, the performance detriments of high "off-distribution" are never signalled. In an ML context, this is basically distributional shift via proxy misalignment.
  2. Positive feedback that negatively impacts fitness "on-distribution" might occur temporarily if selection for higher is so strong that it has "acquired momentum" that ensures females will select for higher males for several generations past the point the trait becomes net costly for fitness. This is possible if the negative effects of the trait take longer to manifest selection pressure than the time window during which sexual selection boosts the trait via preferential mating. This mechanism is temporary, however, but I can see search processes halting prematurely in an ML context.
Replies from: AlexMennen
comment by AlexMennen · 2022-04-10T18:26:11.085Z · LW(p) · GW(p)
  1. By "optimal", I mean in an evidential, rather than causal, sense. That is, the optimal value is that which signals greatest fitness to a mate, rather than the value that is most practically useful otherwise. I took Fisherian runaway to mean that there would be overcorrection, with selection for even more extreme traits than what signals greatest fitness, because of sexual selection by the next generation. So, in my model, the value of  that causally leads to greatest chance of survival could be , but high values for  are evidence for other traits that are causally associated with survivability, so  offers best evidence of survivability to potential mates, and Fisherian runaway leads to selection for . Perhaps I'm misinterpreting Fisherian runaway, and it's just saying that there will be selection for  in this case, instead of over-correcting and selecting for ? But then what's all this talk about later-generation sexual selection, if this doesn't change the equilibrium?
  2. Ah, so if we start out with an average , standard deviation , and optimal , then selecting for larger  has the same effect as selecting for  closer to , and that could end up being what potential mates do, driving  up over the generations, until it is common for individuals to have positive , but potential mates have learned to select for higher ? Sure, I guess that could happen, but there would then be selection pressure on potential mates to stop selecting for higher  at this point. This would also require a rapid environmental change that shifts the optimal value of ; if environmental changes affecting optimal phenotype aren't much faster than evolution, then optimal phenotypes shouldn't be so wildly off the distribution of actual phenotypes.
Replies from: ryankidd44
comment by Ryan Kidd (ryankidd44) · 2022-04-11T17:59:11.800Z · LW(p) · GW(p)

I think it's important to distinguish between "fitness as evaluated on the training distribution" (i.e. the set of environments ancestral peacocks roamed) and "fitness as evaluated on a hypothetical deployment distribution" (i.e. the set of possible predation and resource scarcity environments peacocks might suddenly face). Also important is the concept of "path-dependent search" when fitness is a convex function on which biases local search towards , but has global minimum at .

  1. In this case, I'm imagining that Fisherian runaway boosts as long as it still indicates good fitness on-distribution. However, it could be that is the "local optimum for fitness" and in reality is the global optimum for fitness. In this case, the search process has chosen an intiial -direction that biases sexual selection towards . This is equivalent to gradient descent finding a local minima.
  2. I think I agree with your thoughts here. I do wonder if sexual selection in humans has reached a point where we are deliberately immune to natural selection pressure due to such a distributional shift and acquired capabilities.