Ivan Vendrov's Shortform

ivan-vendrov

Ivan Vendrov's Shortform

post by Ivan Vendrov (ivan-vendrov) · 2025-04-08T14:16:20.419Z · LW · GW · 9 comments

9 comments

9 comments

Comments sorted by top scores.

comment by Ivan Vendrov (ivan-vendrov) · 2025-04-08T14:16:20.419Z · LW(p) · GW(p)

Are instrumental convergence & Omohundro drives just plain false? If Lehman and Stanley are right in "Novelty Search and the Problem with Objectives" (https://www.cs.swarthmore.edu/~meeden/DevelopmentalRobotics/lehmanNoveltySearch11.pdf) later popularized in their book "Why Greatness Cannot Be Planned", VNM-coherent agents that pursue goal stability will reliably be outcompeted by incoherent search processes pursuing novelty.

Replies from: abramdemski, kh, Amyr

↑ comment by abramdemski · 2025-04-08T15:14:22.362Z · LW(p) · GW(p)

Pursuit of novelty is not vnm-incoherent. Furthermore, it is an instrumentally convergent drive; power-seeking agents will seek novelty as well, because learning increases power in expectation (see: value of information).

The argument made in Novelty Search and the Problem with Objectives is based on search processes which inherently cannot do long-term planning (they are myopically trying to increase their score on the objective). These search processes don't do as well as explicit pursuit of novelty because they aren't planning to search effectively, so there's no room in their cognitive architecture for the instrumental convergence towards novelty-seeking to take place. (I'm basing this conclusion on the abstract.) This architectural limitation of most AI optimization methods is mitigated by Bayesian optimization methods (which explicitly combine information-seeking with the normal loss-avoidance).

Replies from: gwern, ivan-vendrov

↑ comment by gwern · 2025-04-09T17:04:55.200Z · LW(p) · GW(p)

Pursuit of novelty is not vnm-incoherent. Furthermore, it is an instrumentally convergent drive; power-seeking agents will seek novelty as well, because learning increases power in expectation (see: value of information).

Or to put it another way: any argument which convincingly proves that 'incoherent search processes ultimately outcompete coherent search processes' is also an argument which convinces a VNM agent to harness the superior incoherent search processes instead of the inferior coherent ones.

Replies from: ivan-vendrov

↑ comment by Ivan Vendrov (ivan-vendrov) · 2025-04-10T03:42:23.688Z · LW(p) · GW(p)

"harness" is doing a lot of work there. If incoherent search processes are actually superior then VNM agents are not the type of pattern that is evolutionary stable, so no "harnessing" is possible in the long term, more like a "dissolving into".

Unless you're using "VNM agent" to mean something like "the definitionally best agent", in which case sure, but a VNM agent is a pretty precise type of algorithm defined by axioms that are equivalent to saying it is perfectly resistant to being Dutch booked.

Resistance to Dutch booking is cool, seems valuable, but not something I'd spent limited compute resources on getting six nines of reliability on. Seems like evolution agrees, so far: the successful organisms we observe in nature, from bacteria to humans, are not VNM agents and in fact are easily Dutch booked. The question is whether this changes as evolution progresses and intelligence increases.

↑ comment by Ivan Vendrov (ivan-vendrov) · 2025-04-08T16:07:23.199Z · LW(p) · GW(p)

I agree Bayesian optimization should win out given infinite compute, but what makes you confident that evolutionary search under computational resource scarcity selects for anything like an explicit Bayesian optimizer or long term planner? (I say "explicit" because the Bayesian formalism has enough free parameters that you can post-hoc recast ~any successful algorithm as an approximation to a Bayesian ideal)

Replies from: abramdemski

↑ comment by abramdemski · 2025-04-08T18:57:25.966Z · LW(p) · GW(p)

Given infinite compute, Bayesian optimization like this doesn't make sense (at least for well-defined objective functions), because you can just select the single best point in the search space.

what makes you confident that evolutionary search under computational resource scarcity selects for anything like an explicit Bayesian optimizer or long term planner? (I say "explicit" because the Bayesian formalism has enough free parameters that you can post-hoc recast ~any successful algorithm as an approximation to a Bayesian ideal)

I would not argue for "explicit". If I had to argue for "explicit" I would say: because biological organisms do in fact have differentiated organs which serve somewhat comprehensible purposes, and even the brain has somewhat distinct regions serving specific purposes. However, I think the argument for explicit-or-implicit is much stronger.
Even so, I would not argue that evolutionary search under computational resource scarcity selects for a long-term planner, be it explicit or implicit. This would seem to depend on the objective function used. For example, I would not expect something trained on an image-recognition objective to exhibit long-term planning.
I'm curious why you specify evolutionary search rather than some more general category that includes gradient descent and other common techniques which are not Bayesian optimization. Do you expect it to be different in this regard?

I'm not sure why you asked the question, but it seems probably that you thought a "confident belief that [...]" followed from my view expressed in the previous comment? I'm curious about your reasoning there. To me, it seems unrelated.

These issues are tricky to discuss, in part because the term "optimization" is used in several different ways, which have rich interrelationships. I conceptually make a firm distinction between search-style optimization (gradient descent, genetic algorithms, natural selection, etc) vs agent-style optimization (control theory, reinforcement learning, brains, etc). I say more about that here [LW · GW].

The proposal of Bayesian Optimization, as I understand it, is to use the second (agentic optimization) in the inner loop of the first (search). This seems like a sane approach in principle, but of course it is handicapped by the fact that Bayesian ideas don't represent the resource-boundedness of intelligence particularly well, which is extremely critical for this specific application (you want your inner loop to be fast). I suspect this is the problem you're trying to comment on?

I think the right way to handle that in principle is to keep the Bayesian ideal as the objective function (in a search sense, not an agency sense) and search for a good search policy (accounting for speed as well as quality of decision-making), which you then use for many specific searches going forward.

Replies from: ivan-vendrov

↑ comment by Ivan Vendrov (ivan-vendrov) · 2025-04-10T04:07:31.400Z · LW(p) · GW(p)

Minor points just to get them out of the way:

I think Bayesian optimization still makes sense with infinite compute if you have limited data (infinite compute doesn't imply perfect knowledge, you still have to run experiments in the world outside of your computer).
The reason I specified evolutionary search is because that's the claim I see Lehman & Stanley as making - that algorithms pursuing simple objectives tend to not be robust in an evolutionary sense. I'm less confident making claims about broader classes of optimization but not intentionally excluding them

Meta point: it feels like we're bouncing between incompatible and partly-specified formalisms before we even know what the high level worldview diff is.

To that end, I'm curious what you think the implications of the Lehman & Stanley hypothesis would be - supposing it were shown even for architectures that allow planning to search, which I agree their paper does not do. So yes you can trivially exhibit a "goal-oriented search over good search policies" that does better than their naive novelty search, but what if it turns out a "novelty-oriented search over novelty-oriented search policies" does better still? Would this be a crux for you, or is this not even a coherent hypothetical in your ontology of optimization?

↑ comment by Kaarel (kh) · 2025-04-08T14:59:55.744Z · LW(p) · GW(p)

it feels to me like you are talking of two non-equivalent types of things as if they were the same. like, imo, the following are very common in competent entities: resisting attempts on one's life, trying to become smarter, wanting to have resources (in particular, in our present context, being interested in eating the Sun), etc.. but then whether some sort of vnm-coherence arises seems like a very different question. and indeed even though i think these drives are legit, i think it's plausible that such coherence just doesn't arise or that thinking of the question of what valuing is like such that a tendency toward "vnm-coherence" or "goal stability" could even make sense as an option is pretty bad/confused^[1].

(of course these two positions i've briefly stated on these two questions deserve a bunch of elaboration and justification that i have not provided here, but hopefully it is clear even without that that there are two pretty different questions here that are (at least a priori) not equivalent)

briefly and vaguely, i think this could involve mistakenly imagining a growing mind meeting a fixed world, when really we will have a growing mind meeting a growing world — indeed, a world which is approximately equal to the mind itself. slightly more concretely, i think things could be more like: eg humanity has many profound projects now, and we would have many profound but currently basically unimaginable projects later, with like the effective space of options just continuing to become larger, plausibly with no meaningful sense in which there is a uniform direction in which we're going throughout or whatever ↩︎

↑ comment by Cole Wyeth (Amyr) · 2025-04-08T16:38:09.920Z · LW(p) · GW(p)

I didn't really "get it" but this paper may be interesting to you: https://arxiv.org/pdf/2502.15820

Ivan Vendrov's Shortform

Contents

9 comments