On expected utility, part 3: VNM, separability, and more

joe-carlsmith

On expected utility, part 3: VNM, separability, and more

post by Joe Carlsmith (joekc) · 2022-03-22T03:05:21.073Z · LW · GW · 4 comments

  I. God’s lotteries
  II. The four vNM axioms
  III. The vNM theorem
     10%       40%        50%
  IV. Separability implies additivity
  V. From “separability implies additivity” to EUM
  VI. Peterson’s “direct argument”
None
5 comments

(Cross-posted from Hands and Cities)

Previously in sequence: Skyscrapers and madmen; Why it can be OK to predictably lose

This is the third essay in a four-part series on expected utility maximization (EUM). This part examines three theorems/arguments that take probability assignments for granted, and derive a conclusion of the form: “If your choices satisfy XYZ conditions, then you act like an EUM-er”: the von Neumann-Morgenstern theorem; an argument based on a very general connection between “separability” and “additivity”; and a related “direct” axiomatization of EUM in Peterson (2017). In each case, I aim to go beyond listing axioms, and to give some intuitive (if still basic and informal) flavor for how the underlying reasoning works.

I. God’s lotteries

OK, let’s look at some theorems. And let’s start with von-Neumann Morgenstern (vNM) — one of the simplest and best-known.

A key disadvantage of vNM is that it takes the probability part of EUM for granted. Indeed, this is a disadvantage of all the theorems/arguments I discuss in this essay. However, as I discuss in the next essay, we can argue for the probability part on independent grounds. And sometimes, working with sources of randomness that everyone wants to be a probabilist about – i.e., coin flips, urns, etc – is enough to get a very EUM-ish game going.

Here’s how I tend to imagine the vNM set-up. Suppose that you’re hanging out in heaven with God, who is deciding what sort of world to create. And suppose, per impossible, that you and God aren’t, in any sense, “part of the world.” God’s creation of the world isn’t adding something to a pre-world history that included you and God hanging out; rather, the world is everything, you and God are deciding what kind of “everything” there will be, and once you decide, neither of you will ever have existed.

(This specification sounds silly, but I think it helps avoid various types of confusion – in particular, confusions involved in imagining that the process of ranking or choosing between worlds is in some sense part of the world itself (more below). On the set-up I’m imagining, it isn’t.)

God’s got a big menu of worlds he’s considering. And let’s say, for simplicity, that this menu is finite. Also, God has an arbitrarily fine-grained source of randomness, like one of those spinning wheels where you can mark some fraction as “A,” and some fraction as “B.” And God is going to ask you to choose between different lotteries over worlds, where a lottery XpY is a lottery with a p chance of world X, and a (1-p) chance of world Y (a certainty of X — i.e., XpX, or X(1)Y — counts as a lottery, too). If you say that you’re indifferent between the two lotteries, then God flips a coin between them. And if you “refuse to choose,” then God tosses the lotteries to his dog Fido, and then picks the one that Fido drools on more.

Before making your choice, though, you’re allowed to send out a ghost version of yourself to inspect the worlds in question at arbitrary degrees of depth; you’re allowed to think as long as you want; to build ghost civilizations [LW · GW] to help you make your choice, and so on.

And let’s use “≻” to mean “better than” (or, “preferred to,” or “chosen over,” or whatever), “≽” to mean “at least as good as,” and “~” to mean “indifferent” (the curvy-ness of these symbols differentiates them from e.g. “>”).

II. The four vNM axioms

Now suppose you want your choices to meet the following four constraints, for all lotteries A, B, etc.

1. Completeness: A ≻ B, or A ≺ B, or A ~ B.

Completeness says that either you choose A over B, or you choose B over A, or you say that you’re indifferent. Some people don’t like this, but I do. One reason I like it is: if your preferences are incomplete in a way that makes you OK with swapping any two lotteries you view as “incomparable,” then you’re OK being “money-pumped.” E.g., if A+ ≻ A, but both are incomparable to B, then you’re OK swapping A+ for B, then B for A, then paying a dollar to trade back to A+.

Maybe you say: “I’ll be the type of ‘can’t compare them’ person who doesn’t trade incomparable things. Thus, if I start with A+, and am offered B, I’ll just stick with A+ — and the same if I start with A” (see here [LW · GW] for some discussion). But now, absent more complicated constraints about foresight and pre-commitment, you’ll pass up free opportunities to trade from A to A+, via first trading for B.

Beyond this, though, refusing to compare stuff just looks, to me, unnecessarily “passive.” It looks like it’s giving up an opportunity for agency, for free. Recall that if you refuse to choose between two lotteries, God tosses them to his dog, Fido, and chooses the one that Fido drools on more. If you’re indifferent between the two, then OK, fine, let Fido choose. Incomparability, though, is not supposed to be indifference. But then: why are you letting Fido make the call? Why so passive? Why not choose for yourself?

There’s a lot more to say, here, and I don’t expect fans of incompleteness to be convinced. But I’ll leave it for now, and turn to the second vNM constraint:

2. Transitivity: If A ≻ B, and B ≻ C, then A ≻ C.

Transitivity requires that you don’t prefer things in a circle. Again, some people don’t like it: but I do. And again, one reason I like it is money-pumps: if you prefer A to B to C to A, then you’ll pay to trade A for C for B for A, which looks pretty silly.

Some people try to block money-pumps with objections like: “I’ll foresee all my future options, make a plan for the overall pattern of choices to make, and stick with it.” But we can create more complicated money-pumps in response. I haven’t tried to get to the bottom of this dialectic, partly because I feel pretty uninterested in justifying transitivity violations, but see this book-length treatment by Gustafsson if you want to dig in on the topic.

Another argument against intransitivity is: suppose I offer you a choice between {A, B, C}, all at once. Which do you want? If you prefer A to B to C to A, then possibly, you’re just stuck. Throw the choice to Fido? Or maybe you just pick something to choose when larger option sets are available. But even in that case, no matter what you choose, there was an alternative you liked better in a two-option comparison. Isn’t that silly? (See Gustaffson (2013)).

Maybe you say: “what I like best depends on the choice set it’s a part of. If you ask me: ‘chocolate or vanilla?’, I might say: ‘vanilla.’ But if you add: ‘oh, strawberry is also available,’ I might say: ‘ah, in that case I’ll have chocolate instead’” (see the literature on “expansion” and “contraction” consistency, which I’m borrowing this example from).

Sure sounds like a strange way to order ice cream, but we can point to more intuitive examples. Suppose that you prefer mountaineering to staying home, because if you stay home instead of mountaineering, you’re a coward. But you prefer Rome to mountaineering, because that’s “cultured” rather than cowardly. But you prefer staying home to Rome, because you actually don’t like vacations at all and just want to watch TV. And let’s say that faced with all three, you choose Rome, because not going on vacation at all, if mountains are on the menu, is cowardly (see Broome (1995), and Dreier (1996) for discussion). Isn’t this an understandable set of preferences?

It’s partly because of examples like these that I’ve made entire worlds the “outcomes” chosen, and made the process of choice take place “outside the world.” Suppose, for example, that God offers you a choice between (A) a world where a travel agent offers you home vs. mountains, and you choose home, vs. (B) a world where a travel agent offers you home vs. Rome, and you choose home. These are different worlds, and it’s fine to treat the choice of “home,” in each of them, differently, and to view A as involving a type of cowardice that B does not. Your interaction with God, though, isn’t like your interaction with the travel agent. It’s not happening “inside the world.” Indeed, it’s not happening at all. So it doesn’t make you a coward, or whatever you’re worried about seeming like in the eyes of the universe.

What’s more, if your choices between worlds start being sensitive to the option set, then I’m left with some sense that you’re caring about something other than the world itself (thanks to Katja Grace for discussion). That is, somehow the value of the world ceases to be intrinsic to the world, and becomes dependent on other factors. Maybe some people don’t mind this, but to me, it looks unattractive.

And as with incompleteness, option-set sensitivity also seems to me somehow compromising of your agency. Apparently, the type of force you want to be in the world depends a lot on how much of the menu God uncovers at a given time, and on the order of the list. Thus, if he drafts the menu, or shows it to you, based on something about Fido’s drool, the type of influence you want to exert is in Fido’s hands. Why give Fido such power?

Maybe you say: “But if talking about whole worlds is allowed, then can’t I excuse myself for that time I paid to trade apples for oranges at t1, oranges for bananas at t2, and bananas for apples at t3, on the grounds that ‘apples at t1’ worlds are different from ‘apples at t3’ worlds?” Sure, you can do that if you’d like. As I said in the first essay: if your goal is to make up a set of preferences that rationalize some concrete episode of real-world behavior, you can. But are those actually your preferences? Do you actually like apples at t3 better than apples at t1? If not, you’re not doing yourself any favors by trying so hard to avoid that dreaded word, “mistake.” Some other person might’ve rationally made those trades. But not you.

Still, maybe you don’t like the whole “choosing outside of the world” thing. Maybe you observe, for example, that all actual choice-making and preference-having occurs in the world. And fair enough. Indeed, there are lots of other arguments and examples in the literature on intransitivity, which I’m not engaging with. And at a certain point, if you insist on having intransitive preferences, I’m just going to bow out and say: “not my bag.”

I’ll add, though, one final vibe: namely, intransitive preferences leave me with some feeling of: “what are you even trying to do in this world?” That is, steering the world around a circular ranking feels like it isn’t, as it were, going anywhere. It’s not a force moving things in a direction. Rather, it’s caught in a loop. It’s eating its own tail.

Again: there’s much more to say, and I don’t expect fans of intransitivity to be convinced. But let’s move on to our third condition:

3. Independence: A ≻ B if and only if ApC ≻ BpC.

This is the type of principle that the “small probabilities are really larger conditional probabilities” argument I gave in part 2 relied on. And it does a lot of the work in the vNM proof itself.

To illustrate it: suppose that you prefer a world of puppies to a world of mud, and you have some kind of feeling (it doesn’t matter what) about a world of flowers. Then, on Independence, you also prefer {p chance of puppies, 1-p chance of flowers} to a {p chance of mud, 1-p chance of flowers}. The intuitive idea here is that you’ve got the same probability of flowers either way – so regardless of how you feel about flowers, the “flowers” bit of the lottery isn’t relevant to evaluating the choice. Rather, the thing that matters here is how you feel about puppies vs. mud.

We can also think about this as a consistency constraint across choices, akin to the one I discussed in my last essay. Suppose, for example, that God tells you that he’s going to flip a coin. If it comes up heads, he’s going to create a world of flowers. If it comes up tails, he’s going to offer you a choice between puppies and mud. However, he’s also offering you a chance to choose now, before the coin is flipped, whether he’ll create puppies vs. mud if the coin comes up tails. Should you make a different choice now, about what God should do in the “tails” outcome, then you would make if you actually end up in the tails outcome? No.

And I discussed in my last essay: if you say yes, you can get money-pumped (one notices a theme) – at least with some probability. I.e., God starts you with a puppies(.5)flowers lottery, you pay to trade for mud(.5)flowers, then if the choice comes up heads and you’re about to get your mud, God offers you puppies instead, and you pay to trade back.

People often violate Independence in practice (see e.g. the Allais Paradox), but I don’t find this very worrying from a normative perspective, especially once you specify that you’re choosing “beyond the world,” and that any feelings of fear/nervousness/regret need to be included in the prizes. To me, with this in mind, Independence looks pretty attractive.

4. Continuity: If A ≻ B ≻ C, there must exist some p and q (not equal to 0 or 1) such that ApC ≻ B ≻ AqC.

This is a bit more of a technical condition, required for the proof to go through. I’m told that if you weaken it, weaker versions of the proof can still work, but I haven’t followed up on this.

To illustrate the idea, though: suppose that A is puppies, B is flowers, and C is mud, such that puppies ≻ flowers ≻ mud. Continuity says that is that if you start with puppies, there’s some sufficiently small probability p on getting mud instead, such that you prefer taking that chance to switching to flowers with certainty. And similarly, if you start with mud, there’s some sufficiently small probability of getting puppies instead such that you’ll take a guarantee of flowers over that probability of trading for puppies.

This can feel a bit counterintuitive: is there really some sufficiently small risk of dying, such that I’d take that risk in order to switch from a mediocre restaurant to a good one? But I think the right answer is yes. Probabilities, recall, are just real numbers – and real numbers can get arbitrarily small. A one-in-a-graham’s-number chance of dying is ludicrously lower than the chance of dying involved in walking a few blocks to the better restaurant, or picking up a teddy bear with padded gloves, or hiding in a fortified bunker, or whatever. By any sort of everyday standard, it’s really not something to worry about. (Once we start talking about infinities, the discussion gets more complicated. In particular, if you value e.g. heaven infinitely, but cookies and mud only finitely, and heaven ≻ cookies ≻ mud, then you can end up violating continuity by saying that any lottery heaven(p)mud is better than cookies (thanks to Petra Kosonen for discussion). But “things get more complicated if we include infinities” is a general caveat (read: understatement) about everything in these essays – and in ethics more broadly).

We make money-pump-ish arguments for Continuity, too – though they’re not quite as good (see Gustaffson’s book manuscript, p. 77). Suppose, for example, that our outcomes are: Gourmet ≻ McDonalds ≻ Torture, but for all values of p, you prefer McDonalds to Gourmet(p)Torture. And let’s assume that you’d pay more than $5 for Gourmet over McDonalds. Thus, Gourmet ≻ Gourmet-$5 ≻ McDonalds ≻ Gourmet(p)Torture, for any p. So if we start you off with Gourmet(p)Torture, you’ll be willing to pay at least $5 switch to Gourmet instead, even if we make p arbitrarily close to 1, and thus Gourmet(p)Torture arbitrarily similar to Gourmet. This isn’t quite paying $5 to switch to A from A. But it’s paying $5 to switch to A from something arbitrarily similar to A – which is (arbitrarily) close.

Maybe you say: “yeah, I told you, I really don’t like torture.” And fair enough: that’s why I don’t think this argument is as strong as the others. But if you’re that averse to torture, then it starts to look like torture, for you, is infinitely bad, in which case you may end up paying arbitrarily large finite costs to avoid arbitrarily small probabilities of it (this is a point from Gustaffson). Maybe that’s where you’re at (though: are you focusing your life solely on minimizing your risk of torture?); but it’s an extreme lifestyle.

III. The vNM theorem

Maybe you like these axioms; or maybe you’re not convinced. For now, though, let’s move on to the theorem:

vNM Theorem: Your choices between lotteries satisfy Completeness, Transitivity, Independence, and Continuity if and only if there exists some function u that assigns real numbers between 0 and 1 to lotteries, such that:
I. A ≻ B iff u(A) > u(B)

That is: the utility function mirrors the preference relation, such that you’re always choose the lottery with the higher utility.

II. u(ApB) = p*u(A) + (1-p)*u(B)

That is: the utility of a lottery is its expected utility.

III. For every other function u’ satisfying (I) and (II), there are numbers c>0 and d such that u’(A) = c * u(A) + d.

That is, the utility function u is unique up to multiplying it by a positive factor and adding any constant (a transformation that preserves the ratios between the differences between outcomes).

How does the proof work? Here I’ll focus on showing that conditional on the axioms, and assuming a finite number of worlds, we can get to a u that satisfies properties (I) and (II). (Here I’m going off of the proof presented in Peterson (2009, appendix B), which may differ from the original in ways I’m not tracking. For the original discussion, see here.)

The basic thought is this. Because we’ve got a finite number of worlds, we can order them from best to worst, and we can think of each world as a lottery that returns that world with certainty. Now pick a world at the very top of your ordering – i.e., a world such that you don’t like any other worlds better. Call this a “best world,” O, and give it (and all worlds A such that A ~ O) utility 1. Also, pick a world at the very bottom of your list, call it a “worst world,” W, and give it (and all worlds B such that B ~ W) utility 0.

The central vNM schtick is to construct the utility function by finding, for each world A, a lottery OpW such that A ~ OpW — and then we’re going to think of p as the utility for world A. That is, your utility for A is just the probability p such that you’re indifferent between A, and a lottery with p chance of a best world, and 1-p chance of a worst world.

Thus, suppose we’ve got puppies ≻ frogs ≻ flowers ≻ mud. Puppies is best here, and mud is worst, so let’s give puppies utility 1, and mud utility 0. Now: what’s the utility of frogs? Well, for what probability p are you indifferent between frogs, on the one hand, and {p chance of puppies, 1-p chance of mud} on the other? Let’s say that this probability is .8 (we can show that there must be one unique probability, here, but I’m going to skip this bit of the proof). And let’s say your probability for flowers is .1. Thus, u(frogs) = .8, and u(flowers) = .1. Here it is in (two dimensional) skyscrapers:

From an “amount of housing” perspective (see part 1), this makes total sense. The best city has packed the entire 1×1 space with housing. The worst city is empty. Any other lotteries will have some amount of housing in between. And if we slide our probability slider such that it makes some fraction of the best city empty, we can get to any amount of housing between 1 and 0.

This sort of utility function, combined with some basic calculations to do with compound lotteries, gets us to (I) and (II) above. Let’s walk through this. (This bit is going to be a bit dense, so feel free to skip to section IV if you’re persuaded, or if you start glazing over. The key move — namely, defining the utility of A as the p such that A ~ OpW — has already been made, and in my opinion, it’s the main thing to remember about vNM.)

Note, first, that because of the Independence axiom, faced with OpW-style lotteries, you always want the higher probability of the best outcome O. After all, if the probability of O is different between the two, there will be some bit of probability space where you get O in one case, and W otherwise (some bit of the city where it’s empty in one case, and full in the other) – and the rest of probability space will be equivalent to the same lottery in both cases. So because you prefer O to W, you’ve got to prefer the lottery with the higher probability of O.

For example, suppose that A is {90% puppies, 10% mud} and B is {50% puppies, 50% mud}. We can line up the outcomes here so they look like:

10% 40%        50%
A   Mud    Puppies   Puppies
B Mud Mud        Puppies

The 10% and 50% sections are the same, so on Independence, this is really just a comparison between the outcomes at stake in the 40% section – and there, we’ve specified that you have a clear preference. So, you’ve got to choose A.

With this in mind, let’s look at property (I): for any lotteries A and B, A ≻ B iff u(A) > u(B).

Let’s start with the proof from “A ≻ B” to u(A) > u(B). Suppose that A ≻ B. Can we show that u(A) > u(B)? Well, u(A) and u(B) are just the probabilities of some lotteries OpW and OqW, such that A ~ OpW, B ~ OqW, and therefore, by our definition of utility, u(A) = p, and u(B) = q. But because A ≻ B, it can’t be that p < q – otherwise you’d be preferring a lower chance of the best outcome to a higher one. And it can’t be that they’re equal, either – otherwise you’d be preferring the same lottery over itself. So p > q, and hence u(A) > u(B).

Now let’s go the other direction: suppose that u(A) > u(B). Can we show that A ≻ B? Well, if u(A) > u(B), then p > q, so OpW is a higher chance of the best world, compared with a lower chance in OqW. So OpW ≻ OqW. So because A ~ OpW, and B ~ OqW, it must be that A ≻ B.

Thus, property (I). Let’s turn to property (II): namely, u(ApB) = p*u(A) + (1-p)*u(B). Here we basically just appeal to the following fact about compound lotteries. Suppose that A ~ OqW, and B ~ OrW. This means that you’ll be indifferent between ApB and the compound lottery (OqW)p(OrW). Overall, though, this compound lottery gives you a p*q+(1-p)*r chance of O, and W otherwise. So if we define s such that p*q + (1-p)*r = s, ApB ~ OsW. So by our definition of utility, u(ApB) = s. And u(A) was q, u(B) was r, and s was just: p*q + (1-p)*r. So u(ApB) = p*u(A) + (1-p)*u(B). Thus, property (II).

An example might help. Suppose you have the following lottery: {.3 chance of frogs, .7 chance of flowers}. What’s the utility of this lottery? Well, we knew from above that frogs ~ {.8 chance of puppies, .2 chance of mud} (call this lottery Y), such that u(frogs) = .8. And flowers ~ {.1 chance of puppies, .9 chance of mud} (call this lottery Z), such that u(flowers) = .1. So the compound lottery YpZ, where p=.3, is just a .3 chance of <.8 chance of puppies, .2 chance of mud), and .7 chance of {.1 chance of puppies, .9 chance of mud}. But this just amounts to a (.3*.8 + .7*.1) chance of puppies, and mud otherwise. Thus, since frogs ~ Y and flowers ~ Z, u(frogs(.3)flowers) = .3*u(frogs) + (1-.3)*u(flowers).

This has been an informal and incomplete sketch, but hopefully it suffices to give some non-black-box flavor for how properties (I) and (II) fall out of the axioms above. I’m going to skip property (III), as well as the proof from properties (I)-(III) to the axioms – but see Peterson (2006) for more.

IV. Separability implies additivity

Ok, so that’s a bit about vNM. Let’s turn to another, very general theorem, which focuses on the notion of “separability” – a notion that crops up a lot in ethics, and which, if accepted, often leads quickly to very EUM-ish and utilitarian-ish results.

Separability basically just says that your ranking of what happens at any combination of “locations” is independent of what’s going on in the other locations – where “locations” here can be worlds, times, places, people, person-moments, or whatever. If your preferences reflect this type of independence, and they are otherwise transitive, complete, and “reflexive” (e.g., for all A, A ≽ A), we can prove that they will also be additive: that is, your overall utility can be represented as the sum of the utilities at the individual locations.

To see why this happens, let’s introduce a bit of terminology (here I’m following the presentation in Broome (1995, Chapter 4) – a book I recommend more generally). Let’s say you got a set of alternatives A, and a (transitive, complete, and reflexive) preference ordering over these alternatives ≽. Further, let’s suppose that we’ve got an overall utility function U that assigns real numbers to the alternatives, such that U(A) ≥ U(B) iff A ≽ B. Finally, let’s suppose that the alternatives, here, are vectors (i.e., lists) of “occurrences” at some set of “locations.” Thus, the vector (x₁, x₂, … x_n) has n locations, and the “occurrence” at location 1 is x₁. We’ll focus on cases where all the alternatives have the same locations (a significant constraint).

Thus, for example, let’s say you have three planets: 1, 2, and 3. Planet 1 can have mud, flowers, or puppies. Planet 2 can have rocks, corn, or cats. Planet 3 can have sand, trees, or ponies. So, ordering the locations by planet number, example alternatives would include: (mud, corn, ponies), (puppies, cats, sand), and so on.

Now suppose we fix on some subset of locations (a “subvector”), and we hold the occurrences at the other locations constant. We can then say that subvector X ranks higher than subvector Y, in a conditional ordering ≽’, iff the alternative that X is extracted from ranks higher, in the original ordering ≽, than the alternative Y is extracted from. That is, a conditional ordering is just the original ordering, applied to a subset of alternatives where we hold the occurrences at some set of locations fixed. And relative to a given conditional ordering, we can talk about how subvectors rank relative to each other.

Thus, for example, suppose we specify that planet 3 is going to have trees, and we focus on the conditional ordering over alternatives where this is true. Then we can say that the subvector (planet 1 with puppies, planet 2 with cats) ranks higher than (planet 1 with mud, planet 2 with rocks), relative to this conditional ordering, iff_def(puppies, cats, trees) ≻ (mud, rocks, trees).

We can then say that a subset of locations S is “separable,” under an ordering ≽, iff_def the ranking of subvectors at those locations is always the same relative to all conditional orderings that hold the occurrences at the other locations fixed. Thus, for example, suppose that some random stuff happens at planet 2 and planet 3 – it doesn’t matter what. If planet 1 is separable from these other planets, then if (puppies, whatever₂, doesn’t-matter₃) ≻ (mud, whatever₂, doesn’t-matter₃) for some values of “whatever” and “doesn’t-matter,” then that’s also true for all values of “whatever” and “doesn’t-matter.” That is, if holding planet 2 and 3 fixed, switching from mud on planet 1 to puppies on planet 1 is ever an improvement to the overall situation, then it is always an improvement to the overall situation.

According to Broome, this means that a separable subset of locations can be assigned its own “sub-utility” function, which can be evaluated independent of what’s going on at the other locations. (Broome seems to think that this is obvious, but I’m a little bit hazy on it. I think the idea is that the ordinal ranking of occurrences at that subset of locations is always the same, so if the sub-utility function reflects this ranking, that’s all the information you need about those locations. But the ordinal ranking can stay the same while the size of the “gaps” between outcomes changes, and I feel unclear about whether this can matter. I’ll take Broome’s word for it on this issue for now, though.) That is, if planet 1 is separable from the others, then U(x₁, x₂, x₃) can be replaced by some other function V(u(x₁), x₂, x₃), where u(x₁) is the “subutility” function for what’s going on at planet 1.

Now let’s say that an ordering over alternatives is “strongly separable” iff_def every subset of locations is separable. And let’s say that an ordering is “additively separable” iff_def it can represented as a utility function that is just the sum of the subutilities at each location: i.e., U(x₁, x₂ … x_n) = u₁(x₁) + u₂(x₂) + … + u_n(x_n). We can then prove:

An ordering is strongly separable iff it is additively separable.

(The proof here, according to Broome, is due to Gérard Debreu).

It’s clear that additively separable implies strongly separable (for any subset of locations, consider the sum of their utilities, and call it x: for any fixed y, representing the sum of the utilities at the other locations, x+y will yield the same ordering as you increase or decrease x), so let’s focus on moving from “strongly separable” to “additively separable.” And for simplicity, let’s focus on the case where there are exactly three locations, and on integer utility values in particular, to illustrate how the procedure works.

Because the locations are strongly separable, we know that each location can be given its own sub-utility function, such that: U(x₁, x₂, x₃) = V(u₁(x₁), u₂(x₂), u₃(x₃)). So what we’ll do is define subutility functions in each location, such that the overall utility function comes out additive. And we’ll do this, basically, by treating an arbitrary subutility gap in the first location as “1,” and then defining the subutilities at the other locations (and for everything else in location 1) by reference to what it takes to compensate for that gap in the overall ordering (Broome notes that we need an additional continuity condition to ensure that such a definition is possible, but following his lead, I’m going to skip over this). This gives us a constant unit across locations, which makes the whole thing additive.

In more detail: pick an arbitrary outcome at each location, and assign this outcome 0 utility, on the sub-utility function at that location. Now we’ve got (0, 0, 0). And because the ordering is strongly separable, we can hold what’s going on at location 3 fixed, and just make comparisons between sub-vectors at locations 1 and 2. So assign outcomes “1” in locations 1 and 2 such that: (1, 0, 0) ~ (0, 1, 0). Assign “2” in the second location to the outcome such that (1, 1, 0) ~ (0, 2, 0), “3” in the second location such that (1, 2, 0) ~ (0, 3, 0), “-1” such that (0, 0, 0) ~ (1, -1, 0), and so on. And note that because of separability, putting 0 in location 3 doesn’t matter, here – we could’ve chosen anything. Thus, we’ve defined the subutilities at location 2 such that:

A1. (1, b, c) ~ (0, b+1, c) for all b and c.

Now do the same at the third location, using the gap between 0 and 1 at the first location as your measuring stick. I.e., define “1” in the third location such that (1, 0, 0) ~ (0, 0, 1), “2” such that (1, 0, 1) ~ (0, 0, 2), and so on. Again, because of strong separability, what’s going on at location 2 doesn’t actually matter, so:

A2: (1, b, c) ~ (0, b, c+1) for all b and c.

And now, finally, do the same procedure at the first location, except using the gap between 0 and 1 in the second location as your basic unit. E.g., (2, 0, 0) ~ (1, 1, 0), (3, 0, 0) ~ (2, 1, 0), and so on. Thus:

A3: (a+1, 0, c) ~ (a, 1, c) for all a and c.

But now we’ve defined a unit of utility that counts the same across locations (see footnote for more detailed reasoning).^[1] And this means that the overall ordering has to be additive. This is intuitive, but one way of pointing at it is to note that we can now reduce comparisons between any two alternatives to equivalent comparisons between alternatives whose subutilities only differ at one location. Thus, for example: suppose you’re comparing (3, 4, 2) vs (1, 5, 9), and you want to reduce it to a comparison between (3, 4, 2) and (x, 4, 2). Well, just shuffle the utilities in (1, 5, 9) around until it looks like (x, 4, 2). E.g., (1, 5, 9) ~ (1, 5-1, 9+1) ~ (1, 4, 10). And (1, 4, 10) ~ (1+8, 4, 10-8) ~ (9, 4, 2). So (1, 5, 9) ~ (9, 4, 2). But comparing (9, 4, 2) to (3, 4, 2) is easy: after all, 9 is bigger than 3, and all the locations are separable. So because (9, 4, 2) ~ (1, 5, 9), (3, 4, 2) ≺ (1, 5, 9). But this sort of procedure always makes the alternative with the bigger total the winner (Broome goes through a more abstract proof of this, which I’m going to skip).

How do we move from here to non-integer values? Well (again, assuming some sort of continuity condition), we can run this procedure for arbitrarily small initial value gaps – e.g., .1, .01, .001, and so on. So successive approximations with finer and finer levels of precision will converge on the value of a given outcome.

Admittedly, the continuity condition here seems pretty strong, and it straightforwardly doesn’t hold for finite sets of occurrences like the mud/flowers/puppies I’ve been using as examples. But I think the proof here illustrates an important basic dynamic regardless. And the overall result is importantly general across whatever type of “location” you like – generality that helps explain why addition shows up so frequently in different ethical contexts.

V. From “separability implies additivity” to EUM

Does this result get us to expected utility maximization? It at least gets us close, if we’re up for strong separability across probability space – a condition sometimes called the “Sure-Thing Principle.” Here I’ll gesture at the type of route I have in mind (see also Savage for a more in-depth representation theorem that relies on the sure-thing principle).

First, let’s assume that we can transform any lotteries ApB and CqD (and indeed, any finite set of lotteries) into more fine-grained lotteries with the same number of equiprobable outcomes – all without changing how much you like them (strictly, we need something about “can be approximated in the limit” here, but I’ll skip this). This allows us to put any finite set of lotteries into the “same number of locations” format we assumed for our “strong separability iff additive separability” proof above.

Thus, for example, suppose that either God will flip a coin for puppies vs. corn (lottery A), or he’ll give you a 20% probability of flowers, vs. a 80% probability of frogs (lottery B). We can transform this choice into:

States	S1	S2	S3	S4	S5	S6	S7	S8	S9	S10
Probability	10%	10%	10%	10%	10%	10%	10%	10%	10%	10%
Lottery A	Puppies	Puppies	Puppies	Puppies	Puppies	Corn	Corn	Corn	Corn	Corn
Lottery B	Flowers	Flowers	Frogs	Frogs	Frogs	Frogs	Frogs	Frogs	Frogs	Frogs

Now suppose your ranking over lotteries is representable via an overall utility function U(l), and it’s strongly separable across states. This means that we can give each state n a sub-utility function u_n. And because strong separability implies additive separability, we know that there is some U(x1, x2 … x_n) = u1(x1) + u2(x2) …. u_n(x_n).

The next claim is just that, given that all the states here are equally probable, we should be able to permute the outcomes of a lottery across states, without affecting the overall utility of the lottery. The intuition here is something like: “regardless of how much you like A or B, if I offer you a coin flip for A vs. B, it shouldn’t matter which one gets paired with heads, and which with tails” (this is the type of intuition I appealed to in the second argument in part 2).

Now, this principle won’t hold in many real-life choices. Suppose, for example, that you just ate a sandwich that you think has a 50% chance of containing deadly poison. Now Bob offers you two lotteries:

Lottery X: $10M conditional on poisoned sandwich, and nothing otherwise.
Lottery Y: nothing conditional on poisoned sandwich, and $10M otherwise.

Are you indifferent between X and Y? No. You definitely want lottery Y, here – even if you think the two possible states are equally likely – because the money is a lot more useful to you if you’re still alive. That is, the “states” and the “prizes” here interact, such that you value prizes differently conditional on the different states.

This type of problem crops up a lot for certain arguments for EUM (for example, Savage’s). But I’ve set up my “hanging out with God” scenario in order to avoid it. That is, in my scenario, the “prizes” are entire worlds, such that Lottery X vs. Y is actually switching from “a world where I have $10M but am dead” to “a world where I have $10M but am alive” (and also: “nothing and alive” vs. “nothing and dead”), which are different prizes. And the “states,” with God, are just flips of God’s coin, spins of God’s wheel, etc – states that do not, intuitively, interact with the value of the “worlds” they are paired with.

To the extent you buy into this “hanging out with God” set-up, then, the argument can proceed: in this type of set-up, you shouldn’t care which worlds go with “heads” vs. “tails,” and similarly for other equiprobable states. Thus, we can permute the “worlds” paired with the different, equiprobable states in a given “lottery-that’s-been-transformed-into-one-with-equiprobable-states,” without changing its utility. And this implies that our subutility functions for the different equiprobable states have to all be the same (otherwise, e.g., if one subutility function gave a different value to “frogs” vs. “flowers” than another, then the overall total could differ when you swap frogs and flowers around).

So now we know that for every lottery L, there is a single sub-utility function u, and a “transformed” lottery L’ consisting entirely of equiprobable, mutually exclusive states (s1, s2, etc), such that U(L’) = u(x₁) + u(x₂) … u(x_n). But now it’s easy to see that U(L’) is going to be equivalent to an expected utility, because the number of states that give rise to a given world will be proportionate to the overall probability of that world (though again, technically we need something about limiting approximations here). E.g., for Lottery B above, you’ve got: two states with Flowers, and eight with frogs, so U(Lottery B) = 2*u(flowers) + 8*u(frogs). But dividing both sides by the total number of states, such that we get U’(Lottery B) = .2*u(flowers) + .8*u(frogs), will yield the same overall ordering. Thus, your preferences over lotteries are representable as maximizing expected utility.

Again: not an airtight formal proof. But I find the basic thought useful anyway. (I think it’s possible that something like this is the basic dynamic underlying at least part of Savage’s proof, once he’s constructed your probability assignment, but I haven’t checked).

VI. Peterson’s “direct argument”

Let’s look at one further argument for EUM, from Peterson (2017), which requires taking for granted both a probability assignment and a utility function, and which tries to show that given this, you should maximize expected utility.

Peterson is motivated, here, by a dissatisfaction with arguments of the form: “If your choices about lotteries meet XYZ conditions, then you are representable as an EU-maximizer” (see also Easwaran (2014)). He wants an argument that reflects the way in which probabilities and utilities seem prior to choices between lotteries, and which can therefore guide such choices.

He appeals to four axioms:

P1. If all the outcomes of an act have utility u, then the utility of that act is u.

That is, a certainty of getting utility u is worth u. Sounds fine.

P2. If Lottery A leads to better outcomes than Lottery B under all states, and the probability of a given state in A is always the same as the probability of that state in B, then A is better.

Again, sounds good: it’s basically just “if A is better no matter what, it’s better.” This leads to questions in Newcomb-like scenarios, but let’s set that aside for now: the “hanging out with God” set up above isn’t like that.

P3. Every decision problem can be transformed into one with equally probable states, in a way that preserves the utilities of the actions.

This is the principle we appealed to last section (as before: strictly, you need something about limiting approximations, here, but whatever).

P4. For any tiny amount of utility t, there is some sufficiently large amount of utility b, such that if you make one of any two equiprobable outcomes worse by t, you can always compensate for this by making the other outcome better by b.

For example, suppose that, in Lottery X, God will give you puppies if heads, and frogs if tails. But now he proposes a tweaked lottery Y, which will involve making one of the puppies in heads feel a moment of slightly-painful self-doubt. P4 says that there must be some way of improving the frogs situation, in tails (for example, by adding some amount of puppy happiness), such that you’re indifferent between X and Y. And further, it says that for whatever utility amount the puppy self-doubt subtracted (t), and whatever utility the puppy happiness added (b), you’re always indifferent between an original coin-flip lottery, and a modified one where the heads outcome is worse by t, and the tails outcome is better by b.

As I see it, the main problem for Peterson is that this axiom is incompatible with various utility functions. Suppose, for example, that my utility function caps out at 1. And suppose that I’ve got a lottery that gives me 1 if heads, and 1 if tails. Here, there’s nothing I can do to tails that will compensate for some hit to my utility in heads; I’ve already max-ed tails out, so any loss on heads is a strict loss. And we’ll get similar dynamics with bounded utility functions more generally. And even for unbounded utility functions, P4 isn’t obvious: it implies that you’re always happy to subtract t from some outcome, and to add b to another equiprobable outcome, no matter the original utilities at stake.

Suppose, though, that we grant P4 (and thereby restrict ourselves to unbounded utility functions – which, notably, cause their own serious problems [LW · GW]). Then, we can show that t and b have to be equal – otherwise, we can make you indifferent between two lotteries where one yields better outcomes than the other no matter what. Thus, consider a case where b > t (the same argument will work if b < t):

State: H T
Lottery X: u(Puppies) u(Frogs)

Now we transform X into the equally-valuable X’:

Lottery X’: u(Puppies) – t u(Frogs) + b

And now we transform X’ into the equal-valuable X’’

Lottery X’’: u(Puppies) – t + b u(Frogs) + b – t

But now, because b > t, we’ve added a positive amount of utility to both outcomes. So by P2, we’re required to prefer X’’ over X. (If b was less than t, we’d have subtracted something from both outcomes, and you’d be required to prefer X over X’’.)

(We can also think of this as an additional argument for intuitions along the lines of “you should be indifferent about saving -1 life on heads, and +1 on tails.” I.e., we can ask whether there’s at least some number n of lives we could save on tails, such that you always become indifferent to saving -1 on heads and +n on tails. But then, if n isn’t equal to 1, and you apply the same reasoning to -1 on tails, +n on heads, then we can make you indifferent to strict improvements/losses across both heads and tails.)

Equipped with b=t, together with P3, we can transform any lottery into a lottery with equiprobable states, where all of them yield the same utility. Thus, suppose u(Puppies) = 9, and u(Frogs) = 4, and we’re trying to assess the utility of {20% you get puppies, 80% you get frogs} (call this lottery R). Well, splitting it into equiprobable cases, and then successively adding/subtracting some constant unit of utility (e.g., our b and t – let’s use 1) across states, we get:

State	S1	S2	S3	S4	S5
Probability	20%	20%	20%	20%	20%
Original R	9	4	4	4	4
R’	8	5	4	4	4
R’’	7	5	5	4	4
R’’	6	5	5	5	4
R’’’	5	5	5	5	5

By P1, u(R’’’) = 5. So because R’’’~ R, u(R) = 5, too. And this will always be the average value across states – i.e., the sum of utilities, which is held constant by the shuffling, divided by the number of states (which, for a lottery with equally probable states, is just the expected utility — for n equally probable states, the probability of a state is just 1/n, so U(R) = u(S1)/n + u(S2)/n +…).

Peterson’s argument is similar in various ways to the “separability” argument I gave in the previous section. To me, though, it feels weaker. In particular, it requires a very strong assumption about your utility function. Still, to people attracted to positing, up front, some “constant units” that they always value the same (for example, if they think that adding a sufficiently sad puppy always makes an outcome worse by the same amount, and adding a sufficiently happy puppy always makes an outcome better by that same amount), Peterson’s argument helps elucidate why expected utility maximization follows naturally.

OK: that’s a look at three different theorems/arguments that try to move from “XYZ constraints” to “EUM” (or, “representable as EUM”). Presumably I lost a ton of readers along the way – but hopefully there are a few out there who wanted, and were up for, this level of depth.

In each of these cases, though, we took the probability part of EUM for granted. In the next (and final) essay in this series, I’ll look at ways of justifying this part, too.

^{^}
By linking A1 and A2 via their mutual reference to (1, b, c), we know that:

A4. (0, b+1, c) ~ (0, b, c+1), for all b and c.

But because locations 2 and 3 are separable, we can put in any other number at location 1 in A4, such that:

A5. (a, b+1, c) ~ (a, b, c+1), for all a, b, and c.

But now set b = 0 in A5. This implies:

A6: (a, 1, c) ~ (a, 0, c+1) for all a and c.

But now, by linking A6 with A3, via their mutual reference to (a, 1, c), we get:

A7: (a+1, 0, c) ~ (a, 0, c+1), for all a and c.

And because locations 1 and 3 are separable from location 2, we substitute in “b” for “0”, and get:

A8: (a+1, b, c) ~ (a, b, c+1), for all a, b, and c

But now, equipped with A5 and A8, we’re really cooking with additivity gas. Notably, we now know that:

A9: (a+1, b, c) ~ (a, b+1, c) ~ (a, b, c+1), for all a, b, and c.

That is, we have a constant unit across locations.

4 comments

Comments sorted by top scores.

comment by tailcalled · 2022-03-22T07:35:23.141Z · LW(p) · GW(p)

Beyond this, though, refusing to compare stuff just looks, to me, unnecessarily “passive.” It looks like it’s giving up an opportunity for agency, for free. Recall that if you refuse to choose between two lotteries, God tosses them to his dog, Fido, and chooses the one that Fido drools on more. If you’re indifferent between the two, then OK, fine, let Fido choose. Incomparability, though, is not supposed to be indifference. But then: why are you letting Fido make the call? Why so passive? Why not choose for yourself?

Maybe you want to be aligned to/corrigible to Fido. In such a case, passivity seems exactly like what we want, so that you allow Fido to choose instead of imposing some unwanted will.

comment by TLW · 2022-03-27T04:33:05.574Z · LW(p) · GW(p)

3. Independence: A ≻ B if and only if ApC ≻ BpC.

Can we split this please?

3A) I find ApC ≻ BpC implies A ≻ B to be simple and obvious.

3B) I find A ≻ B implies ApC ≽ BpC to be simple and obvious.

3C) I find A ≻ B implies ApC ≩^[1] BpC to be complex and non-obvious.

(In particular, consider an agent that has a non-zero cost of calculating the relation between A and B. Then the optimum for said agent may be to return A ~ B even if A and B aren't precisely the same value, because the cost of calculating if A or B is truly better is higher than the utility loss by just saying A ~ B. As a concrete made-up example - if I know that , and that my cost of calculating which way around is 1, it's worth it to me to calculate and return A ≻ B (or B ≻ A, or A ~ B, as appropriate), but if, say, p=0.1, it's not worth it to me to do the calculation, and so I should return ApC ~ BpC.)

There's a similar split with axiom 4.

^{^}
I'm using this for clarity, though honestly this may have made this less clear. read: ≻ and very explicitly not ~.

comment by EOC (Equilibrate) · 2022-03-22T16:31:57.618Z · LW(p) · GW(p)

I think there's a typo in the last paragraph of Section I?

And let’s use “≻” to mean “better than” (or, “preferred to,” or “chosen over,” or whatever), “≺” to mean “at least as good as,”

"≺" should be "≽"

Replies from: joekc

↑ comment by Joe Carlsmith (joekc) · 2022-03-22T20:22:53.123Z · LW(p) · GW(p)

Thanks! Fixed.

comment by Pattern · 2022-03-23T02:53:50.333Z · LW(p) · GW(p)

On expected utility, part 3: VNM, separability, and more

Contents

I. God’s lotteries

II. The four vNM axioms

III. The vNM theorem

IV. Separability implies additivity

V. From “separability implies additivity” to EUM

VI. Peterson’s “direct argument”

4 comments