On expected utility, part 1: Skyscrapers and madmen

post by Joe Carlsmith (joekc) · 2022-03-16T21:58:39.257Z · LW · GW · 16 comments

Contents

  I. Maximal skyscraper
  II. Only one shot
  III. Against “apparently the scary math says so?”
  IV. Is being an EUM-er trivial?
None
16 comments

(Cross-posted from Hands and Cities)

    Summary:

Suppose that you’re trying to do something. Maybe: get a job, or pick a restaurant, or raise money to pay medical bills.

Some people think that unless you’re messing up in silly ways, you should be acting “as if” you’re maximizing expected utility – i.e., assigning consistent, real-numbered probabilities and utilities to the possible outcomes of your actions, and picking the action with the highest expected utility (the sum, across the action’s possible outcomes, of each outcome’s utility multiplied by its probability).

But in combination with plausible ethical views, expected utility maximization (EUM) can lead to a focus on lower-probability, higher-stakes events — a focus that can be emotionally difficult. For example, faced with a chance to save someone’s live for certain, it directs you to choose a 1% chance of saving 1000 lives instead – even though this choice will probably benefit no one. And EUM says to do this even for one shot, or few shot, choices – for example, choices about your career.

Why do this? The “quick argument” – namely, that EUM maximizes actual utility, given repeated choices and independent trials – isn’t enough for few-shot cases. And neither are appeals to “collective action” across EUM-ish people with your values.

Rather, I think the strongest arguments for EUM come from a cluster of related theorems, which say something like: your choices conform to certain attractive ideals of rationality if and only if you act like an EUM-er. But the theorems themselves often go unexplained. Informal discussions typically list some set of axioms, and then state the theorem, but they leave the proof, and the basic dynamics underlying it, as a black box.

Early on in my exposure to such theorems, I found this frustrating. If I was going to make big decisions (especially one-shot decisions) using EUM as a guiding ideal, I wanted to understand its rationale more deeply. But the full proofs are often lengthy and difficult.

This series of essays is an attempt to write the type of thing I would’ve loved to read, at that point in my life.

Exactly how much support these theorems give to EUM as a normative ideal is a further question, which I don’t try to tackle comprehensively (though collectively, I find them pretty compelling). And there are lots of further issues in this vicinity I don’t discuss. Regardless of whether you embrace EUM, though, I think engaging with these theorems at a level deeper than “apparently the math says blah” can result in a more visceral and clear-headed relationship with the moves and vibes that structure EUM-ish thinking. I’ve found such engagement useful, and I hope some readers will too.

Thanks to Katja Grace, Cate Hall, Petra Kosonen, Ketan Ramakrishnan, and especially to John Wentworth for discussion.

I. Maximal skyscraper

What does an expected utility maximizer (EUM-er) do? Three things:

  1. Assigns probabilities to the possible outcomes of their actions (call this “probabilism”).
  2. Assigns real-numbered “utilities” to these outcomes, representing something like their value/preferred-ness (call this “utility-ism”).
  3. Chooses the action that yields the highest expected utility – i.e., the sum, across the action’s possible outcomes, of each outcome’s utility multiplied by its probability.

Thus, suppose you can save A one life for certain, or (B) a thousand lives with 1% chance. And suppose you value each life equally, such that saving a thousand lives is a thousand times better than saving one. EUM then says to choose B, because it saves ten lives in expectation, whereas A saves only one.

We can think of EUM via what I call a “skyscraper” model. Probabilities, recall, behave like the area of a space, like a 1×1 square (see here and here for more). Thus, in the choice above:

Utilities then behave like an extra dimension.

So each action gets a “city-scape” – that is, a probability square as the ground, and “skyscrapers” sprouting up from the different regions, with bases corresponding to the probability of the outcome in that region, and heights corresponding to that outcome’s utility. And EUM says to choose the action with the largest total volume of skyscraper. It’s a pro-housing vibe.

Thus, A is a very short skyscraper taking up the whole city. B is mostly an empty lot, but it’s got a very tall and thin skyscraper sitting in the corner. And this skyscraper is sufficiently tall that B actually has more total housing overall (the drawing above does not capture this well). So EUM chooses B.

(At times in this series, I’ll also use 2d visualizations, with probability on the horizontal axis, and utility on the vertical axis, where the expected utility is then the area of the 2d “housing.” This is often simpler and better – but I like the way the 3d picture can combine with visualizing probability as 2d, which I find generally useful in other contexts.)

II. Only one shot

OK, but: what’s supposed to be cool about this? In particular: it seems like B, here, probably does nothing. If you choose B over A, you predictably lose. At least a 1000 people die, and you could’ve saved one of them for certain (throughout this series, I’ll use the term “losing” broadly, to mean “ending up in a dis-preferred state”). And if EUM results in predictably losing, why do it?

One argument is: EUM does well in the long run. In particular: it maximizes actual utility, if you make the same choice enough times. For example, if you choose B over A a zillion times (and the 1%, in B, is e.g. a new dice-roll each time), then you’re ~guaranteed to save ~10x the lives, relative to choosing A over B instead.

This is the “quick argument.” But it’s also the “not good enough” argument. For one thing: sometimes failures are correlated. If the 1% comes from saving 1000 if the n-th through n+6th digits of pi are all odd (really, this would be .8%), and they’re not, but you never get to find out whether you’re saving people or not, you can choose B over and over, and never save anyone.

But also: suppose that you’re only choosing between A and B once. Suppose, indeed, that choosing between A and B is literally the only thing you will ever do. What then? EUM fans still say: B. But why? The quick argument is silent.

What’s more, this silence matters in real life. Consider careers. I’m in favor of applying EUM-ish reasoning to career choice. For example, I think it can be worth spending your whole career trying to prevent some form of existential catastrophe, even if the overall risk of that type of catastrophe is low, and all your work will probably end up irrelevant. The stakes are that large.

But careers are especially bad candidates for “if you repeat it enough times, you’ll eventually come out ahead.” You can only change careers so many times, and feedback about whether you’re in a “everything I’m doing is irrelevant” world doesn’t always come readily. And beyond this, you’ve only got one life – one single chance to do something in this world. You’ve got to make it count. But if you choose via EUM, then sometimes, it probably won’t.

Perhaps you say: “yes, but if everyone with your values does EUM, then you get to repeat the choice across people, rather than across time.” But this isn’t enough, either.

Pretty clearly, some other argument is needed.

III. Against “apparently the scary math says so?”

The type of argument I like best appeals to a cluster of related theorems pointing at EUM’s equivalence to satisfying various attractive constraints on rationality (often formulated as “axioms”). But I don’t like the way this argument is often left as a black box of scary math.

In particular, when I was first learning about EUM, it felt like people would often mention the theorems, without explaining how or why they work. Sometimes, they would get as far as listing and debating some set of axioms: but then they’d just move from discussing the axioms to the stating theorem, while skipping the proof. Indeed, it remains unclear to me how many fans of EUM have ever actually engaged with the proofs in question.

Perhaps you say: “But, if the proofs are valid, do you need to understand them?” And indeed, not necessarily. But I wanted to. In particular: it felt like EUM was asking a lot of me. It was asking me, for example, to predictably lose – to predictably let people die, to predictably waste my time, for the sake of … something. “Rationality?” Rationality is not an end in itself. What, then? If I was going to make few-shot, high-stakes, predictably-losing life choices on EUM-ish grounds, I wanted to get it “in my gut.” And I hoped that being able to see the flow of logic, from premises to conclusion, would help.

What’s more, absent deeper understanding, I felt some worry about internal coercion. If I left the theorems as black boxes, it felt easy to round them off to: “apparently the scary math I’ll never understand says that I have do EUM, otherwise I’m silly and bad.” And this didn’t feel like a healthy or sustainable set up – especially if EUM was going to ask me to do lots of emotionally unrewarding things. Indeed, I wonder how many people currently relate to EUM this way – and about the costs of doing so.

Plus, there was this stuff about fanaticism (i.e., cases where EUM leads to obsession with tiny probabilities of very high stakes outcomes). It felt like people liked EUM until they didn’t. Some probabilities were “pascalian,” and some weren’t. 1%, I guess, wasn’t (I do think this is right). Why not? How do you tell the difference? Hand-wave, hand-wave, we don’t know. (I, at least, still don’t know. My current best guess is something about bounded utility functions, which you maybe want anyway [LW · GW], but I haven’t worked it out, and I don’t expect to like it.) Bit of a red flag? Maybe not a time to just sit pretty with the definitely-fine math (see also: infinite ethics). And maybe understanding it better could shed light (indeed: the proof of the VNM theorem I’ll present explicitly assumes bounded utility functions, which don’t lead to fanaticism – but it’s easy to not know that “blah proof assumes bounded utility functions,” if you’ve never actually looked at the proof in question).

(Also: some people think these theorems are relevant, or aren’t, to whether we should expect advanced AI systems to kill us all by default – see e.g. Omohundro (2008), Yudkowsky (1 [LW · GW], 2), Shah (2018) [AF · GW], Ngo (2019) [AF · GW], Grace (2021) [AF · GW]. I’m not going to delve into this topic, but I think it’s another reason to actually understand how the theorems work — and it’s part of what prompts my own interest.)

I’m hoping, here, to write the type of thing I would’ve loved to read, when I was first looking into EUM. I want to acknowledge, though, that the relevant audience might be correspondingly limited. In particular: there’s going to be a bit of (basic) math, but I’m also going to work through it slowly, informally, and with various simplifying assumptions. So I worry I’ll lose the math-phobic as soon as we hit the symbols, and bore/frustrate the math-fluent. So it goes. Hopefully, there are a few readers in my specific demographic, for whom it scratches the right itch.

I also want to acknowledge that I’m not speaking from a place of “I’ve gone through and really understood the original versions of all these proofs, including for infinite cases.” Often, indeed, the original version has been too long/intense for me. Rather, I’ve tried to find, understand, and present some relatively short and comprehensible explanation of the basic thing going on in at least some finite version of the proof. For me, at this point, it’s enough. Those who seek more depth, though, can consult the original sources, which I’ll link to (and for those who want a more comprehensive but still accessible introduction to EUM, I recommend Peterson (2017) – though he doesn’t include various of the theorems I’ll discuss).

IV. Is being an EUM-er trivial?

I want to head off one other objection before we dive in.

No one (sensible) thinks you should go around doing comprehensive, explicit EU calculations for very decision. Nor, indeed, do philosophical fans of EUM necessarily claim that ideally rational agents do this. Rather, they generally claim that ideally rational agents act like an EUM-ers. That is, if you’re an ideally rational agent with respect to some set of alternatives A, it’s possible to construct a probability assignment p and a utility function u, such that you make the same choices about A that an EUM-er with u and p would make.

But now perhaps we wonder: is constructing such a p and u trivial?

The worry, then, is that representation is too cheap. I can represent as anything as an EUM-er, or as violating EUM, if I try.

I do think there are tricky issues in this vicinity. But for a given physical system S, and a given set of alternatives A, we should take care to distinguish between:

  1. Does it make sense to think of S as having preferences over A, and if so, how do we tell what they are?
  2. For any given set of preferences over A, is it trivial to represent this set of preferences as maximizing expected utility?

(1) is hard, and I won’t tackle it here. But (2) is easy: the answer is no. For example, if your preferences over A are intransitive, they can’t be represented as maximizing expected utility: maybe you prefer puppies to flowers to mud to puppies, but there is no function u to the real numbers such that u(puppies) > u(flowers) > u(mud) > u(puppies). Indeed, this is the type of thing we learn from the theorems I’ll discuss.

The examples above get their force from not knowing how to answer (1). Presented with a given episode of real-world behavior in a physical system S, they observe that this episode is compatible, in principle, with some set of preferences over some set of alternatives, and some utility function representing those preferences, such that the episode is EU-maximizing (or not). And because they don’t know how to answer (1), they assume these preferences are viable candidates for S’s preferences.

Plausibly, though, (1) has an answer. A rock, for example, doesn’t have preferences. A twitching madman doesn’t, actually, maximally prefer a world where he twitches in exactly X way (intuition pump: if God took him to a realm “beyond the world,” and offered him a chance to create a “I twitch in exactly X way” world by pressing button A, or a “I am sane and healthy” world by pressing button B, he would not, consistently, press button A. Rather, he would twitch until he hit a random button). And presumably, the answer to (1) would allow us to evaluate whether your feelings about fruit are sensitive to their temporal location in some consistent way; or whether you do, in fact, prefer one life saved to a thousand.

But the theorems I’ll discuss aren’t about (1). They assume that you are a preference-haver, and that you are trying to define preferences over some set of alternatives. And they tell you that if and only if you do this in XYZ attractive ways, then you’re representable as an EUM-er. And doing it in XYZ ways is not trivial at all. You can’t just randomize, or flail around. Rocks will have trouble. Madmen, too.

What’s more, in the real world, excuses that appeal to (1)-ish ambiguities seem like cold comfort – at least when I try to apply them to myself. Suppose I choose A over B above. I can say, if I want, that actually, I am an EUM-er; I just value one life saved more than a thousand. But do I? Or suppose I get money-pumped. I can say, if I want, that I actually just like it when fruit changes hands. But do I?

Maybe, if I say these things, I don’t have to label myself “irrational,” or my choice a “mistake.” Maybe I’ve blocked some sort of abstract insult. But if my values aren’t actually like this, am I coming out ahead? This was supposed to be about lives, or fruit. Who cares about abstract insults? That’s not the bullying to worry about. And anyway, if it’s not, actually, me who is EUM-ish, but rather some forced re-interpretation, am I, perhaps, still insulted?

Granted, figuring out your “true values” is tough and ambiguous stuff (see here for trickiness); and figuring out the “true values” of some other physical system, even more so. But even absent some nice theory, we shouldn’t mistake this trickiness for “anything goes,” or confuse it with the (false) claim that all preference relations over alternatives are representable as EU-maximizing.

In the next post, I start diving into substantive arguments in favor of EUM, starting with choices like A vs. B above.

16 comments

Comments sorted by top scores.

comment by Donald Hobson (donald-hobson) · 2022-03-17T19:09:34.020Z · LW(p) · GW(p)

Your use of the word loose is interesting.

Like suppose option A gets utility -47. And option B gets 99% chance of -48 and 1% of 952. 

You seem to define loosing as any outcome of utility below -47.5, and then ask why choose outcome B that usually looses. 

You could just as well define a loss as any state below 150 and say outcome B is obviously better because its the only state with any chance of winning.

Replies from: Measure
comment by Measure · 2022-03-21T02:39:16.561Z · LW(p) · GW(p)

Yes, but loose ≠ lose.

comment by aphyer · 2022-03-17T13:50:04.976Z · LW(p) · GW(p)

You are allowed to assign non-linear utilities to things. This is common in cases where we're considering money: Utility = Log(Money) is a common thing to hear. You don't need to discard a utilitarian framework to do that, you just need a utility function that drops off.

It's somewhat stranger to assign non-linear utilities to 'number of lives saved' over the range 1-1000, and the explanations I can think of for why one might do that are not entirely flattering.

Replies from: JBlack
comment by JBlack · 2022-03-19T01:19:43.746Z · LW(p) · GW(p)

Being unflattering isn't an objection to stating a truth. If my utility function does not correspond to a perfect moral standard, then I wish to know that my utility function does not correspond to a perfect moral standard. If I then so desire, I can attempt to change myself.

In practice, a lot of the utility of saving someone's life will be from the internal good feelings and the reputational benefits of having done it. Neither of those are anything like linear. There may be also an intellectual desire to do what ought to be done, but for almost all people concerning distant strangers it will usually be at most comparable to the immediate practical benefits of doing something. There's little point in trying to discuss utility maximization by assuming implicitly that utility = morality.

Much like many previous examples of decision theories, it seems to me that by introducing scenarios involving "saving lives" or "burning to death", writers are tangling any further thought and discussion up into emotive moral extremes instead of focusing on the actual topic.

comment by Dagon · 2022-03-17T16:00:42.875Z · LW(p) · GW(p)

I like this exploration, but I think you need to go a little deeper into the difference between utility and outcome.  You probably shouldn't entertain non-transitive preferences and still call it "rational" - that's provably inefficient.  But you really need to include non-linear valuation of world-states.  

It's perfectly consistent with rationality and utility maximization to value one life more than you value a distribution of 0 or 1000 that's mostly zeros.  This implies a great value to you of saving one person, and a lesser value for saving numbers 2-1000.  There's a LOT of cases where declining marginal value kicks in like this.  It's a common assumption that many things have logarithmic utility, and there's some interesting math about how to maximize expectation in that case - https://en.wikipedia.org/wiki/Kelly_criterion .

Note that the expected utility framework is often reverse-engineered, not forward-enforced.  If your actions imply a consistent utility function, then you are exhibiting the underlying axioms of rational decisions.  It's about consistency of decision, not about the internal process.  In that sense, the rock is rationally pursuing a preference to just sit there.  It never takes an action inconsistent with that utility curve.  See https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem for the axioms - transitivity is the most common one for humans to violate, showing their irrationality.

Side-note, not changing the main point: If the lives are close to fungible for you (they're distant strangers) and reputation effects don't come into play (nobody will find out about your decision), then it's hard to credit the implication that you care so much about the one that you'll let the expected 99 you could have saved die - that's probably scope insensitivity rather than reasoned valuation of outcomes.  One of the best outcomes of learning this topic well is the ability to recognize inconsistency in yourself, as areas of personal growth to improve.  You might improve by understanding your utility and valuation more clearly (you're imagining you know the one, and don't know the 999, or you're thinking about credit and how you're perceived, or many other factors).  Or by realizing that some of your intuitions are wrong, and re-training yourself, in order to get better at improving the state of the world (according to your preferences).

comment by tailcalled · 2022-03-17T13:42:41.275Z · LW(p) · GW(p)

In later posts, do you intend to address the question of what inputs the utility functions should be over, as well as whether path-dependence is irrational? To me these are the two big cruxes for expected utility maximization.

comment by AlexMennen · 2022-03-17T01:27:15.970Z · LW(p) · GW(p)

(Perhaps you were going to address this in a later post, but) the iterated decisions type of argument for EUM and the one-shot arguments like VNM seem not comparable to me in that they don't actually support the same conclusions. The iterated decision arguments tell you what your utility function should be (linear in amount of good things if future opportunities don't depend on past results; possibly nonlinear otherwise, as in the Kelly criterion), and the one-shot arguments importantly don't, instead simply concluding that there should exist some utility function accurately reflecting your preferences.

Replies from: Equilibrate
comment by Eric Chen (Equilibrate) · 2022-03-17T12:30:02.561Z · LW(p) · GW(p)

The 'iterated decisions'-type arguments support EUM in a given decision problem if you expect to face the exact same decision problem over and over again. The 'representation theorem' arguments support EUM for a given decision problem, without qualification. 

In either case, your utility function is meant to be constructed from your underlying preference relation over the set of alternatives for the given problem. The form of the function can be linear in some things or not, that's something to be determined by your preference relation and not the arguments for EUM.

Replies from: AlexMennen
comment by AlexMennen · 2022-03-17T19:25:57.380Z · LW(p) · GW(p)

In either case, your utility function is meant to be constructed from your underlying preference relation over the set of alternatives for the given problem. The form of the function can be linear in some things or not, that's something to be determined by your preference relation and not the arguments for EUM.

No, what I was trying to say is that this is true only for representation theorem arguments, but not for the iterated decisions type of argument.

Suppose your utility function is some monotonically increasing function of your eventual wealth. If you're facing a choice between some set of lotteries over monetary payouts, and you expect to face an extremely large number of i.i.d. iterations of this choice, then by the law of large numbers, you should pick the option with the highest expected monetary value each time, as this maximizes your actual eventual wealth (and thus your actual utility) with probability near 1.

Or suppose you expect to face an extremely large number of similarly-distributed opportunities to place bets at some given odds at whatever stakes you choose on each step, subject to the constraint that you can't bet more money than you have. Then the Kelly criterion says that if you choose the stakes that maximizes your expected log wealth each time, this will maximize your eventual actual wealth (and thus your actual utility, since that's monotonically increasing with you eventual wealth) with probability near 1.

So, in the first case, we concluded that you should maximize a linear function of money, and in the second case, we concluded that you should maximize a logarithmic function of money, but in both cases, we assumed nothing about your preferences besides "more money is better", and the function you're told to maximize isn't necessarily your utility function as in the VNM representation theorem. The shape of the function you're told you should maximize comes from the assumptions behind the iteration, not from your actual preferences.

Replies from: Equilibrate
comment by Eric Chen (Equilibrate) · 2022-03-18T13:56:05.334Z · LW(p) · GW(p)

Yeah, that's a good argument that if your utility is monotonically increasing in some good X (e.g. wealth), then the type of the iterated decision you expect to fact involving lotteries over that good can determine that the best way to maximize your utility is to maximize a particular function (e.g. linear) of that good. 

But this is not what the 'iterated decisions' argument for EUM amounts to. In a sense, it's quite a bit less interesting. The 'iterated decisions' argument does not start with some weak assumption on your utility function and then attempts to impose more structure on your utility function in iterated choice situations. They don't assume anything about your utility function, other than that you have one (or can be represented as having one). 

All it's saying is that, if you expect to face arbitrarily many i.i.d. iterations of a choice among lotteries (i.e. known probability distributions) over outcomes that you have assigned utilities to already, you should pick the lottery that has the highest expected utility. Note, the utility assignments do not have to be linear or monotonically increasing in any particular feature of the outcomes (such as the amount of money I gain if that outcome obtains), and that the utility function is basically assumed to be there. 

Replies from: AlexMennen
comment by AlexMennen · 2022-03-19T03:56:41.335Z · LW(p) · GW(p)

Oh, are you talking about the kind of argument that starts from the assumption that your goal is to maximize a sum over time-steps of some function of what you get at that time-step? (This is, in fact, a strong assumption about the nature of the preferences involved, which representation theorems like VNM don't make.)

Replies from: Equilibrate
comment by Eric Chen (Equilibrate) · 2022-03-19T20:08:37.217Z · LW(p) · GW(p)

The assumption is that you want to maximize your actual utility. Then, if you expect to face arbitrarily many i.i.d. iterations of a choice among lotteries over outcomes with certain utilities, picking the lottery with the highest expected utility each time gives you the highest actual utility. 

It's really not that interesting of an argument, nor is it very compelling as a general argument for EUM. In practice, you will almost never face the exact same decision problem, with the same options, same outcomes, same probability, and same utilities, over and over again. 

Replies from: AlexMennen
comment by AlexMennen · 2022-03-19T20:32:44.086Z · LW(p) · GW(p)

Ah, I think that is what I was talking about. By "actual utility", you mean the sum over the utility of the outcome of each decision problem you face, right? What I was getting at is that your utility function splitting as a sum like this is an assumption about your preferences, not just about the relationship between the various decision problems you face.

Replies from: Equilibrate
comment by Eric Chen (Equilibrate) · 2022-03-22T00:44:01.686Z · LW(p) · GW(p)

Yeah, by "actual utility" I mean the sum of the utilities you get from the outcomes of each decision problem you face. You're right that if my utility function were defined over lifetime trajectories, then this would amount to quite a substantive assumption, i.e. the utility of each iteration contributes equally to the overall utility and what not. 

And I think I get what you mean now, and I agree that for the iterated decisions argument to be internally motivating for an agent, it does require stronger assumptions than the representation theorem arguments. In the standard 'iterated decisions' argument, my utility function is defined over outcomes which are the prizes in the lotteries that I choose from in each iterated decision. It simply underspecifies what my preferences over trajectories of decision problems might look like (or whether I even have one). In this sense, the 'iterated decisions' argument is not as self-contained as (i.e., requires stronger assumptions than) 'representation theorem' arguments, in the sense that representation theorems justify EUM entirely in reference to the agent's existing attitudes, whereas the 'iterated decisions' argument relies on external considerations that are not fixed by the attitudes of the agent.  

Does this get at the point you were making?

Replies from: AlexMennen
comment by AlexMennen · 2022-03-22T02:30:24.158Z · LW(p) · GW(p)

Yes, I think we're on the same page now.

comment by Tamsin Leake (carado-1) · 2024-03-31T06:14:21.947Z · LW(p) · GW(p)

Unlike on your blog, the images on the lesswrong version of this post are now broken.