# Why Universal Comparability of Utility?

post by AK · 2018-05-13T00:10:20.944Z · score: 27 (8 votes) · LW · GW · 16 commentsApologies if this is answered elsewhere and I couldn't find it. In AI reading I come across an agent's utility function, , mapping world-states to real numbers.

The existence of is justified by the VNM-utility theorem. The first axiom required for VNM utility is 'Completeness' -- in the context of AI this means for every pair of world-states, and , the agent knows or ~ .

Completeness over world-states seems like a huge assumption. Every agent we make this assumption for must already have the tools to compare 'world where, all else equal, the only food is peach ice cream' v. 'world where, all else equal, Shakespeare never existed.'* I have no idea how I'd reliably make that comparison as a human, and that's a far cry from '~', being indifferent between the options.

Am I missing something that makes the completeness assumption reasonable? Is 'world-state' used loosely, referring to a point in a vastly smaller space, with the exact space never being specified? Essentially, I'm confused, can anyone help me out?

*if it's important I can try to cook up better-defined difficult comparisons. 'all else equal' is totally under-specified... where does the ice cream come from?

## 16 comments

Comments sorted by top scores.

Let's talk about why a VNM utility is useful in the first place. The first reason is prescriptive: if you don't have a VNM utility function, you risk being mugged by wandering Bayesians (similar to Dutch Book arguments). The second is descriptive: humans definitely aren't perfect VNM-rational agents, but it's very often a useful approximation. These two use-cases give different answers regarding the role of completeness.

First use-case: avoiding losing one's shirt to an unfriendly Bayesian, who I'll call Dr Malicious. The risk here is that, if we don't even have well-ordered preferences in some region of world-space, then Dr Malicious could push us into that region and then money-pump us. But this really only matters to the extent that someone might actually attempt to pull a Dr Malicious on us, and could feasibly push us into a region where we don't have well-ordered preferences. No one can feasibly push us into a world of peach ice-cream, and if they could, they'd probably have easier ways to make money than money-pumping us.

Second use-case: prediction based on approximate-VNM. Just like the first use-case, completeness really only matters over regions of world-space likely to come up in the problem at hand. If someone has no implicit utility outside that region, it usually won't matter for our predictions.

So to close: this is an instance of spherical cow in a vacuum. In general, the spherical-cow-vacuum assumption is useful right up until it isn't. Use common sense, remember that the real world does not perfectly follow the math, but the math is still really useful. You can add in corrections if and when you need them.

I'm not sure about the first case:

if you don't have a VNM utility function, you risk being mugged by wandering Bayesians

I don't see why this is true. While "VNM utility function => safe from wandering Bayesians", it's not clear to me that "no VNM utility function => vulnerable to wandering Bayesians." I think the vulnerability to wandering Bayesians comes from failing to satisfy Transitivity rather than failing to satisfy Completeness. I have not done the math on that.

But the general point, about approximation, I like. Utility functions in game theory (decision theory?) problems normally involve only a small space. I think completeness is an entirely safe assumption when talking about humans deciding which route to take to their destination, or what bets to make in a specified game. My question comes from the use of VNM utility in AI papers like this one: http://intelligence.org/files/FormalizingConvergentGoals.pdf, where agents have a utility function over possible states of the universe (with the restriction that the space is finite).

Is the assumption that an AGI reasoning about universe-states has a utility function an example of reasonable use, for you?

Your intuition about transitivity being the key requirement is a good intuition. Completeness is more of a model foundation; we need completeness in order to even have preferences which can be transitive in the first place. A failure of completeness would mean that there "aren't preferences" in some region of world-space. In practice, that's probably a failure of the model - if the real system is offered a choice, it's going to do *something*, even if that something amounts to really weird implied preferences.

So when I talk about Dr Malicious pushing us into a region without ordered preferences, that's what I'm talking about. Even if our model contains no preferences in some region, we're still going to have some actual behavior in that region. Unless that behavior implies ordered preferences, it's going to be exploitable.

As for AIs reasoning about universe-states...

First, remember that there's no rule saying that the utility must depend on all of the state variables. I don't care about the exact position of every molecule in my ice cream, and that's fine. Your universe can be defined by an infinite-dimensional state vector, and your AI can be indifferent to all but the first five variables. That's fine.

Other than that, the above comments on completeness still apply. Faced with a choice, the AI is going to do *something*. Unless its behavior implies ordered preferences, it's going to be exploitable, at least when faced with those kinds of choices. And as long as that exploitability is there, Dr Malicious will have an incentive to push the AI into the region where completeness fails. But if the AI has ordered preferences in all scenarios, Dr Malicious won't have any reason to develop peach-ice-cream-destroying nanobots, and we probably just won't need to worry about it.

if you don’t have a VNM utility function, you risk being mugged by wandering Bayesians (similar to Dutch Book arguments)

Ok, but since you don’t *actually* risk being mugged by wandering Bayesians, this isn’t *actually* a problem, right? (And I don’t mean this in a trivial “I live in a nice neighborhood and the police would save me” sense, but in a deep, decision-theoretic sense.)

prediction based on approximate-VNM

Could you give some examples of cases where this has actually been used?

The approximation of VNM rationality is foundational to most of economics. The whole field is basically "hey, what happens if you stick together VNM agents with different utility functions, information and resource baskets?". So pretty much any successful prediction of economics is an example of humans approximating VNM-rational behavior. This includes really basic things like "prices increase when supply is expected to decrease". If people lacked (approximate) utility functions, then prices wouldn't increase (we'd just trade things in circles). If people weren't taking the expectation of that utility function, then the mere expectation of shortage wouldn't increase prices.

This is the sort of thing you need VNM utility for: it's the underlying reason for lots of simple, everyday things. People pursue goals, despite having imperfect information about their environment - that's VNM utility at work. Yes, people violate the math in many corner cases, but this is remarkable precisely because people *do* approximate VNM pretty well most of the time. Violations of transitivity, for instance, require fairly unusual conditions.

As for the risk of mugging, there are situations where you will definitely be money-pumped for violating VNM - think Wall Street or Vegas. In those situations, it's either really cheap to money-pump someone (Wall Street), or lots of people are violating VNM (Vegas). In most day-to-day life, it's not worth the effort to go hunting for people with inconsistent preferences or poor probability skills. Even if you found someone, they'd catch onto your money-pumping pretty quick, at which point they'd update to better approximate VNM rationality. Since it's not profitable, people don't usually do it. But as Wall Street and Vegas suggest, if a supply of VNM irrationality can be exploited with reasonable payoff-to-effort, people will exploit it.

Your example reminds me of an old Eliezer quote [LW · GW]:

"Would you kill Santa Claus or the Easter Bunny?" is an important question if and only if you have trouble deciding. I'd definitely kill the Easter Bunny, by the way, so I don't think it's an important question.

In this case, I think I'd kill Shakespeare, for the simple reason that however important, Shakespeare is just one writer, whereas all foods except for peach ice cream are, well, almost all foods.

(I guess this comment is meant to illustrate that it is possible to compare changes that seem different in kind from each other, even if it doesn't seem so at first. Of course, there are still pairs of changes that it is very difficult to compare, e.g. losing Shakespeare vs X dust specks, for some suitably chosen value of X.)

Moved to frontpage.

Although I don't expect mods to have the time to articulate every frontpage/non-frontpage decision, I wanted to highlight here that I really appreciated this post's brevity, and the sense of having "done the homework" (i.e. making an effort to get up to speed before questioning some common assumptions in the rationalsphere, and phrasing the objections/questions in a fashion that evoked more curiosity than argumentativeness, which I've seen some similar posts do, which I did not move to frontpage).

The VNM theorem is best understood as an operator that applies to a function that obeys the axioms and rewrites that function in the form where U is the resulting "utility function" producing a real number. So it rewrites your function into one that compares "expected utilities".

To apply this to something in the real world, a human or an AI, one must decide exactly what refers to and how are interpreted.

- We can interpret as the actual revealed choices of the agent. Ie. when put in a position to take action to cause either or to happen, what do they do? If the agent's thinking doesn't terminate (within the allotted time), or it chooses randomly, we can interpret that as . The possibilities are fully enumerated, so completeness holds. However, you will find that any real agent fails to obey some of the other axioms.
- We can interpret as the expressed preferences of the agent. That is to say, present the hypothetical and ask what the agent prefers. Then we say that if the agent says they prefer ; we say that if the agent says they prefer , and we say that if the agent says they are equal or can't decide (within the allotted time). Again completeness holds, but you will again always find that some of the other axioms will fail.
- In the case of humans, we can interpret as some extrapolated volition of a particular human. In which case we say that if the person
*would*choose if only they thought faster, knew more, were smarter, were more the person they wished they would be, etc. One might fancifully describe this as defining as the person's "true preferences". This is not a practical interpretation, since we don't know how to compute extrapolated volition in the general case. But it's perfectly mathematically valid, and it's not hard to see how it could be defined so that completeness holds. It's plausible that the other axioms could hold too -- most people consider the rationality axioms generally desirable to conform to, so "more the person they wished they would be" plausibly points in a direction that results in such rationality. - For some AIs whose source code we have access to, we might be able to just read the source code and define using the actual code that computes preferences.

There are a lot of variables here. One could interpret the domain of as being a restricted set of lotteries. This is the likely interpretation in something like a psychology experiment where we are constrained to only asking about different flavours of ice cream or something. In that case the resulting utility function will only be valid in this particular restricted domain.

I was going to say the same thing as the first bullet point here -- you can interpret the preference ordering as "If you were to give the agent two buttons that could cause world state 1 and world state 2 respectively, which would it choose?" (Indifference could be modeled as a third button which chooses randomly.) This gives you a definition of the full preference ordering which is complete by construction.

In practice, you only need to have utilities over world states you actually have to decide between, but I think the VNM utility theorem will apply in the same way to the world states which you actually care about.

Thanks for this response. On notation: I want world-states, , to be specific outcomes rather than random variables. As such, is a real number, and the expectation of a real number could only be defined as itself: in all cases. I left aside all the discussion of 'lotteries' in the VNM Wikipedia article, though maybe I ought not have done so.

I think your first two bullet points are wrong. We can't reasonably interpret ~ as 'the agent's thinking doesn't terminate'. ~ refers to indifference between two options, so if and ~ , then . Equating 'unable to decide between two options' and 'two options are equally preferable' will lead to a contradiction or a trivial case when combined with transitivity. I can cook up something more explicit if you'd like?

There's a similar problem with ~ meaning 'the agent chooses randomly', provided the random choice isn't prompted by equality of preferences.

This comment has sharpened my thinking, and it would be good for me to directly prove my claims above -- will edit if I get there.

Yes, you’re missing something, but it’s not what you think.

Actual humans do not conform to the VNM axioms. You correctly point out one serious problem with axiom 1 (completeness); there are also serious problems with axiom 3 (continuity) and axiom 4 (independence). (It also happens to be an *empirical* fact that many humans violate axiom 2 (transitivity), but there’s much less reason to believe that such violations are anything but irrational, by any reasonable standard of rationality.)

Oskar Morgenstern himself regarded the VNM theorem as an interesting mathematical result with little practical application in the real world, much less application to *actual humans*. This is also my own view, and that of others I’ve spoken to. It is, in any case, not at *all* obvious that conformance to the axioms is mandatory for any rational agent (as is sometimes claimed).

The assumption which I advise you to discard is that the VNM theorem is either *descriptive* of real (or realistic) agents, or that it’s *prescriptive* for “rational” agents. It is neither; rather, it’s a precise mathematical result, and no more. Anything more we make of it requires importing additional assumptions, which the theorem certainly does not dictate.

Downvoting, because it is prescriptive, and the comment doesn't even bother to argue why it wouldn't be. VNM utility generalizes both the Dutch Book arguments and deterministic utility, and similar arguments apply.

doesn’t even bother to argue why it wouldn’t be

Why *would* it be?

My comment was intended to be informative, not to offer a proof of anything. The idea that one has to *argue for* everything one says in all comments is, frankly, bizarre. (If you disagree, fair enough, you can challenge my statement and ask for support, etc., but “you didn’t argue for [thing X which you said]” is a very strange objection in this case.)

Dutch Book arguments

If you’re familiar with Dutch Book arguments, then no doubt you’re also familiar with all the reasons why Dutch Book arguments fail to apply in the real world except in certain specific circumstances, and with the non-trivial assumptions that are required in order for the arguments to carry forth.

deterministic utility

I’m not sure I get this reference. Mind giving a link, at least to point to the sort of thing you have in mind?

I think....

The strategy is to first assume completeness then adjust for limited information, incomplete information, unknowable information, possible wrong information etc. At a different point in calculating.

But you have to start from somewhere.

Tangential -- in that this is going to do absolutely zero more to justify the completeness assumption than anything else you've read -- but this seems like a good place to point out that utility functions can also be grounded in (and IMO, are better grounded in) Savage's theorem [LW · GW], in addition to the VNM theorem.

Ahead of time, you can't really tell precisely what problems you'll be faced with - reality is allowed to throw pretty much anything at you. It's a useful property, then, if decisions are possible to make in all situations, so you can guarantee that e.g. new physics won't throw you into undefined behavior.