Expected utility without the independence axiom

stuart_armstrong

Expected utility without the independence axiom

post by Stuart_Armstrong · 2009-10-28T14:40:08.464Z · LW · GW · Legacy · 68 comments

68 comments

John von Neumann and Oskar Morgenstern developed a system of four axioms that they claimed any rational decision maker must follow. The major consequence of these axioms is that when faced with a decision, you should always act solely to increase your expected utility. All four axioms have been attacked at various times and from various directions; but three of them are very solid. The fourth - independence - is the most controversial.

To understand the axioms, let A, B and C be lotteries - processes that result in different outcomes, positive or negative, with a certain probability of each. For 0<p<1, the mixed lottery pA + (1-p)B implies that you have p chances of being in lottery A, and (1-p) chances of being in lottery B. Then writing A>B means that you prefer lottery A to lottery B, A<B is the reverse and A=B means that you are indifferent between the two. Then the von Neumann-Morgenstern axioms are:

(Completeness) For every A and B either A<B, A>B or A=B.
(Transitivity) For every A, B and C with A>B and B>C, then A>C.
(Continuity) For every A>B>C then there exist a probability p with B=pA + (1-p)C.
(Independence) For every A, B and C with A>B, and for every 0<t≤1, then tA + (1-t)C > tB + (1-t)C.

In this post, I'll try and prove that even without the Independence axiom, you should continue to use expected utility in most situations. This requires some mild extra conditions, of course. The problem is that although these conditions are considerably weaker than Independence, they are harder to phrase. So please bear with me here.

The whole insight in this post rests on the fact that a lottery that has 99.999% chance of giving you £1 is very close to being a lottery that gives you £1 with certainty. I want to express this fact by looking at the narrowness of the probability distribution, using the standard deviation. However, this narrowness is not an intrinsic property of the distribution, but of our utility function. Even in the example above, if I decide that receiving £1 gives me a utility of one, while receiving zero gives me a utility of minus ten billion, then I no longer have a narrow distribution, but a wide one. So, unlike the traditional set-up, we have to assume a utility function as being given. Once this is chosen, this allows us to talk about the mean and standard deviation of a lottery.

Then if you define c(μ) as the lottery giving you a certain return of μ, you can use the following axiom instead of independence:

(Standard deviation bound) For all ε>0, there exists a δ>0 such that for all μ>0, then any lottery B with mean μ and standard deviation less that μδ has B>c((1-ε)μ).

This seems complicated, but all that it says, in mathematical terms, is that if we have a probability distribution that is "narrow enough" around its mean μ, then we should value it are being very close to a certain return of μ. The narrowness is expressed in terms of its standard deviation - a lottery with zero SD is a guaranteed return of μ, and as the SD gets larger, the distribution gets wider, and the chances of getting values far away from μ increases. So risk, in other words, scales (approximately) with the SD.

We also need to make sure that we are not risk loving - if we are inveterate gamblers for the point of being gamblers, our behaviour may be a lot more complicated.

(Not risk loving) If A has mean μ>0, then A≤c(μ).

I.e. we don't love a worse rate of return just because of the risk. This axiom can and maybe should be weakened, but it's a good approximation for the moment - most people are not risk loving with huge risks.

Assume you are going to be have to choose n different times whether to accept independent lotteries with fixed mean β>0, and all with SD less than a fixed upper-bound K. Then if you are not risk loving and n is large enough, you must accept an arbitrarily large proportion of the lotteries.

Proof: From now on, I'll use a different convention for adding and scaling lotteries. Treating them as random variables, A+B will mean the lottery consisting of A and B together, while xA will mean the same lottery as A, but with all returns (positive or negative) scaled by x.

Let X₁, X₂, ... , X_n be these n independent lotteries, with means β and variances v_j. The since the standard deviations are less than K, the variances must be less than K².

Let Y = X₁ + X₂ + ... + X_n. The mean of Y is nβ. The variance of Y is the sum of the v_j, which is less than nK². Hence the SD of Y is less than K√(n). Now pick an ε>0, and the resulting δ>0 from the standard deviation bound axiom. For large enough n, nβδ must be larger than K√(n); hence, for large enough n, Y > c((1-ε)nβ). Now, if we were to refuse more that εn of the lotteries, we would be left with a distribution with mean ≤ (1-ε)nβ, which, since we are not risk loving, is worse than c((1-ε)nβ), which is worse than Y. Hence we must accept more than a proportion (1-ε) of the lotteries on offer. ♦

This only applies to lotteries that share the same mean, but we can generalise the result as:

Assume you are going to be have to choose n different times whether to accept independent lotteries all with means greater than a fixed β>0, and all with SD less than a fixed upper-bound K. Then if you are not risk loving and n is large enough, you must accept lotteries whose means represent an arbitrarily large proportion of the total mean of all lotteries on offer.

Proof: The same proof works as before, with nβ now being a lower bound on the true mean μ of Y. Thus we get Y > c((1-ε)μ), and we must accept lotteries whose total mean is greater than (1-ε)μ. ♦

Analysis: Since we rejected independence, we must now consider the lotteries when taken as a whole, rather than just seeing them individually. When considered as a whole, "reasonable" lotteries are more tightly bunched around their total mean than they are individually. Hence the more lotteries we consider, the more we should treat them as if only their mean mattered. So if we are not risk loving, and expect to meet many lotteries with bounded SD in our lives, we should follow expected utility. Deprived of independence, expected utility sneaks in via aggregation.

Note: This restates the first half of my previous post - a post so confusingly written it should be staked through the heart and left to die on a crossroad at noon.

Edit: Rewrote a part to emphasis the fact that a utility function needs to be chosen in advance - thanks to Peter de Blanc and Nick Hay for bringing this up.

68 comments

Comments sorted by top scores.

comment by SilasBarta · 2009-10-28T15:36:29.438Z · LW(p) · GW(p)

Since we rejected independence, we must now consider the lotteries when taken as a whole, rather than just seeing them individually. When considered as a whole, "reasonable" lotteries are more tightly bunched around their total mean than they are individually. Hence the more lotteries we consider, the more we should treat them as if only their mean mattered.

You are absolutely correct, and it pains me because this issue should have been settled a long time ago.

When Eliezer Yudkowsky first brought up the breakdown of independence in humans, way, way back during the discussion of the Allais Paradox, the poster "Gray Area" explained why people aren't being money-pumped, even though they violate independence. He/she came to the same conclusion in the quote above.

Here's what Gray Area said back then:

Finally, the 'money pump' argument fails because you are changing the rules of the game. The original question was, I assume, asking whether you would play the game once, whereas you would presumably iterate the money pump until the pennies turn into millions. The problem, though, is if you asked people to make the original choices a million times, they would, correctly, maximize expectations. Because when you are talking about a million tries, expectations are the appropriate framework. When you are talking about 1 try, they are not. [bold added]

I didn't see anyone even reply to Gray Area anywhere in that series, or anytime since.

So I bring up essentially the same point whenever Eliezer uses the Allais result, always concluding with a zinger like: If getting lottery tickets is being exploited, I don't want to be empowered.

Please, folks, stop equating a hypothetical money pump with the actual scenario.

Replies from: alyssavance, Psychohistorian, Stuart_Armstrong

↑ comment by alyssavance · 2009-10-28T20:17:39.431Z · LW(p) · GW(p)

The Allais Paradox is not about risk aversion or lack thereof; it's about people's decisions being inconsistent. There are definitely situations in which you would want to choose a 50% chance of $1M over a 10% chance of $10M. However, if you would do so, you should also then choose a 5% chance of $1M over a 1% chance of $10M, because the relative risk is the same. See Eliezer's followup post, Zut Allais.

Turning a person into a money pump also isn't about playing the same gamble a zillion times (as any good investor will tell you, if you play the gamble a zillion times, all the risk disappears and you're left with only expected return, which leaves you with a different problem). The money pump works thusly: I sell you gamble A for $5. You then trade with me gamble A for gamble B. You then sell me back gamble B for $4. I then sell you gamble A for $5... wash, rinse, repeat. Nowhere in the cycle is either gamble actually paid out.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-28T20:34:51.062Z · LW(p) · GW(p)

Are you sure you're responding to the right person here?

1) I wasn't claiming that Allais is about risk aversion.

2) I was claiming it doesn't show an inconsistency (and IMO succeeded).

3) I did read Zut Allais, and the other Allais article with the other ridiculous French pun, and it wasn't responsive to the point that Gray Area raised. (You may note that a strapping lad named "Silas" even noted this at the time.)

However, if you would do so, you should also then choose

4) You cannot substantiate the charge that you should do the latter if you did the former, since no negative consequence actually results from violating that "should" in the one-shot case. You know, the one people were actually tested on.

ETA: (I think the second paragraph was just added in tommccabe's post.)

Turning a person into a money pump also isn't about playing the same gamble a zillion times.

My point never hinged on it being otherwise.

The money pump works thusly: I sell you gamble A for $5. You then trade with me gamble A for gamble B. You then sell me back gamble B for $4. I then sell you gamble A for $5... wash, rinse, repeat. Nowhere in the cycle is either gamble actually paid out.

Okay, and where in the Allais experiment did it permit any of those exchanges to happen? Right, nowhere.

Believe it or not, when I say, "I prefer B to A", it doesn't mean "I hereby legally obligate myself to redeem on demand any B for an A", yet your money pump requires that.

Replies from: RobinZ, alyssavance

↑ comment by RobinZ · 2009-10-28T20:40:53.559Z · LW(p) · GW(p)

The problem is that you're losing money doing it once. You would agree that c(0) > c(-2), yes? If they are willing to trade A for B in a one-shot game, they shouldn't be willing to pay more for A than for B in a one-shot - you don't trade the more valuable item for the less valuable. That their preferences may reverse in the iterated situation has no bearing on the Allais problem.

Edit: The text above following the question mark is incorrect. See my later comment quoting Eliezer for the correct statement.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-28T20:50:29.735Z · LW(p) · GW(p)

The problem is that you're losing money doing it once.

Again, if suddenly being offered the choice of 1A/1B then 2A/2B as described here, but being "inconsistent", is what you call "losing money", then I don't want to gain money!

If they are willing to trade A for B in a one-shot game, they shouldn't be willing to pay more for A than for B in a one-shot

But that's not what's happening the paradox. They're (doing something isomorphic to) preferring A to B once and then p*B to p*A once. At no point do they "pay" more for B than A while preferring A to B. At no point does anyone make or offer the money-pumping trades with the subjects, nor have they obligated themselves to do so!

Replies from: RobinZ, alyssavance

↑ comment by RobinZ · 2009-10-28T21:57:28.189Z · LW(p) · GW(p)

Consider Eliezer's final remarks in The Allais Paradox (I link purely for the convenience of those coming in in the middle):

Suppose that at 12:00PM I roll a hundred-sided die. If the die shows a number greater than 34, the game terminates. Otherwise, at 12:05PM I consult a switch with two settings, A and B. If the setting is A, I pay you $24,000. If the setting is B, I roll a 34-sided die and pay you $27,000 unless the die shows "34", in which case I pay you nothing.

Let's say you prefer 1A over 1B, and 2B over 2A, and you would pay a single penny to indulge each preference. The switch starts in state A. Before 12:00PM, you pay me a penny to throw the switch to B. The die comes up 12. After 12:00PM and before 12:05PM, you pay me a penny to throw the switch to A.

I have taken your two cents on the subject.

You're right insofar as Eliezer invokes the Axiom of Independence when he resolves the Allais Paradox using expected value; I do not yet see any way in which Stuart_Armstrong's criteria rule out the preferences (1A > 1B)u(2A < 2B). However, in the scenario Eliezer describes, an agent with those preferences either loses one cent or two cents relative to the agent with (1A > 1B)u(2A > 2B).

↑ comment by alyssavance · 2009-10-28T21:02:18.492Z · LW(p) · GW(p)

Your preferences between A and B might reasonably change if you actually receive the money from either gamble, so that you have more money in your bank account now than you did before. However, that's not what's happening; the experimenter can use you as a money pump without ever actually paying out on either gamble.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-28T21:10:58.038Z · LW(p) · GW(p)

Yes, I know that a money pump doesn't involve doing the gamble itself. You don't have to repeat yourself, but apparently, I do have to repeat myself when I say:

The money pump does require that the experimenter make actual futher trades with you, not just imagine hypothetical ones. The subjects didn't make these trades, and if they saw many more lottery tickets potentially coming into play, so as to smooth out returns, they would quickly revert to standard EU maximization, as predicted by Armstrongs's derivation.

Replies from: alyssavance

↑ comment by alyssavance · 2009-10-28T21:13:49.952Z · LW(p) · GW(p)

"Potentially coming into play, so as to smooth out returns" requires that there be the possibility of the subject actually taking more than one gamble, which never happens. If you mean that people might get suspicious after the tenth time the experimenter takes their money and gives them nothing in return, and thereafter stop doing it, I agree with you; however, all this proves is that making the original trade was stupid, and that people are able to learn to not make stupid decisions given sufficient repetition.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-29T14:51:08.844Z · LW(p) · GW(p)

"Potentially coming into play, so as to smooth out returns" requires that there be the possibility of the subject actually taking more than one gamble, which never happens.

The possibility has to happen, if you're cycling all these tickets through the subject's hands. What, are they fake tickets that can't actually be used now?

There are factors that come into play when you get to do lots of runs, but aren't present with only one run. A subject's choice in a one-shot scenario does not imply that they'll make the money-losing trades you describe. They might, but you would have to actually test it out. They don't become irrational until such a thing actually happens.

Replies from: alyssavance

↑ comment by alyssavance · 2009-10-29T15:31:32.942Z · LW(p) · GW(p)

"What, are they fake tickets that can't actually be used now?"

No, they're just the same tickets. There's only ever one of each. If I sell you a chocolate bar, trade the chocolate bar for a bag of Skittles, buy the bag of Skittles, and repeat ten thousand times, this does not mean I have ten thousand of each; I'm just re-using the same ones.

"They might, but you would have to actually test it out. They don't become irrational until such a thing actually happens."

We did test it out, and yes, people did act as money pumps. See The Construction of Preference by Sarah Lichtenstein and Paul Slovic.

Replies from: Toby_Ord

↑ comment by Toby_Ord · 2009-10-29T19:56:34.671Z · LW(p) · GW(p)

You can also listen to an interview with one of Sarah Lichtenstein's subjects who refused to make his preferences consistent even after the money-pump aspect was explained:

http://www.decisionresearch.org/publications/books/construction-preference/listen.html

Replies from: Douglas_Knight, tut

↑ comment by Douglas_Knight · 2009-10-30T02:53:30.381Z · LW(p) · GW(p)

You can also listen to an interview with one of Sarah Lichtenstein's subjects who refused to make his preferences consistent even after the money-pump aspect was explained:

http://www.decisionresearch.org/publications/books/construction-preference/listen.html

That is an incredible interview.

Admitting that the set of preferences is inconsistent, but refusing to fix it is not so bad a conclusion - maybe he'd just make it worse (eg, by raising the bid on B to 550). At times he seems to admit that the overall pattern is irrational ("It shows my reasoning process isn't too good"). At other times, he doesn't admit the problem, but I think you're too harsh on him in framing it as refusal.

I may be misunderstanding, but he seems to say that the game doesn't allow him to bid higher than 400 on B. If he values B higher than 400 (yes, an absurd mistake), but sells it for 401, merely because he wasn't allowed to value it higher, then that seems to me to be the biggest mistake. It fits the book's title, though.

Maybe he just means that his sense of math is that the cap should be 400, which would be the lone example of math helping him. He seems torn between authority figures, the "rationality" of non-circular preferences and the unnamed math of expected values. I'm somewhat surprised that he doesn't see them as the same oracle. Maybe he was scarred by childhood math teachers, and a lone psychologist can't match that intimidation?

↑ comment by tut · 2009-10-29T20:16:43.930Z · LW(p) · GW(p)

That sounds to me as though he is using expected utility to come up with his numbers, but doesn't understand expected utility, so when asked which he prefers he uses some other emotional system.

↑ comment by alyssavance · 2009-10-28T20:56:17.291Z · LW(p) · GW(p)

"1) I wasn't claiming that Allais is about risk aversion."

The difference between your preferences over choosing lottery A vs. lottery B when both are performed a million times, and your preferences over choosing A vs. B when both are performed once, is a measurement of your risk aversion; this is what Gray Area was talking about, is it not?

"Believe it or not, when I say, "I prefer B to A", it doesn't mean "I hereby legally obligate myself to redeem on demand any B for an A""

Then you must be using a different (and, I might add, quite unusual) definition of the word "preference". To quote dictionary.com:

pre⋅fer /prɪˈfɜr/ [pri-fur] –verb (used with object), -ferred, -fer⋅ring.

to set or hold before or above other persons or things in estimation; like better; choose rather than: to prefer beef to chicken.

What does it mean to say that you prefer B to A, if you wouldn't trade B for A if the trade is offered? Could I say that I prefer torture to candy, even if I always choose candy when the choice is offered to me?

Typo: Did you mean "prefer A to B"?

Replies from: Psychohistorian, SilasBarta

↑ comment by Psychohistorian · 2009-10-28T22:15:22.369Z · LW(p) · GW(p)

I prefer B to A does not imply I prefer 10B to 10A, or even I prefer 2B to 2A. Expected utility != expected return.

I agree pretty much completely with Silas. If you want to prove that people are money pumps, you need to actually get a random sample of people and then actually pump money out of them. You can't just take a single-shot hypothetical and extrapolate to other hypotheticals when the whole issue is how people deal with the variability of returns.

Replies from: alyssavance, RobinZ

↑ comment by alyssavance · 2009-10-28T22:35:09.665Z · LW(p) · GW(p)

"I prefer B to A does not imply I prefer 10B to 10A, or even I prefer 2B to 2A. Expected utility != expected return."

Of course, but, as I've said (I think?) five times now, you never actually get 2B or 2A at any point during the money-pumping process. You go from A, to B, to nothing, to A, to B... etc.

For examples of Vegas gamblers actually having money pumped out of them, see The Construction of Preference by Sarah Lichtenstein and Paul Slovic.

↑ comment by RobinZ · 2009-10-28T23:08:14.728Z · LW(p) · GW(p)

Strictly speaking, Eliezer's formulation of the Allais Paradox is not the one that has been experimentally tested. I believe a similar money pump can be implemented for the canonical version, however -- and Zut Allais! shows that people can be turned into money pumps in other situations.

↑ comment by SilasBarta · 2009-10-28T21:19:29.851Z · LW(p) · GW(p)

The difference between your preferences over choosing lottery A vs. lottery B when both are performed a million times, and your preferences over choosing A vs. B when both are performed once, is a measurement of your risk aversion; this is what Gray Area was talking about, is it not?

No, it's not, and the problem asserted by Allais paradox is that the utility function is inconsistent, no matter what the risk preference.

Then you must be using a different (and, I might add, quite unusual) definition of the word "preference". To quote dictionary.com:

to set or hold before or above other persons or things in estimation; like better; choose rather than: to prefer beef to chicken.

I don't see anything in there that about how many times the choice has to happen, which is the very issue at stake.

If there's any unusualness, it's definitely on your side. When you buy a chocolate bar for a dollar, that "preference of a chocolate bar to a dollar" does not somehow mean that you are willing to trade every dollar you have for a chocolate bar, nor have you legally obligated yourself to redeem chocolate bars for dollars on demand (as a money pump would require), nor does anyone expect that you will trade the rest of your dollars this way.

It's called diminishing marginal utility. In fact, it's called marginal analysis in general.

What does it mean to say that you prefer B to A, if you wouldn't trade B for A if the trade is offered?

It means you would trade B for A on the next opportunity to do so, not that you would indefinitely do it forever, as the money pump requires.

Replies from: alyssavance

↑ comment by alyssavance · 2009-10-28T21:25:59.277Z · LW(p) · GW(p)

"When you buy a chocolate bar for a dollar, that "preference of a chocolate bar to a dollar" does not somehow mean that you are willing to trade every dollar you have for a chocolate bar, nor have you legally obligated yourself to redeem chocolate bars for dollars on demand (as a money pump would require), nor does anyone expect that you will trade the rest of your dollars this way."

Under normal circumstances, this is true, because the situation has changed after I bought the chocolate bar: I now have an additional chocolate bar, or (more likely) an additional bar's worth of chocolate in my stomach. My preferences change, because the situation has changed.

However, after you have bought A, and swapped A for B, and sold B, you have not gained anything (such as a chocolate bar, or a full stomach), and you have not lost anything (such as a dollar); you are in precisely the same position that you were before. Hence, consistency dictates that you should make the same decision as you did before. If, after buying the chocolate bar, it fell down a well, and another dollar was added to my bank account because of the chocolate bar insurance I bought, then yes, I should keep buying chocolate bars forever if I want to be consistent (assuming that there is no cost to my time, which there essentially isn't in this case).

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-29T14:59:17.423Z · LW(p) · GW(p)

And something about your state has likewise change after the swaps you described, just like when I have bought the first chocolate bar.

Jeez, where's Alicorn when you need her? We need someone to make a point about how, "Just because a woman sleeps with you once, doesn't meen she's inconsistent by ..." and then show the mapping to the logic being used here.

ETA: Forget the position I imputed to Alicorn for the moment. I'm making the point: how is this bizarre extrapolation of preferences any different from a very unfortunate overextrapolation often used by men?

Replies from: jimrandomh, Cyan, Nick_Tarleton

↑ comment by jimrandomh · 2009-10-29T15:23:28.043Z · LW(p) · GW(p)

Jeez, where's Alicorn when you need her? We need someone to make a point about how, "Just because a woman sleeps with you once, doesn't meen she's inconsistent by ..." and then show the mapping to the logic being used here.

What, exactly, are you trying to accomplish here? Your last interaction with Alicorn made it pretty clear that projecting non-sequitur sexual references onto her was unwelcome. Are you trolling?

Replies from: SilasBarta, Alicorn

↑ comment by SilasBarta · 2009-10-29T15:51:42.499Z · LW(p) · GW(p)

The last interaction wasn't a "sexual reference", even by Alicorns definition. I was trying to point out that her phrasing was a reference to LauraABJ's implied beliefs about when a woman is rejecting a man not necessarily in a sexual context.

I'd be interested to know why the follow-up kept getting modded down. As far as I can tell, people just didn't understand.

And I don't know how this is non-sequitur or projecting sexual references. People here are drawing absurd inferences about someone's preferences from one-time choices. It looks to me like the same kind of questionable reasoning used in the context I mentioned, and the same kind of thing Alicorn enjoys refuting.

Sorry for having an insufficiently refined red-flag detector, and for whatever offense I may have caused. Just make sure your offense is because of the topic, not because you just realized what your overextrapolation looks like in other contexts.

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-29T16:03:30.434Z · LW(p) · GW(p)

Just to raise the most obvious possible objection to your phrasing: there was nothing to prevent you from making whatever metaphor you suggested Alicorn could have employed. It is generally poor manners to invoke uninvolved people as supporters of your arguments without their permission, and in this situation, if Alicorn were interested in becoming involved in this thread, she could have posted herself.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-29T16:05:11.555Z · LW(p) · GW(p)

Thanks, that make much more sense.

↑ comment by Alicorn · 2009-10-29T15:53:59.359Z · LW(p) · GW(p)

The sexual references in particular are a subset of a broad class of things from SilasBarta that I do not welcome. That class of things is "anything involving me and SilasBarta directly interacting ever again". Just so no one interprets that last interaction too finely.

↑ comment by Cyan · 2009-10-29T16:05:19.228Z · LW(p) · GW(p)

It would probably be best to make your point in your own voice and not to put words in Alicorn's mouth (however indirectly), since you know that she will not interact directly with you to correct any misapprehensions about her views you may have.

ETA: Whoops, I see RobinZ got there first.

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-29T16:33:05.871Z · LW(p) · GW(p)

Your point about Alicorn not being likely to correct Silas is no less apt than mine about not dragging neutral parties into an argument - in fact, it is scarcely less general.

↑ comment by Nick_Tarleton · 2009-10-29T15:13:33.821Z · LW(p) · GW(p)

And something about your state has likewise change after the swaps you described, just like when I have bought the first chocolate bar.

Yes, but having made the swaps seems highly questionable as a a dimension of your state that affects your preferences.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-29T15:16:18.362Z · LW(p) · GW(p)

It's highly-questionable as a relevant state dimension because ... you need it to be to make the results come out right?

↑ comment by Psychohistorian · 2009-10-28T22:56:15.821Z · LW(p) · GW(p)

the poster "Gray Area" explained why people aren't being money-pumped, even though they violate independence.

I actually think that (for some examples) it's actually simpler than that. The Allais paradox assumes that the proposal of the bet itself has no effect on the utility of the proposee. In reality, if I took a 5% chance at $100M, instead of a 100% chance at $4M, there's a 95% chance I'd be kicking myself every time I opened my wallet for the rest of my life. Thus, taking the bet and losing is significantly worse than never having the bet proposed at all. If this is factored in correctly, EY's original formulation of the Allais Paradox is no longer functional: I prefer certainty, because losing when certainty was an option carries lower utility than never having bet.

This is more about how you calculate outcomes than it is about independence directly. If losing when you could have had a guaranteed (or nearly-guaranteed) win carries negative utility, and if you can only play once, it does not seem like it contradicts independence.

↑ comment by Stuart_Armstrong · 2009-10-29T10:34:00.672Z · LW(p) · GW(p)

Glad this formulation is useful! I do indeed think that people often behave like you describe, without generally losing huge sums of cash.

However, the conclusion of my post is that it is irational to deviate from expected utility for small sums. Agregating every small decision you make will give you expected utility.

comment by Douglas_Knight · 2009-10-29T05:33:33.968Z · LW(p) · GW(p)

I think the post is saying "if your preferences are somewhat coupled to the preferences of an expectation maximizer, then in some limit, your preferences match that expectation maximizer."

But so what? Why should your preferences have any relation to a real-valued function of the world? If you satisfy all the axioms, your preferences are exactly expectation-maximizing for a function that vN and M tell you how to build. But if the whole point is to drop one of the axioms, why should you still expect such a function to be relevant?

(this has been said elsewhere on the thread, but not too tentatively, and not at the top level.)

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2009-10-29T20:51:47.518Z · LW(p) · GW(p)

The results are on the "expected" part of expected utility, not on the "utility" part. Independence is overstrong; replacing it with the somewhat coupling to an expectation maximizer is much weaker. And yet in the limit it mimics the expectation requirement, which is very useful result.

(dropping independence completely leaves you flailing all over the place)

comment by Vladimir_Nesov · 2009-10-28T18:42:41.738Z · LW(p) · GW(p)

Folks, please write at least short reviews on technical articles: if someone parsed the math, whether it appears sensible, whether the message appears interesting, and what exactly this message consists in. Also, this article lacks references: is the stuff it describes standard, how does it relate to the field?

Replies from: Stuart_Armstrong, RobinZ, timtyler

↑ comment by Stuart_Armstrong · 2009-10-29T01:05:07.318Z · LW(p) · GW(p)

The result is my own work, but the reasoning is not particularly complex, and might well have been done before.

It's kind of a poor man's version of the central limit theorem, for differing distributions.

By this I mean that it's known that if you take the mean of identical independent distributions, it will tend to a narrow spike as the number of distributions increase. This post shows that similar things happen with non-identical distributions, if we bound the variances.

And please do point out any errors that anyone finds!

↑ comment by RobinZ · 2009-10-28T19:53:07.757Z · LW(p) · GW(p)

The math looks valid - I believe the content is original to Stuart_Armstrong, attempting to show a novel set of preferences which imply expected-value calculation in (suficiently) iterated cases but not in isolated cases.

Edit: For example, an agent whose decision-making criteria satisfy Stuart_Armstrong's criteria might refuse to bet $1 for a 50% chance of winning $2.50 and 50% chance of losing his initial dollar if it were a one-off gamble, but would be willing to make 50 such bets in a row if the odds of winning each were independent. In both cases, the expected value is positive, but only in the latter case is the probable variation from the expected value small enough to overcome the risk aversion.

↑ comment by timtyler · 2009-10-28T18:55:05.181Z · LW(p) · GW(p)

This article had an interesting title so I scanned it - but it lacked an abstract, a conclusion, had lots of maths in it - and I haven't liked most of Stuart's other articles - so I gave up on it early.

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2009-10-29T00:59:43.443Z · LW(p) · GW(p)

The article attempts to show that you don't need the independence axiom to justify using expected utility. So I replaced the independence axiom with another axiom that basically says that very thin distribution is pretty much the same as a guaranteed return.

Then I showed that if you had a lot of "reasonable" lotteries and put them together, you should behave approximately according to expected utility.

There's a lot of maths in it because the result is novel, and therefore has to be firmly justified. I hope to explore non-independent lotteries in future posts, so the foundations need to be solid.

comment by Peter_de_Blanc · 2009-10-29T01:28:08.656Z · LW(p) · GW(p)

Your axiom talks about expected utility, but you have not defined that term yet.

Replies from: RobinZ, Stuart_Armstrong

↑ comment by RobinZ · 2009-10-29T01:37:55.218Z · LW(p) · GW(p)

The post assumes a knowledge of basic statistics throughout - in such a context, the meaning of "expected utility" is transparent.

Replies from: Peter_de_Blanc

↑ comment by Peter_de_Blanc · 2009-10-29T02:00:52.571Z · LW(p) · GW(p)

Sorry, I meant the definition of utility.

[edit: this should have been a reply to Stuart Armstrong's comment below RobinZ's.]

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-29T02:08:47.670Z · LW(p) · GW(p)

Utility is the thing you want to maximize in your decision-making.

Replies from: Peter_de_Blanc

↑ comment by Peter_de_Blanc · 2009-10-29T02:34:02.076Z · LW(p) · GW(p)

A decision-maker in general isn't necessarily maximizing anything. Von Neumann and Morgenstern showed that if you satisfy axioms 1 through 4, then you do in fact take actions which maximize expected utility for some utility function. But this post is ignoring axiom 4 and assuming only axioms 1 through 3. In that case, why should we expect there to be a utility function?

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2009-10-29T12:07:30.618Z · LW(p) · GW(p)

Thanks for bringing this up, and I've change my post to reflect your comments. Unfortunately, I have to decree a utility function ahead of time for this to make any sense, as I can change the mean and SD of any distribution by just changing my utility function.

I have a new post up that argues that where small sums are concerned, you have to have a utility function linear in cash.

↑ comment by Stuart_Armstrong · 2009-10-29T01:36:13.680Z · LW(p) · GW(p)

? This is just the standard definition. The mean of the random variable, when it is expressed in terms of utils.

Should this be specified in the post, or is it common knowledge on this list?

Replies from: nickjhay, Peter_de_Blanc

↑ comment by Nick Hay (nickjhay) · 2009-10-29T02:08:58.398Z · LW(p) · GW(p)

The Von-Neumann Morgenstern axioms talk just about preference over lotteries, which are simply probability distributions over outcomes. That is you have an unstructured set O of outcomes, and you have a total preordering over Dist(O) the set of probability distributions over O. They do not talk about a utility function. This is quite elegant, because to make decisions you must have preferences over distributions over outcomes, but you don't need to assume that O has a certain structure, e.g. that of the reals.

The expected utility theorem says that preferences which satisfy the first four axioms are exactly those which can be represented by:

A <= B iff E[U;A] <= E[U;B]

for some utility function U: O -> R, where

E[U;A] = \sum{o} A(o) U(o)

However, U is only defined up to positive affine transformation i.e. aU+b will work equally well for any a>0. In particular, you can amplify the standard deviation as much as you like by redefining U.

Your axioms require you to pick a particular representation of U for them to make sense. How do you choose this U? Even with a mechanism for choosing U, e.g. assume bounded nontrivial preferences and pick the unique U such that \sup{x} U(x) = 1 and \inf{x} U(x) = 0, this is still less elegant than talking directly about lotteries.

Can you redefine your axioms to talk only about lotteries over outcomes?

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2009-10-29T12:08:46.054Z · LW(p) · GW(p)

Can you redefine your axioms to talk only about lotteries over outcomes?

Alas no. I've changed my post to explain the difficulties as I can change the mean and SD of any distribution by just changing my utility function.

I have a new post up that argues that where small sums are concerned, you have to have a utility function linear in cash.

↑ comment by Peter_de_Blanc · 2009-10-29T02:08:21.944Z · LW(p) · GW(p)

You started out by assuming a preference relation on lotteries with various properties. The completeness, transitivity, and continuity axioms talk about this preference relation. Your "standard deviation bound" axiom, however, talks about a utility function. What utility function?

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-10-29T03:24:01.298Z · LW(p) · GW(p)

Is the new axiom sufficient to show that the agent cannot be money-pumped?

Replies from: Stuart_Armstrong, RobinZ

↑ comment by Stuart_Armstrong · 2009-10-29T10:40:18.926Z · LW(p) · GW(p)

It's enough to show that an agent cannot be repeatedly money-pumped. The more opportunities for money pumping, the less chances there are of it succeeding.

Contrast household applicance insurance versus health insurance. Both are a one-shot money-pump, as you get less than your expected utility out of then. An agent following these axioms will probably health-insure, but will not appliance insure.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-10-29T12:28:38.354Z · LW(p) · GW(p)

Can you write out the math on that? To me it looks like the Allais Paradox or a simple variant would still go through. It is easy for the expected variance of a bet to increase as a result of learning additional information - in fact the Allais Paradox describes exactly this. So you could prefer A to B when they are bundled with variance-reducing most probable outcome C, and then after C is ruled out by further evidence, prefer B to A. Thus you'd pay a penny at the start to get A rather than B if not-C, and then after learning not-C, pay another penny to get B rather than A.

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2009-10-29T13:00:54.478Z · LW(p) · GW(p)

I'll try and do the maths. This is somewhat complex without independence, as you have to estimate what the total results of following a certain strategy is, over all the bets you are likely to face. Obviously you can't money pump me if I know you are going to do it; I just combine all the bets and see it's a money pump, and so don't follow it.

So if you tried to money pump me repeatedly, I'd estimate it was likely that I'd be money pumped, and adjust my strategy accordingly.

↑ comment by RobinZ · 2009-10-29T03:27:07.833Z · LW(p) · GW(p)

I believe SilasBarta has correctly (if that is the word) noted that it does not - it is perfectly possible for an agent to satisfy the new axioms and fall victim to the Allais Paradox.

Edit: correction - he does not state this.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-29T03:43:59.299Z · LW(p) · GW(p)

That sounds more like the exact opposite of my position.

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-29T04:20:09.475Z · LW(p) · GW(p)

I apologize. In the course of conversation with you, I came to that conclusion, but you reject that position.

Replies from: SilasBarta

↑ comment by SilasBarta · 2009-10-29T14:46:36.964Z · LW(p) · GW(p)

To summarize my point: if you follow the new axioms, you will act differently in one-shot vs. massive-shot scenarios. Acting like the former in the latter will cause you to be money-pumped, but per the axioms, you never actually do it. So you can follow the new axioms, and still not get money-pumped.

comment by RobinZ · 2009-10-28T16:50:04.988Z · LW(p) · GW(p)

It's a good result, but I wonder if the standard deviation is the best parameter. Loss-averse agents react differently to asymmetrical distributions allowing large losses than those allowing large gains.

Edit: For example, the mean of an exponential distribution f(x;t) = L e^(-Lx) has mean and standard deviation 1/L, but a loss-averse agent is likely to prefer it to the normal distribution N(1/L, 1/L^2), which has the same mean and standard deviation.

Replies from: Stuart_Armstrong, Cyan

↑ comment by Stuart_Armstrong · 2009-10-29T00:52:15.670Z · LW(p) · GW(p)

Once you abanndon independence, the possibilities are litteraly infinite - and not just easily controllable infinities, either. I worked with SD as that's the simplest model I could use; but skewness, kurtosis or, Bayes help us, the higher moments, are also valid choices.

You just have to be careful that your choice of units is consistent; the SD and the mean are in the same unit, the variance is in units squared, the skewness and kurtosis are unitless, the k-th moment is in units to the power k, etc...

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-29T01:09:07.575Z · LW(p) · GW(p)

That's true - and it occurred to me after I posted the comment that your criteria don't define the decision system anyway, so even using some other method you might still be able to prove that it meets your conditions.

↑ comment by Cyan · 2009-10-28T18:00:51.687Z · LW(p) · GW(p)

See also semivariance in the context of investment (and betting in general). NB: "semivariance" has a different meaning in the context of spatial statistics.

comment by bgrah449 · 2009-10-28T16:26:16.166Z · LW(p) · GW(p)

"The mean of Y is nβ. The variance of Y is the sum of the vj, which is less than nK2." Been a while for me, but doesn't this require the lotteries to be uncorrelated? If so, that should be listed with your axioms.

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-28T16:31:26.008Z · LW(p) · GW(p)

It requires the lotteries to be independent), which implies uncorrelated. Stuart_Armstrong specified independence.

Replies from: bgrah449

↑ comment by bgrah449 · 2009-10-28T16:36:22.921Z · LW(p) · GW(p)

Ugh, color me stupid - I assumed the "independence" we were relaxing was probability-related. Thanks RobinZ.

Replies from: Stuart_Armstrong, RobinZ

↑ comment by Stuart_Armstrong · 2009-10-29T01:07:22.773Z · LW(p) · GW(p)

You know, I didn't even realise I'd used "independence" both ways! Most of the time, it's only worth pointing out the fact if the random variables are not independent.

↑ comment by RobinZ · 2009-10-28T16:39:40.384Z · LW(p) · GW(p)

No problem. (Don't you love it when people use the same symbol for multiple things in the same work? I know as a mechanical engineer, I got so much joy from remembering which "h" is the heat transfer coefficient and which is the height!)

comment by bgrah449 · 2009-10-28T16:23:36.809Z · LW(p) · GW(p)

"The variance of Y is the sum of the vj, which is less than nK2." You need to specify that the lotteries are uncorrelated for this to be true.

Replies from: RobinZ

↑ comment by RobinZ · 2009-10-28T16:25:25.323Z · LW(p) · GW(p)

He specified "independent" - uncorrelated is implied.

Expected utility without the independence axiom

Contents

68 comments