Average utilitarianism must be correct?

philgoetz

Average utilitarianism must be correct?

post by PhilGoetz · 2009-04-06T17:10:02.598Z · LW · GW · Legacy · 168 comments

168 comments

I said this in a comment on Real-life entropic weirdness, but it's getting off-topic there, so I'm posting it here.

My original writeup was confusing, because I used some non-standard terminology, and because I wasn't familiar with the crucial theorem. We cleared up the terminological confusion (thanks esp. to conchis and Vladimir Nesov), but the question remains. I rewrote the title yet again, and have here a restatement that I hope is clearer.

We have a utility function u(outcome) that gives a utility for one possible outcome. (Note the word utility. That means your diminishing marginal utility, and all your preferences, and your aggregation function for a single outcome, are already incorporated into this function. There is no need to analyze u further, as long as we agree on using a utility function.)
We have a utility function U(lottery) that gives a utility for a probability distribution over all possible outcomes.
The von Neumann-Morgenstern theorem indicates that, given 4 reasonable axioms about U, the only reasonable form for U is to calculate the expected value of u(outcome) over all possible outcomes. This is why we constantly talk on LW about rationality as maximizing expected utility.
This means that your utility function U is indifferent with regard to whether the distribution of utility is equitable among your future selves. Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.
This is the same ethical judgement that an average utilitarian makes when they say that, to calculate social good, we should calculate the average utility of the population; modulo the problems that population can change and that not all people are equal. This is clearer if you use a many-worlds interpretation, and think of maximizing expected value over possible futures as applying average utilitarianism to the population of all possible future yous.
Therefore, I think that, if the 4 axioms are valid when calculating U(lottery), they are probably also valid when calculating not our private utility, but a social utility function s(outcome), which sums over people in a similar way to how U(lottery) sums over possible worlds. The theorem then shows that we should set s(outcome) = the average value of all of the utilities for the different people involved. (In other words, average utilitarianism is correct). Either that, or the axioms are inappropriate for both U and s, and we should not define rationality as maximizing expected utility.
(I am not saying that the theorem reaches down through U to say anything directly about the form of u(outcome). I am saying that choosing a shape for U(lottery) is the same type of ethical decision as choosing a shape for s(outcome); and the theorem tells us what U(lottery) should look like; and if that ethical decision is right for U(lottery), it should also be right for s(outcome). )
And yet, average utilitarianism asserts that equity of utility, even among equals, has no utility. This is shocking, especially to Americans.
It is even more shocking that it is thus possible to prove, given reasonable assumptions, which type of utilitarianism is correct. One then wonders what other seemingly arbitrary ethical valuations actually have provable answers given reasonable assumptions.

Some problems with average utilitarianism from the Stanford Encyclopedia of Philosophy:

Despite these advantages, average utilitarianism has not obtained much acceptance in the philosophical literature. This is due to the fact that the principle has implications generally regarded as highly counterintuitive. For instance, the principle implies that for any population consisting of very good lives there is a better population consisting of just one person leading a life at a slightly higher level of well-being (Parfit 1984 chapter 19). More dramatically, the principle also implies that for a population consisting of just one person leading a life at a very negative level of well-being, e.g., a life of constant torture, there is another population which is better even though it contains millions of lives at just a slightly less negative level of well-being (Parfit 1984). That total well-being should not matter when we are considering lives worth ending is hard to accept. Moreover, average utilitarianism has implications very similar to the Repugnant Conclusion (see Sikora 1975; Anglin 1977).

(If you assign different weights to the utilities of different people, we could probably get the same result by considering a person with weight W to be equivalent to W copies of a person with weight 1.)

168 comments

Comments sorted by top scores.

comment by Wei Dai (Wei_Dai) · 2009-04-08T01:54:50.901Z · LW(p) · GW(p)

Among the four axioms used to derive the von Neumann-Morgenstern theorem, one stands out as not being axiomatic when applied to the aggregation of individual utilities into a social utility:

Axiom (Independence): Let A and B be two lotteries with A > B, and let t \in (0, 1] then tA + (1 − t)C > tB + (1 − t)C .

In terms of preferences over social outcomes, this axiom means that if you prefer A to B, then you must prefer A+C to B+C for all C, with A+C meaning adding another group of people with outcome C to outcome A.

It's the social version of this axiom that implies "equity of utility, even among equals, has no utility". To see that considerations of equity violates the social Axiom of Independence, suppose my u(outcome) = difference between the highest and lowest individual utilities in outcome. In other words, I prefer A to B as long as A has a smaller range of individual utilities than B, regardless of their averages. It should be easy to see that adding a person C to both A and B can cause A’s range to increase more than B’s, thereby reversing my preference between them.

Replies from: gjm

↑ comment by gjm · 2009-04-08T02:03:32.148Z · LW(p) · GW(p)

You're right that this is the axiom that's starkly nonobvious in Phil's attempted application (by analogy) of the theorem. I'd go further, and say that it basically amounts to assuming the controversial bit of what Phil is seeking to prove.

And I'll go further still and suggest that in the original von Neumann-Morgenstern theorem, this axiom is again basically smuggling in a key part of the conclusion, in exactly the same way. (Is it obviously irrational to seek to reduce the variance in the outcomes that you face? vN-M are effectively assuming that the answer is yes. Notoriously, actual human preferences typically have features like that.)

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-08T21:31:03.957Z · LW(p) · GW(p)

I think the two comments above by Wei Dai and gjm are SPOT ON. Thank you.

And my final conclusion is, then:

Either become an average utilitarian; or stop defining rationality as expectation maximization.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2009-04-08T21:37:33.271Z · LW(p) · GW(p)

And my final conclusion is, then:
Either become an average utilitarian; or stop describing rationality as expectation maximization.

That's unwarranted. Axioms are being applied to describe very different processes, so you should look at their applications separately. In any case, reaching a "final conclusion" without an explicit write-up (or discovering a preexisting write-up) to check the sanity of conclusion is in most cases a very shaky step, predictably irrational.

Replies from: PhilGoetz, PhilGoetz

↑ comment by PhilGoetz · 2014-01-27T22:26:49.665Z · LW(p) · GW(p)

Okay: Suppose you have two friends, Betty and Veronica, and one balloon. They both like balloons, but Veronica likes them a little bit more. Therefore, you give the balloon to Veronica.

You get one balloon every day. Do you give it to Veronica every day?

Ignore whether Betty feels slighted by never getting a balloon. If we considered utility and disutility due to the perception of equity and inequity, then average utilitarianism would also produce somewhat equitable results. The claim that inequity is a problem in average utilitarianism does not depend on the subjects perceiving the inequity.

Just to be clear about it, Betty and Veronica live in a nursing home, and never remember who got the balloon previously.

You might be tempted to adopt a policy like this: p(v) = .8, p(b) = .2, meaning you give the balloon to Veronica eight times out of 10. But the axiom of independence assumes that it is better to use the policy p(v) = 1, p(b) = 0.

This is straightforward application of the theorem, without any mucking about with possible worlds. Are you comfortable with giving Veronica the balloon every day? Or does valuing equity mean that expectation maximization is wrong? I think those are the only choices.

Replies from: private_messaging, Jiro

↑ comment by private_messaging · 2014-05-29T05:18:57.972Z · LW(p) · GW(p)

I can have

u(world) = total_lifetime_veronica_happiness + total_lifetime_betty_happiness - | total_lifetime_veronica_happiness - total_lifetime_betty_happiness |

This will compel me, on the day one, to compare different ways how I can organize the world, and adopt one that has the future where veronica gets more balloons, but not excessively so (as giving all to veronica has utility of 0). Note: the 'world' defines it's future. As a consequence, I'd allocate a balloon counter, write up a balloon schedule, or the like.

I don't see how is that at odds with expected utility maximization. If it were at odds, I'd expect you to be able to come up with a Dutch Book style scenario demonstrating some inconsistency between my choices (and I would expect myself to be able to come with such scenario).

Replies from: blacktrance, Vaniver

↑ comment by blacktrance · 2014-05-29T19:38:27.363Z · LW(p) · GW(p)

It's compatible with utility maximization (you have a utility function and you're maximizing it) but it's not compatible with world utility maximization, which is required for utilitarianism.

Replies from: private_messaging

↑ comment by private_messaging · 2014-05-30T04:41:41.344Z · LW(p) · GW(p)

That utility function takes world as an input, I'm not sure what you mean by "world utility maximization".

Replies from: blacktrance

↑ comment by blacktrance · 2014-05-30T15:31:58.867Z · LW(p) · GW(p)

The maximization of the sum (or average) of the utilities of all beings in the world.

↑ comment by Vaniver · 2014-05-29T19:32:33.068Z · LW(p) · GW(p)

I believe this line of the grandparent discusses what you're discussing:

If we considered utility and disutility due to the perception of equity and inequity, then average utilitarianism would also produce somewhat equitable results.

Replies from: private_messaging

↑ comment by private_messaging · 2014-05-30T04:46:54.780Z · LW(p) · GW(p)

Betty and Veronica don't need to know of one another. The formula I gave produces rather silly results, but the point is that you can consistently define the utility of a world state in such a way that it intrinsically values equality.

Replies from: Vaniver

↑ comment by Vaniver · 2014-05-30T18:18:37.156Z · LW(p) · GW(p)

Betty and Veronica don't need to know of one another.

Right, then blacktrance's complaint holds that you're not just adding up the utilities of all the agents in the world, which is a condition of utilitarianism.

Replies from: private_messaging

↑ comment by private_messaging · 2014-05-30T18:42:29.478Z · LW(p) · GW(p)

Right, then blacktrance's complaint holds that you're not just adding up the utilities of all the agents in the world, which is a condition of utilitarianism.

PhilGoetz was trying to show that to be correct or necessary, from some first principles not inclusive of simple assertion of such. Had his point been "average utilitarianism must be correct because summation is a condition of utilitarianism", I wouldn't have bothered replying (and he wouldn't have bothered writing a long post).

Besides, universe is not made of "agents", an "agent" is just a loosely fitting abstraction that falls apart if you try to zoom in at the details. And summation of agent's utility across agents is entirely nonsensical for the reason that utility is only defined up to a positive affine transformation.

edit: also, hedonistic utilitarianism, at least as originally conceived, sums pleasure, rather than utility. Those are distinct, in that pleasure may be numerically quantifiable - we may one day have a function that looks at some high resolution 3d image, and tells how much pleasure the mechanism depicted in that image is feeling (a real number that can be compared across distinct structures).

↑ comment by Jiro · 2014-01-28T01:23:03.600Z · LW(p) · GW(p)

Imagine that instead of balloons you're giving food. Veronica has no food source and a day's worth of food has a high utility to her--she'd go hungry without it. Betty has a food source, but the food is a little bland, and she would still gain some small amount of utility from being given food. Today you have one person-day worth of food and decide that Veronica needs it more, so you give it to Veronica. Repeat ad nauseum; every day you give Veronica food but give Betty nothing.

This scenario is basically the same as yours, but with food instead of balloons--yet in this scenario most people would be perfectly happy with the idea that only Veronica gets anything.

Replies from: None

↑ comment by [deleted] · 2014-02-15T17:01:34.393Z · LW(p) · GW(p)

Alternatively, Veronica and Betty both have secure food sources. Veronica's is slightly more bland relative to her preferences than Betty's. A simple analysis yields the same result: you give the rations to Veronica every day.

Of course, if you compare across the people's entire lives, you would find yourself switching between the two, favoring Veronica slightly. And if Veronica would have no food without your charity, you might have her go hungry on rare occasions in order to improve Betty's food for a day.

This talks about whether you should analyze the delta utility of an action versus the end total utility of people. It doesn't talk about, when deciding what to do with a population, you should use average utility per person versus total utility of the population in your cost function. That second problem only crops up when deciding whether to add or remove people from a population -- average utilitarianism in that sense recommends killing people who are happy with their lives but not as happy as average, while total utilitarianism would recommend increasing the population to the point of destitution and near-starvation as long as it could be done efficiently enough.

Replies from: Jiro

↑ comment by Jiro · 2014-02-15T18:48:44.639Z · LW(p) · GW(p)

The point is that the "most people wouldn't like this" test fails.

It's just not true that always giving to one person and never giving to another person is a situation that most people would, as a rule, object to Most people would sometimes oibject, and sometimes not, depending on circumstances--they'd object when you're giving toys such as balloons, but they won't object when you're giving necessities such as giving food to the hungry.

Pointing out an additional situation when most people would object (giving food when the food is not a necessity) doesn't change this.

↑ comment by PhilGoetz · 2009-04-08T22:00:15.924Z · LW(p) · GW(p)

We haven't proved that you must either become an average utilitarian, or stop describing rationality as expectation maximization. But we've shown that there are strong reasons to believe that proposition. Without equally strong reasons to doubt it, it is in most cases rational to act as if it were true (depending on the utility of its truth or falsehood).

(And, yes, I'm in danger of falling back into expectation maximization in that last sentence. I don't know what else to do.)

Replies from: conchis, Vladimir_Nesov

↑ comment by conchis · 2009-04-08T23:56:04.995Z · LW(p) · GW(p)

Phil, I've finally managed to find a paper addressing this issue that doesn't appear to be behind a paywall.

Weymark, John (2005) "Measurement Theory and the Foundations of Utilitarianism"

Please read it. Even if you don't agree with it, it should at the very least give you an appreciation that there are strong reasons to doubt your conclusion, and that there are people smarter/more knowledgeable about this than either of us who would not accept it. (For my part, learning that John Broome thinks there could be something to the argument has shifted my credence in it slightly, even if Weymark ultimately concludes that Broome's argument doesn't quite work.)

The discussion is framed around Harsanyi's axiomatic "proof" of utilitarianism, but I'm fairly sure that if Harsanyi's argument fails for the reasons discussed, then so will yours.

EDIT: I'd very much like to know whether (a) reading this shifts your estimate of either (i) whether your argument has provided strong reasons for anything, or (ii) whether utilitarianism is true (conditional on expectation maximization being rational); and (b) if not, why not?

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-10T00:32:01.384Z · LW(p) · GW(p)

I haven't read it yet. I'll probably go back and change the word "strong"; it is too subjective, and provokes resistance, and is a big distraction. People get caught up protesting that the evidence isn't "strong", which I think is beside the point. Even weak evidence for the argument I'm presenting should still be very interesting.

↑ comment by Vladimir_Nesov · 2009-04-08T23:46:03.941Z · LW(p) · GW(p)

When there are strong reasons, it should be possible to construct a strong argument, one you can go around crushing sceptics with. I don't see anything salient in this case, to either support or debunk, so I'm either blind, or the argument is not as strong as you write it to be. It is generally a good practice to do every available verification routine, where it helps to find your way in the murky pond of weakly predictable creativity.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-09T04:19:10.346Z · LW(p) · GW(p)

When there are strong reasons, it should be possible to construct a strong argument, one you can go around crushing sceptics with.

I really only need a preponderance of evidence for one side (utilities being equal). If have a jar with 100 coins in it and you ask me to bet on a coin flip, and I know that one coin in the jar has two heads on it, I should bet heads. And you have to bet in this case - you have to have some utility function, if you're claiming to be a rational utility-maximizer.

The fact that I have given any reason at all to think that you have to choose between being an average utilitarian, or stop defining rationality as expectation maximization, is in itself interesting, because of the extreme importance of the subject.

I don't see anything salient in this case, to either support or debunk, so I'm either blind, or the argument is not as strong as you write it to be.

Do you mean that you don't see anything in the original argument, or in some further discussion of the original argument? Replies from: conchis

↑ comment by conchis · 2009-04-09T12:19:52.051Z · LW(p) · GW(p)

If you "don't see anything salient", then identify a flaw in my argument. Otherwise, you're just saying, "I can't find any problems with your argument, but I choose not to update anyway."

I'm sympathetic to this, but I'm not sure it's entirely fair. It probably just means you're talking past each other. It's very difficult to identify specific flaws in an argument when you just don't see how it is supposed to be relevant to the supposed conclusion.

If this were a fair criticism of Vladimir, then I think it would also be a fair criticism of you. I've provided what I view as extensive, and convincing (to me! (and to Amartya Sen)) criticisms of your argument, to which you general response has been, not to point out a flaw in my argument, but instead to say "I don't see how this is relevant".

This is incredibly frustrating to me, just as Vladimir's response probably seems frustrating to you. But I'd like to think it's more a failure of communication than it is bloody-mindedness on your or Vladimir's part.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-10T00:28:08.530Z · LW(p) · GW(p)

Fair enough. It sounded to me like Vladimir was saying something like, "I think your argument is all right; but now I want another argument to support the case for actually applying your argument".

I haven't read that paper you referenced yet. If you have others that are behind firewalls, I can likely get a copy for us.

comment by steven0461 · 2009-04-07T16:47:40.708Z · LW(p) · GW(p)

Oh! I realized only now that this isn't about average utilitarianism vs. total utilitarianism, but about utilitarianism vs. egalitarianism. As far as I understand the word, utilitarianism means summing people's welfare; if you place any intrinsic value on equality, you aren't any kind of utilitarian. The terminology is sort of confusing: most expected utility maximizers are not utilitarians. (edit: though I guess this would mean only total utilitarianism counts, so there's a case that if average utilitarianism can be called utilitarianism, then egalitarianism can be called utilitarianism... ack)

In this light the question Phil raises is kind of interesting. If in all the axioms of the expected utility theorem you replace lotteries by distributions of individual welfare, then the theorem proves that you have to accept utilitarianism. People who place intrinsic value on inequality would deny that some of the axioms, like maybe transitivity or independence, hold for distributions of individual welfare. And the question now is, if they're not necessarily irrational to do so, is it necessarily irrational to deny the same axioms as applying to merely possible worlds?

(Harsanyi proved a theorem that also has utilitarianism follow from some axioms, but I can't find a good link. It may come down to the same thing.)

Replies from: conchis, PhilGoetz

↑ comment by conchis · 2009-04-08T00:21:55.030Z · LW(p) · GW(p)

FWIW, this isn't quite Harsanyi's argument. Though he does build on the von Neuman-Morgenstern/Marschak results, it's in slightly different way to that proposed here (and there's still a lot of debate about whether it works or not).

Replies from: conchis

↑ comment by conchis · 2009-04-08T17:50:16.655Z · LW(p) · GW(p)

In case anyone's interested, here are some references for (a) the original Harsanyi (1955) axiomatization, and (b) the subsequent debate between Harsanyi and Sen about it's meaning. There is much more out there than this, but section 2 of Sen (1976) probably captures two key points, both of which seem equally applicable to Phil's argument.

(1) The independence axiom is seems more problematic when shifting from individual to social choice (as Wei Dai has already pointed out)

(2) Even if it weren't, the axioms don't really say much about utilitarianism as it is is commonly understood (which is what I've been trying, unsuccessfully, to communicate to Phil in the thread beginning here)

Harsanyi, John (1955), "Cardinal Welfare, Individualistic Ethics and Interpersonal Comparisons of Utility", Journal of Political Economy 63.
Diamond, P. (1967) "Cardinal Welfare, Individualistic Ethics and Interpersonal Comparisons of Utility: A Comment", Journal of Political Economy 61 (especially on the validity of the independence axiom in social vs. individual choice.)
Harsanyi, John (1975) "Nonlinear Social Welfare Functions: Do Welfare Economists Have a Special Exemption from Bayesian Rationality?" Theory and Decision 6(3): 311-332.
Sen, Amartya (1976) "Welfare Inequalities and Rawlsian Axiomatics," Theory and Decision, 7(4): 243-262 (reprinted in R. Butts and J. Hintikka eds. (1977) Foundational Problems in the Special Sciences (Boston: Reidel). (esp. section 2)
Harsanyi, John (1977) "Nonlinear Social Welfare Functions: A Rejoinder to Professor Sen," in Butts and Hintikka
Sen, Amartya (1977) "Non-linear Social Welfare Functions: A Reply to Professor Harsanyi," in Butts and Hintikka
Sen, Amartya (1979) "Utilitarianism and Welfarism" The Journal of Philosophy 76(9): 463-489 (esp. section 2)

Parts of the Hintikka and Butts volume are available in Google Books.

↑ comment by PhilGoetz · 2009-04-07T22:03:43.424Z · LW(p) · GW(p)

As far as I understand the word, utilitarianism means summing people's welfare; if you place any intrinsic value on equality, you aren't any kind of utilitarian.

Utilitarianism means computing a utility function. It doesn't AFAIK have to be a sum.

If in all the axioms of the expected utility theorem you replace lotteries by distributions of individual welfare, then the theorem proves that you have to accept utilitarianism. People who place intrinsic value on inequality would deny that some of the axioms, like maybe transitivity or independence, hold for distributions of individual welfare. And the question now is, if they're not necessarily irrational to do so, is it necessarily irrational to deny the same axioms as applying to merely possible worlds?

(average utilitarianism, that is)

YES YES YES! Thank you!

You're the first person to understand.

The theorem doesn't actually prove it, because you need to account for different people having different weights in the combination function; and more especially for comparing situations with different population sizes.

And who knows, total utilities across two different populations might turn out to be incommensurate.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-04-06T17:11:12.798Z · LW(p) · GW(p)

We should maximize average utility across all living people.

(Actually all people, but dead people are hard to help.)

Replies from: PhilGoetz, None, steven0461, cousin_it, cousin_it

↑ comment by PhilGoetz · 2009-04-06T17:38:58.761Z · LW(p) · GW(p)

As is well known, I have a poor model of Eliezer.

(I realize Eliezer is familiar with the problems with taking average utility; I write this for those following the conversation.)

So, if we are to choose between supporting a population of 1,000,000 people with a utility of 10, or 1 person with a utility of 11, we should choose the latter? If someone's children are going to be born into below-average circumstances, it would be better for us to prevent them from having children?

(I know that you spoke of all living people; but we need a definition of rationality that addresses changes in population.)

Inequitable distributions of utility are as good as equitable distributions of utility? You have no preference between 1 person with a utility of 100, and 9 people with utilities of 0, versus 10 people with utilities of 10? (Do not invoke economics to claim that inequitable distributions of utility are necessary for productivity. This has nothing to do with that.)

Ursula LeGuin wrote a short story about this, called "The ones who walk away from Omelas", which won the Hugo in 1974. (I'm not endorsing it; merely noting it.)

Replies from: Eliezer_Yudkowsky, ciphergoth

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-04-06T18:05:58.419Z · LW(p) · GW(p)

You don't interpret "utility" the same way others here do, just like the word "happiness". Our utility inherently includes terms for things like inequity. What you are using the word "utility" here for would be better described as "happiness".

Since your title said "maximizing expected utility is wrong" I assumed that the term "average" was to be taken in the sense of "average over probabilities", but yes, in a Big and possibly Infinite World I tend toward average utilitarianism.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T19:28:35.439Z · LW(p) · GW(p)

You don't interpret "utility" the same way others here do, just like the word "happiness". Our utility inherently includes terms for things like inequity. What you are using the word "utility" here for would be better described as "happiness".

We had the happiness discussion already. I'm using the same utility-happiness distinction now as then.

(You're doing that "speaking for everyone" thing again. Also, what you would call "speaking for me", and misinterpreting me. But that's okay. I expect that to happen in conversations.)

Our utility inherently includes terms for things like inequity.

The little-u u(situation) can include terms for inequity. The big-U U(lottery of situations) can't, if you're an expected utility maximizer. You are constrained to aggregate over different outcomes by averaging.

Since the von Neumann-Morgenstern theorem indicates that averaging is necessary in order to avoid violating their reasonable-seeming axioms of utility, my question is then whether it is inconsistent to use expected utility over possible outcomes, and NOT use expected utility across people.

Since you do both, that's perfectly consistent. The question is whether anything else makes sense in light of the von Neumann-Morgenstern theorem.

If you maximize expected utility, that means that an action that results in utility 101 for one future you in one possible world, and utility 0 for 9 future yous in 9 equally-likely possible worlds; is preferable to an action that results in utility 10 for all 10 future yous. That is very similar to saying that you would rather give utilty 101 to 1 person and utility 0 to 9 other people, than utility 10 to 10 people.

Replies from: Emile, Vladimir_Nesov, conchis, Jonathan_Graehl, loqi

↑ comment by Emile · 2009-04-06T20:59:03.259Z · LW(p) · GW(p)

If your utility function were defined over all possible worlds, you would just say "maximize utility" instead of "maximize expected utility".

I disagree: that's only the case if you have perfect knowledge.

Case A: I'm wondering whether to flip the switch of my machine. The machine causes a chrono-synclastic infundibulum, which is a physical phenomenon that has a 50% chance of causing a lot of awesomeness (+100 utility), and a 50% chance of blowing up my town (-50 utility).

Case B: I'm wondering whether to flip the switch of my machine, a friendly AI I just programmed. I don't know whether I programmed it right, if I did it will bring forth an awesome future (+100 utility), if I didn't it will try to enslave mankind (-50 utility). I estimate that my program has 50% chances of being right.

Both cases are different, and if you have a utility function that's defined over all possible future words (that just takes the average), you could say that flipping the switch in the first case has utility of +50, and in the second case, expected utility of +50 (actually, utility of +100 or -50, but you don't know which).

↑ comment by Vladimir_Nesov · 2009-04-06T20:59:55.222Z · LW(p) · GW(p)

Phil, this is something eerie, totally different from the standard von Neumann-Morgenstern expected utility over the world histories, which is what people usually refer to when talking about the ideal view on the expected utility maximization. Why do you construct this particular preference order? What do you answer to the standard view?

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T21:16:59.091Z · LW(p) · GW(p)

I don't understand the question. Did I define a preference order? I thought I was just pointing out an unspoken assumption. What is the difference between what I have described as maximizing expected utility, and the standard view?

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2009-04-06T21:25:18.991Z · LW(p) · GW(p)

The following passage is very strange, it shows either lack of understanding, or some twisted terminology.

A utility measure discounts for inequities within any single possible outcome. It does not discount for utilities across the different possible outcomes. It can't, because utility functions are defined over a single world, not over the set of all possible worlds. If your utility function were defined over all possible worlds, you would just say "maximize utility" instead of "maximize expected utility".

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T23:39:05.873Z · LW(p) · GW(p)

It shows twisted terminology. I rewrote the main post to try to fix it.

I'd like to delete the whole post in shame, but I'm still confused as to whether we can be expected utility maximizers without being average utilitarianists.

Replies from: loqi, loqi

↑ comment by loqi · 2009-04-07T02:44:36.662Z · LW(p) · GW(p)

I've thought about this a bit more, and I'm back to the intuition that you're mixing up different concepts of "utility" somewhere, but I can't make that notion any more precise. You seem to be suggesting that certain seemingly plausible preferences cannot be properly expressed as utility functions. Can you give a stripped-down, "single-player" example of this that doesn't involve other people or selves?

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T03:33:58.376Z · LW(p) · GW(p)

You seem to be suggesting that certain seemingly plausible preferences cannot be properly expressed as utility functions.

Here's a restatement:

We have a utility function u(outcome) that gives a utility for one possible outcome.
We have a utility function U(lottery) that gives a utility for a probability distribution over all possible outcomes.
The von Neumann-Morgenstern theorem indicates that the only reasonable form for U is to calculate the expected value of u(outcome) over all possible outcomes.
This means that your utility function U is indifferent with regard to whether the distribution of utility is equitable among your future selves. Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.
This is the same sort of ethical judgement that an average utilitarian makes when they say that, to calculate social good, we should calculate the average utility of the population.
Therefore, I think that the von Neumann-Morgenstern theorem does not prove, but provides very strong reasons for thinking, that average utilitarianism is correct.
And yet, average utilitarianism asserts that equity of utility, even among equals has no utility. This is shocking.

Replies from: Peter_de_Blanc, Vladimir_Nesov, loqi

↑ comment by Peter_de_Blanc · 2009-04-07T03:55:54.407Z · LW(p) · GW(p)

If you want a more equitable distribution of utility among future selves, then your utility function u(outcome) may be a different function than you thought it was; e.g. the log of the function you thought it was.

More generally, if u is the function that you thought was your utility function, and f is any monotonically increasing function on the reals with f'' < 0, then by Jensen's inequality, an expected f''(u)-maximizer would prefer to distribute u-utility equitably among its future selves.

Replies from: conchis, PhilGoetz

↑ comment by conchis · 2009-04-07T11:22:15.149Z · LW(p) · GW(p)

Exactly. (I didn't realize the comments were continuing down here and made the essentially same point here after Phil amended the post.)

The interesting point that Phil raises is whether there's any reason to have a particular risk preference with respect to u. I'm not sure that the analogy between being inequality averse amongst possible "me"s and and inequality averse amongst actual others gets much traction once we remember that probability is in the mind. But it's an interesting question nonetheless.

Allais, in particular argued that any form of risk preference over u should be allowable, and Broome finds this view "very plausible". All of which seems to make rational decision-making under uncertainty much more difficult, particularly as it's far from obvious that we have intuitive access to these risk preferences. (I certainly don't have intuitive access to mine.)

P.S. I assume you mean f(u)-maximizer rather than f''(u)-maximizer?

Replies from: Peter_de_Blanc

↑ comment by Peter_de_Blanc · 2009-04-07T15:49:33.275Z · LW(p) · GW(p)

Yes, I did mean an f(u)-maximizer.

↑ comment by PhilGoetz · 2009-04-17T21:04:02.835Z · LW(p) · GW(p)

Yes - and then the f(u)-maximizer is not maximizing expected utility! Maximizing expected utility requires not wanting equitable distribution of utility among future selves.

↑ comment by Vladimir_Nesov · 2009-04-07T11:04:47.617Z · LW(p) · GW(p)

This is the same sort of ethical judgement that an average utilitarian makes when they say that, to calculate social good, we should calculate the average utility of the population.

Nope. You can have u(10 people alive) = -10 and u(only 1 person is alive)=100 or u(1 person is OK and another suffers)=100 and u(2 people are OK)=-10.

Replies from: PhilGoetz, thomblake

↑ comment by PhilGoetz · 2009-04-07T14:42:49.467Z · LW(p) · GW(p)

Not unless you mean something very different than I do by average utilitarianism.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2009-04-07T17:12:25.766Z · LW(p) · GW(p)

I objected to drawing the analogy, and gave the examples that show where the analogy breaks. Utility over specific outcomes values the whole world, with all people in it, together. Alternative possibilities for the whole world figuring into the expected utility calculation are not at all the same as different people. People that the average utilitarianism talks about are not from the alternative worlds, and they do not each constitute the whole world, the whole outcome. This is a completely separate argument, having only surface similarity to the expected utility computation.

↑ comment by thomblake · 2009-04-07T14:35:45.431Z · LW(p) · GW(p)

Maybe I'm missing the brackets between your conjunctions/disjunctions, but I'm not sure how you're making a statement about Average Utilitarianism.

↑ comment by loqi · 2009-04-07T04:01:02.357Z · LW(p) · GW(p)

We have a utility function u(outcome) that gives a utility for one possible outcome.

We have a utility function U(lottery) that gives a utility for a probability distribution over all possible outcomes.

The von Neumann-Morgenstern theorem indicates that the only reasonable form for U is to calculate the expected value of u(outcome) over all possible outcomes.

I'm with you so far.

This means that your utility function U is indifferent with regard to whether the distribution of utility is equitable among your future selves. Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.

What do you mean by "distribute utility to your future selves"? You can value certain circumstances involving future selves higher than others, but when you speak of "their utility" you're talking about a completely different thing than the term u in your current calculation. u already completely accounts for how much they value their situation and how much you care whether or not they value it.

This is the same sort of ethical judgement that an average utilitarian makes when they say that, to calculate social good, we should calculate the average utility of the population.

I don't see how this at all makes the case for adopting average utilitarianism as a value framework, but I think I'm missing the connection you're trying to draw.

↑ comment by loqi · 2009-04-07T01:04:24.013Z · LW(p) · GW(p)

I'd hate to see it go. I think you've raised a really interesting point, despite not communicating it clearly (not that I can probably even verbalize it yet). Once I got your drift it confused the hell out of me, in a good way.

Assuming I'm correct that it was basically unrelated, I think your previous talk of "happiness vs utility" might have primed a few folks to assume the worst here.

↑ comment by conchis · 2009-04-06T19:44:54.419Z · LW(p) · GW(p)

Phil, you're making a claim that what others say about utility (i.e. that it's good to maximize its expectation) is wrong. But it's only on your idiosyncratic definition of utility that your argument has any traction.

You are free to use words any way you want (even if I personally find your usage frustrating at times). But you are not free to redefine others' terms to generate an artificial problem that isn't really there.

The injunction to "maximize expected utility" is entirely capable of incorporating your concerns. It can be "inequality-averse" if you want, simply by making it a concave function of experienced utility.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T21:01:32.450Z · LW(p) · GW(p)

The injunction to "maximize expected utility" is entirely capable of incorporating your concerns. It can be "inequality-averse" if you want, simply by making it a concave function of experienced utility

No. I've said this 3 times already, including in the very comment that you are replying to. The utility function is not defined across all possible outcomes. A utility function is defined over a single outcome; it evaluates a single outcome. It can discount inequalities within that outcome. It cannot discount across possible worlds. If it operated across all possible worlds, all you would say is "maximize utility". The fact that you use the word "expected" means "average over all possible outcomes". That is what "expected" means. It is a mathematical term whose meaning is already established.

Replies from: loqi, conchis, Vladimir_Nesov

↑ comment by loqi · 2009-04-06T21:10:32.917Z · LW(p) · GW(p)

You can safely ignore my previous reply, I think I finally see what you're saying. Not sure what to make of it yet, but I was definitely misinterpreting you.

↑ comment by conchis · 2009-04-06T22:07:32.446Z · LW(p) · GW(p)

Repeating your definition of a utility function over and over again doesn't oblige anybody else to use it. In particular, it doesn't oblige all those people who have argued for expected utility maximization in the past to have adopted it before you tried to force it on them.

A von Neumann-Morgenstern utility function (which is what people are supposed to maximize the expectation of) is a representation of a set of consistent preferences over gambles. That is all it is. If your proposal results in a set of consistent preferences over gambles (I see no particular reason for it not to, but I could be wrong) then it corresponds to expected utility maximization for some utility function. If it doesn't, then either it is inconsistent, or you have a beef with the axioms that runs deeper than an analogy to average utilitarianism.

↑ comment by Vladimir_Nesov · 2009-04-06T21:14:40.943Z · LW(p) · GW(p)

"Expected" means expected value of utility function of possible outcomes, according to the probability distribution on the possible outcomes.

↑ comment by Jonathan_Graehl · 2009-04-06T22:43:42.554Z · LW(p) · GW(p)

If you don't prefer 10% chance of 101 utilons to 100% chance of 10, then you can rescale your utility function (in a non-affine manner). I bet you're thinking of 101 as "barely more than 10 times as much" of something that faces diminishing returns. Such diminishing returns should already be accounted for in your utility function.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T03:17:03.970Z · LW(p) · GW(p)

I bet you're thinking of 101 as "barely more than 10 times as much" of something that faces diminishing returns.

No. I've explained this in several of the other comments. That's why I used the term "utility function", to indicate that diminishing returns are already taken into account.

↑ comment by loqi · 2009-04-06T21:03:43.726Z · LW(p) · GW(p)

It can't, because utility functions are defined over a single world, not over the set of all possible worlds. If your utility function were defined over all possible worlds, you would just say "maximize utility" instead of "maximize expected utility".

This doesn't sound right to me. Assuming "world" means "world at time t", a utility function at the very least has type (World -> Utilons). It maps a single world to a single utility measure, but it's still defined over all worlds, the same way that (+3) is defined over all integers. If it was only defined for a single world it wouldn't really be much of a function, it'd be a constant.

We use expected utility due to uncertainty. If we had perfect information, we could maximize utility by searching over all action sequences, computing utility for each resulting world, and returning the sequence with the highest total utility.

If you maximize expected utility, that means that an action that results in utility 101 for one future you in one possible world, and utility 0 for 9 future yous in 9 equally-likely possible worlds

I think this illustrates the problem with your definition. The utility you're maximizing is not the same as the "utility 101 for one future you". You first have to map future you's utility to just plain utility for any of this to make sense.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T03:26:00.448Z · LW(p) · GW(p)

It maps a single world to a single utility measure, but it's still defined over all worlds,

I meant "the domain of a utility function is a single world."

However, it turns out that the standard terminology includes both utility functions over a single world ("outcome"), and a big utility function over all possible worlds ("lottery").

My question/observation is still the same as it was, but my misuse of the terminology has mangled this whole thread.

↑ comment by Paul Crowley (ciphergoth) · 2009-04-06T20:20:31.007Z · LW(p) · GW(p)

The reason why an inequitable distribution of money is problematic is that money has diminishing marginal utility; so if a millionaire gives $1000 to a poor person, the poor person gains more than the millionaire loses.

If your instincts are telling you that an inequitable distribution of utility is bad, are you sure you're not falling into the "diminishing marginal utility of utility" error that people have been empirically shown to exhibit? (can't find link now, sorry, I saw it here).

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T20:57:46.614Z · LW(p) · GW(p)

The reason why an inequitable distribution of money is problematic is that money has diminishing marginal utility; so if a millionaire gives $1000 to a poor person, the poor person gains more than the millionaire loses.

That's why I said "utility" instead of "money".

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2009-04-06T22:24:43.109Z · LW(p) · GW(p)

Er, I know, I'm contrasting money and utility. Could you expand a little more on what you're trying to say about my point?

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T22:31:20.939Z · LW(p) · GW(p)

The term "utility" means that I'm taking diminishing marginal returns into account.

My instincts are confused on the point, but my impression is that most people find average utilitarianism reprehensible.

Replies from: dclayh

↑ comment by dclayh · 2009-04-06T22:57:07.920Z · LW(p) · GW(p)

Perhaps it would help if you gave a specific example of an action that (a) follows from average utilitarianism as you understand it, and (b) you believe most people would find reprehensible?

Replies from: Nick_Tarleton, Nick_Tarleton

↑ comment by Nick_Tarleton · 2009-04-06T23:12:40.739Z · LW(p) · GW(p)

The standard answer is killing a person with below-average well-being*, assuming no further consequences follow from this. This assumes dying has zero disutility, however.

See comments on For The People Who Are Still Alive for lots of related discussion.

*The term "experienced utility" seems to be producing a lot of confusion. Utility is a decision-theoretic construction only. Humans, as is, don't have utility functions.

Replies from: CarlShulman, ciphergoth

↑ comment by CarlShulman · 2009-04-07T02:23:29.082Z · LW(p) · GW(p)

It also involves maximizing average instantaneous welfare, rather than the average of whole-life satisfaction.

↑ comment by Paul Crowley (ciphergoth) · 2009-04-06T23:24:08.534Z · LW(p) · GW(p)

Yes, I'm surprised that it's average rather than total utility is being measured. All other things being equal, twice as many people is twice as good to me.

↑ comment by Nick_Tarleton · 2009-04-06T23:09:23.380Z · LW(p) · GW(p)

The standard answer is killing a person with below-average well-being*, assuming no further consequences follow from this. This assumes dying has zero disutility, however.

See comments on For The People Who Are Still Alive for lots of related discussion.

*I consider the term "experienced utility" harmful. Utility is a decision-theoretic abstraction, not an experience.

↑ comment by [deleted] · 2014-02-15T17:18:19.336Z · LW(p) · GW(p)

Dead people presumably count as zero utility. I was rather frightened before I saw that -- if you only count living people, then you'd be willing to kill people for the crime of not being sufficiently happy or fulfilled.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-02-15T17:37:40.220Z · LW(p) · GW(p)

Dead people presumably count as zero utility.

This sentence doesn't really mean much. A dead person doesn't have preferences or utility (zero or otherwise) when dead any more than a rock does, a dead person had preferences and utility when alive. The death of a living person (who preferred to live) reduces average utility because the living person preferred to not die, and that preference is violated!

you'd be willing to kill people for the crime of not being sufficiently happy or fulfilled

I support the right to euthanasia for people who truly prefer to be killed, e.g. because they suffer from terminal painful diseases. Do you oppose it?

Perhaps you should see it more clearly if you think of it as maximizing the average preference utility across the timeline, rather than the average utility at a single point in time.

Replies from: None, Vulture

↑ comment by [deleted] · 2014-02-15T21:25:21.418Z · LW(p) · GW(p)

The death of a living person (who preferred to live) reduces average utility because the living person preferred to not die, and that preference is violated!

But after the fact, they are not alive, so they do not impact the average utility across all living things, so you are increasing the average utility across all living things.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-02-15T22:05:42.246Z · LW(p) · GW(p)

Here's what I mean, roughly expressed. Two possible timelines.
(A)

2010 Alice loves her life (she wants to live with continued life being a preference satisfaction of 7 per year). Bob merely likes his life (he wants to live with continued life being a preference satisfaction of 6 per year).
2011 Alice and 2011 Bob both alive as before.
2012 Alice and 2012 Bob both alive as before.
2013 Alice and 2013 Bob both alive as before.

2010 Bob wants 2010 Bob to exist, 2011 Bob to exist, 2012 Bob to exist, 2013 Bob to exist.
2010 Alice wants 2010 Alice to exist, 2011 Alice to exist, 2012 Alice to exist, 2013 Alice to exist.

2010 average utility is therefore (4x7 + 4x6) / 2= 26 and that also remains the average for the whole timeline.

(B)

2010 Alice and Bob same as before.
2011 Alice is alive. Bob has just been killed.
2012 Alice alive as before.
2013 Alice alive as before.

2010 average utility is: 4x7 +6 = (4x7 + 1x6)/2 = 17
2011 average utility is: 4x7 = 28
2012 average utility is: 4x7 = 28
2013 average utility is: 4x7 = 28

So Bob's death increased the average utility indicated in the preferences of a single year. But average utilty across the timeline is now (28 + 6 + 28 + 28 + 28) / 5 = 23.6

In short the average utility of the timeline as a whole is decreased by taking out Bob.

Replies from: None

↑ comment by [deleted] · 2014-02-16T00:24:04.431Z · LW(p) · GW(p)

You are averaging based on the population at the start of the experiment. In essence, you are counting dead people in your average, like Eliezer's offhanded comment implied he would. Also, you are summing over the population rather than averaging.

Correcting those discrepancies, we would see (ua => "utils average"; murder happening New Year's Day 2011):

(A) 2010: 6.5ua; 2011: 6.5ua; 2012: 6.5ua; 2013: 6.5ua
(B) 2010: 6.5ua; 2011: 7.0ua; 2012: 7.0ua; 2013: 7.0ua

The murder was a clear advantage.

Now, let's say we are using procreation instead of murder as the interesting behavior. Let's say each act of procreation reduces the average utility by 1, and it starts at 100 at the beginning of the experiment, with an initial population of 10.

In the first year, we can decrease the average utility by 10 in order to add one human with 99 utility. When do we stop adding humans? Well, it's clear that the average utility in this contrived example is equal to 110 minus the total population, and the total utility is equal to the average times the population size. If we have 60 people, that means our average utility is 50, with a total of 3,000 utils. Three times as good for everyone, except half as good for our original ten people.

We maximize the utility at a population of 55 in this example (and 55 average utility) -- but that's because we can't add new people very efficiently. If we had a very efficient way of adding more people, we'd end up with the average utility being just barely better than death, but we'd make up for it in volume. That's what you are suggesting we do.

That also isn't a universe I want to live in. Eliezer is suggesting we count dead people in our averages, nothing more. That's sufficient to go from kill-almost-all-humans to something we can maybe live with. (Of course, if we counted their preferences, that would be a conservatizing force that we could never get rid of, which is similarly worrying, albeit not as much so. In the worst case, we could use an expanding immortal population to counter it. Still-living people can change their preferences.)

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-02-16T03:03:55.955Z · LW(p) · GW(p)

You are averaging based on the population at the start of the experiment. In essence, you are counting dead people in your average, like Eliezer's offhanded comment implied he would

I consider every moment of living experience as of equal weight. You may call that "counting dead people" if you want, but that's only because when considering the entire timeline I consider every living moment -- given a single timeline, there's no living people vs dead people, there's just people living in different times. If you calculate the global population it doesn't matter what country you live in -- if you calculate the utility of a fixed timeline, it doesn't matter what time you live in.

But the main thing I'm not sure you get is that I believe preferences are valid also when concerning the future, not just when concerning the present.

If 2014 Carl wants the state of the world to be X in 2024, that's still a preference to be counted, even if Carl ends up dead in the meantime. That Carl severely does NOT want to be dead in 2024, means that there's a heavy disutility penalty for the 2014 function of his utility if he ends up nonetheless dead in 2024.

Of course, if we counted their preferences, that would be a conservatizing force that we could never get rid of

If e.g. someone wants to be buried at sea because he loves the sea, I consider it good that we bury them at sea.
But if someone wants to be buried at sea only because he believes such a ritual is necessary for his soul to be resurrected by God Poseidon, his preference is dependent on false beliefs -- it doesn't represent true terminal values; and that's the ones I'm concerned about.

If conservatism is e.g. motivated by either wrong epistemic beliefs, or by fear, rather than true different terminal values, it should likewise not modify our actions, if we're acting from an epistemically superior position (we know what they didn't).

Replies from: Jiro, None

↑ comment by Jiro · 2014-02-16T05:50:25.312Z · LW(p) · GW(p)

when considering the entire timeline I consider every living moment -- given a single timeline, there's no living people vs dead people, there's just people living in different times. If you calculate the global population it doesn't matter what country you live in -- if you calculate the utility of a fixed timeline, it doesn't matter what time you live in.

That's an ingenious fix, but when I think about it I'm not sure it works. The problem is that although you are calculating the utility integrated over the timeline, the values that you are integrating are still based on a particular moment. In other words, calculating the utility of the 2014-2024 timeline by 2014 preferences might not produce the same result as calculating the utility of the 2014-2024 timeline by 2024 preferences. Worse yet, if you're comparing two timelines and the two timelines have different 2024s in them, and you try to compare them by 2024 preferences, which timeline's 2024 preferences do you use?

For instance, consider timeline A: Carl is alive in 2014 and is killed soon afterwards, but two new people are born who are alive in 2024. timeline B: Carl is alive in 2014 and in 2024, but the two people from A never existed.

If you compare the timelines by Carl's 2014 preferences or Carl's timeline B 2024 preferences, timeline B is better, because timeline B has a lot of utility integrated over Carl's life. If you compare the timelines by the other people's timeline A 2024 preferences, timeline A is better.

It's tempting to try to fix this argument by saying that rather than using preferences at a particular moment, you will use preferences integrated over the timeline, but if you do that in the obvious way (by weighting the preferences according to the person-hours spent with that preference), then killing someone early reduces the contribution of their preference to the integrated utility, causing a problem similar to the original one.

↑ comment by [deleted] · 2014-02-16T03:32:30.265Z · LW(p) · GW(p)

I think you're arguing against my argument against a position you don't hold, but which I called by a term that sounds to you like your position.

Assuming you have a function that yields the utility that one person has at one particular second, what do you want to optimize for?

And maybe I should wait until I'm less than 102 degrees Fahrenheit to continue this discussion.

↑ comment by Vulture · 2014-02-15T18:15:26.762Z · LW(p) · GW(p)

you'd be willing to kill people for the crime of not being sufficiently happy or fulfilled

I support the right to euthanasia for people who truly prefer to be killed, e.g. because they suffer from terminal painful diseases. Do you oppose it?

I believe that what dhasenan was getting at is that without the assumption that a dead person has 0 utility, you would be willing to kill people who are happy (positive utility), but just not as happy as they could be. I'm not sure how exactly this would go mathematically, but the point is that killing a +utility person being a reduction in utility is a vital axiom

Replies from: None, ArisKatsaris

↑ comment by [deleted] · 2014-02-15T21:24:07.545Z · LW(p) · GW(p)

It's not that they could be happier. Rather, if the average happiness is greater than my happiness, the average happiness in the population will be increased if I die (assuming the other effects of a person dying are minimal or sufficiently mitigated).

↑ comment by ArisKatsaris · 2014-02-15T19:43:17.423Z · LW(p) · GW(p)

but the point is that killing a +utility person being a reduction in utility is a vital axiom

I don't know if we need have it as an axiom rather than this being a natural consequence of happy people preferring not to be killed, and of us likewise preferring not to kill them, and of pretty much everyone preferring their continued lives to their deaths... The good of preference utilitarianism is that it takes all these preferences as an input.

If preference average utilitarianism nonetheless leads to such an abominable conclusion, I'll choose to abandon preference average utilitarianism, considering it a failed/misguided attempt at describing my sense of morality -- but I'm not certain it needs lead to such a conclusion at all.

↑ comment by steven0461 · 2009-04-06T23:05:06.133Z · LW(p) · GW(p)

If I agreed, I'd be extremely curious as to what the average utility for all people across the multiverse actually is. (Is it dominated by people with extremely short lifespans, because they use so little computing power in a 4D sense?)

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-04-06T23:46:43.959Z · LW(p) · GW(p)

On average? 67 utilons.

Replies from: cousin_it

↑ comment by cousin_it · 2012-07-26T19:06:59.036Z · LW(p) · GW(p)

If you want to maximize a value, please don't compute it unconditionally, instead compute its dependence on your actions.

Replies from: wedrifid

↑ comment by wedrifid · 2012-07-27T03:03:37.562Z · LW(p) · GW(p)

If you want to maximize a value, please don't compute it unconditionally, instead compute its dependence on your actions.

This seems like a useful request to make in a different context. It doesn't appear relevant to the grandparent.

Replies from: cousin_it

↑ comment by cousin_it · 2012-07-27T10:25:18.238Z · LW(p) · GW(p)

Why? Eliezer said he wanted to maximize the average utility of all people, then said that average utility was 67. Now he faces the difficult task of maximizing 67. Or maybe maximizing the utility of people who share a planet with him, at the expense of other people in existence, so the average stays 67. Am I missing something? :-)

↑ comment by cousin_it · 2009-04-06T20:30:52.222Z · LW(p) · GW(p)

Excuse me, what's "average utility"? How do you compare utils of different people? Don't say you're doing it through the lens of your own utility function - this is begging the question.

Replies from: Jonathan_Graehl

↑ comment by Jonathan_Graehl · 2009-04-06T22:32:31.639Z · LW(p) · GW(p)

Coherent Extrapolated Volition tries to resolve conflicting individual utilities - see "Fred wants to kill Steve".

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2009-04-06T22:37:07.159Z · LW(p) · GW(p)

At this point, it looks like resolving conflicts should just be carried out as cooperation of individual preferences.

↑ comment by cousin_it · 2009-04-06T17:21:42.174Z · LW(p) · GW(p)

They're easy to help if their utility functions included terms that outlived them, e.g. "world peace forever". But it still feels somehow wrong to include them in the calculation, because this will necessarily be at the expense of living and future people.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-04-06T17:25:18.726Z · LW(p) · GW(p)

In my non-professional capacity, when I try to help others, I'm doing so to optimize my utility function over them: I want them to be happy, and only living people can fulfill this aspect of my utility function. It's in this sense that I say "we should" meaning "altruists should do the analogue for their own utility functions".

comment by private_messaging · 2014-01-28T19:55:07.064Z · LW(p) · GW(p)

Giving diminished returns on some valuable quantity X, equal distribution of X is preferable anyway.

I think you're quite confused about the constraints imposed by von Neumann-Morgenstern theorem.

In particular, it doesn't in any way imply that if you slice a large region of space into smaller regions of space, the utility of the large region of space has to be equal to the sum of utilities of smaller regions of space considered independently by what ever function gives you the utility within a region of space. Space being the whole universe, smaller regions of space being, say, spheres fitted around people's brains. You get the idea.

comment by timtyler · 2012-12-16T00:38:20.915Z · LW(p) · GW(p)

This post seems incoherent to me :-( It starts out talking about personal utilities, and then draws conclusions about the social utilities used in utilitarianism. Needless to say, the argument is not a logical one.

Proving that average utilitarianism is correct seems like a silly goal to me. What does it even mean to prove an ethical theory correct? It doesn't mean anything. In reality, evolved creatures exhibit a diverse range of ethical theories, that help them to attain their mutually-conflicting goals.

comment by Pablo (Pablo_Stafforini) · 2012-03-06T20:56:28.307Z · LW(p) · GW(p)

[Average utilitarianism] implies that for any population consisting of very good lives there is a better population consisting of just one person leading a life at a slightly higher level of well-being (Parfit 1984 chapter 19). More dramatically, the principle also implies that for a population consisting of just one person leading a life at a very negative level of well-being, e.g., a life of constant torture, there is another population which is better even though it contains millions of lives at just a slightly less negative level of well-being (Parfit 1984). That total well-being should not matter when we are considering lives worth ending is hard to accept. Moreover, average utilitarianism has implications very similar to the Repugnant Conclusion (see Sikora 1975; Anglin 1977).

Average utilitarianism has even more implausible implications. Consider a world A in which people experience nothing but agonizing pain. Consider next a different world B which contains all the people in A, plus arbitrarily many people all experiencing pain only slightly less intense. Since the average pain in B is less than the average pain in A, average utilitarianism implies that B is better than A. This is clearly absurd, since B differs from A only in containing a surplus of arbitrarily many people experiencing nothing but intense pain. How could one possibly improve a world by merely adding lots of pain to it?

Replies from: ArisKatsaris, PhilGoetz

↑ comment by ArisKatsaris · 2014-01-27T23:24:20.343Z · LW(p) · GW(p)

There's a confusion in asking "which world is better" without specifying for whom. 'Better' is a two-place word, better for whom?

World B is better than world A for an average person. Because an average person is going to be experiencing less pain in World B than an average person in world A.

World B is also better if you have to choose which of the two worlds you want to be an average person in.

Replies from: Pablo_Stafforini

↑ comment by Pablo (Pablo_Stafforini) · 2014-01-28T11:15:37.140Z · LW(p) · GW(p)

'Better' is a two-place word, better for whom?

Argument needed.

ETA: Dear downvoter: okay, next time I will abstain from suggesting that assertions should be supported by arguments, even if these assertions are disputed by the relevant community of experts.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-01-28T11:52:05.964Z · LW(p) · GW(p)

If you're defining 'better' in a way that doesn't involve someone judging something to be preferable to the alternative, then please let me know how you're defining it.

This isn't about arguments, this is about what we mean when we use words. What do you mean when you say 'better'?

Replies from: Pablo_Stafforini

↑ comment by Pablo (Pablo_Stafforini) · 2014-01-28T13:28:28.553Z · LW(p) · GW(p)

That a state of affairs can only be better if it is better for someone is a substantive and controversial position in moral philosophy. Statements of the form "World A is better than world B, both of which contain no people" are meaningful, at least to many of those who have considered this question in detail. In general, I don't think it's a good idea to dismiss a claim as "confused" when it is made by a group of smart people who are considered experts in the field (e.g. G. E. Moore or Larry Temkin).

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-01-28T14:57:48.457Z · LW(p) · GW(p)

You've used a phrase like "clearly absurd" to characterize a belief that World B is "better" than World A, but you say that the semantics of the word 'better' are controversial.

You've used personal suffering to judge the quality of World B vs the quality of World A, and yet it seems "better" doesn't need concern itself with personal viewpoints at all.

From the inside, every inhabitant of World A would prefer to have the average existence of World B, and yet it's "clearly absurd" to believe that this means World B is better.

Yes, these claims seem confused and contradictory to me.

Replies from: Pablo_Stafforini

↑ comment by Pablo (Pablo_Stafforini) · 2014-01-28T19:12:36.040Z · LW(p) · GW(p)

You've used a phrase like "clearly absurd" to characterize a belief that World B is "better" than World A, but you say that the semantics of the word 'better' are controversial.

The absurdity that I was pointing out doesn't turn on the semantic ambiguity that you are referring to. Regardless of what you think about the meaning of 'better', it is clearly absurd to say that a world that differs from another only in containing a surplus of (undeserved, instrumentally useless) suffering is better. This is why virtually every contemporary moral philosopher rejects average utilitarianism.

From the inside, every inhabitant of World A would prefer to have the average existence of World B, and yet it's "clearly absurd" to believe that this means World B is better.

This is not an accurate characterization of the scenario I described. Please re-read my earlier comment.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-01-28T19:21:32.795Z · LW(p) · GW(p)

It is clearly absurd to say that a world that differs from another only in containing a surplus of (undeserved, instrumentally useless) suffering is better.

Except when you add "...for the average person in it" at the end of the sentence, in short when you specify who is it supposed to be better for, rather than use a vague, confused and unspecified notion of 'better'.

From the outside you say that a world with 10 people suffering LOTS is better than a world where (10 people suffer LOTS and 90 people suffer slightly less).

But given a choice, you'll choose to one of the people in the second world rather than one of the people in the first world. So what exactly is the first world "better" at? Better at not horrifying you when viewed from the outside, not better at not hurting you when experienced from the inside.

So, again, I dispute how "clearly absurd" preferring the second world is.

This is why virtually every contemporary moral philosopher rejects average utilitarianism.

I don't care what every contemporary moral philosopher says. If someone can't decide what something is supposed to be better at, then they have no business using the word 'better' in the first place.

Replies from: Pablo_Stafforini

↑ comment by Pablo (Pablo_Stafforini) · 2014-02-01T13:33:29.818Z · LW(p) · GW(p)

From the outside you say that a world with 10 people suffering LOTS is better than a world where (10 people suffer LOTS and 90 people suffer slightly less).

But given a choice, you'll choose to one of the people in the second world rather than one of the people in the first world. So what exactly is the first world "better" at? Better at not horrifying you when viewed from the outside, not better at not hurting you when experienced from the inside.

No, you are conflating two distinct questions here: the question of whether a world is better "from the inside" versus "from the outside", and the question of which world you would pick, if you knew the welfare levels of every person in either world, but ignored the identities of these people. Concerning the first question, which is the question relevant to our original discussion, it is clear that the world with the surplus of agony cannot be better, regardless of whether or not you believe that things can be better only if they are better for someone (i.e. better "from the inside").

I don't care what every contemporary moral philosopher says. If someone can't decide what something is supposed to be better at, then they have no business using the word 'better' in the first place.

I was trying to get you understand that it is probably unwise to assume that people much smarter than you, who have thought about the issue for much longer than you have, have somehow failed to notice a distinction (impersonally better versus better for someone) that strikes you as glaringly obvious. And, as a matter of fact, these people haven't failed to notice that distinction, as anyone with a passing knowledge of the literature will readily attest.

Here's a simple scenario that should persuade you that things are not nearly as simple as you seem to believe they are. (This case was first introduced by Derek Parfit, and has been discussed extensively by population ethicists.) Suppose a woman who desires to have children is told by a competent doctor that, if she conceives a child within the next month, the child will very likely be born with a major disability. However, if the mother waits and conceives the child after this critical period, the baby's chances of being born with that disability are close to zero. Most reasonable people will agree that the mother should wait, or at least that she has a strong reason to wait. But why should she? If the mother doesn't wait and has a disabled child, this child couldn't say, "Mother, you wronged me. If you had waited, I wouldn't have been born with this disability." If the mother had waited, a different child would have been born. So if we want to insist that the mother does have a reason for waiting, we must drop, or at least revise, the principle that things can be better only if they are better for someone.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2014-02-01T17:47:15.096Z · LW(p) · GW(p)

it is clear that the world with the surplus of agony cannot be better,

Okay, do you have any argument other than "it is clear". Once again: it may be clear to you, it's not clear to me. "Clear" being another two-place word.

I was trying to get you understand that it is probably unwise to assume that people much smarter than you,

What's your reason to believe that they're much smarter than me?

And of course, for you to use the argument of their smartness, and for me to accept it, I wouldn't just have to accept they're smarter than me, I would also have to accept that they're smarter than me and that you are interpreting and expressing their views correctly.

I'd rather discuss the issue directly, rather than yield to the views of authorities which I haven't read.

So if we want to insist that the mother does have a reason for waiting, we must drop, or at least revise, the principle that things can be better only if they are better for someone.

It's me who was arguing on the side of average utilitarianism, in short the idea that we can consider the life of the average person, not a real specific person. Average utilitarianism clearly sides with the idea that the woman ought wait.

As for the particular example you gave, any decision in our presents makes future people in our light-cones "different" than they would otherwise have been.

If we're making a distinction between "John-A, born on Oct-1, and suffers from a crippling ailment" and "John-B, born on Nov-1, perfectly health", then we should also be making a distinction between "John-A, born on Oct-1, and suffers from a crippling ailment" and "John-C, born on Oct-1, used to suffer from a crippling ailment and was cured via a medical procedure shortly afterwards".

↑ comment by PhilGoetz · 2014-01-27T22:08:19.044Z · LW(p) · GW(p)

You realize you just repeated the scenario described in the quote?

Replies from: Pablo_Stafforini

↑ comment by Pablo (Pablo_Stafforini) · 2014-01-28T11:11:39.709Z · LW(p) · GW(p)

I don't think I did. The implication I noted is even more absurd than those mentioned in the comment I was replying to. To see the difference, pay attention to the italicized text.

comment by Psy-Kosh · 2009-04-06T21:36:16.429Z · LW(p) · GW(p)

I may be misunderstanding here, but I think there's a distinction you're failing to make:

Max expected utility over possible future states (only one of which turns out to be real, so I guess max utility over expected future properties of the amplitude field over configuration space, rather than properties over individual configurations, if one want's to get nitpicky...), while average/total/whatever utilitarianism has to do with how you deal with summing the good experienced/recieved among people that would exist in the various modeled states.

At least that's my understanding.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T23:12:47.763Z · LW(p) · GW(p)

I'm looking at that distinction and un-making it. I don't see how you can choose not to average utility within an outcome, yet choose to average utility over possible future states.

Replies from: Psy-Kosh

↑ comment by Psy-Kosh · 2009-04-07T00:03:08.554Z · LW(p) · GW(p)

Oh, okay then. The version of the post that I read seemed to be more failing to notice it rather than trying to explicitly deal with it head on. Anyways, I'd still say there's a distinction that makes it not quite obvious that one implies the other.

Anyways, the whole maximize expected utility over future states (rather than future selves, I guess) comes straight out of the various theorems used to derive decision theory. Via the vulnerability arguments, etc, it's basically a "how not to be stupid, no matter what your values are" thing.

The average vs total utilitarianism thing would more be a moral position, a property of one's utility function itself. So that would have to come from somehow appealing to the various bits of us that process moral reasoning. At least in part. First, it requires an assumption of equal (in some form) inherent value of humans, in some sense. (Though now that I think about this, that condition may be weakenable)

Next, one has to basically, well, ultimately figure out whether maximizing average or total good across every person is preferable. Max good can produces oddities like the repugnant conclusion, of course.

Things like an appeal to, say, a sense of fairness would be an example of an argument for average utilitarianism.

An argument against would be, say, something like average utilitarianism would seem to imply that the inherent value of a person, that is, how important it is when something good or bad happens to them, would seem to decrease as population increases. This rubs my moral sense the wrong way.

(note, it's not currently obvious to me which is the Right Way, so am not trying to push either one. I'm merely giving examples of what kinds of arguments would, I think, be relevant here. ie, stuff that appeals to our moral senses or implications theirof rather than trying to make the internal structure of one's utility function correspond to the structure of decision theory itself. While utility functions like that may, in a certain sense, have a mathematical elegance, it's not really the type of argument that I'd think is at all relevant.)

comment by cja · 2009-04-07T01:41:44.525Z · LW(p) · GW(p)

This is back to the original argument, and not on the definition of expected utility functions or the status of utilitiarianism in general.

PhilGoetz's argument appears to contain a contradiction similar to that which Moore discusses in Principia Ethica, where he argues that the principle egoism does not entail utilitarianism.

Egoism: X ought to do what maximizes X's happiness.
Utilitarianism: X ought to do what maximizes EVERYONE's happiness

(or put Xo for X. and X_sub_x for Everyone).

X's happiness is not logically equivalent to Everyone's happiness. The important takeway here is that because happiness is indexed to an individual person (at least as defined in the egoistic principle), each person's happiness is an independent logical term.

We have to broaden the scope of egoism slightly to include whatever concept of the utility function you use, and the discussion of possible selves. However, unless you have a pretty weird concept of self/identity, I don't see why it wouldn't work. In that situation, X's future self in all possible worlds bears a relationship to X at time 0, such that future X's happiness is independent of future Everyone's happiness.

Anyway, using Von-Neumann Morgenstern doesn't work here. There is no logical reason to believe that averaging possible states with regard to an individual's utility has any implications for averaging happiness over many different individuals.

As addendum, neither average nor total utility provides a solution to the fairness, or justice, issue (i.e. how utility is distributed among people, which at least has some common sense gravity to it). Individual utility maximization more or less does not have to deal with that issue at all (their might be some issues with time-ordering of preferences, etc., but that's not close to the same thing). That's another sign Von-Neumann Morgenstern just doesn't give an answer as to which ethical system is more rational.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T03:52:11.850Z · LW(p) · GW(p)

There is no logical reason to believe that averaging possible states with regard to an individual's utility has any implications for averaging happiness over many different individuals.

How is it different? Aren't all of the different possible future yous different people? In both cases you are averaging utility over many different individuals. It's just that in one case, all of them are copies of you.

Replies from: cja

↑ comment by cja · 2009-04-07T04:46:20.984Z · LW(p) · GW(p)

That's why I threw in the disclaimer about needing some theory of self/identity. Possible future Phil's must bear a special relationship to the current Phil, which is not shared by all other future people--or else you lose egoism altogether when speaking about the future.

There are certainly some well thought out arguments that when thinking about your possible future, you're thinking about an entirely different person, or a variety of different possible people. But the more you go down that road, the less clear it is that classical decision theory has any rational claim on what you ought to do. The Ramsey/Von Neumann-Morgenstern framework tacitly requires that when a person acts so has to maximize his expected utility, he does so with the assumption that he his actually maximizing HIS expected utility, not someone else's.

This framework only makes sense if each possible person over which the utility function is defined is the agents future self, not another agent altogether. There needs to be some logical or physical relationship between the current agent and the class of future possible agent's such that their self/identity is maintained.

The less clear that the identity is maintained, the less clear that there is a rational maxim that the agent should maximize the future agent's utility...which among other things, is a philosopher's explanation for why we discount future value when performing actions, beyond what you get from simple time value of money.

So you still have the problem that the utility is, for instance, defined over all possible future Phil's utilities, not over all possible future people's. Possible Phil's are among the class of possible people (i presume), but not vice versa. So there is no logical relationship that a process that holds for possible phil's holds for possible future people.

Replies from: loqi

↑ comment by loqi · 2009-04-07T05:19:06.614Z · LW(p) · GW(p)

The Ramsey/Von Neumann-Morgenstern framework tacitly requires that when a person acts so has to maximize his expected utility, he does so with the assumption that he his actually maximizing HIS expected utility, not someone else's.

Sure, and when you actually do the expected utility calculation, you hold the utility function constant, regardless of who specifically is theoretically acting. For example, I can maximize my expected utility by sabotaging a future evil self. To do this, I have to make an expected utility calculation involving a future self, but my speculative calculation does not incorporate his utility function (except possibly as useful information).

The less clear that the identity is maintained, the less clear that there is a rational maxim that the agent should maximize the future agent's utility

This maxim isn't at all clear to me to begin with. Maximizing your future self's utility is not the same as maximizing your current self's utility. The only time these are necessarily the same is when there is no difference in utility function between current and future self, but at that point you might as well just speak of your utility, period. If you and all your future selves possess the same utility function, you all by definition want exactly the same thing, so it makes no sense to talk about providing "more utility" to one future self than another. The decision you make carries exactly the same utility for all of you.

Replies from: cja

↑ comment by cja · 2009-04-07T07:30:26.597Z · LW(p) · GW(p)

You're right that insofar as the utility function of the my future self is the same as my current utility function, I should want to maximize the utility of my future self. But my point with that statement is precisely that one's future self can have very different interests that one's current self, as you said (hence, the heroin addict example EDIT: Just realized I deleted that from the prior post! Put back in at the bottom of this one!).

Many (or arguably most) actions we perform can be explained (rationally) only in terms of future benefits. Insofar as my future self just is me, there's no problem at all. It is MY present actions are maximizing MY utility (where actions are present, and utility not necessarily indexed by time, and if it is indexed by time, not by reference to present and future selves, just to ME). I take something like that to be the everyday view of things. There is only one utility function, though it might evolve over time

(the evolution brings about its own complexities. If a 15 year old who dislikes wine is offered a $50,000 bottle of wine for $10, to be given to him when he is 30, should he buy the wine? Taking a shortsighted look, he should turn it down. But if he knows by age 30 he's going to be a wine connoisseur, maybe he should buy it after all cause it's a great deal).

However, on the view brought up by Phil, that an expected utility function is definied over many different future selves, who just are many different people, you have to make things more complicated (or at the very least, we're on the edge of having to complicate things). Some people will argue that John age 18, John age 30, and John age 50 are three completely different people. On this view, it is not clear that John age 18 rationally ought to perform actions that will make the lives of Johns age 30/50 better (at little detriment to his present day). On the extreme view, John's desire to have a good job at age 30 does not provide a reason to go college--because John 18 will never be 30, some other guy will reap the benefits (admittedly, John likely receives some utility from the deceived view that he is progressing toward his goal; but then progression, not the goal itself, is the end that rationalizes his actions). Unless you establish an utilitarian or altruistic rational norm, etc., the principles of reason do not straightforwardly tell us to maximize other peoples utilities.

The logic naturally breaks apart even more when we talk about many possible John age 30s, all of whom live different quite different lives and are not the same agent at all as John age 18. It really breaks down if John age 18 + 1 second is not the same as John age 18. (On a short time scale, very few actions, if any, derive immediate utility. e.g. I flip the light swich to turn on the light, but there is at least a milisecond between performing the basic action and the desired effect occurring).

Which is why, if many of our actions are to make rational sense, an agent's identity has to be maintained through time...at least in some manner. And that's all I really wanted establish, so as to show that the utilities in an expected utility calculation are still indexed to an individual, not a collection of people that have nothing to do with each other (maybe John1, John2, etc are slightly different--but not so much as John1 and Michael are). However, if someone wants to take the view that John age 18 and John age 18 + 1s are as different as John and Michael, I admittedly can't prove that someone wrong.

EDIT: Heroin example: sorry for any confusion

you are having surgery tomorrow. There's a 50% chance that (a) you will wake up with no regard for former interests and relationships, and hopelessly addicted to heroin. There's a 50% chance that (b) you will wake up with no major change to your personality. You know that in (a) you'll be really happy if you come home from surgery to a pile full of heroin. And in (b) if you come home and remember that you wasted your life savings on heroin, you will only be mildly upset.

In order to maximize the expected utility of the guy who’s going to come out of surgery, you should go out and buy all of the heroin you can (and maybe pay someone to prevent you from ODing). But it’s by no means clear that you rationally ought to do this. You are trying to maximize your utility. Insofar as you question whether or not the heroin addict in (a) counts as yourself, you should minimize the importance of his fate in your expected utility calculation. Standing here today, I don’t care what that guy’s life is like, even if it is my physical body. I would rather make the utility of myself in (b) slightly higher, even at the risk of making the utility of the person in (a) significantly lower.

Replies from: loqi

↑ comment by loqi · 2009-04-07T16:16:18.621Z · LW(p) · GW(p)

Many (or arguably most) actions we perform can be explained (rationally) only in terms of future benefits.

Mostly true, but Newcomb-like problems can muddy this distinction.

There is only one utility function, though it might evolve over time

No, it can't. If the same utility function can "evolve over time", it's got type (Time -> Outcome -> Utilons), but a utility function just has type (Outcome -> Utilons).

Unless you establish an utilitarian or altruistic rational norm, etc., the principles of reason do not straightforwardly tell us to maximize other peoples utilities.

Agreed. The same principle applies to the utility of future selves.

It really breaks down if John age 18 + 1 second is not the same as John age 18.

No, it really doesn't. John age 18 has a utility function that involves John age 18 + 1 second, who probably has a similar utility function. Flipping the light grants both of them utility.

Insofar as you question whether or not the heroin addict in (a) counts as yourself, you should minimize the importance of his fate in your expected utility calculation.

I don't see how this follows. The importance of the heroin addict in my expected utility calculation reflects my values. Identity is (possibly) just another factor to consider, but it has no intrinsic special privilege.

I would rather make the utility of myself in (b) slightly higher, even at the risk of making the utility of the person in (a) significantly lower.

That may be, but your use of the word "utility" here is confusing the issue. The statement "I would rather" is your utility function. When you speak of "making the utiity of (b) slightly higher", then I think you can only be doing so because "he agrees with me on most everything, so I'm actually just directly increasing my own utility" or because "I'm arbitrarily dedicating X% of my utility function to his values, whatever they are".

comment by smoofra · 2009-04-06T19:49:07.893Z · LW(p) · GW(p)

I haven't done the math, so take the following with a grain of salt.

We humans care about what will happen in the future. We care about how things will turn out. Call each possible future an "outcome". We humans prefer some outcomes over another. We ought to steer the future towards the outcomes we prefer. Mathematically, we have a (perhaps partial) order on the set of outcomes, and if we had perfect knowledge of how our actions affected the future, our decision procedure would just be "pick the best outcome".

So far I don't think I've said anything controversial.

But we don't have perfect knowledge of the future. We must reason and act under uncertainty. The best we can hope to do is assign conditional probabilities of to outcomes based on our possible actions. But in order to choose actions based on probabilities rather than certainties, we have to know a little more about what our preferences actually are. It's not enough to know that one outcome is better than another, we have to know how much better. Let me give an example. If you are given the choice between winning a little money with probability .1 and a lot of money with probability .01, which option do you choose? Well, I haven't given you enough information. If "a little" is $1 and "a lot" is $1 million, you should go for "a lot". But if "a lot" is only $2, you're better off going for "a little".

So it turns out that if you want to have a consistent decision theory under these circumstances, there's only one form it can take. Instead of a partial order on outcomes, you have to express your preference for each outcome as a number, called the utility of that outcome. And instead of selecting action that leads to the best outcome, you select the actions that lead to the highest expected utility.

The function that maps outcomes to utilities is called the utility function, and it is unique (given your preferences) up to an positive affine transformation. In other words, you can multiply the whole utility function by a positive scalar, or add any number to it, and it's meaning does not change.

So you see, "maximize expected utility" doesn't mean you have to maximize say, your profit, or your personal happiness, or even the number of human lives you save. All it really means is that your preferences ought to be consistent, because if they are, and if you're trying the best you can to steer the future towards them, then "maximizing expected utility" is what you're already doing.

comment by Nick_Tarleton · 2009-04-06T19:01:12.086Z · LW(p) · GW(p)

Average utilitarianism is actually a common position.

"Utility", as Eliezer says, is just the thing that an agent maximizes. As I pointed out before, a utility function need not be defined over persons or timeslices of persons (before aggregation or averaging); its domain could be 4D histories of the entire universe, or other large structures. In fact, since you are not indifferent between any two distributions of what you call "utility" with the same total and the same average, your actual preferences must have this form. This makes questions of "distribution of utility across people" into type errors.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T20:51:15.116Z · LW(p) · GW(p)

If your utility were defined over all possible futures, you wouldn't speak of maximizing expected utility. You would speak of maximizing utility. "The expected value of X" means "the average value of X". The word "expected" means that your aggregation function over possible outcomes is simple averaging. Everything you said applies to one evaluation of the utility function, over one possible outcome; these evaluations are then averaged together. That is what "expected utility" means.

Replies from: steven0461

↑ comment by steven0461 · 2009-04-06T22:56:01.928Z · LW(p) · GW(p)

If your utility were defined over all possible futures, you wouldn't speak of maximizing expected utility. You would speak of maximizing utility.

The utility function defined on lotteries is the expectation value of the utility function defined on futures, so maximizing one means maximizing the expectation value of the other. When we say "maximizing expected utility" we're referring to the utility function defined on futures, not the utility function defined on lotteries. (As far as I know, all such utility functions are by definition defined over all possible futures; else the formalism wouldn't work.)

edit: you seem to be thinking in terms of maximizing the expectation of some number stored in your brain, but you should be thinking more in terms of maximizing the expectation of some number Platonically attached to each possible future.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T23:53:09.945Z · LW(p) · GW(p)

The utility function defined on lotteries is the expectation value of the utility function defined on futures, so maximizing one means maximizing the expectation value of the other.

Yes. I realize that now.

Replies from: steven0461

↑ comment by steven0461 · 2009-04-07T00:06:04.184Z · LW(p) · GW(p)

Ah yes, sorry, I should have known from your "EDIT 2". I don't agree that you were right in essence; averaging over all outcomes and totaling over all outcomes mean the exact same thing as far as I can tell, and maximizing expected utility does correspond to averaging over all outcomes and not just the subset where you're alive.

comment by conchis · 2009-04-07T09:01:03.080Z · LW(p) · GW(p)

The von Neumann-Morgenstern theorem indicates that the only reasonable form for U is to calculate the expected value of u(outcome) over all possible outcomes

I'm afraid that's not what it says. It says that any consistent set of choices over gambles can be represented as the maximization of some utility function. It does not say that that utility function has to be u. In fact, it can be any positive monotonic transform of u. Call such a transform u*.

This means that your utility function U is indifferent with regard to whether the distribution of utility is equitable among your future selves. Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.

I'm afraid this still isn't right either. To take an example, suppose u* = ln(u+1). Assuming 50-50 odds for each outcome, the Eu for your first gamble is ln(11). The Eu* for your second gamble is ln(36), which is higher. So the second gamble is preferred, contra your claim of indifference. In fact, this sort of "inequality aversion" (which is actually just risk aversion with respect to u) will be present whenever u* is a concave function of u.

The rest of the argument breaks down at this point too, but you do raise an interesting question: are there arguments that we should have a particular type of risk aversion with respect to utility (u), or are all risk preferences equally rational?

EDIT: John Broome's paper "A Mistaken Argument Against the Expected Utility Theory of Rationality" (paywalled, sorry) has a good discussion of some of these issues, responding to an argument made by Maurice Allais along the lines of your original post, and showing that it rested on a mathematical error.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T13:30:57.562Z · LW(p) · GW(p)

I read summaries on the web yesterday (can't find them now) that concluded that the theorem proves that any utility function U that satisfies the axioms must simply be the expected value of the utility functions u. Wikipedia says U must be a linear combination of the function u and the probabilities of each outcome, which is more vague. But in either case, U = ln(u+1) is not allowed.

To take an example, suppose u = ln(u+1). Assuming 50-50 odds for each outcome, the Eu for your first gamble is ln(11). The Eu* for your second gamble is ln(36), which is higher. So the second gamble is preferred, contra your claim of indifference.

If u is just a function of u, then u is the single-outcome utility function. What is u, and why do you even bring it up? Are you thinking that I mean pre-diminishing-marginal-utility objects when I say utility?

The second gamble is preferred because it has a higher average utility. The actual utilities involved here are ln(11) and 2ln(5); ln(11) is less than twice ln(5). So this is not a case in which we are compensating for inequity; total utility is higher in the 2ln(5) case.

I think you are misinterpreting me and thinking that I mean u(outcome) is an object measure like dollars. It's not. It's utility.

Wait! I think you're thinking that I'm thinking that the theorem reaches directly down through the level of U, and also says something about u. That's not what I'm thinking.

Replies from: conchis

↑ comment by conchis · 2009-04-07T18:16:41.096Z · LW(p) · GW(p)

I'm afraid I have no idea what you're thinking any more (if I ever did). I am, however, fairly confident that whatever you're thinking, it does not imply utilitarianism, average or otherwise (unless you're also using those terms in a non-standard way).

My point is whatever you think of as utility, I can apply a positive monotonic transformation to it, maximize the expectation of that transformation, and this will still be rational (in the sense of complying with the Savage axioms).

Accepting, arguendo, that your analogy to interpersonal utility aggregation holds up (I don't think it does, but that's another matter), this means that, whatever utilities individuals have, I can apply a positive monotonic transformation to them all, maximize the expectation of the transformation, and this will be rational (in some sense that is as yet, not entirely clear). If that transformation is a concave function, then I can take account of inequality. (Or, more precisely, I can be prioritarian.) This is all pretty standard stuff.

in either case, U = ln(u+1) is not allowed.

u* = ln(u+1), not U=ln(u+1). u* = ln(u+1) is allowed.

What is u, and why do you even bring it up?

It was intended to reflect whatever you conceived of utility as. I brought it up because you did.

Are you thinking that I mean pre-diminishing-marginal-utility objects when I say utility?

No.

The second gamble is preferred because it has a higher average utility.

It has higher average utility*, but it doesn't have higher average utility.

The actual utilities involved here are ln(11) and 2ln(5); ln(11) is less than twice ln(5). So this is not a case in which we are compensating for inequity; total utility is higher in the 2ln(5) case.

It's 2ln(6) = ln(36), but that doesn't affect the point.

I think you are misinterpreting me and thinking that I mean u(outcome) is an object measure like dollars.

Nope.

Wait! I think you're thinking that I'm thinking that the theorem reaches directly down through the level of U, and also says something about u. That's not what I'm thinking.

I'm afraid I don't know what you mean by this.

EDIT:

Wikipedia says U must be a linear combination of the function u and the probabilities of each outcome, which is more vague.

Actually, it's slightly more precise. Maximizing any linear combination of the u and the probabilities will describe the same set of preferences over gambles as maximizing E(u). The same goes for any positive affine transformation of u.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T22:00:28.513Z · LW(p) · GW(p)

My point is whatever you think of as utility, I can apply a positive monotonic transformation to it, maximize the expectation of that transformation, and this will still be rational (in the sense of complying with the Savage axioms).

Sure. That has no bearing on what I'm saying. You are still maximizing expectation of your utility. Your utility is not the function pre-transformation. The axioms apply only if the thing you are maximizing the expectation of is your utility function. There's no reason to bring up applying a transformation to u to get a different u. You're really not understanding me if you think that's relevant.

Maximizing any linear combination of the u and the probabilities will describe the same set of preferences over gambles as maximizing E(u).

Not at all. You can multiply each probability by a different constant if you do that. Or you can multiply them all by -1, and you would be minimizing E(u).

Replies from: conchis

↑ comment by conchis · 2009-04-07T23:29:46.661Z · LW(p) · GW(p)

Sure. That has no bearing on what I'm saying.

Did you even read the next paragraph where I tried to explain why it does have a bearing on what you're saying? Do you have a response?

Not at all. You can multiply each probability by a different constant if you do that.

Fair. I assumed a positive constant. I shouldn't have.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-07T23:40:43.383Z · LW(p) · GW(p)

Did you even read the next paragraph where I tried to explain why it does have a bearing on what you're saying? Do you have a response?

I read it. I don't understand why you keep bringing up "u", whatever that is. You use u to represent the utility function on a possible world. We don't care what is inside that utility function for the purposes of this argument. And you can't* get out of taking the expected value of your utility function by transforming it into another utility function. Then you just have to take the expected value of that new utility function.

Read steven0461's comment above. He has it spot on.

Replies from: conchis

↑ comment by conchis · 2009-04-08T01:39:41.937Z · LW(p) · GW(p)

I read it.

But you still haven't engaged with it at all. I'm going to give this one last go before I give up.

Utilitarianism starts with a set of functions describing each individuals' welfare. To purge ourselves of any confusion over what u means, let's call these w(i). It then defines W as the average (or the sum, the distinction isn't relevant for the moment) of the w(i), and ranks certain (i.e. non-risky) states of the world higher if they have higher W. Depending on the type of utilitarianism you adopt, the w(i) could be defined in terms of pleasure, desire-satisfaction, or any number of other things.

The Savage/von Neuman-Morgenstern/Marschak approach starts from a set of axioms that consistent decision-makers are supposed to adhere to when faced with choices over gambles. It says that, for any consistent set of choices you might make, there exists a function f (mapping states of the world into real numbers), such that your choices correspond to maximizing E(f). As I think you realize, it puts no particular constraints on f.

Substituting distributions of individuals for probability distributions over the states of the world (and ignoring for the moment the other problems with this), the axioms now imply that for any consistent set of choices we might make, there exists a function f, such that our choices correspond to maximizing E(f).

Again, there are no particular constraints on f. As a result, (and this is the crucial part) nothing in the axioms say that f has to have anything to do with the w(i). Because the f does not have to have anything in particular to do with the w(i), E(f) does not have to have anything to do with W, and so the fact that we are maximizing E(f) says nothing about we are or should be good little utilitarians.

Replies from: gjm, PhilGoetz

↑ comment by gjm · 2009-04-08T01:57:02.525Z · LW(p) · GW(p)

I think you may be missing PhilGoetz's point.

E(f) needn't have anything to do with W. But it has to do with the f-values of all the different versions of you that there might be in the future (or in different Everett branches, or whatever). And it treats all of them symmetrically, looking only at their average and not at their distribution.

That is: the way in which you decide what to do can be expressed in terms of some preference you have about the state of the world, viewed from the perspective of one possible-future-you, but all those possible-future-yous have to be treated symmetrically, just averaging their preferences. (By "preference" I just mean anything that plays the role of f.)

So, Phil says, if that's the only consistent way to treat all the different versions of you, then surely it's also the only consistent way to treat all the different people in the world.

(This is of course the controversial bit. It's far from obvious that you should see the possible-future-yous in the same way as you see the actual-other-people. For instance, because it's more credible to think of those possible-future-yous as having the same utility function as one another. And because we all tend to care more about the welfare of our future selves than we do about other people's. And so on.)

If so, then the following would be true: To act consistently, your actions must be such as to maximize the average of something (which, yes, need have nothing to do with the functions w, but it had better in some sense be the same something) over all actual and possible people.

I think Phil is wrong, but your criticisms don't seem fair to me.

Replies from: Nick_Tarleton, conchis, PhilGoetz, conchis

↑ comment by Nick_Tarleton · 2009-04-08T03:15:21.350Z · LW(p) · GW(p)

But it has to do with the f-values of all the different versions of you that there might be in the future (or in different Everett branches, or whatever).

I think this is part of the problem. Expected utility is based on epistemic uncertainty, which has nothing to do with objective frequency, including objective frequency across Everett branches or the like.

Replies from: gjm, PhilGoetz

↑ comment by gjm · 2009-04-08T19:45:43.899Z · LW(p) · GW(p)

Nothing to do with objective frequency? Surely that's wrong: e.g., as you gain information, your subjective probabilities ought to converge on the objective frequencies.

But I agree: the relevant sense of expectation here has to be the subjective one (since all an agent can use in deciding what to prefer is its own subjective probabilities, not whatever objective ones there may be), and this does seem like it's a problem with what Phil's saying.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-09T04:26:05.637Z · LW(p) · GW(p)

the relevant sense of expectation here has to be the subjective one (since all an agent can use in deciding what to prefer is its own subjective probabilities, not whatever objective ones there may be), and this does seem like it's a problem with what Phil's saying.

It's just as much a problem with all of decision theory, and all expectation maximization, as with anything I'm saying. This may be a difficulty, but it's completely orthogonal to the issue at hand, since all of the alternatives have that same weakness.

Replies from: conchis

↑ comment by conchis · 2009-04-09T13:55:55.512Z · LW(p) · GW(p)

I think the point is that, if probability is only in the mind, the analogy between averaging over future yous and averaging over other people is weaker than it might initially appear.

It sort of seems like there might be an analogy if you're talking about averaging over versions of you that end up in different Everett branches. But if we're talking about subjective probabilities, then there's no sense in which these "future yous" exist, except in your mind, and it's more difficult to see the analogy between averaging over them, and averaging over actual people.

↑ comment by PhilGoetz · 2009-04-08T04:11:49.451Z · LW(p) · GW(p)

Expected utility is based on epistemic uncertainty

What? So if you had accurate information about the probability distribution of outcomes, then you couldn't use expected utility? I don't think that's right. In fact, it's exactly the reverse. Expected utility doesn't really work when you have epistemic uncertainty.

Replies from: Nick_Tarleton

↑ comment by Nick_Tarleton · 2009-04-08T13:44:38.849Z · LW(p) · GW(p)

What does "accurate information about the probability distribution" mean? Probability is in the mind.

If I'm using subjective probabilities at all – if I don't know the exact outcome – I'm working under uncertainty and using expected utility. If, in a multiverse, I know with certainty the objective frequency distribution of single-world outcomes, then yes, I just pick the action with the highest utility.

Replies from: steven0461, Vladimir_Nesov, PhilGoetz

↑ comment by steven0461 · 2009-04-08T16:39:08.947Z · LW(p) · GW(p)

I don't know that this affects your point, but I think we can make good sense of objective probabilities as being something else than either subjective probabilities or objective frequencies. See for example this.

↑ comment by Vladimir_Nesov · 2009-04-08T15:22:45.549Z · LW(p) · GW(p)

If, in a multiverse, I know with certainty the objective frequency distribution of single-world outcomes, then yes, I just pick the action with the highest utility.

You still need to multiply utility of those events by their probability. Objective frequency is not as objective as it may seem, it's just a point at which the posterior is no longer expected to change, given more information. Or, alternatively, it's "physical probability", a parameter in your model that has nothing to do with subjective probability and expected utility maximization, and has the status similar to that of, say, mass.

↑ comment by PhilGoetz · 2009-04-08T15:28:51.198Z · LW(p) · GW(p)

You said

Expected utility is based on epistemic uncertainty, which has nothing to do with objective frequency

If you have accurate information about the probability distribution, you don't have epistemic uncertainty.

What was you original comment supposed to mean?

If, in a multiverse, I know with certainty the objective frequency distribution of single-world outcomes, then yes, I just pick the action with the highest utility.

This is the situation which the theory of expected utility was designed for. I think your claim is exactly backwards of what it should be.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2009-04-08T16:24:13.256Z · LW(p) · GW(p)

If you have accurate information about the probability distribution, you don't have epistemic uncertainty.

This is not what is meant by epistemic uncertainty. In a framework of Bayesian probability theory, you start with a fixed, exactly defined prior distribution. The uncertainty comes from working with big events on state space, some of them coming in form of states of variables, as opposed to individual states. See Probability Space, Random Variable, also E.T. Jaynes (1990) "Probability Theory as Logic" may be helpful.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-09T04:28:50.527Z · LW(p) · GW(p)

According to Wikipedia, that is what is meant by epistemic uncertainty. It says that one type of uncertainty is

"1. Uncertainty due to variability of input and / or model parameters when the characterization of the variability is available (e.g., with probability density functions, pdf),"

and that all other types of uncertainty are epistemic uncertainty.

And here's a quote from "Separating natural and epistemic uncertainty in flood frequency analysis", Bruno Merz and Annegret H. Thieken, J. of Hydrology 2004, which also agrees with me:

"Natural uncertainty stems from variability of the underlying stochastic process. Epistemic uncertainty results from incomplete knowledge about the process under study."

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2009-04-09T10:47:11.179Z · LW(p) · GW(p)

This "natural uncertainty" is a property of distributions, while epistemic uncertainty to which you refer here corresponds to what I meant. When you have incomplete knowledge about the process under study, you are working with one of the multiple possible processes, you are operating inside a wide event that includes all these possibilities. I suspect you are still confusing the prior on global state space with marginal probability distributions on variables. Follow the links I gave before.

↑ comment by conchis · 2009-04-08T02:38:28.335Z · LW(p) · GW(p)

To act consistently, your actions must be such as to maximize the average of something

Yes, but maximizing the average of something implies neither utilitarianism nor indifference to equity, as Phil has claimed it does. I don't see how pointing this out is unfair.

↑ comment by PhilGoetz · 2009-04-08T04:02:37.987Z · LW(p) · GW(p)

You understand what I'm saying, so I'd very much like to know why you think it's wrong.

Note that I'm not claiming that average utilitarianism must be correct. The axioms could be unreasonable, or a strict proof could fail for some as-yet unknown reason. But I think the axioms are either reasonable in both cases, or unreasonable in both cases; and so expected-value maximization and average utilitarianism go together.

Replies from: gjm

↑ comment by gjm · 2009-04-08T19:51:02.153Z · LW(p) · GW(p)

See (1) the paragraph in my comment above beginning "This is of course the controversial bit", (2) Wei_Dai's comment further down and my reply to it, and (3) Nick Tarleton's (basically correct) objection to my description of E(f) as being derived from "the f-values of all the different versions of you".

↑ comment by conchis · 2009-04-08T02:23:34.652Z · LW(p) · GW(p)

E(f) needn't have anything to do with W. But it has to do with the f-values of all the different versions of you that there might be in the future.

I think this is part of where things are going wrong. f values aren't things that each future version of me has. They are especially not the value that a particular future me places on a given outcome, or the preferences of that particular future me. f values are simply mathematical constructs built to formalize the choices that current me happens to make over gambles.

Despite its name, expected utility maximization is not actually averaging over the preferences of future me-s; it's just averaging a more-or-less arbitrary function, that may or may not have anything to do with those preferences.

↑ comment by PhilGoetz · 2009-04-08T03:56:26.699Z · LW(p) · GW(p)

As a result, (and this is the crucial part) nothing in the axioms say that f has to have anything to do with the w(i).

Here we disagree. f is the utility function for a world state. If it were an arbitrary function, we'd have no reason to think that the axioms should hold for it. Positing the axioms is based on our commonsense notion of what utility is like.

I'm not assuming that there are a bunch of individual w(i) functions. Think instead of a situation where one person is calculating only their private utility. f is simply their utility function. You may be thinking that I have some definition of "utilitarianism" that places restrictions on f. "Average utilitarianism" does, but I don't think "utilitarianism" does; and if it did, then I wouldn't apply it here. The phrase "average utilitarianism" has not yet come into play in my argument by this point. All I ask at this point in the argument, is what the theorem asks - that there be a utility function for the outcome.

I thikn you're thinking that I'm saying that the theorem says that f has to be a sum or average of the w(i), and therefore we have to be average utilitarians. That's not what I'm saying at all. I tried to explain that already before. Read steven0461's comment above, and my response to it.

Replies from: conchis, PhilGoetz

↑ comment by conchis · 2009-04-08T04:39:21.417Z · LW(p) · GW(p)

The claim I am taking exception to is the claim that the vNM axioms provide support to (average) utilitarianism, or suggest that we need not be concerned with inequality. This is what I took your bullet points 6 and 8 (in the main post) to be suggesting (not to mention the title of the post!)

If you are not claiming either of these things, then I apologize for misunderstanding you. If you are claiming either of these things, then my criticisms stand.

As far as I can tell, most of your first two paragraphs are inaccurate descriptions of the theory. In particular, f is not just an individual's private utility function. To the extent that the vNM argument generalizes in the way you want it to, f can be any monotonic transform of a private utility function, which means, amongst other things, that we are allowed to care about inequality, and (average) utilitarianism is not implied.

But I've repeated myself enough. I doubt this conversation is productive any more, if it ever was, so I'm going to forego adding any more noise from now on.

Read steven0461's comment above, and my response to it.

I read both of them when they were originally posted, and have looked over them again at your exhortation, but have sadly not discovered whatever enlightenment you want me to find there.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-08T04:51:30.282Z · LW(p) · GW(p)

The claim I am taking exception to is the claim that the vNM axioms provide support to (average) utilitarianism, or suggest that we need not be concerned with inequality. This is what I took your bullet points 6 and 8 (in the main post) to be suggesting (not to mention the title of the post!)

As steven0461 said,

If in all the axioms of the expected utility theorem you replace lotteries by distributions of individual welfare, then the theorem proves that you have to accept utilitarianism.

Not "proven", really, but he's got the idea.

As far as I can tell, most of your first two paragraphs are inaccurate descriptions of the theory. In particular, f is not just an individual's private utility function. To the extent that the vNM argument generalizes in the way you want it to, f can be any monotonic transform of a private utility function, which means, amongst other things, that we are allowed to care about inequality, and (average) utilitarianism is not implied.

I am pretty confident that you're mistaken. f is a utility function. Furthermore, it doesn't matter that the vNM argument can apply to things that satisfy the axioms but aren't utility functions, as long as it applies to the utility functions that we maximize when we are maximizing expected utility.

Either my first two bullet points are correct, or most of the highest-page-ranked explanations of the theory on the Web are wrong. So perhaps you could be specific about how they are wrong.

Replies from: conchis

↑ comment by conchis · 2009-04-08T17:41:47.710Z · LW(p) · GW(p)

I understand what steven0461 said. I get the idea too, I just think it's wrong. I've tried to explain why it's wrong numerous times, but I've clearly failed, and don't see myself making much further progress.

In lieu of further failed attempts to explain myself, I'm lodging a gratuitous appeal to Nobel Laureate authority, leaving some further references, and bowing out.

The following quote from Amartya Sen (1979) pretty much sums up my position (in the context of a similar debate between him and Harsanyi about the meaning of Harsanyi's supposed axiomatic proof of utilitarianism).

[I]t is possible to define individual utilities in such a way that the only way of aggregating them is by summation. By confining his attention to utilities defined in that way, John Harsanyi has denied the credibility of "nonlinear social welfare functions." That denial holds perfectly well for the utility measures to which Harsanyi confines his attention, but has no general validity outside that limited framework. Thus, sum-ranking remains an open issue to be discussed in terms of its moral merits-and in particular, our concern for equality of utilities-and cannot be "thrust upon" us on grounds of consistency.

Further refs, if anyone's interested:

Harsanyi, John (1955), "Cardinal Welfare, Individualistic Ethics and Interpersonal Comparisons of Utility", Journal of Political Economy 63. (Harsanyi's axiomatic "proof" of utilitarianism.)
Diamond, P. (1967) "Cardinal Welfare, Individualistic Ethics and Interpersonal Comparisons of Utility: A Comment", Journal of Political Economy 61
Harsanyi, John (1975) "Nonlinear Social Welfare Functions: Do Welfare Economists Have a Special Exemption from Bayesian Rationality?" Theory and Decision 6(3): 311-332.
Sen, Amartya (1976) "Welfare Inequalities and Rawlsian Axiomatics," Theory and Decision, 7(4): 243-262 (reprinted in R. Butts and J. Hintikka eds., Foundational Problems in the Special Sciences (Boston: Reidel, 1977). (esp. section 2: Focuses on two objections to Haysanyi's derivation: the first is the application of the independence axiom to social choice (as Wei Dai has pointed out), the second is the point that I've been making about the link to utilitarianism.)
Harsanyi, John (1977) "Nonlinear Social Welfare Functions: A Rejoinder to Professor Sen," in Butts and Hintikka
Sen, Amartya (1977) "Non-linear Social Welfare Functions: A Reply to Professor Harsanyi," in Butts and Hintikka
Sen, Amartya (1979) "Utilitarianism and Welfarism" The Journal of Philosophy 76(9): 463-489 (esp. section 2)

Parts of the Hintikka and Butts volume are available in Google Books.

(I'll put these in the Harsanyi thread above as well.)

↑ comment by PhilGoetz · 2009-04-08T04:00:19.904Z · LW(p) · GW(p)

You know how the Reddit code is very clever, and you write a comment, and post it, and immediately see it on your screen?

Well, I just wrote the above comment out, and clicked "comment", and it immediately appeared on my screen. And it had a score of 0 points when it first appeard.

And that's the second time that's happened to me.

Does this happen to anybody else? Is there some rule to the karma system that can make a comment have 0 starting points?

EDIT: This comment, too. Had 0 points at the start.

comment by conchis · 2009-04-07T08:54:37.434Z · LW(p) · GW(p)

This means that your utility function U is indifferent with regard to whether the distribution of utility is equitable among your future selves. Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.

I'm afraid there's still some confusion here, because this isn't right. To take an example, suppose U = ln(u).

comment by Nick_Tarleton · 2009-04-06T18:59:59.691Z · LW(p) · GW(p)

Average utilitarianism is actually a common position.

"Utility", as Eliezer says, is the thing that is maximized, not happiness. As I pointed out before, a utility function need not be defined over persons or timeslices of persons (before aggregation or averaging); its domain could be 4D histories of the entire universe, or other large structures. In fact, since you are not indifferent between any two distributons of what you call "utility" with the same total and the same average, your actual preferences must have this form. This makes questions of "distribution of utility across people" into type errors.

comment by Nick_Tarleton · 2009-04-06T18:52:19.749Z · LW(p) · GW(p)

Average utilitarianism is actually a common position.

comment by conchis · 2009-04-06T18:51:55.276Z · LW(p) · GW(p)

Because if your choices under uncertainty do not maximize the expected value of some utility function, you are behaving inconsistently (in a particular sense, axiomatized by Savage, and others - there's a decent introduction here).

These axioms are contestable, but the reasons for contesting them have little to do with population ethics.

Also, as I've said before, the utility function that consistent agents maximize the expectation of need not be identical with an experienced utility function, though it will usually need to be a positive monotonic transform of one.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T21:44:06.836Z · LW(p) · GW(p)

Because if your choices under uncertainty do not maximize the expected value of some utility function, you are behaving inconsistently (in a particular sense, axiomatized by Savage, and others - there's a decent introduction here).

Can you name a specific chapter in the linked-to Choice under Risk and Uncertainty?

Unfortunately, the equations are not rendered correctly in my browser.

comment by loqi · 2009-04-06T18:15:19.847Z · LW(p) · GW(p)

Why do we think it's reasonable to say that we should maximize average utility across all our possible future selves

Because that's what we want, even if our future selves don't. If I know I have a 50/50 chance of becoming a werewolf (permanently, to make things simple) and eating a bunch of tasty campers on the next full moon, then I can increase loqi's expected utility by passing out silver bullets at the campsite ahead of time, at the expense of wereloqi's utility.

In other words, one can attempt to improve one's expected utility as defined by their current utility function by anticipating situations where they no longer implement said function.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2009-04-06T18:30:22.390Z · LW(p) · GW(p)

I'm not asking questions about identity. I'm pointing out that almost everyone considers equitable distributions of utility better than inequitable distributions. So why do we not consider equitable distributions of utility among our future selves to be better than inequitable distributions?

Replies from: loqi, rwallace, Nick_Tarleton

↑ comment by loqi · 2009-04-06T19:19:51.757Z · LW(p) · GW(p)

I'm pointing out that almost everyone considers equitable distributions of utility better than inequitable distributions.

If that is true, then that means their utility is a function of the distribution of others' utility, and they will maximize their expected utility by maximizing the expected equity of others' utility.

So why do we not consider equitable distributions of utility among our future selves to be better than inequitable distributions?

Is this the case? I don't know how you reached this conclusion. Even if it is the case, I also don't see how this is necessarily inconsistent unless one also makes the claim that they make no value distinction between future selves and other people.

↑ comment by rwallace · 2009-04-06T19:40:30.308Z · LW(p) · GW(p)

I don't consider equitable distributions of utility better than inequitable distributions. I consider fair distributions better then unfair ones, which is not quite the same thing.

Put it that way, the answer to the original question is simple: if my future selves are me, then I am entitled to be unfair to some of myself whenever in my sole judgment I have sufficient reason.

Replies from: PhilGoetz, dclayh

↑ comment by PhilGoetz · 2009-04-06T22:29:58.743Z · LW(p) · GW(p)

That's a different question. That's the sort of thing that a utility function incorporates; e.g., whether the system of distribution of rewards will encourage productivity.

If you say you don't consider equitable distributions of utility better than inequitable distributions, you don't get to specify which inequitable distributions can occur. You mean all inequitable distributions, including the ones in which the productive people get nothing and the parasites get everything.

↑ comment by dclayh · 2009-04-06T20:54:33.807Z · LW(p) · GW(p)

I consider fair distributions better then unfair ones

What definition of "fair" are you using such that that isn't a tautology?

Replies from: rwallace

↑ comment by rwallace · 2009-04-06T20:58:03.819Z · LW(p) · GW(p)

Example: my belief that my neighbor's money would yield more utility in my hands than his, doesn't entitle me to steal it.

↑ comment by Nick_Tarleton · 2009-04-06T18:50:00.341Z · LW(p) · GW(p)

I'm pointing out that almost everyone considers equitable distributions of utility better than inequitable distributions.

Do they? I don't see this.

comment by PhilGoetz · 2014-01-25T06:06:03.247Z · LW(p) · GW(p)

I figured out what the problem is. Axiom 4 (Independence) implies average utilitarianism is correct.

Suppose you have two apple pies, and two friends, Betty and Veronica. Let B denote the number of pies you give to Betty, and V the number you give to Veronica. Let v(n) denote the outcome that Veronica gets n apple pies, and similarly define b(n). Let u_v(S) denote Veronica's utility in situation S, and u_b(S) denote Betty's utility.

Betty likes apple pies, but Veronica loves them, so much so that u_v(v(2), b(0)) > u_b(b(1), v(1)) + u_v(b(1), v(1)). We want to know whether average utilitarianism is correct to know whether to give Veronica both pies.

Independence, the fourth axiom of the von Neumann-Morgenstern theorem, implies that if the outcome L is preferable to outcome M, then one outcome of L and one outcome of N is preferable to one outcome of M and one outcome of N.

Let L represent giving one pie to Veronica and M represent giving one pastry to Betty. Now let’s be sneaky and let N also represent giving one pastry to Veronica. The fourth axiom says that L + N—giving two pies to Veronica—is preferable to L + M—giving one to Veronica and one to Betty. We have to assume that to use the theorem.

But that’s the question we wanted to ask--whether our utility function U should prefer the solution that gives two pies to Veronica, or one to Betty and one to Veronica! Assuming the fourth axiom builds average utilitarianism into the von Neumann-Morgenstern theorem.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2014-01-27T22:10:23.404Z · LW(p) · GW(p)

Argh; never mind. This is what Wei_Dai already said below.

↑ comment by timtyler · 2012-12-16T01:51:56.335Z · LW(p) · GW(p)

I tend to think that utilitarianism is a pretty naive and outdated ethical philosophy - but I also think that total utilitarianism is a bit less silly than average utilitarianism. Having read this post, my opinion on the topic is unchanged. I don't see why I should update towards Phil's position.

comment by PhilGoetz · 2009-04-10T17:23:24.817Z · LW(p) · GW(p)

I started reading the Weymark article that conchis linked to. We have 4 possible functions:

u(world), one person's utility for a world state
s(world), social utility for a world state
U(lottery), one person's utility given a probability distribution over future world states
S(lottery), social utility given a probability distribution over future world states

I was imagining a set of dependencies like this:

Combine multiple u(world) to get s(world)
Combine multiple s(world) to get S(lottery)

Weymark describes it like this:

Combine multiple u(world) to get U(lottery)
Combine multiple U(lottery) to get S(lottery)

Does anyone have insight into whether there is any important difference between these approaches?

Replies from: conchis

↑ comment by conchis · 2009-04-13T11:32:13.583Z · LW(p) · GW(p)

FWIW, I tend to think about it the same way as you.

My sense is that the difference isn't important as long as you're willing to force all individuals to use the same subjective probabilities as are used in S.* As Weymark notes, Harsanyi's axioms can't in general be satisfied unless this constraint is imposed; which suggests that the approaches are the same in any case where the latter approach works at all. (And Aumann's result also tends to suggest that imposing the constraint would be reasonable.)

* If U_i is the expectation of u_i(x), and S is a (weighted) sum of the U_i, then S is also the expectation of s(x), where s(x) is defined as a (weighted) sum of the u_i(x), provided all the expectations are taken with respect to the same probability distribution.

comment by chaosmosis · 2012-12-16T01:10:32.803Z · LW(p) · GW(p)

This is the same ethical judgement that an average utilitarian makes when they say that, to calculate social good, we should calculate the average utility of the population

This is the part that I think is wrong. You don't assess your average utility when you evaluate your utility function, you evaluate your aggregate utility. 10 utilons plus 0 utilons is equivalent to 5 utilons plus 5 utilons not because their average is the same but because their total is the same.

Replies from: thomblake

↑ comment by thomblake · 2012-12-19T16:55:37.928Z · LW(p) · GW(p)

10 utilons plus 0 utilons is equivalent to 5 utilons plus 5 utilons not because their average is the same but because their total is the same.

This is incoherent. "Average is the same" and "total is the same" are logically equivalent for cases where n is the same, which I think are all we're concerned about here.

Replies from: chaosmosis

↑ comment by chaosmosis · 2012-12-20T07:31:50.189Z · LW(p) · GW(p)

It could be either, so he's not justified in assuming that it's the average one in order to support his conclusion. He's extrapolating beyond the scope of their actual equivalence, that's the reason his argument is bringing anything new to the table at all.

He's using their mathematical overlap in certain cases as prove that in cases where they don't overlap the average should be used as superior to the total. That makes no sense at all, when thought of in this way. That is what I think the hole in his argument is.

Replies from: thomblake

↑ comment by thomblake · 2012-12-20T14:26:50.898Z · LW(p) · GW(p)

Can you give an example of a case where they don't overlap, that PhilGoetz is arguing about?

Replies from: chaosmosis

↑ comment by chaosmosis · 2012-12-20T23:26:32.758Z · LW(p) · GW(p)

Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.

So, to give a concrete example, you have $10 dollars. You can choose between gaining 5 utilons today and five tomorrow by spending half of the money today and half of the money tomorrow, or between spending all of it today and gaining 10 utilons today and 0 tomorrow. These outcomes both give you equal numbers of utilons, so they're equal.

Phil says that the moral reason they're both equal is because they both have the same amount of average utility distributed across instances of you. He then uses that as a reason that average utilitarianism is correct across different people, since there's nothing special about you.

However, an equally plausible interpretation is that the reason they are morally equal in the first instance is because the aggregate utilities are the same. Although average utilitarianism and aggregate utilitarianism overlap when N = 1, in many other cases they disagree. Average utilitarianism would rather have one extremely happy person than twenty moderately happy people, for example. This disagreement means that average and aggregate utilitarianism are not the same (as well as the fact that they have different metaethical justifications which are used as support), which means he's not justified in either his initial privileging of average utilitarianism or his extrapolation of it to large groups of people.

Average utilitarianism must be correct?

Contents

168 comments