Can you define "utility" in utilitarianism without using words for specific human emotions?

post by SurvivalBias (alex_lw) · 2022-09-21T03:29:34.261Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    15 Richard_Kennaway
    13 oumuamua
    1 Dagon
    0 Viktor Rehnberg
None
No comments

I'm trying to get a slightly better grasp of utilitarianism as it is understood in rat/EA circles, and here's my biggest confusion at the moment.

How do you actually define "utility", not in the sense of how to compute it, but in the sense of specifying wtf are you even trying to compute? People talk about "welfare", "happiness" or "satisfaction", but those are intrinsically human concepts and most people seem to assume non-human agents at least in theory can have utility. So let's taboo those words, and all other words referring to specific human emotions (you can still use the word "human" or "emotion" itself if you have to). Caveats:

  1. Your definition should exclude things like AlphaZero or a $50 robot toy following a lights spot.
  2. If you use the word "sentient" or synonyms, provide at least some explanation of what do you mean by it.

If the answer is different for different flavors of utilitarianism, please clarify which one(s) your definition(s) apply to.

Alternatively, if "utility" is defined in human terms by design, can you explain what is the supposed process for mapping internal states of those non-human agents into human terms?

Answers

answer by Richard_Kennaway · 2022-09-21T17:49:23.710Z · LW(p) · GW(p)

"Utilitarianism" has two different, but related meanings. Historically, it generally means "the morally right action is the action that produces the most good", or as Bentham put it, "the greatest amount of good for the greatest number". Leave aside for the moment that this ignores the tradeoff between how much good and how many people, and exactly what the good is. Bentham and like-minded thinkers mean by "good" things like material well-being, flourishing, "happiness", and so on. They are pointing in a certain direction, even if a bit vaguely. Utilitarianism in this sense is about people, and its conception of the good consists of what humans generally want. It is necessarily expressed in terms of human concepts, because that is what it is about.

The other thing that the word "utilitarianism" has become used for is the thing that various theorems prove can be constructed from a preference relation satisfying certain axioms. Von Neumann and Morgenstern are the usual names mentioned, but there are also Savage, Cox, and others. Collectively, these are, as Eliezer has put it [LW · GW], "multiple spotlights all shining on the same core mathematical structure". The theory is independent of of any specific preference relation and of what the utility function determined by those preferences comes out to be. (ETA: This use of the word might be specific to the rationalist community. "Utility theory" is I think the more widely used term. Accordingly I've replaced "VNMU" by VNMUT" below.)

To distinguish these two concepts I shall call them "Benthamite utilitarianism" and "Von Neuman-Morgenstern utility theory", or BU and VNMUT for short. How do they relate to each other, and what does either have to say about AI?

  1. BU has a specific notion of the individual good. VNMUT does not. VNMUT is concerned only with the structure of the preference relation, not its content. In VNMUT, the preference relation is anything satisfying the axioms; in BU it is a specific thing, not up for grabs, described by words such as "welfare", "happiness", or "satisfaction".

By analogy: BU is like studying the structure of some particular group, such as the Monster Group, while VNMUT is like group theory, which studies all groups and does not care where they came from or what they are used for.

  1. VNMUT is made of theorems. BU is not. BU contains no mathematical structure to elucidate what is meant by "the greatest good for the greatest number". The slogan is a rallying call, but leaves many hard decisions to be made.

  2. Neither BU nor VNMUT have a satisfactory concept of collective good. BU is silent about the tradeoff between the greatest good and the greatest number. There is no generally agreed on extension of VNMUT to mathematically construct a collective preference relation or utility function. There have been many attempts, on both the practical side (BU) and the theoretical side (VNMUT), but the body of such work does not have the coherence of those "multiple spotlights all shining on the same core mathematical structure". The differing attitudes we observe to the Repugnant Conclusion illustrate the lack of consensus.

What do either of these have to do with AI?

If a program is trained to produce outputs that maximise some objective function, that value is at least similar to a utility in the VNMUT sense, although it is not derived from a preference relation. The utility (objective function) is primitive and a preference relation can be derived from it: the program "prefers" a higher value to a lower.

As for BU, whether a program optimises for the human good is up to what its designers choose to have it optimise. Optimise for deadly poisons and that may be what you get. (I don't know if anyone has experimented with the compounds that that experiment suggested, although it seems to me quite likely that some military lab somewhere is doing so, if they weren't already.) Optimise for peace and love, and maybe you get something like that, or maybe you end up painting smiley faces onto everything. The AI itself is not feeling or emoting. Its concepts of "welfare", "happiness", or "satisfaction", such as they are, are embodied in the training procedure its programmers used to judge its outputs as desired or undesired.

comment by M. Y. Zuo · 2022-10-03T20:46:05.241Z · LW(p) · GW(p)

How can we know conclusively that 'The AI itself is not feeling or emoting.'? 

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2022-10-03T20:58:08.182Z · LW(p) · GW(p)

"Conclusively" is doing too much work there. Do you attribute feelings or emotions to current AIs? I deny them on the same grounds as I deny them to any of the other software I use, and to rocks. I say current AIs, because that is what I had in mind, and because there would be no point in arguing "But suppose someone did make an AI with emotions! Then it would have emotions!"

Replies from: M. Y. Zuo
comment by M. Y. Zuo · 2022-10-03T22:31:37.735Z · LW(p) · GW(p)

If the objectionable word is removed:

How can we know - that 'The AI itself is not feeling or emoting.'? 

Would you have a different answer?

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2022-10-04T07:36:22.019Z · LW(p) · GW(p)

I just gave my answer. For more, there's this from my recent ding-dong with Signer [LW(p) · GW(p)]. Briefly, in the absence of any method of actually detecting and measuring consciousness (a concept in which I include feelings and emotions), a consciousnessometer, we must fall back on the experiences that give rise to the very concept, on the basis of which we attribute consciousness to people besides ourselves, and to some extent other animals. On that basis I see no reason to attribute it to any extant piece of software.

Replies from: M. Y. Zuo
comment by M. Y. Zuo · 2022-10-12T18:16:19.262Z · LW(p) · GW(p)

Briefly, in the absence of any method of actually detecting and measuring consciousness (a concept in which I include feelings and emotions), a consciousnessometer, we must fall back on the experiences that give rise to the very concept, on the basis of which we attribute consciousness to people besides ourselves, and to some extent other animals.

That seems like a less popular understanding. 

Why must consciousness include 'feelings' and 'emotions'?

If someone has their portion of the brain responsible for emotional processing damaged, do they become less conscious?

Merriam-webster also lists that as number 2 in their dictionary, and a different definition in the number one position:

Definition of consciousness

1a: the quality or state of being aware especially of something within oneself

b: the state or fact of being conscious of an external object, state, or fact

c: AWARENESSespecially : concern for some social or political cause The organization aims to raise the political consciousness of teenagers.

2: the state of being characterized by sensation, emotion, volition, and thought : MIND

3: the totality of conscious states of an individual

4: the normal state of conscious liferegained consciousness

5: the upper level of mental life of which the person is aware as contrasted with unconscious processes

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2022-10-13T07:21:22.610Z · LW(p) · GW(p)

Why must consciousness include 'feelings' and 'emotions'?

If they are present, they are part of consciousness. They are included in the things of which one is aware within oneself (and in item 2 of the definition you quote). I did not intend any implication that they must be present, for consciousness to be present.

answer by oumuamua · 2022-09-21T09:09:22.275Z · LW(p) · GW(p)

People talk about "welfare", "happiness" or "satisfaction", but those are intrinsically human concepts

No, they are not. Animals can feel e.g. happiness as well.

If you use the word "sentient" or synonyms, provide at least some explanation of what do you mean by it.

Something is sentient if being that thing is like something. For instance, it is a certain way to be a dog, so a dog is sentient. As a contrast, most people who aren't panpsychists do not believe that it is like anything to be a rock, so most of us wouldn't say of a rock that it is sentient.

Sentient beings have conscious states, each of which are (to a classical utilitarian) desirable to some degree (which might be negative, of course). That is what utilitarians mean by "utility": The desirability of a certain state of consciousness.

I expect that you'll be unhappy with my answer, because "desirability of a certain state of consciousness" does not come with an algorithm for computing that, and that is because we simply do not have an understanding of how consciousness can be explained in terms of computation.

Of course having such an explanation would be desirable, but its absence doesn't render utilitarianism meaningless, because humans still have an understanding of what approximately we mean by terms such as "pleasure", "suffereing", "happiness", even if it is merely in a "I know it when I see it" kind of way.

comment by SurvivalBias (alex_lw) · 2022-09-22T02:03:30.587Z · LW(p) · GW(p)

No, they are not. Animals can feel e.g. happiness as well.

Yeah but the problem here is that we perceive happiness in animals only in as much as it looks like our own happiness. Did you notice that the closer an animal to a human the more likely we are to agree it can feel emotions? An ape can definitely display something like a human happiness, so we're pretty sure it can experience it. A dog can display something mostly like human happiness so most likely they can feel it too. A lizard - meh, maybe but probably not. An insect, most people would say no. Maybe I'm wrong and there's an argument that animals can experience happiness which is not based on their similarity to us, in that case I'm very curious to see this argument.

Sentience

For the record, I believe we do have at least crude mechanistic model of how consciousness works in general, and yes what's with the hard problem of consciousness in particular (the latter being a bit of a wrong question [LW · GW]).

Otherwise, I actually think it somewhat answers my question. One my qualm would be that sentience does seem to come on a spectrum - but that can in theory be addressed by some scaling factor. The bigger issue for me is that it implies that a hardcore total utilitarian would be fine with a future populated by trillions of sentient but otherwise completely alien AIs successfully achieving their alien goals (e.g. maximizing paperclips) and experiencing desirable-state-of-consciousness about it. But I think some hardcore utilitarians would bite this bullet, and that wouldn't be a biggest bullet for a utilitarian to bite either.

answer by Dagon · 2022-09-21T15:35:51.307Z · LW(p) · GW(p)

[note: anti-realist non-Utilitarian here; I don't believe "utility" is actually a universal measurable thing, nor that it's comparable across entities (nor across time for any real entity).  Consider this my attempt at an ITT on this topic for Utilitarianism]

One possible answer is that it's true that those emotions are pretty core to most people's conception of utility (at least most people I've discussed it with).  But this does NOT mean that the emotions ARE the utility, they're just an evolved mechanism which points to utility, and not necessarily the only possible mechanism.  Goodhart's Law hits pretty hard if you think of the emotions directly as utility.  

Utility itself is an abstraction over the level of satisfaction of goals/preferences about the state of the universe for an entity.  Or in some conceptions, the eu-satisfaction of the goals the entity would have if it were fully informed.

comment by SurvivalBias (alex_lw) · 2022-09-21T20:08:30.871Z · LW(p) · GW(p)

>Utility itself is an abstraction over the level of satisfaction of goals/preferences about the state of the universe for an entity.

You can say that a robot toy has a goal of following a light source. Or thermostat has a goal of keeping the room temperature at a certain setting. But I'm yet to hear anyone counting those things towards total utility calculations.

Of course a counterargument would be "but those are not actual goals, those are the goals of humans that set it", but in this case you've just hidden all the references to humans into the word "goal" and are back to square 1.

answer by Viktor Rehnberg · 2022-09-21T08:17:51.620Z · LW(p) · GW(p)

Utility when it comes to a single entity is simply about preferences.

The entity should have

  1. For any two outcomes/states of the world the entity should prefer one over the other or consider them equally preferable
  2. The entity should be coherent in its preferences such that if it preferes to and to , then the entity prefers to
  3. When it comes to probabilities, if the entity prefers to then the entity prefers with probability to with probability all else equal. Furthermore, there exist a probability such that and is equally preferable to with certainty with the preference ordering from 2.

This is simply Von Neumann -- Morgenstern utility theory and means that for such an entity you can translate the preference ordering to a real valued function over preferences. When we only consider a single agent this function is undetermined up to a any scaling with positive scalar values or shifting with scalar values.

Usually I'd like to add the expected utility hypothesis as well that

where is with probability .

(Edit: Apparently step 3 implies the expected utility hypothesis. And cubefox [LW(p) · GW(p)] pointed out that my notation here was weird. An improved notation would be that

where is a random variable over the set of states . Then I'd say that the expected utility hypothesis is the step .

end of edit.)

Now the tricky part to me is when it comes to multiple entities with utility functions. How do you combine these into a single valued function, how are they aggregated.

Here there are differences in

  1. Aggregation function. Should you sum the contributions (total utilitarianism), average, take the minimum (for a maximin strategy), ...
  2. Weighting. For each individual utility function we have a freedom in scale and shift. If we fix utility 0 as this entity does not exist or the world does not exist, then what remains is a scale of the utility functions which effectively functions as weighting in aggregations like sum and average. Here questions like, how many cows living lives worth living would are needed to choose that over a human having a life worth living and how do you determine where in the scale you are in a life worth living.

Another tricky part is that humans and other entities are not coherent to satisfy the axioms in Von Neumann -- Morgenstern utility theory. What to do then, which preferences are "rational" and which are not?

comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-21T08:24:33.873Z · LW(p) · GW(p)

You could perhaps argue that "preference" is a human concept. You could extend it with something like coherent extrapolated volition to be what the entity would prefer if it knew all that was relevant, had all the time needed to think about it and was more coherent. But, in the end if something has no preference, then it would be best to leave it out of the aggregation.

comment by SurvivalBias (alex_lw) · 2022-09-21T20:02:46.242Z · LW(p) · GW(p)

So utility theory is a useful tool, but as far as I understand it's not directly used as a source of moral guidance (although I assume once you have some other source you can use utility theory to maximize it). Whereas utilitarianism as a metaethics school is concerned exactly with that, and you can hear people in EA talking about "maximizing utility" as the end in and of itself all the time. It was in this latter sense that I was asking.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-22T09:14:51.210Z · LW(p) · GW(p)

Perhaps for most they don't have this in the back of their mind when they think of utility. But, for me this is what I'm thinking about. The aggregation is still confusing to me, but as a simple case example. If I want to maximise total utility and am in a situation that only impacts a single entity then increasing utility is the same to me as getting this entity in for them more preferable states.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-22T09:27:03.434Z · LW(p) · GW(p)

Having read some of your other comments. I expect you to ask if the top preference of a thermostat is it's goal temperature? And to this I have no good answer.

For things like a thermostat and a toy robot you can obviously see that there is a behavioral objective which we could use to infer preferences. But, is the reason that thermostats are not included in utility calculations that behavioral objective does not actually map to a preference ordering or that their weight when aggregated is 0.

comment by cubefox · 2022-09-22T02:57:46.611Z · LW(p) · GW(p)

Could you explain the "expected utility hypothesis"? Where does this formula come from? Very intriguing!

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-22T09:11:08.526Z · LW(p) · GW(p)

Expected utility hypothesis is that . To make it more concrete suppose that for outcome is worth for you. Then getting with probaillity is worth . This is not necessarily true, there could be an entity that prefers outcomes comparatively more if they are probable/improbable. The name comes from the fact that if you assume it to be true you can simply take expectations of utils and be fine. I find it very agreeable for me.

Replies from: cubefox
comment by cubefox · 2022-09-22T09:50:20.350Z · LW(p) · GW(p)

I'm probably missing something here, but how is a defined expression? I thought takes as inputs events or outcomes or something like that, not a real number like something which could be multiplied with ? It seems you treat not as an event but as some kind of number? (I get of course, since returns a real number.)

The thing I would have associated with "expected utility hypothesis": If and are mutually exclusive, then

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-23T09:37:13.862Z · LW(p) · GW(p)

Hmm, I usually don't think too deeply about the theory so I had to refresh somethings to answer this.

First off, the expected utility hypothesis is apparently implied by the VNM axioms. So that is not something needed to add on. To be honest I usually only think of a coherent preference ordering and expected utilities as two seperate things and hadn't realized that VNM combines them.

About notation, with I mean the utility of getting with certainty and with I mean the utility of getting with probability . If you don't have the expected utility hypothesis I don't think you can separate an event from its probability. I tried to look around to the usual notation but didn't find anything great.

Wikipedia used something like

where is a random variable over the set of states . Then I'd say that the expected utility hypothesis is the step .

Replies from: cubefox
comment by cubefox · 2022-09-23T11:11:54.654Z · LW(p) · GW(p)

Ah, thanks. I still find this strange, since in your case and are events, which can be assigned specific probabilities and utilities, while is apparently a random variable. A random variable is, as far as I understand, basically a set of mutually exclusive and exhaustive events. E.g. = The weather tomorrow = {good, neutral, bad}. Each of those events can be assigned a probability (and they must sum to 1, since they are mutually exclusive and exhaustive) and a utility. So it seems it doesn't make sense to assign itself a utility (or a probability). But I might be just confused here...

Edit: It would make more sense, and in fact agree with the formula I posted in my last comment, if a random variable would correspond to an event that is the disjunction of its possible values. E.g. = weather will be good or neutral or bad. In which case the probability of a random variable will be always 1, such that the expected utility of the disjunction is just its utility, and my formula above is identical to yours.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-26T07:23:33.605Z · LW(p) · GW(p)

What I found confusing with was that to me this reads as which should always(?) depend on but with this notation it is hidden to me. (Here I picked as the mutually exclusive event , but I don't think it should remove much from the point).

That is also why I want some way of expressing that in the notation. I could imagine writing as that is the cleanest way I can come up with to satisfy both of us. Then with expected utility .

When we accept the expected utility hypothesis then we can always write it as a expectation/sum of its parts and then there is no confusion either.

Replies from: cubefox
comment by cubefox · 2022-09-26T10:52:55.648Z · LW(p) · GW(p)

Well, the "expected value" of something is just the value multiplied by its probability. It follows that, if the thing in question has probability 1, its value is equal to the expected value. Since is a tautology, it is clear that .

Yes, this fact is independent of , but this shouldn't be surprising I think. After all, we are talking about the utility of a tautology here, not about the utility of itself! In general, is usually not 1 ( and are only presumed to be mutually exclusive, not necessarily exhaustive), so its utility and expected utility can diverge.

In fact, in his book "The Logic of Decision" Richard Jeffrey proposed for his utility theory that the utility of any tautology is zero: This should make sense, since learning a tautology has no value for us, neither positive not negative. This assumption also has other interesting consequences. Consider his "desirability axiom", which he adds to the usual axioms of probability to obtain his utility theory:

If and are mutually exclusive, then (Alternatively, this axiom is provable from the expected utility hypothesis I posted a few days ago, by dividing both sides of the equation by .)

If we combine this axiom with the assumption (tautologies have utility zero), it is provable that if then . Jeffrey explains this as follows: Interpreting utility subjectively as degree of desire, we can only desire things we don't have, or more precisely, things we are not certain are true. If something is certain, the desire for it is already satisfied, for better or for worse. Another way to look at it is that the "news value" of a certain proposition is zero. If the utility of a proposition is how good or bad it would be if we learned that it is true, then learning a certain proposition doesn't have any value, positive or negative, since we knew it all along. So it should be assigned the value 0.

Another provable consequence is this: If (with not necessarily being certain), then . In other words, if we don't care whether is true or not, if we are indifferent between and , then the utility of is zero. This seems highly plausible.

Yet another provable consequence is that we actually obtain a negation rule for utilities: In other words, the utility of the negation of is the utility of times its negative odds.

I also wondered whether it is then possible to also derive other rules for utility theory, such as for where and are not presumed to be mutually exclusive, or for . It would also be helpful to have a definition of conditional utility , i.e. the utility of under the assumption that is satisfied (certain). Presumably we would then have facts like .

Regarding the problem with the random variable : Since I believe probabilities of the values of a random variable sum to 1, I think we would have to assign all random variables probability 1 if we interpret the probability of a random variable as the probability of the disjunction of its values, and consequently utility zero if we accept that tautologies have utility zero.

But I'm not very familiar with random variables, and I'm not sure we even need them in subjective utility theory, a theory of instrumental rationality where we deal with propositions ("events") which can be believed and desired (assigned a probability and a utility). A random variable does not straightforwardly correspond to a proposition, except the binary random variable which has the two values and .

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-26T14:38:53.616Z · LW(p) · GW(p)

Ok, so this is a lot to take in, but I'll give you my first takes as a start.

My only disagreement prior to your previous comment seems to be in the legibility of the desirability axiom for which I think should contain some reference to the actual probabilities of and .

Now, I gather that this disagreement probably originates from the fact that I defined while in your framework .

Something that appears problematic to me is if we consider the tautology (in Jeffrey notation) . This would mean that reducing the risk of has net utility. In particular, certain and certain are equally preferable (). Which I don't thing either of us agree with. Perhaps I've missed something.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-26T14:49:48.776Z · LW(p) · GW(p)

Oh, I think I see what confuses me. In the subjective utility framework the expected utilities are shifted to after each Bayesian update?

So then utility of doing action to prevent a Doom is . But when action has been done then the utility scale is shifted again.

Replies from: cubefox
comment by cubefox · 2022-09-26T23:36:49.517Z · LW(p) · GW(p)

I'm not perfectly sure what the connection with Bayesian updates is here. In general it is provable from the desirability axiom that This is because any (e.g. ) is logically equivalent to for any (e.g. ), which also leads to the "law of total probability". Then we have a disjunction which we can use with the desirability axiom. The denominator cancels out and gives us in the nominator instead of , which is very convenient because we presumably don't know the prior probability of an action . After all, we want to figure out whether we should do (= make ) by calculating first. It is also interesting to note that a utility maximizer (an instrumentally rational agent) indeed chooses the actions with the highest utility, not the actions with the highest expected utility, as is sometimes claimed.

Yes, after you do an action you become certain you have done it; its probability becomes 1 and its utility 0. But I don't see that as counterintuitive, since "Doing it again", or "continuing to do it" would be a different action which has not utility 0. Is that what you meant?

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-27T06:30:17.570Z · LW(p) · GW(p)

Well, deciding to do action would also make it utility 0 (edit: or close enough considering remaining uncertainties) even before it is done. At least if you're committed to the action and then you could just as well consider the decision to be the same as the action.

It would mean that a "perfect" utility maximizer always does the action with utility (edit: but the decision can have positive utility(?)). Which isn't a problem in any way except that it is alien to how I usually think about utility.

Put in another way. While I'm thinking about which possible action I should take the utilities fluctuate until I've decided for an action and then that action has utility . I can see the appeal of just considering changes to the status quo, but the part where everything jumps around makes it an extra thing for me to keep track of.

Replies from: cubefox
comment by cubefox · 2022-09-27T10:34:48.961Z · LW(p) · GW(p)

The way I think about it: The utility maximizer looks for the available action with the highest utility and only then decides to do that action. A decision is the event of setting the probability of the action to 1, and, because of that, its utility to 0. It's not that an agent decides for an action (sets it to probability 1) because it has utility 0. That would be backwards.

There seems to be some temporal dimension involved, some "updating" of utilities. Similar to how assuming the principle of conditionalization formalizes classical Bayesian updating when something is observed. It sets to a new value, and (or because?) it sets to 1.

A rule for utility updating over time, on the other hand, would need to update both probabilities and utilities, and I'm not sure how it would have to be formalized.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-09-27T15:27:24.549Z · LW(p) · GW(p)

Ah, those timestep subscripts are just what I was missing. I hadn't realised how much I needed that grounding until I noticed how good it felt when I saw them.

So to summarise (below all sets have mutually exclusive members). In Jeffrey-ish notation we say have the axiom

and normally you would want to indicate what distribution you have over in the left-hand side. However, we always renormalize such that the distribution is our current prior. We can indicate this by labeling the utilities from what timestep (and agent should probably included as well, but lets skip this for now).

That way we don't have to worry about being shifted during the sum in the right hand side or something. (I mean notationally that would just be absurd, but if I would sit down and estimate the consequences of possible actions I wouldn't be able to not let this shift my expectation for what action I should take before I was done.).

We can also bring up the utility of an action to be

Furthermore, for most actions it is quite clear that we can drop the subscript as we know that we are considering the same timestep consistently for the same calculation

Now I'm fine with this because I will have those subscript s in the back of my mind.


I still haven't commented on in general or . My intuition is that they should be able to be described from , and , but it isn't immediately obvious to me how to do that while keeping .

I tried considering a toy case where and () and then

but I couldn't see how it would be possible without assuming some things about how , and relate to each other which I can't in general.

Replies from: cubefox, cubefox
comment by cubefox · 2022-09-27T17:01:21.941Z · LW(p) · GW(p)

Interesting! I have a few remarks, but my reply will have to wait a few days as I have to finish something.

comment by cubefox · 2022-10-03T02:41:51.724Z · LW(p) · GW(p)

Regarding the time stamp: Yeah, this is the right way to think about it, at least in the case of subjective utility theory, where utilities represent desires, and probabilities represent beliefs, and it also the right way to think about for Bayesianism (subjective probability theory). and only represent the subjective state of an agent at a particular point in time. They don't say anything how they should be changed over time. They only say that at any point in time, these functions (the agents) should satisfy the axioms.

Rules for change over time would need separate assumptions. In Bayesian probability theory this is usually the rule of classical conditionalization or the more general rule of Jeffrey conditionalization. (Bayes' theorem alone doesn't say anything about updating. Bayes' rule = classical conditionalization + Bayes' theorem)

Regarding the utility of , you write the probability part in the sum is . But it is actually just !

To see this, start with the desirability axiom: This doesn't tell us how to calculate , only . But we can write as the logically equivalent . This is a disjunction, so we can apply the desirability axiom: This is equal to Since , we have Since was chosen arbitrarily, it can be any proposition whatsoever. And since in Jeffrey's framework we only consider propositions, all actions are also described by propositions. Presumably of the form "I now do x". Hence, for any .

This proof could also be extended to longer disjunctions between mutually exclusive propositions apart from and . Hence, for a set of mutually exclusive propositions , The set , the "set of all outcomes", is a special case of where the mutually exclusive elements of sum to 1. One interpretation is to regard each as describing one complete possible world. So, But of course this holds for any proposition, not just an action . This is the elegant thing about Jeffrey's decision theory which makes it so general: He doesn't need special types of objects (acts, states of the world, outcomes etc) and definitions associated with those.

Regarding the general formula for . Your suggestion makes sense, I also think it should be expressible in terms of , , and . I think I've got a proof.

Consider The disjunctions are exclusive. By the expected utility hypothesis (which should be provable from the desirability axiom) and by the assumption, we have Then subtract the last term: Now since for any , we have . Hence, By De Morgan, . Therefore Now add to both sides: Notice that and . Therefore we can write Now subtract and we have which is equal to So we have and hence our theorem which we can also write as Success!

Okay, now with solved, what about the definition of ? I think I got it: This correctly predicts that . And it immediately leads to the plausible consequence . I don't know how to further check whether this is the right definition, but I'm pretty sure it is.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-03T09:12:40.727Z · LW(p) · GW(p)

Some first reflections on the results before I go into examining all the steps.

Hmm, yes my expression seems wrong when I look at it a second time. I think I still confused the timesteps and should have written

The extra negation comes from a reflex from when not using Jeffrey's decision theory. With Jeffrey's decision theory it reduces to your expression as the negated terms sum to . But, still I probably should learn not to guess at theorems and properly do all steps in the future. I suppose that is a point in favor for Jeffrey's decision theory that the expressions usually are cleaner.

As for your derivation you used that in the derivation but that is not the case for general . This is a note to self to check whether this still holds for .


Edit: My writing is confused here disregard it. My conclusion is still

Your expression for is nice

and what I would have expected. The problem I had was that I didn't realize that (which should have been obvious). Furthermore your expression checks out with my toy example (if remove the false expectation I had before).

Consider a lottery where you guess the sequence of 3 numbers and , and are the corresponding propositions that you guessed correctly and and . You only have preferences over whether you win or not .

Replies from: cubefox
comment by cubefox · 2022-10-03T19:46:42.052Z · LW(p) · GW(p)

I don't understand what you mean in the beginning here, how is the same as ?

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-04T10:31:37.296Z · LW(p) · GW(p)

that was one of the premises, no? You expect utility from your prior.

Replies from: cubefox
comment by cubefox · 2022-10-04T12:18:52.963Z · LW(p) · GW(p)

Oh yes, of course! (I probably thought this was supposed to be valid for our as well, which is assumed to be mutually exclusive, but, unlike , not exhaustive.)

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-04T15:16:23.592Z · LW(p) · GW(p)

General (even if mutually exclusive) is tricky I'm not sure the expression is as nice then.

Replies from: cubefox
comment by cubefox · 2022-10-04T16:48:46.199Z · LW(p) · GW(p)

But we have my result above, i.e.

This proof could also be extended to longer disjunctions between mutually exclusive propositions apart from and . Hence, for a set of mutually exclusive propositions ,

which does not rely on the assumption of being equal to . After all, I only used the desirability axiom for the derivation, not the assumption . So we get a "nice" expression anyway as long as our disjunction is mutually exclusive. Right? (Maybe I misunderstood your point.)

Regarding , I am now no longer sure that is the right definition. Maybe we instead have In which case it would follow that They are both compatible with , and I'm not sure which further plausible conditions would have to be met and which could decide which is the right definition.

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-05T07:18:16.584Z · LW(p) · GW(p)

Didn't you use that . I can see how to extend the derivation for more steps but only if . The sums

and

for arbitrary are equal if and only if .

The other alternative I see is if (and I'm unsure about this) we assume that and for .


What I would think that would mean is after we've updated probabilities and utilities from the fact that is certain. I think that would be the first one but I'm not sure. I can't tell which one that would be.

Replies from: cubefox
comment by cubefox · 2022-10-05T21:04:23.140Z · LW(p) · GW(p)

Yeah, you are right. I used the fact that . This makes use of the fact that and are both mutually exclusive and exhaustive, i.e. and . For , where and are mutually exclusive but not exhaustive, is not equivalent to . Since can be true without either of or being true.

It should however work if , since then . So for to hold, would have to be a "partition" of , exhaustively enumerating all the incompatible ways it can be true.


Regarding conditional utility, I agree. This would mean that if . I found an old paper by a someone who analyzes conditional utility in detail, though with zero citations according to Google scholar. Unfortunately the paper is hard to read because of eccentric notation, and since the author, an economist, was apparently only aware of Savage's more complicated utility theory (which has acts, states of the world, and prospects), so he doesn't work in Jeffrey's simpler and more general theory. But his conclusions seem intriguing, since he e.g. also says that , despite, as far as I know, Savage not having an axiom which demands utility 0 for certainty. Unfortunately I really don't understand his notation and I'm not quite an expert on Savage either...

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-07T11:09:33.251Z · LW(p) · GW(p)

I agree with as a sufficient criteria to only sum over , the other steps I'll have to think about before I get them.


I found this newer paper https://personal.lse.ac.uk/bradleyr/pdf/Unification.pdf and having skimmed it seemed like it had similar premises but they defined (instead of deriving it).

Replies from: cubefox
comment by cubefox · 2022-10-10T15:05:27.824Z · LW(p) · GW(p)

Thanks for the Bradley reference. He does indeed work in Jeffrey's framework. On conditional utility ("conditional desirability", in Jeffrey terminology) Bradley references another paper from 1999 where he goes into a bit more detail on the motivation:

To arrive at our candidate expression for conditional desirabilities in terms of unconditional ones, we reason as follows. Getting the news that XY is true is just the same as getting both the news that X is true and the news that Y is true. But DesXY is not necessarily equal to DesX + DesY because of the way in which the desirabilities of X and Y might depend on one another. Unless X and Y are probabilistically independent, for instance, the news that X is true will affect the probability and, hence, the desirability of Y. Or it might affect the desirability of Y directly, because it is the sort of condition that makes Y less or more desirable. It is natural then to think of DesXY as equal, not to the sum of the desirabilities of X and Y, but to the sum of the desirability of X and the desirability of Y given that X is true.

(With DesXY he means .)

I also found a more recent (2017) book from him, where he defines and where he uses the probability axioms, Jeffrey's desirability axiom, and as axioms. So pretty much the same way we did here.

So yeah, I think that settles conditional utility.

In the book Bradley has also some other interesting discussions, such as this one:

[...] Richard Jeffrey is often said to have defended a specific one, namely the ‘news value’ conception of benefit. It is true that news value is a type of value that unambiguously satisfies the desirability axioms. Consider getting the news that a trip to the beach is planned and suppose that one enjoys the beach in sunny weather but hates it in the rain. Then, whether this is good news or not will depend on how likely it is that it is going to be sunny or rainy. If you like, what the news means for you, what its implications are, depends on your beliefs. If it’s going to rain, then the news means a day of being wet and cold; if it’s going to be sunny, then the news means an enjoyable day swimming. In the absence of certainty about the weather, one’s attitude to the prospect will lie somewhere between one’s attitude to these two prospects, but closer to the one that is more probable. This explains why news value should respect the axiom of desirability. It also gives a rationale for the axiom of normality, for news that is certain is no news at all and hence cannot be good or bad.

Nonetheless, considerable caution should be exercised in giving Desirabilism this interpretation. In particular, it should not be inferred that Jeffrey’s claim is that we value something because of its news value. News value tracks desirability but does not constitute it. Moreover, it does not always track it accurately. Sometimes getting the news that X tells us more than just that X is the case because of the conditions under which we get the news. To give an extreme example: if I believe that I am isolated, then I cannot receive any news without learning that this is not the case. This ‘extra’ content is no part of the desirability of X.

Our main interest is in desirability as a certain kind of grounds for acting in conditions of uncertainty. In this respect, it perhaps more helpful to fix one’s intuitions using the concept of willingness to pay than that of news value. For if one imagines that all action is a matter of paying to have prospects made true, then the desirabilities of these prospects will measure (when appropriately scaled) the price that one is willing to pay for them. It is clear that one should not be willing to pay anything to make a tautology true and quite plausible that one should price the prospect of either X or Y by the sum of the probability-discounted prices of the each. So this interpretation is both formally adequate and exhibits the required relationship between desirability and action.

Anyway, someone should do a writeup of our findings, right? :)

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-11T16:59:55.373Z · LW(p) · GW(p)

Anyway, someone should do a writeup of our findings, right? :)

Sure, I've found it to be an interesting framework to think in so I suppose someone else might too. You're the one who's done the heavy lifting so far so I'll let you have an executive role.

If you want me to write up a first draft I can probably do it end of next week. I'm a bit busy for at least the next few days.

Replies from: cubefox
comment by cubefox · 2022-10-12T11:50:59.901Z · LW(p) · GW(p)

I think I will write a somewhat longer post as a full introduction to Jeffrey-style utility theory. But I'm still not quite sure on some things. For example, Bradley suggests that we can also interpret the utility of some proposition as the maximum amount of money we would pay (to God, say) to make it true. But I'm not sure whether that money would rather track expected utility (probability times utility) -- or not. Generally the interpretation of expected utility versus the interpretation of utility is not yet quite clear to me, yet. Have to think a bit more about it...

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-12T15:59:08.324Z · LW(p) · GW(p)

Isn't that just a question whether you assume expected utility or not. In the general case it is only utility not expected utility that matters.

Replies from: cubefox
comment by cubefox · 2022-10-12T20:37:52.133Z · LW(p) · GW(p)

I'm not sure this is what you mean, but yes, in case of acts, it is indeed so that only the utility of an action matters for our choice, not the expected utility, since we don't care about probabilities of, or assign probabilities to, possible actions when we choose among them, we just pick the action with the highest utility.

But only some propositions describe acts. I can't chose (make true/certain) that the sun shines tomorrow, so the probability of the sun shining tomorrow matters, not just its utility. Now if the utility of the sun shining tomorrow is the maximum amount of money I would pay for the sun shining tomorrow, is that plausible? Assuming the utility of sunshine tomorrow is a fixed value , wouldn't I pay less money if sunshine is very likely anyway, and more if sunshine is unlikely?

On the other hand, I believe (but am uncertain) the utility of a proposition being true moves towards 0 as its probability rises. (Which would correctly predict that I pay less for sunshine when it is likely anyway.) But I notice I don't have a real understanding of why or in which sense this happens! Specifically, we know that tautologies have utility 0, but I don't even see how to prove how it follows that all propositions with probability 1 (even non-tautologies) have utility 0. Jeffrey says it as if it's obvious, but he doesn't actually give a proof. And then, more generally, it also isn't clear to me why the utility of a proposition would move towards 0 as its probability moves towards 1, if that's the case.

I notice I'm still far from having a good level of understanding of (Jeffrey's) utility theory...

Replies from: viktor.rehnberg
comment by Viktor Rehnberg (viktor.rehnberg) · 2022-10-13T06:48:23.908Z · LW(p) · GW(p)

So we have that

[...] Richard Jeffrey is often said to have defended a specific one, namely the ‘news value’ conception of benefit. It is true that news value is a type of value that unambiguously satisfies the desirability axioms.

but at the same time

News value tracks desirability but does not constitute it. Moreover, it does not always track it accurately. Sometimes getting the news that X tells us more than just that X is the case because of the conditions under which we get the news.

And I can see how starting from this you would get that . However, I think one of the remaining confusions is how you would go in the other direction. How can you go from the premise that we shift utilities to be for tautologies to say that we value something to a large part from how unlikely it is.

And then we also have the desirability axiom

for all and such that together with Bayesian probability theory.

What I was talking about in my previous comment goes against the desirability axiom in the sense that I meant that for in the more general case there could be subjects that prefer certain outcomes proportionally more (or less) than usual such that for some probabilities . As the equality derives directly from the desirability axiom, it was wrong of me to generalise that far.

But, to get back to the confusion at hand we need to unpack the tautology axiom a bit. If we say that a proposition is a tautology if and only if [1], then we can see that any proposition that is no news to us has zero utils as well.

And I think it might be well to keep in mind that learning that e.g. sun tomorrow is more probable than we once thought does not necessarily make us prefer sun tomorrow less, but the amount of utils for sun tomorrow has decreased (in an absolute sense). This comes in nicely with the money analogy because you wouldn't buy something that you expect with certainty anyway[2], but this doesn't mean that you prefer it any less compared to some other worse outcome that you expected some time earlier. It is just that we've updated from our observations such that the utility function now reflects our current beliefs. If you prefer to then this is a fact regardless of the probabilities of those outcomes. When the probabilities change, what is changing is the mapping from proposition to real number (the utility function) and it is only changing with an shift (and possibly scaling) by a real number.

At least that is the interpretation that I've done.


  1. This seems reasonable but non-trivial to prove depending on how we translate between logic and probability. ↩︎

  2. If you do, you either don't actually expect it or has a bad sense of business. ↩︎

No comments

Comments sorted by top scores.