# Probabilities Small Enough To Ignore: An attack on Pascal's Mugging

post by Kaj_Sotala · 2015-09-16T10:45:56.792Z · score: 20 (25 votes) · LW · GW · Legacy · 176 comments*Summary: the problem with Pascal's Mugging arguments is that, intuitively, some probabilities are just too small to care about. There might be a principled reason for ignoring some probabilities, namely that they violate an implicit assumption behind expected utility theory. This suggests a possible approach for formally defining a "probability small enough to ignore", though there's still a bit of arbitrariness in it.*

**not**talking about the bastardized version of Pascal's Mugging that's gotten popular of late, where it's used to refer to any argument involving low probabilities and huge stakes (e.g. low chance of thwarting unsafe AI vs. astronomical stakes). Neither am I talking specifically about the "mugging" illustration, where a "mugger" shows up to threaten you.

**Intuition: how Pascal's Mugging breaks implicit assumptions in expected utility theory**

*need*a way to look at just the probability part component in the expected utility calculation and ignore the utility component, since the core of PM is that the utility can always be arbitrarily increased to overwhelm the low probability.

**Our first attempt**

**Looking at the first attempt in detail**

**ELU example**

**Defining R**

**General thoughts on this approach**

## 176 comments

Comments sorted by top scores.

I don't know if this solves very much. As you say, if we use the number 1, then we shouldn't wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal's Muggings. But if we start going for less than one, then we're just defining away Pascal's Mugging by fiat, saying "this is the level at which I am willing to stop worrying about this".

Also, as some people elsewhere in the comments have pointed out, this makes probability non-additive in an awkward sort of way. Suppose that if you eat unhealthy, you increase your risk of one million different diseases by plus one-in-a-million chance of getting each. Suppose also that eating healthy is a mildly unpleasant sacrifice, but getting a disease is much worse. If we calculate this out disease-by-disease, each disease is a Pascal's Mugging and we should choose to eat unhealthy. But if we calculate this out in the broad category of "getting some disease or other", then our chances are quite high and we should eat healthy. But it's very strange that our ontology/categorization scheme should affect our decision-making. This becomes much more dangerous when we start talking about AIs.

Also, does this create weird nonlinear thresholds? For example, suppose that you live on average 80 years. If some event which causes you near-infinite disutility happens every 80.01 years, you should ignore it; if it happens every 79.99 years, then preventing it becomes the entire focus of your existence. But it seems nonsensical for your behavior to change so drastically based on whether an event is every 79.99 years or every 80.01 years.

Also, a world where people follow this plan is a world where I make a killing on the Inverse Lottery (rules: 10,000 people take tickets; each ticket holder gets paid $1, except a randomly chosen "winner" who must pay $20,000)

we shouldn't wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal's Muggings

How confident are you that this is false?

suppose that you live on average 80 years. [...] 80.01 years [...] 79.99 years

I started writing a *completely wrong* response to this, and it seems worth mentioning for the benefit of anyone else whose brain has the same bugs as mine.

I was going to propose replacing "Compute how many times you expect this to happen to you; treat it as normal if n>=1 and ignore completely if n= 1, ignore completely if n<0.01, and interpolate smoothly between those for intermediate n".

But all this does, if the (dis)utility of the event is hugely greater than that of everything else you care about, is to push the threshold where you jump between "ignore everything except this" and "ignore this" further out, maybe from "once per 80 years" to "once per 8000 years".

I fear we really do need something like bounded utility to make that problem go away.

I fear we really do need something like bounded utility to make that problem go away.

If what you dislike is a discontinuity, you *still* get a discontinuity at the bound.

I am not a utilitarian, but I would look for a way to deal with the issue at the meta level. Why would you believe the bet that Pascal's Mugger offers you?

At a more prosaic level (e.g. seat belts) this looks to be a simple matter of risk tolerance and not that much of a problem.

What do you mean by "you still get a discontinuity at the bound"? (I am wondering whether by "bounded utility" you mean something like "unbounded utility followed by clipping at some fixed bounds", which would certainly introduce weird discontinuities but isn't at all what I have in mind when I imagine an agent with bounded utilities.)

I agree that doubting the mugger is a good idea, and in particular I think it's entirely reasonable to suppose that the probability that anyone can affect your utility by an amount U must decrease at least as fast as 1/U for large U, which is essentially (except that I was assuming a Solomonoff-like probability assignment) what I proposed on LW back in 2007.

Now, of course an agent's probability and utility assignments are whatever they are. Is there some reason *other* than wanting to avoid a Pascal's mugging why that condition should hold? Well, if it doesn't hold then your expected utility diverges, which seems fairly bad. -- Though I seem to recall seeing an argument from Stuart Armstrong or someone of the sort to the effect that if your utilities aren't bounded then your expected utility in some situations pretty much has to diverge anyway.

(We can't hope for a much stronger reason, I think. In particular, your utilities *can* be just about anything, so there's clearly no outright impossibility or inconsistency about having utilities that "increase too fast" relative to your probability assignments.)

but isn't at all what I have in mind when I imagine an agent with bounded utilities

What *do* you have in mind?

Crudely, something like this: Divide the value-laden world up into individuals and into short time-slices. Rate the happiness of each individual in each time-slice on a scale from -1 to +1. (So we suppose there are limits to *momentary intensity* of satisfaction or dissatisfaction, which seems reasonable to me.) Now, let h be a tiny positive number, and assign overall utility 1/h tanh(sum of atanh(h * local utility)).

Because h is extremely small, for modest lifespans and numbers of agents this is very close to net utility = sum of local utility. But we never get |overall utility| > 1/h.

Now, this feels like an ad hoc trick rather than a principled description of how we "should" value things, and I am not seriously proposing it as what anyone's utility function "should" (or does) look like. But I think one could make something of a case for an agent that works more like this: just add up local utilities linearly, *but* weight the utility for agent A in timeslice T in a way that decreases exponentially (with small constant) with the description-length of (A,T), where the way in which we describe things is inevitably somewhat referenced to ourselves. So it's pretty easy to say "me, now" and not that much harder to say "my wife, an hour from now", so these two are weighted similarly; but you need a longer description to specify a person and time much further away, and if you have 3^^^3 people then you're going to get weights about as small as 1/3^^^3.

Let me see if I understand you correctly.

You have a matrix of (number of individuals) x (number of time-slices). Each matrix cell has value ("happiness") that's constrained to lie in the [-1..1] interval. You call the cell value "local utility", right?

And then you, basically, sum up the cell values, re-scale the sum to fit into a pre-defined range and, in the process, add a transformation that makes sure the bounds are not sharp cut-offs, but rather limits which you approach asymptotically.

As to the second part, I have trouble visualising the language in which the description-length would work as you want. It seems to me it will have to involve a lot scaffolding which might collapse under its own weight.

"You have a matrix ...": correct. "And then ...": whether that's correct depends on what you mean by "in the process", but it's certainly not entirely unlike what I meant :-).

Your last paragraph is too metaphorical for me to work out whether I share your concerns. (My description was extremely handwavy so I'm in no position to complain.) I think the scaffolding required is basically just the agent's knowledge. (To clarify a couple of points: not necessarily *minimum* description length, which of course is uncomputable, but something like "shortest description the agent can readily come up with"; and of course in practice what I describe is way too onerous computationally but some crude approximation might be manageable.)

The basic issue is whether the utility weights ("description lengths") reflect the subjective preferences. If they do, it's an entirely different kettle of fish. If they don't, I don't see why "my wife" should get much more weight than "the girl next to me on a bus".

I think real people have preferences whose weights decay with distance -- geographical, temporal and conceptual. I think it would be reasonable for artificial agents to do likewise. Whether the particular mode of decay I describe resembles real people's, or would make an artificial agent tend to behave in ways we'd want, I don't know. As I've already indicated, I'm not claiming to be doing more than sketch what some kinda-plausible bounded-utility agents might look like.

One easy way to do this is to map an unbounded utility function onto a finite interval. You will end up with the same order of preferences, but your choices won't always be the same. In particular you will start avoiding cases of the mugging.

In particular you will start avoiding cases of the mugging.

Not really avoiding -- a bound on your utility in the context of a Pascal's Mugging is basically a bound on what the Mugger can offer you. For any probability of what the Mugger promises there is some non-zero amount that you would be willing to pay and that amount is a function of your bound (and of the probability, of course).

However utility asymptotically approaching a bound is likely to have its own set of problems. Here is a scenario after five seconds of thinking:

That vexatious chap Omega approaches you (again!) and this time instead of boxes offers you two buttons, let's say one of them is teal-coloured and the other is cyan-coloured. He says that if you press the teal button, 1,000,001 people will be cured of terminal cancer. But if you press the cyan button, 1,000,000 people will be cured of terminal cancer plus he'll give you a dollar. You consult your utility function, happily press the cyan button and walk away richer by a dollar. Did something go wrong?

Yes, something went wrong in your analysis.

I suggested mapping an unbounded utility function onto a finite interval. This preserves the order of the preferences in the unbounded utility function.

In my "unbounded" function, I prefer saving 1,000,001 people to saving 1,000,000 and getting a dollar. So I have the same preference with the bounded function, and so I press the teal button.

If you want to do all operations -- notably, adding utility and dollars -- *before* mapping to the finite interval, you still fall prey to the Pascal's Mugging and I don't see the point of the mapping at all in this case.

The mapping is of utility values, e.g.

In my unbounded function I might have:

Saving 1,000,000 lives = 10,000,000,000,000 utility. Saving 1,000,001 lives = 10,000,010,000,000 utility. Getting a dollar = 1 utility. Saving 1,000,000 lives and getting a dollar = 10,000,000,000,001 utility.

Here we have getting a dollar < saving 1,000,000 lives < saving 1,000,000 lives and getting a dollar < saving 1,000,001 lives.

The mapping is a one-to-one function that maps values between negative and positive infinity to a finite interval, and preserves the order of the values. There are a lot of ways to do this, and it will mean that the utility of saving 1,000,001 lives will remain higher than the utility of saving 1,000,000 lives and getting a dollar.

But it preserves this order, not everything else, and so it can still avoid Pascal's Mugging. Basically the mugging depends on multiplying the utility by a probability. But since the utility has a numerical bound, that means when the probability gets too low, this multiplied value will tend toward zero. This does mean that my system can give different results when betting is involved. But that's what we wanted, anyway.

As you say, if we use the number 1, then we shouldn't wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal's Muggings.

And in fact, it has taken lots of pushing to make all of those things common enough that we can no longer say that no one does them. (In fact, looking back at hte 90s and early 2000s, it feels like wearing one's seatbelt at all times was pretty contrarian, where I live. This is only changing thanks to intense advertising campaigns.)

if we use the number 1, then we shouldn't wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal's Muggings

Isn't this dealt with in the above by aggregating all the deals of a certain probability together?

(amount of deals that you can make in your life that have this probability) * (PEST) < 1

Maybe the expected number of major car crashes or dangerous fires, etc that you experience are each less than 1, but the expectation for the number of all such things that happen to you might be greater than 1.

There might be issues with how to group such events though, since only considering things with the exact same probability together doesn't make sense.

What are the odds that our ideas about eating healthy are wrong?

But it seems nonsensical for your behavior to change so drastically based on whether an event is every 79.99 years or every 80.01 years.

Doesn't it actually make sense to put that threshold at the predicted usable lifespan of the universe?

For example, suppose that you live on average 80 years. If some event which causes you near-infinite disutility happens every 80.01 years, you should ignore it; if it happens every 79.99 years, then preventing it becomes the entire focus of your existence.

That only applies if you're going to live *exactly* 80 years. If your lifespan is some distribution which is centered around 80 years, you should gradually stop caring as the frequency of the event goes up past 80 years, the amount by which you've stopped caring depending on the distribution. It doesn't go all the way to zero until the chance that you'll live that long is zero.

(Of course, you could reply that your chance of living to some age doesn't go to *exactly* zero, but all that is necessary to prevent the mugging is that it goes down fast enough.)

As you say, if we use the number 1, then we shouldn't wear seatbelts, get fire insurance, or eat healthy to avoid getting cancer, since all of those can be classified as Pascal's Muggings. But if we start going for less than one, then we're just defining away Pascal's Mugging by fiat, saying "this is the level at which I am willing to stop worrying about this".

The point of Pascal's mugging is things that have basically infinitely small probability. Things that will never happen, ever, ever, once in 3^^^3 universes and possibly much more. People do get in car accidents and get cancer all the time. You shouldn't ignore those probabilities.

Having a policy of heeding small risks like those is fine. Over the course of your life, they add up. There will be a large chance that you will be better off than not.

But having a policy of paying the mugger, of following expected utility in extreme cases, will never ever pay off. You will always be worse off than you otherwise would be.

So in that sense it isn't arbitrary. There is an actual number where ignoring risks below that threshold gives you the best median outcome. Following expected utility above that threshold works out for the best. Following EU on risks below the threshold is more likely to make you worse off.

If you knew your full probability distribution of possible outcomes, you could exactly calculate that number.

A rule-of-thumb I've found use for in similar situations: There are approximately ten billion people alive, of whom it's a safe conclusion that at least one is having a subjective experience that is completely disconnected from objective reality. There is no way to tell that I'm not that one-in-ten-billion. Thus, I can never be more than one minus one-in-ten-billion sure that my sensory experience is even roughly correlated with reality. Thus, it would require extraordinary circumstances for me to have any reason to worry about any probability of less than one-in-ten-billion magnitude.

There are all sorts of questionable issues with the assumptions and reasoning involved; and yet, it seems roughly as helpful as remembering that I've only got around a 99.997% chance of surviving the next 24 hours, another rule-of-thumb which handily eliminates certain probability-based problems.

Thus, I can never be more than one minus one-in-ten-billion sure that my sensory experience is even roughly correlated with reality. Thus, it would require extraordinary circumstances for me to have any reason to worry about any probability of less than one-in-ten-billion magnitude.

No. The reason not to spend much time thinking about the I-am-undetectably-insane scenario is not, in general, that it's extraordinarily unlikely. The reason is that you can't make good predictions about what would be good choices for you in worlds where you're insane and totally unable to tell.

This holds even if the probability for the scenario goes up.

/A/ reason not to spend much time thinking about the I-am-undetectably-insane scenario is as you describe; however, it's not the /only/ reason not to spend much time thinking about it.

I often have trouble explaining myself, and need multiple descriptions of an idea to get a point across, so allow me to try again:

There is roughly a 30 out of 1,000,000 chance that I will die in the next 24 hours. Over a week, simplifying a bit, that's roughly 200 out of 1,000,000 odds of me dying. If I were to buy a 1-in-a-million lottery ticket a week, then, by one rule of thumb, I should be spending 200 times as much of my attention on my forthcoming demise than I should on buying that ticket and imagining what to do with the winnings.

In parallel, if I am to compare two independent scenarios, the at-least-one-in-ten-billion odds that I'm hallucinating all this, and the darned-near-zero odds of a Pascal's Mugging attempt, then I should be spending proportionately that much more time dealing with the Matrix scenario than that the Pascal's Mugging attempt is true; which works out to darned-near-zero seconds spent bothering with the Mugging, no matter how much or how little time I spend contemplating the Matrix.

(There are, of course, alternative viewpoints which may make it worth spending more time on the low-probability scenarios in each case; for example, buying a lottery ticket can be viewed as one of the few low-cost ways to funnel money from most of your parallel-universe selves so that a certain few of your parallel-universe selves have enough resources to work on certain projects that are otherwise infeasibly expensive. But these alternatives require careful consideration and construction, at least enough to be able to have enough logical weight behind them to counter the standard rule-of-thumb I'm trying to propose here.)

In parallel, if I am to compare two independent scenarios, the at-least-one-in-ten-billion odds that I'm hallucinating all this, and the darned-near-zero odds of a Pascal's Mugging attempt, then I should be spending proportionately that much more time dealing with the Matrix scenario than that the Pascal's Mugging attempt is true

That still sounds wrong. You appear to be deciding on what to precompute for purely by probability, without considering that some possible futures will give you the chance to shift more utility around.

If I don't know anything about Newcomb's problem and estimate a 10% chance of Omega showing up and posing it to me tomorrow, I'll definitely spend more than 10% of my planning time for tomorrow reading up on and thinking about it. Why? Because I'll be able to make far more money in that possible future than the others, which means that the expected utility differentials are larger, and so it makes sense to spend more resources on preparing for it.

The I-am-undetectably-insane case is the opposite of this, a scenario that it's pretty much impossible to usefully prepare for.

And a PM scenario is (at least for an expected-utility maximizer) a more extreme variant of my first scenario - low probabilities of ridiculously large outcomes, that are because of that still worth thinking about.

**[deleted]**· 2015-09-17T21:35:53.865Z · score: 0 (2 votes) · LW(p) · GW(p)

In parallel, if I am to compare two independent scenarios, the at-least-one-in-ten-billion odds that I'm hallucinating all this, and the darned-near-zero odds of a Pascal's Mugging attempt, then I should be spending proportionately that much more time dealing with the Matrix scenario than that the Pascal's Mugging attempt is true

That still sounds wrong. You appear to be deciding on what to precompute for purely by probability, without considering that some possible futures will give you the chance to shift more utility around.

I agree, but I think I see where DataPacRat is going with his/her comments.

First, it seems as if we only think about the Pascalian scenarios that are presented to us. If we are presented with one of these scenarios, e.g. mugging, we should consider all other scenarios of equal or greater expected impact.

In addition, low probability events that we fail to consider can possibly obsolete the dilemma posed by PM. For example, say a mugger demands your wallet or he will destroy the universe. There is a nonzero probability that he has the capability to destroy the universe, but it is important to consider the much greater, but still low, probability that he dies of a heart attack right before your eyes.

In the I-am-undetectably-insane scenario, your predictions about the worlds where you're insane don't even matter, because your subjective experience doesn't actually take place in those worlds anyways.

Not only is this basically my opinion as well, but Terrence Tao once said something incredibly similar to this in response to Holden Karnofsky of GiveWell.

A satisfactory solution should work not only on humans.

A satisfactory solution should also work if the em population explodes to several quadrillion. As I said, 'all sorts of questionable issues'; it's a rule-of-thumb to keep certain troubling edge cases from being quite so troublesome, not a fixed axiom to use as the foundation for an ethical system that can be used by all sapient beings in all circumstances. Once I learn of something more coherent to use instead of the rule-of-thumb, I'll be happy to use that something-else instead.

Thanks for these thoughts, Kaj.

It's a worthwhile effort to overcome this problem, but let me offer a mode of criticising it. A lot of people are not going to want the principles of rationality to be contingent on how long you expect to live. There are a bunch of reasons for this. One is that how long you expect to live might not be well-defined. In particular, some people will want to say that there's no right answer to the question of whether you become a new person each time you wake up in the morning, or each time some of your brain cells die. On the other extreme, it might be the case that to a significant degree, some of your life continues after your heart stops beating, either through your ideas living on in others' minds, or by freezing yourself. If you freeze yourself for 1000 years, then wake up again for another hundred, should the frozen years be included in defining PESTs, or not? It seems weird that rationality should be dependent on how we formalise the philosophy of identity in the real world. Why should PESTs be defined based on how long you expect to live, compared to how long you expect humanity as a whole to live, or on the expected lifetime of anything else that you might care about.

Anyhow, despite my criticism, this is an interesting answer - cheers for writing this up.

Thanks!

I understand that line of reasoning, but to me it feels similar to the philosophy where one thinks that the principles of rationality should be totally objective and shouldn't involve things like subjective probabilities, so then one settles on a frequentist interpretation of probability and tries to get rid of subjective (Bayesian) probabilities entirely. Which doesn't really work in the real world.

One is that how long you expect to live might not be well-defined. In particular, some people will want to say that there's no right answer to the question of whether you become a new person each time you wake up in the morning, or each time some of your brain cells die.

But most people already base their reasoning on an assumption of being the same person tomorrow; if you seriously start making your EU calculations based on the assumption that you're only going to live for one day or for an even shorter period, *lots* of things are going to get weird and broken, even without my approach.

It seems weird that rationality should be dependent on how we formalise the philosophy of identity in the real world. Why should PESTs be defined based on how long you expect to live, compared to how long you expect humanity as a whole to live, or on the expected lifetime of anything else that you might care about.

It doesn't seem all that weird to me; rationality has always been a tool for us to best achieve the things we care about, so its exact form will always be dependent on the things that we care about. The kinds of deals we're willing to consider *already* depend on how long we expect to live. For example, if you offered me a deal that had a 99% chance of killing me on the spot and a 1% chance of giving me an extra 20 years of healthy life, the rational answer would be to say "no" if it was offered to me now, but "yes" if it was offered to me when I was on my deathbed.

If you say "rationality is dependent on how we formalize the philosophy of identity in the real world", it does sound counter-intuitive, but if you say "you shouldn't make deals that you never expect to be around to benefit from", it doesn't sound quite so weird anymore. If you expected to die in 10 years, you wouldn't make a deal that would give you lots of money in 30. (Of course it could still be rational if someone else you cared about would get the money after your death, but let's assume that you could only collect the payoff personally.)

Using subjective information within a decision-making framework seems fine. The troublesome part is that the idea of 'lifespan' is being used to create the framework.

Making practical decisions about how long I expect to live seems fine and normal currently. If I want an icecream tomorrow, that's not contingent on whether 'tomorrow-me' is the same person as I was today or a different one. My lifespan is uncertain, and a lot of my values might be fulfilled after it ends. Weirdnesses like the possibility of being a Boltmann brain are tricky, but at least they don't interfere with the machinery/principles of rationality - I can still do an expected value calculation. Weirdness on the object level I can deal with.

Allowing 'lifespan' to introduce weirdness into the decision-making framework itself seems less nice. Now, whether my frozen life counts as 'being alive' is extra important. Like being frozen for a long time, or lots of Boltzmann brains existing could interfere with what risks I should be willing to accept on this planet, a puzzle that would require resolution.

Using subjective information within a decision-making framework seems fine. The troublesome part is that the idea of 'lifespan' is being used to create the framework.

I'm not sure that the within/outside the framework distinction is meaningful. I feel like the expected lifetime component is also just another variable that you plug into the framework, similarly to your probabilities and values (and your general world-model, assuming that the probabilities don't come out of nowhere). The rational course of action already depends on the state of the world, and your expected remaining lifetime is a part of the state of the world.

I also actually feel that the fact that we're forced to think about our lifetime is a *good* sign. EU maximization is a tool for getting what we want, and Pascal's Mugging is a scenario where it causes us to do things that don't get us what we want. If a potential answer to PM reveals that EU maximization is broken because it doesn't properly take into account everything that we want, and forces us to consider previously-swept-under-the-rug questions about what we do want... then that seems like a sign that the proposed answer is on the right track.

I have this intuition that I'm having slight difficulties putting into words... but roughly, EU maximization is a rule of how to behave in different situations, which abstracts over the details of those situations while being ultimately derived from them. I feel that attempts to resolve Pascal's Mugging by purely "rational" grounds are mostly about trying to follow a certain aesthetic that favors deriving things from purely logical considerations and a priori principles. And that aesthetic necessarily ends up treating EU maximization as just a formal rule, neglecting to consider the *actual* situations it abstracts over, and loses sight of the actual *purpose* of the rule, which is to give us good outcomes. If you forget about trying to follow the aesthetic and look at the actual behavior that something like PEST leads to, you'll see that it's the agents who ignore PESTs are the ones who actually end up winning... which is the thing that should really matter.

If I want an icecream tomorrow, that's not contingent on whether 'tomorrow-me' is the same person as I was today or a different one.

Really? If I really only cared about the tomorrow!me for the same amount that I cared about some random stranger, my behavior would be a lot different. I wouldn't bother with any long-term plans, for one.

Of course, even people who believe that they'll be another person tomorrow still mostly act the same as everyone else. One explanation would be that their implicit behavior doesn't match their explicit beliefs... but even if it did match, there would still be a rational case for caring about their future self more than they cared about random strangers, because the future self would have more similar values to them than a random stranger. In particular, their future self would be likely to follow the same decision algorithm as they were.

So if they cared about things that happened after their death, it would be reasonable to still behave like they expected their total lifetime to be the same as with a more traditional theory of personal identity, and this is the case regardless of whether we're talking traditional EU maximization or PEST.

So, um:

Which axiom does this violate?

Continuity and independence.

Continuity: Consider the scenario where each of the [LMN] bets refer to one (guaranteed) outcome, which we'll also call L, M and N for simplicity.

Let U(L) = 0, U(M) = 1, U(N) = 10**100

For a simple EU maximizer, you can then satisfy continuity by picking p=(1-1/10**100). A PESTI agent, OTOH, may just discard a (1-p) of 1/10**100, which leaves no other options to satisfy it.

The 10**100 value is chosen without loss of generality. For PESTI agents that still track probabilities of this magnitude, increase it until they don't.

Independence: Set p to a number small enough that it's Small Enough To Ignore. At that point, the terms for getting L and M by that probability become zero, and you get equality between both sides.

Theoretically, that's the question he's asking about Pascal's Mugging, since accepting the mugger's argument would tell you that expected utility never converges. And since we could rephrase the problem in terms of (say) diamond creation for a diamond maximizer, it does look like an issue of probability rather than goals.

Theoretically, that's the question he's asking about Pascal's Mugging, since accepting the mugger's argument would tell you that expected utility never converges

Of course, and the paper cited in http://wiki.lesswrong.com/wiki/Pascal's_mugging makes that argument rigorous.

And since we could rephrase the problem in terms of (say) diamond creation for a diamond maximizer, it does look like an issue of probability rather than goals.

It's a problem of expected utility, not necessarily probability. And I still would like to know which axiom it ends up violating. I suspect Continuity.

We can replace Continuity with the Archimedean property (or, 'You would accept some chance of a bad outcome from crossing the street.') By my reading, this ELU idea trivially follows Archimedes by ignoring the part of a compound 'lottery' that involves a sufficiently small probability. In which case it would violate Independence, and would do so by treating the two sides as effectively equal when the differing outcomes have small enough probability.

I like Scott Aaronson's approach for resolving paradoxes that seemingly violate intuitions -- see if the situation makes physical sense.

Like people bring up "blockhead," a big lookup table that can hold an intelligent conversation with you for [length of time], and wonder whether this has ramifications for the Turing test. But blockhead is not really physically realizable for reasonable lengths.

Similarly for creating 10^100 happy lives, how exactly would you go about doing that in our Universe?

Similarly for creating 10^100 happy lives, how exactly would you go about doing that in our Universe?

By some alternative theory of physics that has a, say, .000000000000000000001 probability of being true.

Right, the point is to throw away certain deals. I am suggesting another approach from the OP.

The OP says: ignore deals involving small numbers. I say: ignore deals that violate physical intuitions (as they are). Where my heuristic differs from the OP is my heuristic is willing to listen to someone trying to sell me the Brooklyn bridge if I think the story fundamentally makes sense to me, given how I think physics ought to work. I am worried about long shot cases not forbidden by physics explicitly (which the OP will ignore if the shot is long enough). My heuristic will fail if humans are missing something important about physics, but I am willing to bet we are not at this point.

In your example, the OP and I will both reject, for different reasons. I because it will violate my intuition and the OP because there is a small number involved.

Relativity seems totally, insanely physically impossible to me. That doesn't mean that taking a trillion to one bet on the Michelson Morley experiment wouldn't have been a good idea.

May I recommend Feynman's lectures then? I am not sure what the point is. Aristotle was a smart guy, but his physics intuition was pretty awful. I think we are in a good enough state now that I am comfortable using physical principles to rule things out.

Arguably quantum mechanics is a better example here than relativity. But I think a lot of what makes QM weird isn't about physics but about the underlying probability theory being non-standard (similarly to how complex numbers are kinda weird). So, e.g. Bell violations say there is no hidden variable DAG model underlying QM -- but hidden variable DAG models are defined on classical probabilities, and amplitudes aren't classical probabilities. Our intuitive notion of "hidden variable" is somehow tied to classical probability.

It all has to bottom out somewhere -- what criteria do you use to rule out solutions? I think physics is in better shape today than basically any other empirical discipline.

Do you know, offhand, if Baysian networks have been extended with complex numbers as probabilities, or (reaching here) if you can do belief propagation by passing around qubits instead of bits? I'm not sure what I mean by either of these thing but I'm throwing keywords out there to see if anything sticks.

Yes they have, but there is no single generalization. I am not even sure what conditioning should mean.

Scott A is a better guy to ask.

I don't think the consensus of physicists is good enough for you to place *that* much faith in it. As I understand modern day cosmology, the consensus view holds that universe once grew by a factor of 10^78 *for no reason*. Would you pass up a 1 penny to $10,000,000,000 bet that cosmologists of the future will believe creating 10^100 happy humans is *physically* possible?

what criteria do you use to rule out solutions?

I don't know :-(. Certainly I like physics as a darn good heuristic, but I don't think I should reject bets with super-exponentially good odd based on my understanding of physics. A few *bits* of information from an expert would be enough to convince me that I'm wrong about physics, and I don't think I should reject a bet with a payout better than 1 / the odds I will see those bits.

Which particular event has P = 10^-21? It seems like part of the pascal's mugging problem is a type error: We have a utility function U(W) over physical worlds but we're trying to calculate expected utility over strings of English words instead.

Pascal's Mugging is a constructive proof that trying to maximize expected utility over logically possible worlds doesn't work in any particular world, at least with the theories we've got now. Anything that doesn't solve reflective reasoning under probabilistic uncertainty won't help against Muggings promising things from other possible worlds unless we just ignore the other worlds.

I'd say that if you assign a 10^-22 probability to a theory of physics that allows somebody to create 10^100 happy lives depending on your action, then you doing physics wrong.

If you assign probability 10^-(10^100) to 10^100 lives,10^-(10^1000) to 10^1000 lives, 10^-(10^10000) to 10^10000 lives, and so on, then you are doing physics right and you will not fall for Pascal's Mugging.

There seems to be no obvious reason to assume that the probability falls exactly in proportion to the number of lives saved.

If GiveWell told me they thought that real-life intervention A could save one life with probability PA and real-life intervention B could save a hundred lives with probability PB, I'm pretty sure that dividing PB by 100 would be the wrong move to make.

There seems to be no obvious reason to assume that the probability falls exactly in proportion to the number of lives saved.

It is an assumption to make asymptotically (that is, for the tails of the distribution), which is reasonable due to all the nice properties of exponential family distributions.

If GiveWell told me they thought that real-life intervention A could save one life with probability PA and real-life intervention B could save a hundred lives with probability PB, I'm pretty sure that dividing PB by 100 would be the wrong move to make.

I'm not implying that.

EDIT:

As a simple example, if you model the number of lives saved by each intervention as a normal distribution, you are immune to Pascal's Muggings. In fact, if your utility is linear in the number of lives saved, you'll just need to compare the means of these distributions and take the maximum. Black swan events at the tails don't affect your decision process.

Using normal distributions may be perhaps appropriate when evaluating GiveWell interventions, but for a general purpose decision process you will have, for each action, a probability distribution over possible future world state trajectories, which when combined with an utility function, will yield a generally complicated and multimodal distribution over utility. But as long as the shape of the distribution at the tails is normal-like, you wouldn't be affected by Pascal's Muggings.

But it looks like the shape of the distributions *isn't* normal-like? In fact, that's one of the standard EA arguments for why it's important to spend energy on finding the most effective thing you can do: if possible intervention outcomes really *were* approximately normally distributed, then your exact choice of an intervention wouldn't matter all that much. But actually the distribution of outcomes looks very skewed; to quote The moral imperative towards cost-effectiveness:

DCP2 includes cost-effectiveness estimates for 108 health interventions, which are presented in the chart below, arranged from least effective to most effective [...] This larger sample of interventions is even more disparate in terms of costeffectiveness. The least effective intervention analysed is still the treatment for Kaposi’s sarcoma, but there are also interventions up to ten times more cost-effective than education for high risk groups. In total, the interventions are spread over more than four orders of magnitude, ranging from 0.02 to 300 DALYs per $1,000, with a median of 5. Thus, moving money from the least effective intervention to the most effective would produce about 15,000 times the benefit, and even moving it from the median intervention to the most effective would produce about 60 times the benefit.

It can also be seen that due to the skewed distribution, the most effective interventions produce a disproportionate amount of the benefits. According to the DCP2 data, if we funded all of these interventions equally, 80% of the benefits would be produced by the top 20% of the interventions. [...]

Moreover, there have been health interventions that are even more effective than any of those studied in the DCP2. [...] For instance in the case of smallpox, the total cost of eradication was about $400 million. Since more than 100 million lives have been saved so far, this has come to less than $4 per life saved — significantly superior to all interventions in the DCP2.

I think you misunderstood what I said or I didn't explain myself well: I'm not assuming that the DALY distribution obtained if you choose interventions at random is normal. I'm assuming that for each intervention, the DALY distribution it produces is normal, with an intervention-dependent mean and variance.

I think that for the kind of interventions that GiveWell considers, this is a reasonable assumption: if the number of DALYs produced by each intervention is the result of a sum of many roughly independent variables (e.g. DALYs gained by helping Alice, DALYs gained by helping Bob, etc.) the total should be approximately normally distributed, due to the central limit theorem.

For other types of interventions, e.g. whether to fund a research project, you may want to use a more general family of distributions that allows non-zero skewness (e.g. skew-normal distributions), but as long as the distribution is light-tailed and you don't use extreme values for the parameters, you would not run into Pascal's Mugging issues.

It's easy if they have access to running detailed simulations, and while the probability that someone secretly has that ability is very low, it's not nearly as low as the probabilities Kaj mentioned here.

It is? How much energy are you going to need to run detailed sims of 10^100 people?

How do you know you don't exist in the matrix? And that the true universe above ours doesn't have infinite computing power (or huge but bounded, if you don't believe in infinity.) How do you know the true laws of physics in our own universe don't allow such possibilities?

You can say these things are *unlikely*. That's literally specified in the problem. That doesn't resolve the paradox at all though.

I don't know, but my heuristic says to ignore stories that violate sensible physics I know about.

That's fine. You can just follow your intuition, and that usually won't lead you too wrong. Usually. However the issue here is programming an AI which doesn't share our intuitions. We need to actually formalize our intuitions to get it to behave as we would.

What criterion do you use to rule out solutions?

If you assume that the probability of somebody creating X lives decreases asymptotically as exp(-X) then you will not accept the deal. In fact, the larger the number they say, the less the expected utility you'll estimate (assuming that your utility is linear in the number of lives).

It seems to me that such epistemic models are natural. Pascal's Mugging arises as a thought experiment only if you consider arbitrary probability distributions and arbitrary utility functions, which in fact may even cause the expectations to become undefined in the general case.

If you assume that the probability of somebody creating X lives decreases asymptotically as exp(-X) then you will not accept the deal.

I don't assume this. And I don't see any reason why I should assume this. It's quite possible that there exist powerful ways of simulating large numbers of humans. I don't think it's likely, but it's not *literally impossible* like you are suggesting.

Maybe it even is likely. I mean the universe seems quite large. We could theoretically colonize it and make trillions of humans. By your logic, that is incredibly improbable. For no other reason than that it involves a large number. Not that there is any physical law that suggests we can't colonize the universe.

I don't think it's likely, but it's not literally impossible like you are suggesting.

I'm not saying it's literally impossible, I'm saying that its probability should decrease with the number of humans, faster than the number of humans.

Maybe it even is likely. I mean the universe seems quite large. We could theoretically colonize it and make trillions of humans. By your logic, that is incredibly improbable. For no other reason than that it involves a large number.

Not really. I said "asymptotically". I was considering the tails of the distribution.

We can observe our universe and deduce the typical scale of the stuff in it. Trillion of humans may not be very likely but they don't appear to be physically impossible in our universe. 10^100 humans, on the other hand, are off scale. They would require a physical theory very different than ours. Hence we should assign to it a vanishingly small probability.

I'm not saying it's literally impossible

1/3^^^3 is so unfathomably huge, you might as well be saying it's literally impossible. I don't think humans are confident enough to assign probabilities so low, ever.

10^100 humans, on the other hand, are off scale. They would require a physical theory very different than ours. Hence we should assign to it a vanishingly small probability.

I think EY had the best counter argument. He had a fictional scenario where a physicist proposed a new theory that was simple and fit all of our data perfectly. But the theory also implies a new law of physics that could be exploited for computing power, and would allow unfathomably large amounts of computing power. And that computing power could be used to create simulated humans.

Therefore, if it's true, anyone alive today has a small probability of affecting large amounts of simulated people. Since that has "vanishingly small probability", the theory must be wrong. It doesn't matter if it's simple or if it fits the data perfectly.

But it seems like a theory that is simple and fits all the data should be very likely. And it seems like all agents with the same knowledge, should have the same beliefs about reality. Reality is totally uncaring about what our values are. What is true is already so. We should try to model it as accurately as possible. Not refuse to believe things because we don't like the consequences. That's actually a logical fallacy.

1/3^^^3 is so unfathomably huge, you might as well be saying it's literally impossible. I don't think humans are confident enough to assign probabilities so low, ever.

Same thing with numbers like 10^100 or 3^^^3.

I think EY had the best counter argument. He had a fictional scenario where a physicist proposed a new theory that was simple and fit all of our data perfectly. But the theory also implies a new law of physics that could be exploited for computing power, and would allow unfathomably large amounts of computing power. And that computing power could be used to create simulated humans.

EY can imagine all the fictional scenario he wants, this doesn't mean that we should assign non-negligible probabilities to them.

It doesn't matter if it's simple or if it fits the data perfectly.

If.

But it seems like a theory that is simple and fits all the data should be very likely. And it seems like all agents with the same knowledge, should have the same beliefs about reality. Reality is totally uncaring about what our values are. What is true is already so. We should try to model it as accurately as possible. Not refuse to believe things because we don't like the consequences.

If your epistemic model generates undefined expectations when you combine it with your utility function, then I'm pretty sure we can say that at least one of them is broken.

EDIT:

To expand: just because we can imagine something and give it a short English description, it doesn't mean that it is simple in epistemical terms. That's the reason why "God" is not a simple hypothesis.

EY can imagine all the fictional scenario he wants, this doesn't mean that we should assign non-negligible probabilities to them.

Not negligible, zero. You literally can not believe in an theory of physics that allows large amounts of computing power. If we discover that an existing theory like quantum physics allows us to create large computers, we will be forced to abandon it.

If your epistemic model generates undefined expectations when you combine it with your utility function, then I'm pretty sure we can say that at least one of them is broken.

Yes something is broken, but it's definitely not our prior probabilities. Something like solomonoff induction should generate perfectly sensible predictions about the world. If knowing those predictions makes you do weird things, that's a problem with your decision procedure. Not the probability function.

Not negligible, zero.

You seem to have a problem with very small probabilities but not with very large numbers. I've also noticed this in Scott Alexander and others. If very small probabilities are zeros, then very large numbers are infinities.

You literally can not believe in an theory of physics that allows large amounts of computing power. If we discover that an existing theory like quantum physics allows us to create large computers, we will be forced to abandon it.

Sure. But since we know no such theory, there is no a priori reason to assume it exists with non-negligible probability.

Something like Solomonoff induction should generate perfectly sensible predictions about the world.

Nope, it doesn't. If you apply Solomonoff induction to predict arbitrary integers, you get undefined expectations.

Solomonoff induction combined with an unbounded utility function gives undefined expectations. But Solomonoff induction combined with a bounded utility function can give defined expectations.

And Solomonoff induction by itself gives defined predictions.

Solomonoff induction combined with an unbounded utility function gives undefined expectations. But Solomonoff induction combined with a bounded utility function can give defined expectations.

Yes.

And Solomonoff induction by itself gives defined predictions.

If you try to use it to estimate the expectation of any unbounded variable, you get an undefined value.

Probability is a bounded variable.

Yes I understand that 3^^^3 is finite. But it's so unfathomably large, it might as well be infinity to us mere mortals. To say an event has 1/3^^^3 is to say you are certain it will never happen, ever. No matter how much evidence you are provided. Even if the sky opens up and the voice of god bellows to you and says "ya its true". Even if he comes down and explains why it is true to you, and shows you all the evidence you can imagine.

The word "negligible" is obscuring your true meaning. There is a massive - no, *unfathomable* - difference between 1/3^^^3 and "small" numbers like 1/10^80 (1 divided by the number of atoms in the universe.)

To use this method is to say there are hypotheses with relatively short descriptions which you will refuse to believe. Not just about muggers, but even simple things like theories of physics which *might* allow large amounts of computing power. Using this method, you might be forced to believe vastly more complicated and arbitrary theories that fit the data worse.

If you apply Solomonoff induction to predict arbitrary integers, you get undefined expectations.

Solomonoff inductions *predictions* will be perfectly reasonable, and I would trust them far more than any other method you can come up with. What you choose to do with the predictions could generate nonsense results. But that's not a flaw with SI, but with your method.

Point, but not a hard one to get around.

There is a theoretical lower bound on energy per computation, but it's extremely small, and the timescale they'll be run in isn't specified. Also, unless Scott Aaronson's speculative consciousness-requires-quantum-entanglement-decoherence theory of identity is true, there are ways to use reversible computing to get around the lower bounds and achieve theoretically limitless computation as long as you don't need it to output results. Having that be extant adds improbability, but not much on the scale we're talking about.

I'll need some background here. Why aren't bounded utilities the *default* assumption? You'd need some extraordinary arguments to convince me that anyone has an unbounded utility function. Yet this post and many others on LW seem to implicitly assume unbounded utility functions.

1) We don't need an unbounded utility function to demonstrate Pascal's Mugging. Plain old large numbers like 10^100 are enough.

2) It seems reasonable for utility to be linear in things we care about, e.g. human lives. This could run into a problem with non-uniqueness, i.e., if I run an identical computer program of you twice, maybe that shouldn't count as two. But I think this is sufficiently murky as to not make bounded utility clearly correct.

We don't need an unbounded utility function to demonstrate Pascal's Mugging. Plain old large numbers like 10^100 are enough.

The scale is arbitrary. If your utility function is designed such that utility for common scenario are not very small compared to the maximum utility then you wouldn't have Pascal's Muggings.

It seems reasonable for utility to be linear in things we care about, e.g. human lives.

Does anybody really have linear preferences in anything? This seems at odds with empirical evidence.

Like V_V, I don't find it "reasonable" for utility to be linear in things we care about.

I will write a discussion topic about the issue shortly.

EDIT: Link to the topic: http://lesswrong.com/r/discussion/lw/mv3/unbounded_linear_utility_functions/

Why aren't bounded utilities the default assumption?

Because here the default utility is the one specified by the Von Neumann-Morgenstern theorem and there is no requirement (or indication) that it is bounded.

Humans, of course, don't operate according to VNM axioms, but most of LW thinks it's a bug to be fixed X-/

But VNM theory allows for bounded utility functions, so if we are designing an agent why don't design it with a bounded utility function?

It would systematically solve Pascal's Mugging, and more formally, it would prevent the expectations to ever become undefined.

VNM theory allows for bounded utility functions

Does it? As far as I know, all it says is that the utility function ** exists**. Maybe it's bounded or maybe not -- VNM does not say.

It would systematically solve Pascal's Mugging

I don't think it would because the bounds are arbitrary and if you make them wide enough, Pascal's Mugging will still work perfectly well.

Does it? As far as I know, all it says is that the utility function exists. Maybe it's bounded or maybe not -- VNM does not say.

VNM main theorem proves that if you have a set of preferences consistent with some requirements, then an utility function exists such that maximizing its expectation satisfies your preferences.

If you are designing an agent ex novo, you can choose a bounded utility function. This restricts the set of allowed preferences, in a way that essentially prevents Pascal's Mugging.

I don't think it would because the bounds are arbitrary and if you make them wide enough, Pascal's Mugging will still work perfectly well.

Yes, but if the expected utility for common scenarios is not very far from the bounds, then Pascal's Mugging will not apply.

you can choose a bounded utility function. This restricts the set of allowed preferences

How does that work? VNM preferences are basically ordering or ranking. What kind of VNM preferences would be disallowed under a bounded utility function?

if the expected utility for common scenarios is not very far from the bounds, then Pascal's Mugging will not apply

Are you saying that you can/should set the bounds narrowly? You lose your ability to correctly react to rare events, then -- and black swans are VERY influential.

VNM preferences are basically ordering or ranking.

Only in the deterministic case. If you have uncertainty, this doesn't apply anymore: utility is invariant to positive affine transforms, not to arbitrary monotone transforms.

What kind of VNM preferences would be disallowed under a bounded utility function?

Any risk-neutral (or risk-seeking) preference in any quantity.

If you have uncertainty, this doesn't apply anymore

I am not sure I understand. Uncertainty in what? Plus, if you are going beyond the VNM Theorem, what is the utility function we're talking about, anyway?

I am not sure I understand. Uncertainty in what?

In the outcome of each action. If the world is deterministic, then all that matters is a preference ranking over outcomes. This is called ordinal utility.

If the outcomes for each action are sampled from some action-dependent probability distribution, then a simple ranking isn't enough to express your preferences. VNM theory allows you to specify a cardinal utility function, which is invariant only up to positive affine transform.

In practice this is needed to model common human preferences like risk-aversion w.r.t. money.

If the outcomes for each action are sampled from some action-dependent probability distribution, then a simple ranking isn't enough to express your preference.

Yes, you need risk tolerance / risk preference as well, but once we have that, aren't we already outside of the VNM universe?

No, risk tolerance / risk preference can be modeled with VNM theory.

Link?

Consistent risk preferences can be encapsulated in the shape of the utility function--preferring a certain $40 to a half chance of $100 and half chance of nothing, for example, is accomplished by a broad class of utility functions. Preferences on probabilities--treating 95% as different than midway between 90% and 100%--cannot be expressed in VNM utility, but that seems like a feature, not a bug.

In principle, utility non-linear in money produces various amounts of risk aversion or risk seeking. However, this fundamental paper proves that observed levels of risk aversion cannot be thus explained. The results have been generalised here to a class of preference theories broader than expected utility.

However, this fundamental paper proves that observed levels of risk aversion cannot be thus explained.

This paper has come up before, and I still don't think it proves anything of the sort. Yes, if you choose crazy inputs a sensible function will have crazy outputs--why did this get published?

In general, prospect theory is a better descriptive theory of human decision-making, but I think it makes for a terrible normative theory relative to utility theory. (This is why I specified *consistent* risk preferences--yes, you can't express transaction or probabilistic framing effects in utility theory. As said in the grandparent, that seems like a feature, not a bug.)

Because here the default utility is the one specified by the Von Neumann-Morgenstern theorem and there is no requirement (or indication) that it is bounded.

Except, the VNM theorem in the form given applies to situations with finitely many possibilities. If there are infinitely many possibilities, then the generalized theorem does require bounded utility. This follows from precisely the Pascal's mugging-type arguments like the ones being considered here.

(And with finitely many possibilities, the utility function cannot possibly be unbounded, because any finite set of reals has a maximum.)

Except, the VNM theorem in the form given applies to situations with finitely many possibilities.

In the page cited a proof outline is given for the finite case, but the theorem itself has no such restriction, whether "in the form given" or, well, the theorem itself.

If there are infinitely many possibilities, then the generalized theorem does require bounded utility.

What are you referring to as the generalised theorem? Something other than the one that VNM proved? That certainly does not require or assume bounded utility.

This follows from precisely the Pascal's mugging-type arguments like the ones being considered here.

If you're referring to the issue in the paper that entirelyuseless cited, Lumifer correctly pointed out that it is outside the setting of VNM (and someone downvoted him for it).

The paper does raise a real issue, though, for the setting it discusses. Bounding the utility is one of several possibilities that it briefly mentions to salvage the concept.

The paper is also useful in clarifying the real problem of Pascal's Mugger. It is not that you will give all your money away to strangers promising 3^^^3 utility. It is that the calculation of utility in that setting is dominated by extremes of remote possibility of vast positive and negative utility, and nowhere converges.

Physicists ran into something of the sort in quantum mechanics, but I don't know if the similarity is any more than superficial, or if the methods they worked out to deal with it have any analogue here.

What are you referring to as the generalised theorem?

Try this:

Theorem: Using the notation from here, except we will allow lotteries to have infinitely many outcomes as long as the probabilities sum to 1.

If an ordering satisfies the four axioms of completeness, transitivity, continuity, and independence, and the following additional axiom:

Axiom (5): Let L = Sum(i=0...infinity, p*_*i M*_*i) with Sum(i=0...infinity, p*_*i)=1 and N >= Sum(i=0...n, p*_*i M*_*i)/Sum(i=0...n, p*_*i) then N >= L. And similarly with the arrows reversed.

An agent satisfying axioms (1)-(5) has preferences given by a bounded utility function *u* such that, L>M iff Eu(L)>Eu(M).

Edit: fixed formatting.

Axiom (5): Let L = Sum(i=0...infinity, pi Mi) with Sum(i=0...infinity, pi)=1 and N >= Sum(i=0...n, pi Mi)/Sum(i=0...n, pi) then N >= L. And similarly with the arrows reversed.

That appears to be an axiom that probabilities go to zero enough faster than utilities that total utility converges (in a setting in which the sure outcomes are a countable set). It lacks something in precision of formulation (e.g. what is being quantified over, and in what order?) but it is fairly clear what it is doing. There's nothing like it in VNM's book or the Wiki article, though. Where does it come from?

Yes, in the same way that VNM's axioms are just what is needed to get affine utilities, an axiom something like this will give you bounded utilities. Does the axiom have any intuitive appeal, separate from it providing that consequence? If not, the axiom does not provide a justification for bounded utilities, just an indirect way of getting them, and you might just as well add an axiom saying straight out that utilities are bounded.

None of which solves the problem that entirelyuseless cited. The above axiom forbids the Solomonoff prior (for which pi Mi grows with busy beaver fastness), but does not suggest any replacement universal prior.

That appears to be an axiom that probabilities go to zero enough faster than utilities that total utility converges (in a setting in which the sure outcomes are a countable set).

No, the axiom doesn't put any constraints on the probability distribution. It merely constrains preferences, specifically it says that preferences for infinite lotteries should be the 'limits' of the preference for finite lotteries. One can think of it as a slightly stronger version of the following:

Axiom (5'): Let L = Sum(i=0...infinity, p*_*i M*_*i) with Sum(i=0...infinity, p*_*i)=1. Then if for all i N>=M*_*I then N>=L. And similarly with the arrows reversed. (In other words if N is preferred over every element of a lottery then N is preferred over the lottery.)

In fact, I'm pretty sure that axiom (5') is strong enough, but I haven't worked out all the details.

It lacks something in precision of formulation (e.g. what is being quantified over, and in what order?)

Sorry, there were some formatting problems, hopefully it's better now.

(for which p

_i M_i [formatting fixed] grows with busy beaver fastness)

The M*_*i's are lotteries that the agent has preferences over, not utility values. Thus it doesn't *a priori* make sense to talk about its growth rate.

I think I understand what the axiom is doing. I'm not sure it's strong enough, though. There is no guarantee that there is any N that is >= M_i for all i (or for all large enough i, a weaker version which I think is what is needed), nor an N that is <= them. But suppose there are such an upper Nu and a lower Nl, thus giving a continuous range between them of Np = p Nl + (1-p) Nu for all p in 0..1. There is no guarantee that the supremum of those p for which Np is a lower bound is equal to the infimum of those for which it is an upper bound. The axiom needs to stipulate that lower and upper bounds Nl and Nu exist, and that there is no gap in the behaviours of the family Np.

One also needs some axioms to the effect that a formal infinite sum Sum{i>=0: pi Mi} actually behaves like one, otherwise "Sum" is just a suggestively named but uninterpreted symbol. Such axioms might be invariance under permutation, equivalence to a finite weighted average when only finitely many pi are nonzero, and distribution of the mixture process to the components for infinite lotteries having the same sequence of component lotteries. I'm not sure that this is yet strong enough.

The task these axioms have to perform is to uniquely extend the preference relation from finite lotteries to infinite lotteries. It may be possible to do that, but having thought for a while and not come up with a suitable set of axioms, I looked for a counterexample.

Consider the situation in which there is exactly one sure-thing lottery M. The infinite lotteries, with the axioms I suggested in the second paragraph, can be identified with the probability distributions over the non-negative integers, and they are equivalent when they are permutations of each other. All of the distributions with finite support (call these the finite lotteries) are equivalent to M, and must be assigned the same utility, call it u. Take any distribution with infinite support, and assign it an arbitrary utility v. This determines the utility of all lotteries that are weighted averages of that one with M. But that won't cover all lotteries yet. Take another one and give it an arbitrary utility w. This determines the utility of some more lotteries. And so on. I don't think any inconsistency is going to arise. This allows for infinitely many different preference orderings, and hence infinitely many different utility functions.

The construction is somewhat analogous to constructing an additive function from reals to reals, i.e. one satisfying f(a+b) = f(a) + f(b). The only continuous additive functions are multiplication by a constant, but there are infinitely many non-continuous additive functions.

An alternative approach would be to first take any preference ordering consistent with the axioms, then use the VNM axioms to construct a utility function for that preference ordering, and then to impose an axiom about the behaviour of that utility function, because once we have utilities it's easy to talk about limits. The most straightforward such axiom would be to stipulate that U( Sum{i>=0: pi Mi} ) = Sum{i>=0: pi U(Mi)}, where the sum on the right hand side is an ordinary infinite sum of real numbers. The axiom would require this to converge.

This axiom has the immediate consequence that utilities are bounded, for if they were not, then for any probability distribution {i>=0: pi} with infinite support, one could choose a sequence of lotteries whose utilities grew fast enough that Sum{i>=0: pi U(Mi)} would fail to converge.

Personally, I am not convinced that bounded utility is the way to go to avoid Pascal's Mugging, because I see no principled way to choose the bound. The larger you make it, the more Muggings you are vulnerable to, but the smaller you make it, the more low-hanging fruit you will ignore: substantial chances of stupendous rewards.

In one of Eliezer's talks, he makes a point about how bad an existential risk to humanity is. It must be measured not by the number of people who die in it when it happens, but the loss of a potentially enormous future of humanity spreading to the stars. That is the real difference between "only" 1 billion of us dying, and all 7 billion. If you are moved by this argument, you must see a substantial gap between the welfare of 7 billion people and that of however many 10^n you foresee if we avoid these risks. That already gives substantial headroom for Muggings.

I think I understand what the axiom is doing. I'm not sure it's strong enough, though. There is no guarantee that there is any N that is >= M

_i for all i (or for all large enough i, a weaker version which I think is what is needed), nor an N that is <= them.

The M*_*i's can themselves be lotteries. The idea is to group events into finite lotteries so that the M_i's are >= N.

Personally, I am not convinced that bounded utility is the way to go to avoid Pascal's Mugging, because I see no principled way to choose the bound.

There is no principled way to chose utility functions either, yet people seem to be fine with them.

My point is that if one takes the VNM theory seriously as justification for having a utility function, the same logic means it must be bounded.

There is no principled way to chose utility functions either, yet people seem to be fine with them.

The VNM axioms are the principled way. That's not to say that it's a way I agree with, but it is a principled way. The axioms are the principles, codifying an idea of what it means for a set of preferences to be rational. Preferences are assumed given, not chosen.

My point is that if one takes the VNM theory seriously as justification for having a utility function, the same logic means it must be bounded.

Boundedness does not follow from the VNM axioms. It follows from VNM plus an additional construction of infinite lotteries, plus additional axioms about infinite lotteries such as those we have been discussing. Basically, if utilities are unbounded, then there are St. Petersburg-style infinite lotteries with divergent utilities; if all infinite lotteries are required to have defined utilities, then utilities are bounded.

This is indeed a problem. Either utilities are bounded, or some infinite lotteries have no defined value. When probabilities are given by algorithmic probability, the situation is even worse: if utilities are unbounded then no expected utiilties are defined.

But the problem is not solved by saying, "utilities must be bounded then". Perhaps utilities must be bounded. Perhaps Solomonoff induction is the wrong way to go. Perhaps infinite lotteries should be excluded. (Finitists would go for that one.) Perhaps some more fundamental change to the conceptual structure of rational expectations in the face of uncertainty is called for.

The VNM axioms are the principled way.

They show that you must have a utility function, not what it should be.

Boundedness does not follow from the VNM axioms. It follows from VNM plus an additional construction of infinite lotteries, plus additional axioms about infinite lotteries such as those we have been discussing.

Well the additional axiom is as intuitive as the VNM ones, and you need infinite lotteries if you are too model a world with infinite possibilities.

Perhaps Solomonoff induction is the wrong way to go.

This amounts to rejecting completeness. Suppose omega offered to create a universe based on a Solomonoff prior, you'd have to way to evaluate this proposal.

The VNM axioms are the principled way.

They show that you must have a utility function, not what it should be.

Given your preferences, they do show what your utility function should be (up to affine transformation).

Well the additional axiom is as intuitive as the VNM ones, and you need infinite lotteries if you are too model a world with infinite possibilities.

You need some, but not all of them.

This amounts to rejecting completeness.

By completeness I assume you mean assigning a finite utility to every lottery, including the infinite ones. Why not reject completeness? The St. Petersburg lottery is plainly one that cannot exist. I therefore see no need to assign it any utility.

Bounded utility does not solve Pascal's Mugging, it merely offers an uneasy compromise between being mugged by remote promises of large payoffs and passing up unremote possibilities of large payoffs.

Suppose omega offered to create a universe based on a Solomonoff prior, you'd have to way to evaluate this proposal.

I don't care. This is a question I see no need to have any answer to. But why invoke Omega? The Solomonoff prior is already put forward by some as a universal prior, and it is already known to have problems with unbounded utility. As far as I know this problem is still unsolved.

Given your preferences, they do show what your utility function should be (up to affine transformation).

Assuming your preferences satisfy the axioms.

By completeness I assume you mean assigning a finite utility to every lottery, including the infinite ones.

No, by completeness I mean that for any two lotteries you prefer one over the other.

Why not reject completeness?

So why not reject it in the finite case as well?

The St. Petersburg lottery is plainly one that cannot exist.

Care to assign a probability to that statement.

So why not reject it in the finite case as well?

Actually, I would, but that's digressing from the subject of infinite lotteries. As I have been pointing out, infinite lotteries are outside the scope of the VNM axioms and need additional axioms to be defined. It seems no more reasonable to me to require completeness of the preference ordering over St. Petersburg lotteries than to require that all sequences of real numbers converge.

Care to assign a probability to that statement.

"True." At some point, probability always becomes subordinate to logic, which knows only 0 and 1. If you can come up with a system in which it's probabilities all the way down, write it up for a mathematics journal.

If you're going to cite this (which makes a valid point, but people usually repeat the password in place of understanding the idea), tell me what probability you assign to A conditional on A, to 1+1=2, and to an omnipotent God being able to make a weight so heavy he can't lift it.

"True." At some point, probability always becomes subordinate to logic, which knows only 0 and 1. If you can come up with a system in which it's probabilities all the way down, write it up for a mathematics journal.

Ok, so care to present an *a priori* pure logic argument for why St. Petersburg lottery-like situations can't exist.

Ok, so care to present an a priori pure logic argument for why St. Petersburg lottery-like situations can't exist.

FInite approximations to the St. Petersburg lottery have unbounded values. The sequence does not converge to a limit.

In contrast, a sequence of individual gambles with expectations 1, 1/2, 1/4, etc. does have a limit, and it is reasonable to allow the idealised infinite sequence of them a place in the set of lotteries.

You might as well ask why the sum of an infinite number of ones doesn't exist. There are ways of extending the real numbers with various sorts of infinite numbers, but they are extensions. The real numbers do not include them. The difficulty of devising an extension that allows for the convergence of all infinite sums is not an argument that the real numbers should be bounded.

FInite approximations to the St. Petersburg lottery have unbounded values. The sequence does not converge to a limit.

They have unbounded *expected values*, that doesn't mean the St. Petersburg lottery can't exist, only that its expected value doesn't.

If there are infinitely many possibilities, then the generalized theorem does require bounded utility.

I am not sure I understand. Link?

http://arxiv.org/pdf/0907.5598.pdf

Our main result implies that if you have an unbounded, perception determined, computable utility function, and you use a Solomonoff-like prior (Solomonoff, 1964), then you have no way to choose between policies using expected utility.

So, it's within the AIXI context and you feed your utility function infinite (!) sequences of "perceptions".

We're not in VNM land any more.

I think this simplifies. Not sure, but here's the reasoning:

L (or expected L) is a consequence of S, so not an independent parameter. If R=1, then is this median maximalisation?http://lesswrong.com/r/discussion/lw/mqa/median_utility_rather_than_mean/ it feels close to that, anyway.

I'll think some more...

This is just a complicated way of saying, "Let's use bounded utility." In other words, the fact that people don't want to take deals where they will overall expect to get nothing out of it (in fact), means that they don't value bets of that kind enough to take them. Which means they have bounded utility. Bounded utility is the correct reponse to PM.

This is just a complicated way of saying, "Let's use bounded utility."

But nothing about the approach implies that our utility functions would need to be bounded when considering deals involving non-PEST probabilities?

If you don't want to violate the independence axiom (which perhaps you did), then you will need bounded utility also when considering deals with non-PEST probabilities.

In any case, if you effectively give probability a lower bound, unbounded utility doesn't have any specific meaning. The whole point of a double utility is that you will be willing to accept the double utility with half the probability. Once you won't accept it with half the probability (as will happen in your situation) there is no point in saying that something has twice the utility.

It's weird, but it's not quite the same as bounded utility (though it looks pretty similar). In particular, there's still a point in saying it has double the utility even though you sometimes won't accept it at half the utility. Note the caveat "sometimes": at other times, you will accept it.

Suppose event X has utility U(X) = 2 * U(Y). Normally, you'll accept it instead of Y at anything over half the probability. But if you reduce the probabilities of *both* events enough, that changes. If you simply had a bound on utility, you would get a different behavior: you'd always accept X and over half the probability of Y for any P(Y), unless the utility of Y was too high. These behaviors are both fairly weird (except in the universe where there's no possible construction of an outcome with double the utility of Y, or the universe where you can't construct a sufficiently low probability for some reason), but they're not the same.

Ok. This is mathematically correct, except that bounded utility means that if U(Y) is too high, U(X) cannot have a double utility, which means that the behavior is not so weird anymore. So in this case my question is why Kaj suggests his proposal instead of using bounded utility. Bounded utility will preserve the thing he seems to be mainly interested in, namely not accepting bets with extremely low probabilities, at least under normal circumstances, and it can preserve the order of our preferences (because even if utility is bounded, there are an infinite number of possible values for a utility.)

But Kaj's method will also lead to the Allais paradox and the like, which won't happen with bounded utility. This seems like undesirable behavior, so unless there is some additional reason why this is better than bounded utility, I don't see why it would be a good proposal.

So in this case my question is why Kaj suggests his proposal instead of using bounded utility.

Two reasons.

First, like was mentioned elsewhere in the thread, bounded utility seems to produce unwanted effects, like we want utility to be linear in human lives and bounded utility seems to fail that.

Second, the way I arrived at this proposal was that RyanCarey asked me what's my approach for dealing with Pascal's Mugging. I replied that I just ignore probabilities that are small enough, which seems to be thing that most people do in practice. He objected that that seemed rather ad-hoc and wanted to have a more principled approach, so I started thinking about *why* exactly it would make sense to ignore sufficiently small probabilities, and came up with this as a somewhat principled answer.

Admittedly, as a principled answer to which probabilities are actually small enough to ignore, this isn't all that satisfying of an answer, since it still depends on a rather arbitrary parameter. But it still seemed to point to some hidden assumptions behind utility maximization as well as raising some very interesting questions about what it is that we actually care about.

First, like was mentioned elsewhere in the thread, bounded utility seems to produce unwanted effects, like we want utility to be linear in human lives and bounded utility seems to fail that.

This is not quite what happens. When you do UDT properly, the result is that the Tegmark level IV multiverse has finite capacity for human lives (when human lives are counted with 2^-{Kolomogorov complexity} weights, as they should). Therefore the "bare" utility function has some kind of diminishing returns but the "effective" utility function is roughly linear in human lives once you take their "measure of existence" into account.

I consider it highly likely that bounded utility is the correct solution.

I agree that bounded utility implies that utility is not linear in human lives or in other similar matters.

But I have two problems with saying that we should try to get this property. First of all, no one in real life actually acts like it is linear. That's why we talk about scope insensitivity, because people don't treat it as linear. That suggests that people's real utility functions, insofar as there are such things, are bounded.

Second, I think it won't be possible to have a logically coherent set of preferences if you do that (at least combined with your proposal), namely because you will lose the independence property.

I agree that, insofar as people have something like utility functions, those are probably bounded. But I don't think that an AI's utility function should have the same properties as my utility function, or for that matter the same properties as the utility function of any human. I wouldn't want the AI to discount the well-being of me or my close ones simply because a billion *other* people are already doing pretty well.

Though ironically given my answer to your first point, I'm somewhat unconcerned by your second point, because humans probably don't have coherent preferences either, and still seem to do fine. My hunch is that rather than trying to make your preferences perfectly coherent, one is better off making a system for detecting sets of circular trades and similar exploits as they happen, and then making local adjustments to fix that particular inconsistency.

I edited this comment to include the statement that "bounded utility means that if U(Y) is too high, U(X) cannot have a double utility etc." But then it occurred to me that I should say something else, which I'm adding here because I don't want to keep changing the comment.

Evand's statement that "these behaviors are both fairly weird (except in the universe where there's no possible construction for an outcome with double the utility of Y, or the universe where you can't construct a sufficiently low probability for some reason)" implies a particular understanding of bounded utility.

For example, someone could say, "My utility is in lives saved, and goes up to 10,000,000,000." In this way he would say that saving 7 billion lives has a utility of 7 billion, saving 9 billion lives has a utility of 9 billion, and so on. But since he is bounding his utility, he would say that saving 20 billion lives has a utility of 10 billion, and so with saving any other number of lives over 10 billion.

This is definitely weird behavior. But this is not what I am suggesting by bounded utility. Basically I am saying that someone might bound his utility at 10 billion, but keep his order of preferences so that e.g. he would always prefer saving more lives to saving less.

This of course leads to something that could be considered scope insensitivity: a person will prefer a chance of saving 10 billion lives to a chance ten times as small of saving 100 billion lives, rather than being indifferent. But basically according to Kaj's post this is the behavior we were trying to get to in the first place, namely ignoring the smaller probability bet in certain circumstances. It does correspond to people's behavior in real life, and it doesn't have the "switching preferences" effect that Kaj's method will have when you change probabilities.

I think I agree that the OP does not follow independence, but everything else here seems wrong.

Actions A and B are identical except that A gives me 2 utils with .5 probability, while B gives me Graham's number with .5 probability. I do B. (Likewise if there are ~Graham's number of alternatives with intermediate payoffs.)

I'm not sure how you thought this was relevant to what I said.

What I was saying was this:

Suppose I say that A has utility 5, and B has utility 10. Basically the statement that B has twice the utility A has, has no particular meaning except that if I would like to have A at a probability of 10%, I would equally like to have B at a probability of 5%. If I would take the 10% chance and not the 5% chance, then there is no longer any meaning to saying that B has "double" the utility of A.

This does totally defy the intuitive understanding of expected utility. Intuitively, you can just set your utility function as whatever you want. If you want to maximize something like saving human lives, you can do that. As in one person dying is exactly half as bad as 2 people dying, which itself is exactly half as bad as 4 people dying, etc.

The justification for expected utility is that as the number of bets you take approach infinity, it becomes the optimal strategy. You save the most lives than you would with any other strategy. You lose some people on some bets, but gain many more on other bets.

But this justification and intuitive understanding totally breaks down in the real world. Where there are finite horizons. You don't get to take an infinite number of bets. An agent following expected utility will just continuously bet away human lives on mugger-like bets, without ever gaining anything. It will always do worse than other strategies.

You can do some tricks to maybe fix this somewhat by modifying the utility function. But that seems wrong. Why are 2 lives not twice as valuable as 1 life. Why are 400 lives not twice as valuable as 200 lives? Will this change the decisions you make in everyday, non-muggle/excessively low probability bets? It seems like it would.

Or we could just keep the intuitive justification for expected utility, and generalize it to work on finite horizons. There are some proposals for methods that do this like mean of quantiles.

I don't think that "the justification for expected utility is that as the number of bets you take approach infinity, it becomes the optimal strategy," is quite true. Kaj did say something similar to this, but that seems to me a problem with his approach.

Basically, expected utility is supposed to give a mathematical formalization to people's preferences. But consider this fact: in itself, it does not have any particular sense to say that "I like vanilla ice cream twice as much as chocolate." It makes sense to say I like it more than chocolate. This means if I am given a choice between vanilla and chocolate, I will choose vanilla. But what on earth does it mean to say that I like it "twice" as much as chocolate? In itself, nothing. We have to define this in order to construct a mathematical analysis of our preferences.

In practice we make this definition by saying that I like vanilla so much that I am indifferent to having chocolate, or to having a 50% chance of having vanilla, and a 50% chance of having nothing.

Perhaps I justify this by saying that it will get me a certain amount of vanilla in my life. But perhaps I don't - the definition does not justify the preference, it simply says what it means. This means that in order to say I like it twice as much, I have to say that I am indifferent to the 50% bet and to the certain chocolate, no matter what the justification for this might or might not be. If I change my preference when the number of cases goes down, then it will not be mathematically consistent to say that I like it twice as much as chocolate, unless we change the definition of "like it twice as much."

Basically I think you are mixing up things like "lives", which can be mathematically quantified in themselves, more or less, and people's preferences, which only have a quantity if we define one.

It may be possible for Kaj to come up with a new definition for the amount of someone's preference, but I suspect that it will result in a situation basically the same as keeping our definition, but admitting that people have only a limited amount of preference for things. In other words, they might prefer saving 100,000 lives to saving 10,000 lives, but they certainly do not prefer it 10 times as much, meaning they will not always accept the 100,000 lives saved at a 10% chance, compared to a 100% chance of saving 10,000.

But what on earth does it mean to say that I like it "twice" as much as chocolate?

Obviously it means you would be willing to trade 2 units of chocolate ice cream for 1 unit of vanilla. And over the course of your life, you would prefer to have more vanilla ice cream than chocolate ice cream. Perhaps before you die, you will add up all the ice creams you've ever eaten. And you would prefer for that number to be higher rather than lower.

Nowhere in the above description did I talk about probability. And the utility function is already completely defined. I just need to decide on a decision procedure to maximize it.

Expected utility seems like a good choice, because, over the course of my life, different bets I make on ice cream should average themselves out, and I should do better than otherwise. But that might not be true if there are ice cream muggers. Which promise lots of ice cream in exchange for a down payment, but usually lie.

So trying to convince the ice cream maximizer to follow expected utility is a lost cause. They will just end up losing all their ice cream to muggers. They need a system which ignores muggers.

This is definitely not what I mean if I say I like vanilla twice as much as chocolate. I might like it twice as much even though there is no chance that I can ever eat more than one serving of ice cream. If I have the choice of a small serving of vanilla or a triple serving of chocolate, I might still choose the vanilla. That does not mean I like it three times as much.

It is not about "How much ice cream." It is about "how much wanting".

I'm saying that the experience of eating chocolate is objectively twice as valuable. Maybe there is a limit on how much ice cream you can eat at a single sitting. But you can still choose to give up eating vanilla today and tomorrow, in exchange for eating chocolate once.

Again, you are assuming there is a quantitative measure over eating chocolate and eating vanilla, and that this determines the measure of my utility. This is not necessarily true, since these are arbitrary examples. I can still value one twice as much as the other, even if they both are experiences that can happen only once in a lifetime, or even only once in the lifetime of the universe.

Sure. But there is still some internal, objective measure of value of experiences. The constraints you add make it harder to determine what they are. But in simple cases, like trading ice cream, it's easy to determine how much value a person has for a thing.

This has nothing to do with bounded utility. Bounded utility means you don't care about any utilities above a certain large amount. Like if you care about saving lives, and you save 1,000 lives, after that you just stop caring. No amount of lives after that matters at all.

This solution allows for unbounded utility. Because you can always care about saving more lives. You just won't take bets that could save huge numbers of lives, but have very very small probabilities.

This isn't what I meant by bounded utility. I explained that in another comment. It refers to utility as a real number and simply sets a limit on that number. It does not mean that at any point "you just stop caring."

If your utility has a limit, then you can't care about anything past that limit. Even a continuous limit doesn't work, because you care less and less about obtaining more utility, as you get closer to it. You would take a 50% chance at saving 2 people the same as a guaranteed chance at saving 1 person. But not a 50% chance at saving 2,000 people, over a chance at saving 1,000.

Yes, that would be the effect in general, that you would be less willing to take chances when the numbers involved are higher. That's why you wouldn't get mugged.

But that still doesn't mean that "you don't care." You still prefer saving 2,000 lives to saving 1,000, whenever the chances are equal; your preference for the two cases does not suddenly become equal, as you originally said.

If utility is strictly bounded, then you do literally not care about saving 1,000 lives or 2,000.

You *can* fix that with asymptote. Then you do have a preference for 2,000. But the preference is only very slight. You wouldn't take a 1% risk of losing 1,000 people, to save 2,000 people otherwise. Even though the risk is very small and the gain is very huge.

So it does fix Pascal's mugging, but causes a whole new class of issues.

Your understanding of "strictly bounded" is artificial, and not what I was talking about. I was talking about assigning a strict, numerical bound to utility. That does not prevent having an infinite number of values underneath that bound.

It would be silly to assign a bound and a function low enough that "You wouldn't take a 1% risk of losing 1,000 people, to save 2,000 people otherwise," if you meant this literally, with these values.

But it is easy enough to assign a bound and a function that result in the choices we actually make in terms of real world values. It is true that if you increase the values enough, something like that will happen. And that is exactly the way real people would behave, as well.

Your understanding of "strictly bounded" is artificial, and not what I was talking about. I was talking about assigning a strict, numerical bound to utility. That does not prevent having an infinite number of values underneath that bound.

Isn't that the same as an asymptote, which I talked about?

It would be silly to assign a bound and a function low enough that "You wouldn't take a 1% risk of losing 1,000 people, to save 2,000 people otherwise," if you meant this literally, with these values.

You can set the bound wherever you want. It's arbitrary. My point is that if you ever reach it, you start behaving weird. It is not a very natural fix. It creates other issues.

It is true that if you increase the values enough, something like that will happen. And that is exactly the way real people would behave, as well.

Maybe human utility functions are bounded. Maybe they aren't. We don't know for sure. Assuming they are is a big risk. And even if they are bounded, it doesn't mean we should put that into an AI. If, somehow, it ever runs into a situation where it can help 3^^^3 people, it really should.

"If, somehow, it ever runs into a situation where it can help 3^^^3 people, it really should."

I thought the whole idea behind this proposal was that the probability of this happening is essentially zero.

If you think this is something with a reasonable probability, you should accept the mugging.

You were speaking about bounded utility functions. Not bounded probability functions.

The whole point of the Pascal's mugger scenario is that these scenarios *aren't* impossible. Solomonoff induction halves the probability of each hypothesis based on how many additional bits it takes to describe. This means the probability of different models decreases fairly rapidly. But not as rapidly as functions like 3^^^3 grow. So there are hypotheses that describe things that are 3^^^3 units large in *much* fewer than log(3^^^3) bits.

So the utility of hypotheses can grow *much* faster than their probability shrinks.

If you think this is something with a reasonable probability, you should accept the mugging.

Well the probability isn't *reasonable*. It's just not as unreasonably small as 3^^^3 is big.

But yes you could bite the bullet and say that the expected utility is so big, it doesn't matter what the probability is, and pay the mugger.

The problem is, expected utility *doesn't even converge*. There is a hypothesis that paying the mugger saves 3^^^3 lives. And there's an even more unlikely hypothesis that not paying him will save 3^^^^3 lives. And an even more complicated hypothesis that he will really save 3^^^^^3 lives. Etc. The expected utility of every action grows to infinity, and never converges on any finite value. More and more unlikely hypotheses totally dominate the calculation.

Solomonoff induction halves the probability of each hypothesis based on how many additional bits it takes to describe.

See, I told everyone that people here say this.

Fake muggings with large numbers are more profitable to the mugger than fake muggings with small numbers because the fake mugging with the larger number is more likely to convince a naive rationalist. And the profitability depends on the size of the number, not the number of bits in the number. Which makes the likelihood of a large number being fake grow *faster* than the number of bits in the number.

You are solving the specific problem of the mugger, and not the general problem of tiny bets with huge rewards.

Regardless, there's no way the probability decreases faster than the reward the mugger promises. I don't think you can assign 1/3^^^3 probability to anything. That's an unfathomably small probability. You are literally saying there is no amount of evidence the mugger could give you to convince you otherwise. Even if he showed you his matrix powers, and the computer simulation of 3^^^3 people, you still wouldn't believe him.

How could he show you "the computer simulation of 3^^^3 people"? What could you do to verify that 3^^^3 people were really being simulated?

You probably couldn't verify it. There's always the possibility that any evidence you see is made up. For all you know you are just in a computer simulation and the entire thing is virtual.

I'm just saying he can show you evidence which increases the probability. Show you the racks of servers, show you the computer system, explain the physics that allows it, lets you do the experiments that shows those physics are correct. You could solve any NP complete problem on the computer. And you could run programs that take known numbers of steps to compute. Like actually calculating 3^^^3, etc.

Sure. But I think there are generally going to be more parsimonious explanations than any that involve him having the power to torture 3^^^3 people, let alone having that power *and caring about whether I give him some money*.

Parsimonious, sure. The possibility is very unlikely. But it doesn't just need to be "very unlikely", it needs to have smaller than 1/3^^^3 probability.

Sure. But if you have an argument that some guy who shows me apparent magical powers has the power to torture 3^^^3 people with probability substantially over 1/3^^^3, then I bet I can turn it into an argument that *anyone*, with or without a demonstration of magical powers, with or without any sort of claim that they have such powers, has the power to torture 3^^^3 people with probability nearly as substantially over 1/3^^^3. Because surely for anyone under any circumstances, Pr(I experience what seems to be a convincing demonstration that they have such powers) is much larger than 1/3^^^3, whether they actually have such powers or not.

Sure. But if you have an argument that some guy who shows me apparent magical powers has the power to torture 3^^^3 people with probability substantially over 1/3^^^3, then I bet I can turn it into an argument that anyone, with or without a demonstration of magical powers, with or without any sort of claim that they have such powers, has the power to torture 3^^^3 people with probability nearly as substantially over 1/3^^^3.

Correct. That still doesn't solve the decision theory problem, it makes it worse. Since you have to take into account the possibility that anyone you meet might have the power to torture (or reward with utopia) 3^^^3 people.

I don't think you can assign 1/3^^^3 probability to anything. That's an unfathomably small probability.

About as unfathomably small as the number of 3^^^3 people is unfathomably large?

I think you're relying on "but I ** feel** this can't be right!" a bit too much.

I don't see what your point is. Yes that's a small number. It's not a feeling, that's just math. If you are assigning things 1/3^^^3 probability, you are basically saying they are impossible and no amount of evidence could convince you otherwise.

You can do that and be perfectly consistent. If that's your point I don't disagree. You can't argue about priors. We can only agree to disagree, if those are your true priors.

Just remember that reality could always say "WRONG!" and punish you for assigning 0 probability to something. If you don't want to be wrong, don't assign 1/3^^^3 probability to things you aren't 99.9999...% sure absolutely can't happen.

Eliezer showed a problem that that reasoning in his post on Pascal's Muggle.

Basically, human beings do not have an actual prior probability distribution. This should be obvious, since it means assigning a numerical probability to every possible state of affairs. No human being has ever done this, or ever will.

But you have something like a prior, but you build the prior itself based on your experience. At the moment we don't have a specific number for the probability of the mugging situation coming up, but just think it's very improbable, so that we don't expect any evidence to ever come up that would convince us. But if the mugger shows matrix powers, we would change our prior so that the probability of the mugging situation was high enough to be convinced by being shown matrix powers.

You might say that means it was already that high, but it does not mean this, given the objective fact that people do not have real priors.

Maybe humans don't really have probability distributions. But that doesn't help us actually build an AI which reproduces the same result. If we had infinite computing power and could do ideal Solomonoff induction, it would pay the mugger.

Though I would argue that humans do have approximate probability functions and approximate priors. We wouldn't be able to function in a probabilistic world if we didn't. But it's not relevant.

But if the mugger shows matrix powers, we would change our prior so that the probability of the mugging situation was high enough to be convinced by being shown matrix powers.

That's just a regular bayesian probability update! You don't need to change terminology and call it something different.

At the moment we don't have a specific number for the probability of the mugging situation coming up, but just think it's very improbable, so that we don't expect any evidence to ever come up that would convince us.

That's fine. I too think the situation is *extraordinarily* implausible. Even Solomonoff induction would agree with us. The probability that the mugger is real would be something like 1/10^100. Or perhaps the exponent should be orders of magnitude larger than that. That's small enough that it shouldn't even remotely register as a plausible hypothesis in your mind. But big enough some amount of evidence could convince you.

You don't need to posit new models of how probability theory should work. Regular probability works fine at assigning really implausible hypotheses really low probability.

But that is still way, way bigger than 1/3^^^3.

So there are hypotheses that describe things that are 3^^^3 units large in much fewer than log(3^^^3) bits.

So the utility of hypotheses can grow much faster than their probability shrinks.

If utility is straightforwardly additive, yes. But perhaps it isn't. Imagine two possible worlds. In one, there are a billion copies of our planet and its population, all somehow leading exactly the same lives. In another, there are a billion planets like ours, with different people on them. Now someone proposes to blow up one of the planets. I find that I feel less awful about this in the first case than the second (though of course either is awful) because what's being lost from the universe is something of which we have a billion copies anyway. If we stipulate that the destruction of the planet is instantaneous and painless, and that the people really are living exactly identical lives on each planet, then actually I'm not sure I care very much that one planet is gone. (But my feelings about this fluctuate.)

A world with 3^^^3 inhabitants that's described by (say) no more than a billion bits seems a little like the first of those hypothetical worlds.

I'm not very sure about this. For instance, perhaps the description would take the form: "Seed a good random number generator as follows. [...] Now use it to generate 3^^^3 person-like agents in a deterministic universe with such-and-such laws. Now run it for 20 years." and maybe you can get 3^^^3 genuinely non-redundant lives that way. But 3^^^3 is a very large number, and I'm not even quite sure there's such a thing as 3^^^3 genuinely non-redundant lives even in principle.

Well what if instead of killing them, he tortured them for an hour? Death might not matter in a Big World, but total suffering still does.

I dunno. If I imagine a world with a billion identical copies of me living identical lives, having all of them tortured doesn't seem a billion times worse than having one tortured. Would an AI's experiences matter more if, to reduce the impact of hardware error, all its computations were performed on ten identical computers?

What if any of the Big World hypotheses are true? E.g. many worlds interpretation, multiverse theories, Tegmark's hypothesis, or just a regular infinite universe. In that case anything that can exist does exist. There already are a billion versions of you being tortured. An infinite number actually. All you can ever really do is reduce the probability that you will find yourself in a good world or a bad one.

Bounded utility functions effectively give "bounded probability functions," in the sense that you (more or less) stop caring about things with very low probability.

For example, if my maximum utility is 1,000, then my maximum utility for something with a probability of one in a billion is .0000001, an extremely small utiliity, so something that I will care about very little. The probability of of the 3^^^3 scenarios may be more than one in 3^^^3. But it will still be small enough that a bounded utility function won't care about situations like that, at least not to any significant extent.

That is precisely the reason that it will do the things you object to, if that situation comes up.

That is no different from pointing out that the post's proposal will reject a "mugging" even when it will actually cost 3^^^3 lives.

Both proposals have that particular downside. That is not something peculiar to mine.

Bounded utility functions mean you stop caring about things with very high utility. That you care less about certain low probability events is just a side effect. But those events can also have very high probability and you still don't care.

If you want to just stop caring about really low probability events, why not just do that?

I just explained. There is no situation involving 3^^^3 people which will ever have a high probability. Telling me I need to adopt a utility function which will handle such situations well is trying to mug me, because such situations will never come up.

Also, I don't care about the difference between 3^^^^^3 people and 3^^^^^^3 people even if the probability is 100%, and neither does anyone else. So it isn't true that I just want to stop caring about low probability events. My utility is actually bounded. That's why I suggest using a bounded utility function, like everyone else does.

There is no situation involving 3^^^3 people which will ever have a high probability.

Really? No situation? Not even if we discover new laws of physics that allow us to have infinite computing power?

Telling me I need to adopt a utility function which will handle such situations well is trying to mug me, because such situations will never come up.

We are talking about utility functions. Probability is irrelevant. All that matters for the utility function is that *if* the situation came up, you would care about it.

Also, I don't care about the difference between 3^^^^^3 people and 3^^^^^^3 people even if the probability is 100%, and neither does anyone else.

I totally disagree with you. These numbers are so incomprehensibly huge you can't picture them in your head, sure. There is massive scope insensitivity. But if you had to make moral choices that affect those two numbers of people, you should always value the bigger number proportionally more.

E.g. if you had to torture 3^^^^^3 to save 3^^^^^^3 from getting dust specks in their eyes. Or make bets involving probabilities between various things happening to the different groups. Etc. I don't think you can make these decisions correctly if you have a bounded utility function.

If you don't make them correctly, well that 3^^^3 people probably contains a basically infinite number of copies of you. By making the correct tradeoffs, you maximize the probability that the other versions of yoruself find themselves in a universe with higher utility.

What if one considers the following approach: Let e be a probability small enough that if I were to accept all bets offered to me with probability p<= e then the expected number of such bets that I win is less than one. The approach is to ignore any bet where p <=e.

This solve's Yvain's problem with wearing seatbelts or eating unhealthy for example. It also solves the problem that "sub-dividing" a risk no longer changes whether you ignore the risk.

I'm probably way late to this thread, but I was thinking about this the other day in the response to a different thread, and considered using the Kelly Criterion to address something like Pascal's Mugging.

Trying to figure out your current 'bankroll' in terms of utility is probably open to intepretation, but for some broad estimates, you could probably use your assets, or your expected free time, or some utility function that included those plus whatever else.

When calculating optimal bet size using the Kelly criterion, you end up with a percentage of your current bankroll you should bet. This percentage will never exceed the probability of the event occurring, regardless of the size of the reward. This basically means that if I'm using my current net worth as an approximation for my 'bankroll', I shouldn't even consider betting a dollar on something I think has a one-in-a-million chance, unless my net worth is at least a million dollars.

I think this could be a bit more formalized, but might help serve as a rule-of-thumb for evaluating Pascal's Wager type scenarios.

I think the mugger can modify their offer to include "...and I will offer you this deal X times today, so it's in your interest to take the deal every time," where X is sufficiently large, and the amount requested in each individual offer is tiny but calibrated to add up to the amount that the mugger wants. If the odds are a million to one, then to gain $1000, the mugger can request $0.001 a million times.

Rolling all 60 years of bets up into one probability distribution as in your example, we get:

- 0,999999999998 chance of - 1 billion * cost-per-bet
- 1 - 0,999999999998 - epsilon chance of 10^100 lives - 1 billion * cost-per-bet
- epsilon chance of n * 10^100 lives, etc.

I think what this shows is that the aggregating technique you propose is no different than just dealing with a 1-shot bet. So if you can't solve the one-shot Pascal's mugging, aggregating it won't help in general.

**[deleted]**· 2015-09-16T23:17:41.275Z · score: 1 (1 votes) · LW(p) · GW(p)

The way I see it, if one believes that the range of possible extremes of positive or negative values is greater than the range of possible probabilities then one would have reason to treat rare high/low value events as more important than more frequent events.

**[deleted]**· 2015-09-16T15:49:42.532Z · score: 1 (1 votes) · LW(p) · GW(p)

Would that mean that if I expect to have to use transport n times throughout the next m years, with probability p of dying during commuting; and I want to calculate the PEST of, for example, fatal poisoning from canned food f, which I estimate to be able to happen about t times during the same m years, I have to lump the two dangers together and see if it is still <1? I mean, I can work from home and never eat canned food... But this doesn't seem to be what you write about when you talk about different deals.

(Sorry for possibly stupid question.)

Some more extreme possibilities on the lifespan problem: Should you figure in the possibility of life extension? The possibility of immortality?

What about Many Worlds? If you count alternate versions of yourself as you, then low probability bets make more sense.

This is an interesting heuristic, but I don't believe that it answers the question of, "What should a rational agent do here?"

The reasoning why one should rely on expected value on one offs can be used to circumvent the reasoning. It is mentioned int he article but I would like to raise it explicitly.

If I personally have a 0.1 chance of getting a high reward within my lifetime then 10 persons like me would on average hit the jackpot once.

Or in the reverse if one takes the conclusion seriously one needs to start rejecting one-offs because there isn't sufficient repetition to tend to the mean. Well you could say that value is personal and thus relevant repetition class is lifetime decisions. But if we take life to be "human value" then the relevant repetition class is choices made by homo sapiens (and possibly beyond).

Say someone offers to create 10^100 happy lives in exchange for something, and you assign them a 0.000000000000000000001 probability to them being capable and willing to carry through their promise. Naively, this has an overwhelmingly positive expected value.

If the stated probability is what you really assign then yes, positive expected value.

I see the key flaw in that the more exceptional the promise is, the lower the probability you must assign to it.

Would you give more credibility to someone offering you 10^2 US$ or 10^7 US$?

I see the key flaw in that the more exceptional the promise is, the lower the probability you must assign to it.

According to common LessWrong ideas, lowering the probability based on the exceptionality of the promise would mean lowering it based on the Kolomogorov complexity of the promise.

If you do that, you won't lower the probability enough to defeat the mugging.

If you can lower the probability more than that, of course you can defeat the mugging.

**[deleted]**· 2015-09-17T00:47:01.295Z · score: 1 (1 votes) · LW(p) · GW(p)

If you can lower the probability more than that, of course you can defeat the mugging. And one of the key problems with lowering it more is that it becomes really really hard to update when you get evidence that the mugging is real.

If you do that, you won't lower the probability enough to defeat the mugging.

If you do that, your decision system just breaks down, since the expectation over arbitrary integers with probabilities computer by Solomonoff induction is undefined. That's the reason why AIXI uses bounded rewards.

As an alternative to probability thresholds and bounded utilities (already mentioned in the comments), you could constrain the epistemic model such that for any state and any candidate action, the probability distribution of utility is light-tailed.

The effect is similar to a probability threshold: the tails of the distribution don't dominate the expectation, but this way it is "softer" and more theoretically principled, since light-tailed distributions, like those in the exponential family, are in a certain sense, "natural".