Pascal's mugging and Bayes

dmytry

Pascal's mugging and Bayes

post by Dmytry · 2012-03-30T19:55:44.917Z · LW · GW · Legacy · 14 comments

Suppose that your prior probability that giving $1000 to a stranger will save precisely N beings is P(1000$ saves N beings)=f(N) , where f is some sort of probability distribution.

When the stranger makes a claim that he will torture N beings unless you give him the $1000 , the probability has to be increased to

P(1000$ saves N beings | asking for $1000 to save N beings) = f(N) * P(Asking for $1000 to save N beings | 1000$ saves N beings) / P(asking for $1000 to save N beings)

The probability is increased by factor of P(Asking for $1000 to save N beings | 1000$ saves N beings) / P(asking for $1000 to save N beings) <= 1/ P(asking for $1000 to save N beings)

If you are attending philosophical events, and being pascal-mugged by a philosopher, the 1/P(asking for $1000 to save N beings) can be less than 100 . Being asked then only raises the probability by at most factor of 100 over your f(N). If there was only one person in the world who came up with Pascal's mugging, the factor is at most a few billions.

edit: Note (it may not be very clear from the post) that if your f(N) is not small enough, not only should you be Pascal-mugged, you should also give money to random stranger when he did not even Pascal-mug you - unless the utility of the mugging is very close to 1000$.

I think it is fairly clear that it is reasonable to have f(N) that decreases monotonously with N, and it has to sum to 1 which implies that it has to fall off faster than 1/N . So the f(3^^^3) is much much smaller than 1/(3^^^3) . If one is not to do that, one is not only prone to being Pascal-mugged, one should run around screaming 'take my money and please don't torture 3^^^3 beings' at random people.

[Of course there is still a problem if one is to assign prior probability to N via Kolmogorov's complexity, but it seems to me that it doesn't make much sense to do so as such f won't be monotonously decreasing]

Other issue is the claim of 'more than 3^^^3 beings', but any reasonable f(N) seem to eat up that sum as well.

This highlight a practically important problem with use of probabilistic reasoning in decision making. A proposition may be pulled out of immensely huge space of similar propositions, which should give it appropriately small prior; but we typically don't know of the competing propositions, especially when it was transmitted from person to person, and substitute 'do we trust that person' in place of original statement. One needs to be very careful when trying to be rational and abandon intuitions, as it is very difficult to transform word problems into mathematical problems - and this operation itself relies on intuitions - and thus one could easily make a gross mistake that one's intuitions do correctly veto, providing only a very vague hint along the lines of "anyone can make this claim" .

While typing this up I found a post that goes in greater detail on the issue.

(This sort of outgrew the reply I wanted to post in the other thread)

14 comments

Comments sorted by top scores.

comment by CarlShulman · 2012-03-31T00:16:27.749Z · LW(p) · GW(p)

So to summarize: conditional on it being possible to produce huge amounts of stuff, e.g. happy or unhappy human minds, there are likely to be more (expected) effective ways of using money to produce huge amounts of desired stuff than handing it to muggers.

comment by Manfred · 2012-03-30T22:42:39.301Z · LW(p) · GW(p)

I think it is fairly clear that it is reasonable to have f(N) that decreases monotonously with N

Lots of reasonable-seeming things are false. Why is this particular thing true? That seems to be the key question.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-30T23:29:53.719Z · LW(p) · GW(p)

Show a problem where non-monotonously decreasing f(N) escapes some issue. You really can choose what ever f(N) you want (subject to it summing to 1) but in the pascal's mugging if your f(N) is sufficiently stupid (e.g. is zero everywhere except for 1 at n=3^^^3), you'll pay the money. Non-monotonously decreasing f(N) leads to cases where claiming 1 extra person raises your probability that the claim is true. Of course, an agent can be made to act this way; an agent can be made which would pay; nobody's proving that an agent can't exist which would pay.

One can hypothesise a case, whereby there is some simple way of simulating the people, which produces exactly (some number of people), and then the agent that gets this claim, would be inclined to believe because the false claimant wouldn't know the number, and then the agent would have to pay any demand. (number acting as password). That, however, is not a part of pascal's mugging. One can also come up with universe where it is, in fact, best to pay.

comment by DanielLC · 2012-03-31T01:32:02.444Z · LW(p) · GW(p)

Being asked then only raises the probability by at most factor of 100 over your f(N). If there was only one person in the world who came up with Pascal's mugging, the factor is at most a few billions.

What's the prior probability? Probably somewhere above 1/3^^^^3. The expected negative utility is enormous, and his threat makes it more so.

I've noticed that if you have a problem with Pascal's mugging, you most likely have an even bigger problem in that expected value does not converge. There could be 3^^^3 utility, or -3^^^3, or 3^^^^3, or -3^^^^3, etc. Multiply these by the probabilities (it won't make much of a difference) and add them, and the sum won't converge. It likely won't even increase/decrease without limit, though you can intentionally make it do either by adding them in different orders. In fact, you could make it converge to any number, if you divide up the probabilities right.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-31T06:31:41.700Z · LW(p) · GW(p)

What's the prior probability? Probably somewhere above 1/3^^^^3

Then you got to give money to random person who didn't even pascal-mug you, too.

I don't see why it would be somewhere above 1/3^^^^3 . The f(N) got to sum to 1 (or less than 1). In any case imo the issue with Pascal's mugging is that if we are told something we automatically assign it much, much higher probability than we assigned before, as a sort of reflex.

For someone who never been told of Pascal's mugging before, the f(N) is probably really small, something along the lines of whole of it summing to zero as evident by them not shouting 'take my money please and don't torture 3^^^^3 people' at random strangers. Then when we are told something we have to assign some nonzero prior probability, and there we fail because there may not be enough information for sensible prior.

Replies from: DanielLC

↑ comment by DanielLC · 2012-03-31T19:24:28.986Z · LW(p) · GW(p)

I don't see why it would be somewhere above 1/3^^^^3.

Because it's a round number. The universe essentially is a program. Just as a much shorter program could be written to output 3^^^^3 than a similarly large number, a much simpler universe can kill 3^^^^3 people than a similarly large number.

If I typed a computer program at random, and it was long enough that it could output 3^^^^3, and I calculated the expected output given that it actually stops, then it would be dominated by such large numbers.

In the case that he threatens you with a random number of similar magnitude, it works out differently. In this case, the prior is crazy low, but he has a similarly low chance of giving that exact number, so the probability goes way up.

Then you got to give money to random person who didn't even pascal-mug you, too.

Before it gets multiplied, it's about as likely that he would kill the people if I don't give him money.

It's not exactly as likely, so my choices would still be dominated by things like this. Except, as I already stated, I have the even bigger problem of not even being able to calculate expected value.

Eliezer didn't tell us this with the intention of convincing us to submit to muggers. We all know it gives insane results. The problem is coming up with and justifying a decision method that doesn't give insane results for this reason.

For someone who never been told of Pascal's mugging before, the f(N) is probably really small, something along the lines of whole of it summing to zero as evident by them not shouting 'take my money please and don't torture 3^^^^3 people' at random strangers.

Perhaps they're not acting rationally. For one thing, they're not logically omniscient. Maybe they didn't think of it.

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-31T19:26:33.591Z · LW(p) · GW(p)

Because it's a round number. The universe essentially is a program. Just as a much shorter program could be written to output 3^^^^3 than a similarly large number, a much simpler universe can kill 3^^^^3 people than a similarly large number.

I doubt it. Let's suppose our universe is close to being the simplest universe. How likely it is that our universe tortures a 'round number' of beings? The number of beings that our universe (or quantum many world universe) tortures, is probably only encoded in a short way by simulating the universe and counting. That's probably the roundest way to express it. You would need some very bizarre and complex laws of physics to make the universe produce a number of tortured beings which is round in more than one way.

Meanwhile a made up number (or a number that is a product of faulty reasoning) is much more likely to be round.

Replies from: DanielLC

↑ comment by DanielLC · 2012-04-01T02:54:37.998Z · LW(p) · GW(p)

You would need some very bizarre and complex laws of physics

Very bizarre compared to what we have. Not very bizarre compared to how big a number 3^^^^3 is. Once you have those basic laws, you can make it as big a number as you want.

The mugger may have the ability to kill an arbitrary number of people. If he does, he can kill 3^^^^3 people as easily as he can express it. Him killing that many people and him making that number up will be similar in likelihood.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-01T06:29:43.488Z · LW(p) · GW(p)

The point is that our universe already codes some specific number of beings that are tortured, without stacking on some extra laws. That specific number would look utterly random to anyone who's not simulating. Furthermore, the issue with informal use of complexities is... consider simple short program that iterates over every program and runs it for 3^^^^3 steps. Now, this includes stuff that you would deem to have high complexity, somewhere along the road.

Replies from: DanielLC

↑ comment by DanielLC · 2012-04-01T06:36:35.631Z · LW(p) · GW(p)

The point is that our universe already codes some specific number of beings that are tortured, without stacking on some extra laws.

It doesn't matter what our universe does. What matters is what some universe does that has a probability well over 1/3^^^^3. Stacking on extra laws can get you a universe like that, without lowering its probability that much.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-01T06:38:50.048Z · LW(p) · GW(p)

I'm getting sick of informal use of complexities. Indexing 3^^^^3 beings in the universe (i mean, somehow listing their addresses) can have complexity greater than 3^^^^3 . If you don't care for the complexity of indexing then all Kolmogorov's complexities greater than that of a program which iterates through all programs and runs them for infinite number of steps each (ha, can do that if i choose right language), are equal. That short program produces all the universes, and all the beings, and everything.

Replies from: DanielLC

↑ comment by DanielLC · 2012-04-01T18:58:53.391Z · LW(p) · GW(p)

Suppose the universe allows for hypercomputers (presumably this have a finite likelihood, and it won't be proportional to whatever number someone puts in it later on), but it's hard enough to do that it doesn't happen naturally. At some point, a sapient species evolves, and a member builds hypercomputer. He simulates a universe on it, in a program called the Matrix. At some point, just for kicks, he contacts someone inside the Matrix and threatens to use his powers from outside the Matrix to kill 3^^^^3 (a number easy to make up) people if they don't give him five dollars. If they don't, he writes a program that can create people, and sets it to randomly create and kill 3^^^^3 of them.

Each step of this is unlikely. The unlikelihood multiplies with each successive step. At no point does it even vaguely begin to approach 1/3^^^^3.

comment by Oscar_Cunningham · 2012-03-30T21:06:51.560Z · LW(p) · GW(p)

[Of course there is still a problem if one is to assign prior probability to N via Kolmogorov's complexity, but it seems to me that it doesn't make much sense to do so as such f won't be monotonously decreasing]

I think this is the problem that most people are worried about. What's more likely, that a big business is sold for $1,000,000,000 or that it's sold for $945,234,567?

Replies from: Dmytry

↑ comment by Dmytry · 2012-03-30T21:10:22.585Z · LW(p) · GW(p)

What is more likely, that a big business is worth $1,000,000,000 or that it is worth $945,234,567 ? ;)

We round off the numbers; the 1000 000 000 accumulates some of the values from a range nearby.

Intuitively, complexity of generating and torturing this many beings, is so enormous, that the change in Kolmogorov's complexity due to their sheer count, is unimportant. (plus I, personally, do not count duplicates, so the N falls well short of 10^(10^30) .

edit: example, the lowest complexity way to torture really many beings could be to simply implement our universe. The number of beings here really is unlikely to be anything as obvious as 3^^^^3 , and probably can't be generated in any other way than yourself running the universe and counting. edit: or it might well be uncountably infinite. In any case, for Kolmogorov's complexity as a prior, the super-universe needs infinite computing power. And so do you to calculate that prior.