Testing lords over foolish lords: gaming Pascal's mugging

post by Stuart_Armstrong · 2013-05-07T18:47:07.412Z · LW · GW · Legacy · 18 comments

Contents

18 comments

There are two separate reasons to reject Pascal's mugger's demands. The first one is if you have a system of priors or a method of updating that precluded you from going along with the deal. The second reason is that if it becomes known that you accept Pascal's mugger situations, people are going to seek you out and take advantage of you.

I think it's useful to keep the two reasons very separate. If Pascal's mugger was a force of nature - a new theory of physics, maybe - then the case for keeping to expected utility maximisation may be quite strong. But when there are opponents, everything gets much more complicated - which is why game theory has thousands of published research papers, while expected utility maximisation is taught in passing in other subjects.

But does this really affect the argument? It means that someone approaching you with a Pascal's mugging today is much less likely to be honest (and much more likely to have simply read about it on Less Wrong). But that's a relatively small shift in probability, in an area where the number are already so huge/tiny.

Nevertheless, it seems that "reject Pascal's muggings (and other easily exploitable gambles)" may be a reasonable position to take, even if you agreed with the expected utility calculation. First, of course, you would gain that you reject all the human attempts to exploit you. But there's another dynamic: the "Lords of the Matrix" are players too. They propose certain deals to you for certain reasons, and fail to propose them to you for other reasons. We can model three kinds of lords:

  1. The foolish lords, who will offer a Pascal's mugging no matter what they predict your reaction will be.
  2. The sadistic lords, who will offer a deal you won't accept.
  3. The testing lords, who will offer a deal you will accept, but push you to the edge of your logic and value system.

Precommitting to rejecting the mugging burns you only with the foolish lords. The sadistic lords won't offer an acceptable deal anyway, and the testing lords will offer you a better deal if you've made such a precommitment. So the gain is the loss with (some of) the foolish lords versus a gain with the testing lords. Depending on your probability distribution over the lord types, this can be a reasonable thing to do, even if you would accept the impersonal version of the mugging.

18 comments

Comments sorted by top scores.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-08T05:14:42.880Z · LW(p) · GW(p)

Note that in Bostrom's version and my revised version, the Mugger is offering a positive trade, not making a threat. Isn't it great if more and more people offer you a googolplex expected utilons for $5? :)

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2013-05-08T07:35:14.184Z · LW(p) · GW(p)

Isn't it better if these people offered you even more utilons, at better odds? :)

But I do think we should keep the impersonal mugging separate - real people have lots of instincts that kick in for the original model with an actual "mugger"/beggar. I would be tempted to dismiss the original mugger out of hand, but I realised, to my surprise, that I wasn't so quick and eager to do so if the same problem was presented impersonally.

comment by Shmi (shminux) · 2013-05-07T23:46:51.969Z · LW(p) · GW(p)

Let us remember that Pascal mugging is almost never an issue for humans, who instictively ignore the possibilities they consider of low probability. I mean the subjective probability only. For example, people who are afraid of flying alieve in the high probability of a plane crash happening to them, so there is no Pascal mugging here. Same with the lottery players. Or the original Pascal's wager. No normal person will give $5 to the mugger in exchange for the promise to not create and torture a bazillion of simulated humans. Well, maybe some hapless LW reader thinking that they might be in a simulation would.

The issue only arises for an AGI, where you have to balance calculated infinitesimal odds against calculated enormous payoffs/penalties. Because only an AGi would bother calculating (and be able to calculate) them properly to begin with.

Repeat conning is not an issue if you are an AGI. Neither are the matrix lords. And precommitting to rejecting mugging is what humans already do naturally, so your suggestion has a rather low surprise value :)

An interesting issue is the one pointed out by Eliezer, where the odds are increased enormously by the provided non-anthropic evidence, but are still infinitesimally small.

Replies from: somervta
comment by somervta · 2013-05-10T04:50:30.878Z · LW(p) · GW(p)

For example, people tho are afraid should be who.

Replies from: shminux
comment by Shmi (shminux) · 2013-05-10T07:02:13.465Z · LW(p) · GW(p)

Thanks.

comment by endoself · 2013-05-08T06:54:13.396Z · LW(p) · GW(p)

If Pascal's mugger was a force of nature - a new theory of physics, maybe - then the case for keeping to expected utility maximisation may be quite strong.

There's still the failure of convergence. If the theory that made you think that it would be a good idea to accept Pascal's mugging tells you to sum an infinite series, and that infinite series diverges, then the theory is wrong.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2013-05-08T07:41:52.863Z · LW(p) · GW(p)

The convergence can be solved using the arguments I presented in: http://lesswrong.com/lw/giu/naturalism_versus_unbounded_or_unmaximisable/

Essentially, take advantage of the fact that we are finite state probabilistic machines (or analogous to that), and therefore there is a maximum to the number of choices we can expect to make. So our option set is actually finite (though brutally large).

Replies from: endoself
comment by endoself · 2013-05-08T08:10:47.821Z · LW(p) · GW(p)

I'm referring to an infinity of possible outcomes, not an infinity of possible choices. This problem still applies if the agent must pick from a finite list of actions.

Specifically, I'm referring to the problem discussed in this paper, which is mostly the same problem as Pascal's mugging.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2013-05-08T09:07:55.278Z · LW(p) · GW(p)

Interesting problem, thanks! I personally felt that there could be a good case made for insisting your utility be bounded, and that paper's an argument in that direction.

Replies from: endoself, Wei_Dai
comment by endoself · 2013-05-09T02:03:29.323Z · LW(p) · GW(p)

Pascal's mugging is less of a problem if your utility function is bounded, and it completely goes away if the bound is reasonably low, since then there just isn't any amount of utility that would outweight the improbability of the mugger being truthful.

comment by Wei Dai (Wei_Dai) · 2013-05-08T23:14:32.936Z · LW(p) · GW(p)

Weren't you working on ways to compare infinite/divergent expectations? I'm confused that you're now writing as if the problem is new to you...

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2013-05-09T09:46:18.921Z · LW(p) · GW(p)

I was working on a way to do that - but I'm also aware that not all divergent expectations can be compared, and so that there might be a case to avoid using unbounded utilities.

comment by [deleted] · 2013-05-07T19:47:49.566Z · LW(p) · GW(p)

the impersonal version of the mugging

To make sure I understand this correctly, is one example of this essentially a Pascal's Mugging collection box - The box has a note on it with a Pascal's Mugging on those who read it, but it's an inanimate box, so you can't actually interact with it or ask any clarifying questions?

Replies from: ModusPonies, ArisKatsaris, Stuart_Armstrong
comment by ModusPonies · 2013-05-07T21:23:56.891Z · LW(p) · GW(p)

I interpreted this to mean a situation where the mugging isn't being offered by an agent, but instead is simply a fact about the universe. For example, if you're the first person to think of cryonics*, then precommitments don't matter. Either the universe is such that cryonics will work, or it is not, and game theory doesn't enter into it.

*Assume for the sake of the example that cryonics has an infinitesimal chance of working and that the payoff of revival is nigh-uncountably huge. (I believe neither of these things.)

Replies from: None
comment by [deleted] · 2013-05-08T13:20:42.908Z · LW(p) · GW(p)

Thank you for the clarification. I'm still confused about something, and to explain where I was getting stuck, I think it may have been the deciders prediction of the Muggers/expected fact of the universes response to the question "Can you show me more evidence that you are a Matrix Lord(or a fact of the universe set up with comparable probability and utility.)"

For instance, if the Mugger might say:

A: "Sure, let me open a firey portal in the sky."

B: "Let me call the next several coinflips you toss."

C: "No, you'll just have to judge based on the current evidence."

D: "You were correct to question me, this was actually a scam."

E: "You question me? The offer is now invalidated and/or I have killed those people."

F: "You can't investigate this right now because your evidence gathering abilities are too low, but you could use these techniques to increase your maximum evidence gathering abilities. With sufficient repeated application, you would be able to investigate the original problem."

On the other hand, a fact of the universe may be such that:

A: Further investigation leads to sudden dramatic shifts such as firey portals.

B: Further investigation leads to more evidence that it's right, but nothing dramatic.

C: Further investigation leads nowhere new. You'll have to decide on current evidence.

D: Further investigation shows worrying about this was a waste of time.

E: Further investigation caused you to lose the opportunity: it was time sensitive.

F: Further investigation leads you to better investigative techniques, but you still can't actually investigate the original problem. Perhaps you should try again?

And I was thinking "If it's impersonal and simple, such as the box, maybe you may be stuck with C. But Foolish, Sadistic, or Testing Lords may give you anywhere from A-F." (A testing lord in particular seems likely to give you scenario F.)

However, from your, Stuart_Armstrong and ArisKatsaris's replies, this is not actually the area that is currently of concern, but I'm still somewhat confused about which position A-F I should be taking, whether it is just irrelevant to the problem and all would be handled the same, or whether some/each represents an entirely separate scenario which should be handled on it's own.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2013-05-08T16:49:47.669Z · LW(p) · GW(p)

Maybe material for a further post...

comment by ArisKatsaris · 2013-05-08T11:43:27.902Z · LW(p) · GW(p)

No, it really just doesn't have to be a statement that someone else provides at all. From the perspective of a pure Bayesian agent, Bob telling you "I'm a Matrix Lord" is merely evidence that works to update (not necessily in a positive direction) the probability of the pre-existing hypothesis "Bob is a Matrix Lord".

And Bob telling you "If you built a temple to worship this rock, 3^^^3 lives will find happiness" is merely Bayesian evidence to update the probability of the prexisting hypothesis "If I built a temple to worship this rock, 3^^^3 lives will find happiness" -- a hypothesis that a mind can construct all by itself, it doesn't need another mind to construct it for itself.

The problem is the probability you assign on the hypothesis, not that someone else provided you the hypothesis. Such explicit statements made by others are barely significant at all. As evidence they're probably near worthless. If I wanted to find potential Matrix Lords, I'd probably have better luck focusing on the people who fart the least or have had the fewest cases of diarrhea, rather than the people who say "I'm a Matrix Lord." :-)

comment by Stuart_Armstrong · 2013-05-08T07:30:44.885Z · LW(p) · GW(p)

Another example would be a new theory of physics, maybe one that would allow the creation of/access to parallel worlds, and where you had the opportunity to contribute towards the development of said theory.