# A simple game that has no solution

post by James_Miller · 2014-07-20T18:36:54.636Z · LW · GW · Legacy · 124 comments## Contents

The Game An Incorrect Argument for A Why the Game Has No Solution None 124 comments

The following simple game has one solution that seems correct, but isn’t. Can you figure out why?

**The Game**

Player One moves first. He must pick A, B, or C. If Player One picks A the game ends and Player Two does nothing. If Player One picks B or C, Player Two will be told that Player One picked B or C, but will not be told which of these two strategies Player One picked, Player Two must then pick X or Y, and then the game ends. The following shows the Players’ payoffs for each possible outcome. Player One’s payoff is listed first.

A 3,0 [And Player Two never got to move.]

B,X 2,0

B,Y 2,2

C,X 0,1

C,Y 6,0

The players are rational, each player cares only about maximizing his own payoff, the players can’t communicate, they play the game only once, this game is all that will ever matter to them, and all of this plus the payoffs and the game structure is common knowledge.

Guess what will happen. Imagine you are really playing the game and decide what you would do as either Player One, or as Player Two if you have been told that you will get to move. To figure out what you would do you must formulate a belief about what the other player has/will do, and this will in part be based on your belief about his belief of what you have/will do.

**An Incorrect Argument for A**

If Player One picks A he gets 3, whereas if he picks B he gets 2 regardless of what Player Two does. Consequently, Player One should never pick B. If Player One picks C he might get 0 or 6 so we can’t rule out Player One picking C, at least without first figuring out what Player Two will do.

Player Two should assume that Player One will never pick B. Consequently, if Player Two gets to move he should assume that C was played and therefore Player Two should respond with X. If Player One believes that Player Two will, if given the chance to move, pick X, then Player One is best off picking A. In conclusion, Player One will pick A and Player Two will never get to move.

**Why the Game Has No Solution**

I believe that the above logic is wrong, and indeed the game has no solution. My reasoning is given in rot13. (Copy what is below and paste at this link to convert to English.)

Vs gur nobir nanylfvf jrer pbeerpg Cynlre Gjb jbhyq oryvrir ur jvyy arire zbir. Fb jung unccraf vs Cynlre Gjb qbrf trg gb zbir? Vs Cynlre Gjb trgf gb zbir jung fubhyq uvf oryvrs or nobhg jung Cynlre Bar qvq tvira gung Cynlre Gjb xabjf Cynlre Bar qvq abg cvpx N? Cynlre Gjb pna’g nffhzr gung P jnf cynlrq. Vs vg jrer gehr gung vg’f pbzzba xabjyrqtr gung Cynlre Bar jbhyq arire cynl O, gura vg fubhyq or pbzzba xabjyrqtr gung Cynlre Gjb jbhyq arire cynl L, juvpu jbhyq zrna gung Cynlre Bar jbhyq arire cynl P, ohg pyrneyl Cynlre Bar unf cvpxrq O be P fb fbzrguvat vf jebat.

Zber nofgenpgyl, vs V qrirybc n gurbel gung lbh jba’g gnxr npgvba Y, naq guvf arprffnevyl erfhygf va gur vzcyvpngvba gung lbh jba’g qb npgvba Z, gura vs lbh unir pyrneyl qbar rvgure Y be Z zl bevtvany gurbel vf vainyvq. V’z abg nyybjrq gb nffhzr gung lbh zhfg unir qbar Z whfg orpnhfr zl vavgvny cebbs ubyqvat gung lbh jba’g qb Y gbbx srjre fgrcf guna zl cebbs sbe jul lbh jba’g qb Z qvq.

Abar vs guvf jbhyq or n ceboyrz vs vg jrer veengvbany sbe Cynlre Bar gb abg cvpx N. Nsgre nyy, V unir nffhzrq engvbanyvgl fb V’z abg nyybjrq gb cbfghyngr gung Cynlre Bar jvyy qb fbzrguvat veengvbany. Ohg vg’f veengvbany sbe Cynlre Bar gb Cvpx P bayl vs ur rfgvzngrf gung gur cebonovyvgl bs Cynlre Gjb erfcbaqvat jvgu L vf fhssvpvragyl ybj. Cynlre Gjb’f zbir jvyy qrcraq ba uvf oryvrsf bs jung Cynlre Bar unf qbar vs Cynlre Bar unf abg cvpxrq N. Pbafrdhragyl, jr pna bayl fnl vg vf veengvbany sbe Cynlre Bar gb abg cvpx N nsgre jr unir svtherq bhg jung oryvrs Cynlre Gjb jbhyq unir vs Cynlre Gjb trgf gb cynl. Naq guvf oryvrs bs Cynlre Gjb pna’g or onfrq ba gur nffhzcgvba gung Cynlre Bar jvyy arire cvpx O orpnhfr guvf erfhygf va Cynlre Gjb oryvrivat gung Cynlre Bar jvyy arire cvpx P rvgure, ohg pyrneyl vs Cynlre Gjb trgf gb zbir rvgure O be P unf orra cvpxrq.

Va fhz, gb svaq n fbyhgvba sbe gur tnzr jr arrq gb xabj jung Cynlre Gjb jbhyq qb vs ur trgf gb zbir, ohg gur bayl ernfbanoyr pnaqvqngr fbyhgvba unf Cynlre Gjb arire zbivat fb jr unir n pbagenqvpgvba naq V unir ab vqrn jung gur evtug nafjre vf. Guvf vf n trareny ceboyrz va tnzr gurbel jurer n fbyhgvba erdhverf svthevat bhg jung n cynlre jbhyq qb vs ur trgf gb zbir, ohg nyy gur ernfbanoyr fbyhgvbaf unir guvf cynlre arire zbivat.

Update: Emile has a great answer if you assume a "trembling hand."

## 124 comments

Comments sorted by top scores.

## comment by Manfred · 2014-07-21T07:29:54.310Z · LW(p) · GW(p)

Consider the Prisoner's Dilemma, modified so that one person moves first and the other person gets to observe their move before choosing.

Obviously the classically correct first move is to defect first. Thus the second player will never have to deal with a move of Cooperate.

Therefore if a move of Cooperate *is* made, the second player's move is classically undefined (if one accepts the logic of this post). And yet, if both players play cooperate it's better than (D,D), and so which move gets made first depends on the actions of the second player if Cooperate is played first. Therefore, this prisoner's dilemma has no solution.

(I consider this to be a reductio).

Viewed this way, there's an obvious relationship with the formal-agent problem of "I can prove what option is best, and I know I'll take the best option - therefore, if I do something else all logical statements are conditioning on a falsehood, and so it's true that I can get the best results by doing nothing." The solution there is to not use logical conditioning inside the decision-making process like that, and instead use causal insertion

Similarly, we might never expect the second player in an ordered Prisoner's Dilemma to have to deal with cooperation. But we can still talk about that counterfactual by declaring by fiat that the first move was cooperation and looking at the choices that result. Note the similarity between causal insertion and the trembling hand - almost like this trembling hand stuff works for a deeper reason. If our second PD player is an ordinary classical agent, they will choose Defect - problem resolved.

The game you present has an extra dimension, but upon learning that B or C were chosen (again, via causal surgery, not logical conditioning), a classical agent without additional information will just play the Nash equilibrium of the sub-game where only B or C are available - see JGWeissman's comment for the correct numbers.

## comment by JGWeissman · 2014-07-20T20:20:39.987Z · LW(p) · GW(p)

Classical game theory says that player 1 should chose A for expected utility 3, as this is better than than the sub game of choosing between B and C where the best player 1 can do against a classically rational player 2 is to play B with probability 1/3 and C with probability 2/3 (and player 2 plays X with probability 2/3 and Y and with probability 1/3), for an expected value of 2.

But, there are pareto improvements available. Player 1's classically optimal strategy gives player 1 expected utility 3 and player 2 expected utility 0. But suppose instead Player 1 plays C, and player 2 plays X with probability 1/3 and Y with probability 2/3. Then the expected utility for player 1 is 4 and for player 2 it is 1/3. Of course, a classically rational player 2 would want to play X with greater probability, to increase its own expected utility at the expense of player 1. It would want to increase the probability beyond 1/2 which is the break even point for player 1, but then player 1 would rather just play A.

So, what would 2 TDT/UDT players do in this game? Would they manage to find a point on the pareto frontier, and if so, which point?

Replies from: Joshua_Blaine, James_Miller## ↑ comment by Joshua_Blaine · 2014-07-23T03:20:32.917Z · LW(p) · GW(p)

Two TDT players have 3 plausible outcomes to me, it seems. This comes from my admittedly inexperienced intuitions, and not much rigorous math. The 1st two plausible points that occurred to me are 1)both players choose C,Y, with certainty, or 2)they sit at exactly the equilibrium for p1, giving him an expected payout of 3, and p2 an expected payout of .5. Both of these improve on the global utility payout of 3 that's gotten if p1 just chooses A (giving 6 and 3.5, respectively), which is a positive thing, right?

The argument that supports these possibilities isn't unfamiliar to TDT. p2 does not expect to be given a choice, except in the cases where p1 is using TDT, therefore she has the choice of Y, with a payout of 0, or not having been given a chance to chose at all. Both of these possibilities have no payout, so p2 is neutral about what choice to make, therefore choosing Y makes some sense. Alternatively, Y has to choose between A for 3 or C for p(.5)*(6), which have the same payout. C, however, gives p2 .5 more utility than she'd otherwise get, so it makes some sense for p1 to pick C.

Alternatively, and what occurred to me last, both these agents have some way to equally share their "profit" over Classical Decision Theory. For however much more utility than 3 p1 gets, p2 gets the same amount. This payoff point (p1-3=p2) does exists, but I'm not sure where it is without doing more math. Is this a well formulated game theoretic concept? I don't know, but it makes some sense to my idea of "fairness", and the kind of point two well-formulated agents should converge on.

## ↑ comment by James_Miller · 2014-07-20T20:26:36.279Z · LW(p) · GW(p)

"Classical game theory says that player 1 should chose A for expected utility 3, as this is better than than the sub game of choosing between B and C "

No since this is not a subgame because of the uncertainty. From Wikipedia " In game theory, a subgame is any part (a subset) of a game that meets the following criteria...It has a single initial node that is the only member of that node's information set... "

I'm uncertain about what TDT/UDT would say.

Replies from: JGWeissman, ThisSpaceAvailable## ↑ comment by JGWeissman · 2014-07-20T21:24:59.845Z · LW(p) · GW(p)

To see that it is indeed a subgame:

Represent the whole game with a tree whose root node represents player 1 choosing whether to play A (leads to leaf node), or to enter the subgame at node S. Node S is the root of the subgame, representing player 1's choices to play B or C leading to nodes representing player 2 choice to play X or Y in those respective cases, each leading to leaf nodes.

Node S is the only node in its information set. The subgame contains all the descendants of S. The subgame contains all nodes in the same information set as any node in the subgame. It meets the criteria.

There is no uncertainty that screws up my argument. The whole point of talking about the subgame was to stop thinking about the possibility that player 1 chose A, because that had been observed not to happen. (Of course, I also argue that player 2 should be interested in logically causing player 1 not to have chosen A, but that gets beyond classical game theory.)

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T21:34:32.568Z · LW(p) · GW(p)

I'm sorry but "subgame" has a very specific definition in game theory which you are not being consistent with. Also, intuitively when you are in a subgame you can ignore everything outside of the subgame, playing as if it didn't exist. But when Player 2 moves he can't ignore A because the fact that Player 1 could have picked A but did not provides insight into whether Player 1 picked B or C. I am a game theorist.

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-20T22:43:56.143Z · LW(p) · GW(p)

I'm sorry but "subgame" has a very specific definition in game theory which you are not being consistent with.

I just explained in detail how the subgame I described meets the definition you linked to. If you are going to disagree, you should be pointing to some aspect of the definition I am not meeting.

Also, intuitively when you are in a subgame you can ignore everything outside of the subgame, playing as if it didn't exist. But when Player 2 moves he can't ignore A because the fact that Player 1 could have picked A but did not provides insight into whether Player 1 picked B or C.

If it is somehow the case that giving player 2 info about player 1 is advantageous for player 1, then player 2 should just ignore the info, and everything still plays out as in my analysis. If it is advantageous for player 2, then it just strengthens the case that player 1 should choose A.

I am a game theorist.

I still think you are making a mistake, and should pay more attention to the object level discussion.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T22:50:07.822Z · LW(p) · GW(p)

Let's try to find the source of our disagreement. Would you agree with the following:

"You can only have a subgame that excludes A if the fact that Player 1 has not picked A provides no useful information to Player 2 if Player 2 gets to move."

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-20T23:10:17.369Z · LW(p) · GW(p)

The definition you linked to doesn't say anything about entering subgame not giving the players information, so no, I would not agree with that.

I would agree that if it gave player 2 useful information, that should influence the analysis of the subgame.

(I also don't care very much whether we call this object within the game of how the strategies play out given that player 1 doesn't choose A a "subgame". I did not intend that technical definition when I used the term, but it did seem to match when I checked carefully when you objected, thinking that maybe there was a good motivation for the definition so it could indicated a problem with my argument if it didn't fit.)

I also disagree that player 1 not picking A provides useful information to player 2.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T23:31:07.862Z · LW(p) · GW(p)

"I also disagree that player 1 not picking A provides useful information to player 2."

Player 1 gets 3 if he picks A and 2 if he picks B, so doesn't knowing that Player 1 did not pick A provide useful information as to whether he picked B?

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-21T01:00:42.211Z · LW(p) · GW(p)

The reason player 1 would choose B is not because it directly has a higher payout but because including B in a mixed strategy gives player 2 an incentive to include Y in its own mixed strategy, increasing the expected payoff of C for player 1. The fact that A dominates B is irrelevant. The fact that A has better expected utility than the subgame with B and C indicates that player 1 not choosing A is somehow irrational, but that doesn't give a useful way for player 2 to exploit this irrationality. (And in order for this to make sense for player 1, player 1 would need a way to counter exploit player 2's exploit, and for player 2 to try its exploit despite this possibility.)

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-21T01:51:36.991Z · LW(p) · GW(p)

"The reason player 1 would choose B is not because it directly has a higher payout but because including B in a mixed strategy gives player 2 an incentive to include Y in its own mixed strategy, "

No since Player 2 only observes Player 1's choice not what probabilities Player 1 used.

Replies from: jbay## ↑ comment by jbay · 2014-07-21T04:26:15.980Z · LW(p) · GW(p)

Player 2 observes "not A" as a choice. Doesn't player 2 still need to estimate the relative probabilities that B was chosen vs. that C was chosen?

Of course Player 2 doesn't have access to Player 1's source code, but that's not an excuse to set those probabilities in a completely arbitrary manner. Player 2 has to decide the probability of B in a rational way, given the available (albeit scarce) evidence, which is the payoff matrix and the fact that A was not chosen.

It seems reasonable to imagine a space of strategies which would lead player 1 to not choose A, and assign probabilities to which strategy player 1 is using. Player 1 is probably making a shot for 6 points, meaning they are trying to tempt player 2 into choosing Y. Player 2 has to decide the probability that (Player 1 is using a strategy which results in [probability of B > 0]), in order to make that choice.

## ↑ comment by ThisSpaceAvailable · 2014-07-24T02:40:09.094Z · LW(p) · GW(p)

Can you give an example a pair G1, G2 such that you consider G2 to be a "subgame" of G1?

## comment by cousin_it · 2014-07-21T14:31:53.319Z · LW(p) · GW(p)

Let's say player 1 submits a computer program that will receive no input and print either A, B or C. Player 2 submits a computer program that will receive a single bit as input (telling it whether P1's program printed A), and print either X or Y. Both programs also have access to a fair random number generator. That's a simultaneous move game where every Nash equilibrium leads to payoffs (3,0). Hopefully it's not too much of a stretch to say that we should play the game in the same way that the best program would play it.

If additionally each program receives the other's source code as input, many better Nash equilibria become achievable, like the outcome (4,1) proposed by Eliezer. In this case I think it's a bargaining problem. The Nash bargaining solution proposed by Squark might be relevant, though I don't know how to handle such problems in general.

Replies from: Wei_Dai## ↑ comment by Wei_Dai · 2014-07-21T23:11:57.461Z · LW(p) · GW(p)

Hopefully it's not too much of a stretch to say that we should play the game in the same way that the best program would play it.

Should we (humans) play like the best program that *don't* have access to each other's source code play it, or play like the best programs that *do* have access to each other's source code play it? I mean, figuratively we have *some* information about the other player's source code...

## ↑ comment by Squark · 2014-07-22T09:35:11.200Z · LW(p) · GW(p)

I think that if we know about one another that we believe in playing like programs with access to each other's source when playing against opponents about which we know [QUINE], then we are justified to play like programs with access to each others source. :)

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2014-07-20T21:38:34.999Z · LW(p) · GW(p)

P1: .5C .5B

P2: Y

It's not a Nash equilibrium, but it could be a timeless one. Possibly more trustworthy than usual for oneshots, since P2 knows that P1 was not a Nash agent assuming the other player was a Nash agent (classical game theorist) if P2 gets to move at all.

Replies from: itaibn0, ESRogs, DefectiveAlgorithm## ↑ comment by itaibn0 · 2014-07-21T16:53:01.431Z · LW(p) · GW(p)

I have no idea where those numbers came from. Why not "P1: .3C .7B" to make "P2: Y" rational? Otherwise, why does P2 play Y at all? Why not "P1: C, P2: Y", which maximizes the sum of the two utilities, and is the optimal precommitment under the Rawlian veil-of-ignorance prior? Heck, why not just play the unique Nash equilibrium "P1: A"? Most importantly, if there's no principled way to make these decisions, why assume your opponent will timelessly make them the same way?

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-21T17:28:00.516Z · LW(p) · GW(p)

Why not "P1: C, P2: Y", which maximizes the sum of the two utilities, and is the optimal precommitment under the Rawlian veil-of-ignorance prior?

If we multiply player 2's utility function by 100, that shouldn't change anything because it is an affine transformation to a utility function. But then "P1: B, P2: Y" would maximize the sum. Adding values from different utility functions is a meaningless operation.

Replies from: itaibn0## ↑ comment by itaibn0 · 2014-07-21T18:35:31.126Z · LW(p) · GW(p)

You're right. I'm not actually advocating this option. Rather, I was comparing EY's seemingly arbitrary strategy with other seemingly arbitrary strategies. The only one I actually endorse is "P1: A". It's true that this specific criterion is not invariant under affine transformations of utility functions, but how do I know EY's proposed strategy wouldn't change if we multiply player 2's utility function by 100 as you propose?

(Along a similar vein, I don't see how I can justify my proposal of "P1: 3/10 C 7/10 B". Where did the 10 come from? "P1: 2/7 C 5/7 B" works equally well. I only chose it because it is convenient to write down in decimal.)

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-21T19:10:24.368Z · LW(p) · GW(p)

Eliezer's "arbitrary" strategy has the nice property that it gives both players more expected utility than the Nash equilibrium. Of course there are other strategies with this property, and indeed multiple strategies that are not themselves dominated in this way. It isn't clear how ideally rational players would select one of these strategies or which one they would choose, but they should choose one of them.

## ↑ comment by ESRogs · 2014-07-23T09:22:45.498Z · LW(p) · GW(p)

Eliezer, would your ideas from this post apply here?

There could be many acceptable negotiating equilibria between what you think is the 'fair' point on the Pareto boundary, and the Nash equilibrium. So long as each step down in what you think is 'fairness' reduces the total payoff to the other agent, even if it reduces your own payoff even more.

If I'm not too confused, the Nash equilibrium is [P1: A], and the Pareto boundary extends from [P1: B, P2: Y] to [P1: C, P2: Y]. So the gains from trade give P1 1-3 extra points, and P2 0-2 extra points. As others have pointed out, a case could be made for [P1: C, P2: Y], as it maximizes the total gains from trade, but maybe, taking your idea of different concepts of fairness from the linked post, P2 should hold P1 hostage by playing some kind of mixed X,Y strategy unless P1 offers a "more fair" split.

Is that behavior by B the kind of thing that the reasoning in the linked post endorses?

## ↑ comment by DefectiveAlgorithm · 2014-07-23T01:55:32.121Z · LW(p) · GW(p)

I think this should get better and better for P1 the closer P1 gets to (2/3)C (1/3)B (without actually reaching it).

## comment by Wei_Dai · 2014-07-20T22:21:33.473Z · LW(p) · GW(p)

I think the following is the unique proper equilibrium of this game:

Player One plays A with probability 1-ϵ, B with probability 1/3 ϵ, C with probability 2/3 ϵ. Player Two plays X with probability 2/3 and Y with probability 1/3.

JGWeissman had essentially the right idea, but used the wrong terminology.

ETA: I've changed my mind and no longer think the proper equilibrium solution makes sense for this game. See later in this thread as well as this comment for the explanation.

Replies from: satt, James_Miller## ↑ comment by satt · 2014-07-20T23:48:40.239Z · LW(p) · GW(p)

Player Two plays X with probability 2/3 and Y with probability 1/3.

Should that be the other way round?

As written, player 1 expects to score 3(1-ϵ) + 2(ϵ/3) + (2ϵ/3)(6/3) = 3 - 3ϵ + 2ϵ/3 + 4ϵ/3 = 3-ϵ (assuming I haven't made a dumb error there), and so would do better by unilaterally switching to the pure strategy A.

But if player 2 plays *Y* with probability 2/3 and *X* with probability 1/3, player 1 can expect to score 3(1-ϵ) + 2(ϵ/3) + (2ϵ/3)(12/3) = 3 - 3ϵ + 2ϵ/3 + 8ϵ/3 = 3 + ϵ/3, which beats the pure strategy A.

**Edit:** no, ignore me, I forgot that the whole point of proper equilibrium is that ϵ is an arbitrary parameter imposed from outside and assumed nonzero. Player 1 isn't allowed to set it to zero. [*Slaps self on forehead.*]

## ↑ comment by James_Miller · 2014-07-20T22:43:38.341Z · LW(p) · GW(p)

You are probably right but the proper equilibrium assumption that "more costly trembles are made with significantly smaller probability than less costly ones" is a huge one.

Replies from: Wei_Dai## ↑ comment by Wei_Dai · 2014-07-20T23:55:45.727Z · LW(p) · GW(p)

That assumption isn't really relevant here, since there aren't actually any second level trembles with probability ϵ^2 in the solution. (Maybe trembling hand perfect equilibrium already gives us this unique solution and I didn't really need to invoke proper equilibrium, but the latter seems easier to work with.) Instead of talking about technicalities though, let's just think about this intuitively.

Suppose there is some definite probability of physically making a mistake when Player One intends to choose A. Let's say when he intends to choose A, there's probability 1/100 for accidentally pressing B and 1/100 for accidentally pressing C, and this is common knowledge. Now we can just solve this modified game using standard Nash equilibrium, and the only solution is that Player One has to make his choices so that after taking both "trembling" and deliberate choice into account, the probabilities that Player Two face is that C is twice as likely as B. (I.e., Player One has to deliberately choose C with probability about 1/100.) That's the only way that Player Two would be willing to choose a mixed strategy, which must be the case in order to have an equilibrium.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-21T00:14:49.110Z · LW(p) · GW(p)

Yes, I think you are right. I wonder if there is a way of changing the payoffs so there isn't any trembling mixed equilibrium.

Replies from: Wei_Dai## ↑ comment by Wei_Dai · 2014-07-21T00:30:50.428Z · LW(p) · GW(p)

Hmm, I just realized that Player Two's strategy in the Nash equilibrium of this modified game is different than in the proper equilibrium of the original game. Because here Player Two has to make his choice so that Player One is indifferent between A and C, whereas in the proper equilibrium Player Two has to make his choices so that Player One is indifferent between B and C.

I think my "intuitive analysis" does make sense, so I'm going to change my mind and say that perhaps proper equilibrium isn't the right solution concept here...

## comment by **[deleted]** ·
2014-07-20T20:48:53.683Z · LW(p) · GW(p)

Assume that each player's hand may tremble with a small non-zero probability *p*, then take the limit as *p* approaches zero from above.

## ↑ comment by Emile · 2014-07-20T21:35:50.233Z · LW(p) · GW(p)

... Let's do that!

Simple model: A plays A, B and C with probabilities a, b, and c, with the constraint that each must be above the trembling probability t (=p/3 using the p above). (Two doesn't tremble for simplicity's sake)

Two picks X with probability x and Y with probability (1-x).

So their expected utilities are:

One: 3a + 2b+6c(1-x)

Two: 2b(1-x) + cx = 2*b + (c - 2b) x

It seems pretty clear that One wants b to be as low as possible (either a or c will always be better), so we can set b=t.

So One's utility is (constant) - 3c+6c -6cx

So One wants c to maximize (1-2x)c, and Two wants x to maximize (c-2t)c

The Nash equilibrium is at 1-2x=0 and c-2t=0, so c=2t and x=0.5

So in other words, if One's hand can tremble than he should also sometimes deliberately pick C to make it twice as likely as B, and Two should flip a coin.

(and as t converges towards 0, we do indeed get One always picking A)

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T21:57:25.091Z · LW(p) · GW(p)

Excellent! This does indeed work given the assumption that Player 1 can not set the probability of himself picking B at zero. But if Player 1 can set the probability of him picking B at zero, the game still has no solution.

## ↑ comment by James_Miller · 2014-07-20T20:54:35.606Z · LW(p) · GW(p)

Good thinking, but I picked the payoffs so this approach wouldn't give an easy solution. Consider an equilibrium where Player 1 intends to pick A, but there is a small but equal chance he will pick B or C by mistake. In this equilibrium Player 2 would pick Y if he got to move, but then Player 1 would always intend to Pick C, effectively pretending he had made a mistake.

Replies from: Wei_Dai## ↑ comment by Wei_Dai · 2014-07-20T21:47:42.629Z · LW(p) · GW(p)

From Fudenberg & Tirole (1995 edition, chapter 8):

Section 8.4 then describes a refinement of trembling-hand perfect equilibrium due to Myerson(1978). A "proper equilibrium" requires that a player tremble less on strategies that are worse responses.

## comment by D_Alex · 2014-07-21T09:02:28.353Z · LW(p) · GW(p)

Hmm, is this not the correct solution for two super-rational players:

Player One: Pick C with probability of 2/3 - e; pick B with probability of 1/3 + e, e being some very small but not negligible number. Player Two: Pick Y

Expected payoff for Player One is 4 2/3 -4e; way better than playing A. For B is 2/3 + 2 e, a tiny bit better than playing X - so B **will** play Y, since he knows that A is totally rational and would have picked this very strategy.

## comment by TrE · 2014-07-21T03:51:00.534Z · LW(p) · GW(p)

Purely the information that Player One behaves irrationally doesn't give Player Two any more information on A's behaviour than the fact that it is not rational. So other than knowing Player One didn't use the strategy "play A with 100% probability", Player Two doesn't know anything about Player One's behaviour. What can Player Two do on that basis? They can assume that any deviations from rational choice are small, which brings us to the trembling hand solution. Or they can use a different model.

Which model of player One's behaviour is the "correct" one to assume in this situation is not at all clear. Perhaps Player One can be modelled as a RNG? Then pick Y. Perhaps One always models its opponents as RNG's (which is, given no information about their irrationality, irrational)? Then pick X (since One is indifferent between A and C in this case). Just as reversed stupidity is not intelligence, [not intelligence] doesn't tell you anything about the kind of stupidity, unless given more information.

Most Game Theory Problems in general can be said to have no solution if one of the players behaves irrationally. But that's not a problem for game theory because the Rational Choice assumption (perhaps allowing small deviations) is perfectly fine in the real world, and really the only sane way one can solve game-theoretic problems, narrowing down the space of possible behaviours tremendously!

And by the way, what use has a payoff matrix if players don't do their best acting on that information?

Replies from: Lumifer, James_Miller## ↑ comment by Lumifer · 2014-07-21T16:47:42.420Z · LW(p) · GW(p)

But that's not a problem for game theory because the Rational Choice assumption (perhaps allowing small deviations) is perfectly fine in the real world

Not in the real world I'm familiar with.

Replies from: TrE## ↑ comment by TrE · 2014-07-21T20:13:46.124Z · LW(p) · GW(p)

Unless you have more specific information about the problem in question, it's the best concept to consider. At least in the limit of large stacks, long pondering times, and decisions jointly made by large organizations, the assumption holds. Although thinking about it, I'd really like to see game theory for predictably irrational agents, suffering from exactly those biases untrained humans fall prey to.

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-21T21:18:15.185Z · LW(p) · GW(p)

it's the best concept to consider

I am not convinced about that at all.

Let's consider the top headlines of the moment: the Russian separatists in the Ukraine shot down a passenger jet and the IDF invaded Gaza. Both situations (the separatist movement and the Middle Eastern conflict) could be modeled in the game theory framework. Would you be comfortable applying the "Rational Choice assumption" to these situations?

Replies from: TrE## ↑ comment by TrE · 2014-07-22T04:55:51.984Z · LW(p) · GW(p)

I would attribute the shooting of the passenger jet to incompetence; The IDF invading Gaza yet again certainly makes sense from their perspective.

Considering the widespread false information in both cases, I'd argue that by and large, the agents (mostly the larger ones like Russia and Israel, less so the separatists and the palestine fighters) act rationally on the information they have. Take a look at Russia, neither actively fighting the separatists nor openly supporting them. I could argue that this is the best strategy for territorial expansion, avoiding a UN mission while strengthening the separatists. Spreading false information does its part.

I don't know enough about the palestine fighters and the information they act on to evaluate whether or not their behaviour makes sense.

I only consider instrumental rationality here, not epistemic rationality.

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-22T06:49:34.785Z · LW(p) · GW(p)

certainly makes sense from their perspective.

That may well be so, but this is a rather different claim than the "Rational Choice assumption".

We know quite well that people are not rational. Why would you model them as rational agents in game theory?

Replies from: TrE## ↑ comment by TrE · 2014-07-22T07:16:45.193Z · LW(p) · GW(p)

As I wrote above, in the limit of large stacks, long pondering times, and decisions jointly made by large organizations, people do actually behave rationally. As an example: Bidding for oil drilling rights can be modelled as auctions with incomplete and imperfect information. Naïve bidding strategies fall prey to the winner's curse. Game theory can model these situations as Bayesian games and compute the emerging Bayesian Nash Equilibria.

Guess what? The companies actually bid the way game theory predicts!

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-22T14:47:10.901Z · LW(p) · GW(p)

in the limit of large stacks, long pondering times, and decisions jointly made by large organizations, people do actually behave rationally.

I still don't think so. To be a bit more precise, certainly people behave rationally *sometimes* and I will agree that things like long deliberations or joint decisions (given sufficient diversity of the deciding group) tend to increase the rationality. But I don't think that even in the limit assuming rationality is a "safe" or a "fine" assumption.

Example: international politics. Another example: organized religions.

I also think that in analyzing this issue there is the danger of constructing rational narratives *post-factum* via the claim of revealed preferences. Let's say entity A decides to do B. It's very tempting to say "Aha! It would be rational for A to decide to do B if A really wants X, therefore A wants X and behaves rationally". And certainly, that happens like that on a regular basis. However what also happens is that A really wants Y and decides to do B on non-rational grounds or just makes a mistake. In this case our analysis of A's rationality is false, but it's hard for us to detect that without knowing whether A really wants X or Y.

## ↑ comment by James_Miller · 2014-07-21T04:08:21.055Z · LW(p) · GW(p)

"They can assume that any deviations from rational choice are small,"

Not playing A might be rational depending on Player Two's beliefs.

"And by the way, what use has a payoff matrix if players don't do their best acting on that information?" The game's uncertainty and sequential moves mean you can't use a standard payoff matrix.

Replies from: TrE## ↑ comment by TrE · 2014-07-21T04:23:46.637Z · LW(p) · GW(p)

Not playing A might be rational depending on Player Two's beliefs.

That, too, is the case in many games. The success in always assuming mutually rational behaviour first lies in its nice properties, like inability to be exploited, existence of equilibrium, and a certain resemblance (albeit not a perfect one) to the real world.

The game's uncertainty and sequential moves mean you can't use a standard payoff matrix.

Well, I mean, why specify payoffs at all if you then assume players won't care about them? If Player One cared about their payoff and were modelling Two as a rational choice agent, they would've played A. So either they don't model Two as a rational choice agent (which is stupid in its own right), or they simply don't care about their payoff.

In any case, the fact that a game with irrational players doesn't have a solution, at least as long as the nature of players' irrationality is not clear, doesn't surprise me. Dealing with madmen has never been easy.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-21T05:01:10.765Z · LW(p) · GW(p)

If Player One cared about their payoff and were modelling Two as a rational choice agent, they would've played A

You can only prove this if you first tell me what Player two's beliefs would be if he got to move.

In any case, the fact that a game with irrational players doesn't have a solution, at least as long as the nature of players' irrationality is not clear, doesn't surprise me.

I agree, but I meant for the players in the game to be rational.

Replies from: TrE## ↑ comment by TrE · 2014-07-21T05:49:04.976Z · LW(p) · GW(p)

Player two could simply play the equilibrium strategy for the 2x2-subgame.

And to counter your response that it's all one game, not two games, I can split the game by adding an extra node without changing the structure of the game. Then, we arrive at the standard subgame perfect equilibrium, only that the 2x2 subgame is in normal form, which shouldn't really change things since we can just compute a Nash equilibrium for that.

After solving the subgame, we see that player one not playing A is not credible, and we can eliminate that branch.

Replies from: Wei_Dai, James_Miller## ↑ comment by Wei_Dai · 2014-07-22T08:26:40.888Z · LW(p) · GW(p)

I can split the game by adding an extra node without changing the structure of the game

That actually does change the structure of the game, if we assume that physical mistakes can happen with some probability, which seems realistic. (Think about playing this game in real life as player 2. If you do get to move, you must think there's some non-zero probability that it was because Player 1 just accidentally pressed the wrong button, right?) With the extra node, you get the occasional "crap, I just accidentally pressed not-A, now I have to decide which of B or C to choose" which has no analogy in the original game where you never get to choose between B or C without A as an option.

Replies from: TrE## ↑ comment by TrE · 2014-07-22T09:23:25.684Z · LW(p) · GW(p)

Okay, I agree. But what do you think about the extensive-form game in the image below? Is the structure changed there?

Replies from: Wei_Dai## ↑ comment by Wei_Dai · 2014-07-22T10:16:54.989Z · LW(p) · GW(p)

The structure isn't changed there, but without the extra node, there is no subgame. That extra node is necessary in order to have a subgame, because only then can Player 2 think "the probabilities I'm facing is the result of Player 1's choice between just B and C" which allows them to solve that subgame independently of the rest of the game. Also, see this comment and its grandchild for why specifically, given possibility of accidental presses, I don't think Player 2's strategy in the overall game should be same as the equilibrium of the 2x2 "reduced game". In short, in the reduced game, Player 2 has to make Player 1 indifferent between B and C, but in the overall game with accidental presses, Player 2 has to make Player 1 indifferent between A and C.

Replies from: TrE## ↑ comment by TrE · 2014-07-22T10:44:16.501Z · LW(p) · GW(p)

In the 2x2 reduced game, Player One's strategy is 1/3 B, 2/3 C; Two's strategy is 2/3 X, 1/3 Y. In the complete game with trembling hands, Player Two's strategy remains unchanged, as you wrote in the starter of the linked thread, invoking proper equilibrium.

Replies from: Wei_Dai## ↑ comment by Wei_Dai · 2014-07-22T12:10:15.892Z · LW(p) · GW(p)

Later on in the linked thread, I realized that the proper equilibrium solution doesn't make sense. Think about it: why does Player 1 "tremble" so that C is exactly twice the probability of B? Other than pure coincidence, the only way that could happen is if some of the button presses of B and/or C are actually deliberate. Clearly Player 1 would never deliberately press B while A is still an option, so Player 1 must actually be playing a mixed strategy between A and C, while also accidentally pressing B and C with some small probability. But that implies Player 2 must be playing a mixed strategy that makes Player 1 indifferent between A and C, not between B and C.

## ↑ comment by James_Miller · 2014-07-22T02:53:37.053Z · LW(p) · GW(p)

After solving the subgame, we see that player one not playing A is not credible, and we can eliminate that branch.

But Player 1 can make this perfectly credible by actually not Playing A.

Replies from: TrE## ↑ comment by TrE · 2014-07-22T04:57:37.053Z · LW(p) · GW(p)

But I just showed that this is irrational as they would get less payoff in that subgame!

If that's your attitude, then you have to abandon the concept of subgame perfect equilibrium entirely. Are you willing to do that?

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T05:03:17.468Z · LW(p) · GW(p)

I think that adding the extra node does change the structure of the game. I also think that we have different views of what credibility means.

Replies from: TrE## ↑ comment by TrE · 2014-07-22T05:34:37.151Z · LW(p) · GW(p)

How does it change the structure of the game? Of course, it was in normal form before, and is now in extensive form, but really, the way you set it up means it shouldn't matter which representation we choose, since player two is getting exactly the same information.

Also, your argument about player two getting information about One's behaviour can easily be applied to "normal" extensive form games. Regardless of whether you intended to, if your argument were correct, it would render the concept of subgame perfect equilibrium useless.

I know that credibility is normally applied as in "make credible threats". But if I change payoffs to A: (3, 5) and add a few nodes above, then player 1's threat to not play A (which in this case is a threat) is not credible, and (3,5) carries over to the parent node.

By the logic of the extensive form formulation, Two should simply play the equilibrium strategy for the 2x2 subgame.

Edit: Here is what the game looks like in extensive form:

The dotted ellipse indicates that 2 can't differentiate between the two contained nodes. I don't see how any of the players has any more or less information or any more or less choices available.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T21:28:19.746Z · LW(p) · GW(p)

Yes this is the same game, but you can not create a subgame that has B and C but not A.

## comment by Tyrrell_McAllister · 2014-07-22T18:00:20.893Z · LW(p) · GW(p)

[The following argument is made with tongue somewhat in cheek.]

Rationality (with a capital R) is supposed to be an idealized algorithm that is universally valid for all agents. Therefore, this algorithm, as such, doesn't know whether it will be instantiated in Player One (P1) or Player Two (P2). Yes, each player knows which one they are. But this knowledge is input that they *feed to* their Rationality subroutine. The subroutine itself doesn't come with this knowledge built in.

Since Rationality doesn't know where it will end up, it doesn't know which outcomes will maximize its utility. Thus, the most rational thing for Rationality (qua idealized algorithm) to do is to precommit to strategies for P1 and P2 that maximize its expected utility given this uncertainty.

That is, let EU₁(*s*, *t*) be the expected utility to P1 if P1 implements strategy *s* and P2 implements strategy *t*. Define EU₂(*s*, *t*) similarly. Then Rationality wants to precommit the players to the respective strategies *s* and *t* that maximize

*p* EU₁(*s*, *t*) + *q* EU₂(*s*, *t*),

where *p* (respectively, *q*) is the measure of the copies of Rationality that end up in P1 (respectively, P2).

Assuming that *p* = *q*, Rationality will therefore have P1 choose C (with probability 1) and have P2 choose Y (with probability 1).

That's all well and good, but will the players actually act that way? After all, they are stipulated to care only about themselves, so P2 in particular will not want to act according to this selfless strategy.

Yes, but this selfish desire on P2's part is not built into P2's Rationality subroutine (call it R), because Rationality is universal. P2's selfish desires must be implemented elsewhere, outside of R. To be sure, P2 is free to feed the information that it is P2 to R, but R won't do anything with this information, because R is already precommitted to a strategy for the reasons given above.

And since the players are given to be rational, they are forced to act according to the strategies pre-selected by their Rationality subroutines, despite their wishes to the contrary. Therefore, they will in fact act as Rationality determined.

[If this comment has any point, it is that there is a strong tension, if not a contradiction, between the idea of rationality as a universally valid mode of reasoning, on the one hand, and the idea of rational agents whose revealed preferences are selfish, on the other.]

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-22T19:21:32.281Z · LW(p) · GW(p)

Error: Adding values from different utility functions.

See this comment.

Replies from: Tyrrell_McAllister## ↑ comment by Tyrrell_McAllister · 2014-07-22T20:39:07.458Z · LW(p) · GW(p)

[Resuming my tongue-in-cheek argument...]

It is true that adding different utility functions is in general an error. However, for agents bound to follow Rationality (and Rationality alone), the different utility functions are best thought of as the same utility function conditioned on different hypotheses, where the different hypotheses look like "The utility to P2 turns out to be what really matters".

After all, if the agents are making their decisions on the basis of Rationality alone, then Rationality alone must have a utility function. Since Rationality is universal, the utility function must be universal. What alternative does Rationality have, given the constraints of the problem, other than a weighted sum of the utility functions of the different individuals who might turn out to matter?

Replies from: JGWeissman## ↑ comment by JGWeissman · 2014-07-22T20:50:43.838Z · LW(p) · GW(p)

"Rationality" seems to give different answer to the same problem posed with different affine transformations of the players' utility functions.

Replies from: Tyrrell_McAllister## ↑ comment by Tyrrell_McAllister · 2014-07-22T21:24:47.040Z · LW(p) · GW(p)

[Still arguing with tongue in cheek...]

That's where the measures *p* and *q* come in.

## comment by Squark · 2014-07-21T07:54:43.509Z · LW(p) · GW(p)

I suspect that UDT players always reach the Nash bargaining solution although I have no proof.

For this game I proved that there is a local maximum of the Nash product when 1 plays B with probability 3/8 and C with probability 5/8 and 2 plays Y. I'm not sure whether it's global (can the Nash product have non-global local maxima?)

## comment by Dagon · 2014-07-20T20:17:33.132Z · LW(p) · GW(p)

"assume both players are rational. What should player 2 do when player 1 acts irrationally?"

Player 2 should realize that his model is incorrect and come up with a new theory for player 1's motivation. If one is human-like, then two should guess that one was tempted by the shot at the 6 payoff and chose C. Play X.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T20:29:04.641Z · LW(p) · GW(p)

But unless you can prove that Player 2 would respond with X, you can't tell me that Player 1 is irrational for not picking A.

## comment by blake8086 · 2014-07-20T18:57:16.734Z · LW(p) · GW(p)

I think you can just compute the Nash Equilibria. For example, use this site: http://banach.lse.ac.uk/

The answer appears to be "always pick A". Player 2 will never get to move.

Replies from: James_Miller, satt## ↑ comment by James_Miller · 2014-07-20T18:59:01.618Z · LW(p) · GW(p)

In the Nash equilibrium, what is Player 2's belief if he gets to move? Also, the link you gave is for solving simultaneous move games, and the game I presented is a sequential move game.

Replies from: Metus, ThisSpaceAvailable, blake8086## ↑ comment by Metus · 2014-07-20T19:04:19.791Z · LW(p) · GW(p)

That player one is not rational and he should abandon classical game theory.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T19:07:37.595Z · LW(p) · GW(p)

You are implicitly using circular reasoning. Not picking A is only irrational for some but not all possible beliefs that Player 2 could have if Player 1 does not pick A. And even if we grant your assumption, what should Player 2 do if he gets to move, and if your answer allows for the possibility that he picks Y how can you be sure that Player 1 is irrational?

Replies from: Metus## ↑ comment by Metus · 2014-07-20T19:19:03.947Z · LW(p) · GW(p)

I'm not using circular reasoning. The choice for player one between A and either B or C is a choice between a certain payoff of 3 and an as of yet uncertain payoff. If player A already chose to play either B or C, the game transforms into a game with a simple 2x2 payoff matrix. Writing down the matrix we see that there is no pure dominant strategy for either player. We know though that there is a mixed strategy equilibrium *as there always is one*. Player one assumes that player two will play such that player one's choice does not matter and equalises his expected payoff to 2. Player two again assumes that player one plays in such a way that their choice does not matter and equalises his expected payoff to 2/3. As the expected payoff for player one in the second game is lower than in the first game, at least one of the following assumptions has to be false about player one:

- Player one maximises expected utility in terms of game payoff
- Player one cares only about his own utility, not the utility of player two
- Player one assumes player two to not act according to similar principles

## ↑ comment by James_Miller · 2014-07-20T19:23:56.011Z · LW(p) · GW(p)

"If player [1] already chose to play either B or C, the game transforms into a game with a simple 2x2 payoff matrix."

No because Player 2 knows you did not pick A and this might give him insight into what you did pick. So even after Player 1 picks B or C the existence of strategy A might still effect the game because of uncertainty.

Replies from: Metus## ↑ comment by Metus · 2014-07-20T19:31:00.137Z · LW(p) · GW(p)

Distuingish between the reasoning and the outcome. Game theoretic reasoning is memory-less, the exact choice of action of one player does not matter to the other one *in the hypothetical*. As the rules are known, both players come to the same conclusion and can predict how the game will play out. If in practice this model is violated by one player the other player immediately knows that the first player is irrational.

## ↑ comment by James_Miller · 2014-07-20T19:33:45.750Z · LW(p) · GW(p)

"Game theoretic reasoning is memory-less"

No. Consider the tit-for-tat strategy in the infinitely repeated prisoners' dilemma game.

Why is it irrational for Player 1 to not pick A? Your answer must include beliefs that Player 2 would have if he gets to move.

## ↑ comment by ThisSpaceAvailable · 2014-07-24T02:58:57.260Z · LW(p) · GW(p)

Sequential move games are essentially a subset of simultaneous move games. If two players both write source code for programs that will play a sequential move game, then the writing of the code is a simultaneous move game.

## ↑ comment by blake8086 · 2014-07-20T21:53:04.797Z · LW(p) · GW(p)

I don't think your game is sequential, if Player 2 doesn't know Player 1's move.

You really have two games:

Game 1: Sequential game of Player 1 chooses A or B/C, and determines whether game 2 occurs.

Game 2: Simultaneous game of Player 2 maybe choosing X or Y, against Player 1's unknown selection of B/C.

edit: And the equilibrium case for Player 1 in the second game is an expected payout of 2, so he should always choose A.

## ↑ comment by satt · 2014-07-20T23:10:17.274Z · LW(p) · GW(p)

I think you can just compute the Nash Equilibria.

Things don't feel so simple to me. (A, X) is a Nash equilibrium (and the only pure strategy NE for this game), but is nonetheless unsatisfactory to me; if player 1 compares that pure strategy against the mixed strategy proposed by Wei_Dai, they'll choose to play Wei_Dai's strategy instead. Nash equilibrium doesn't seem to be a strong enough requirement ("solution concept") to force a plausible-looking solution. [**Edit:** oops, disregard this paragraph. I misinterpreted Wei_Dai's solution so switching to it from the NE pure equilibrium won't actually get player A a better payoff.]

(I also tried computing the mixed strategy NE by finding the player 1 move probabilities that maximized their expected return, but obtained a contradiction! Maybe I screwed up the maths.)

## comment by DanielLC · 2014-07-21T01:45:02.024Z · LW(p) · GW(p)

I thought the answer was Player One picks B with 1/3+ϵ probability and C with 2/3-ϵ probability. Player Two picks Y.

This gives Player One an expected value of 2(1/3+ϵ) + 6(2/3-ϵ) = 14/3-4ϵ and Player Two an expected value of 2(1/3+ϵ) = 2/3+2ϵ.

If Player Two picked X, he'd have an expected value of 2/3-ϵ, and miss out on 3ϵ.

If Player One picked B with a higher probability, Player Two would still pick C, and Player One wouldn't gain anything. If Player One picked C with a higher probability, Player Two would pick X, and Player One would get nothing. If Player One picked A, he'd only get 3, and miss out on 5/3-4ϵ.

Did I mess up somewhere?

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-21T02:00:59.674Z · LW(p) · GW(p)

If Player One believes that Player Two is going to pick Y, then Player One will pick C, but of course this isn't an equilibrium since Player Two would regret his strategy. All Player Two ever sees is Player One's move, not the probabilities that Player 1 might have used so if C is played Player Two doesn't know if it was because Player One Played C with probability 1 or probability 2/3-ϵ.

Replies from: DanielLC## comment by shminux · 2014-07-20T19:23:48.406Z · LW(p) · GW(p)

So what happens if Player Two does get to move?

This is equivalent to "what if Omega can't predict what I would do?" implicit reasoning done by two-boxers in Newcomb. **Neither possibility is in the solution domain, provided that one does not fight the hypothetical** (in your case "the players are rational" in Newcomb's "Omega is a perfect predictor"). Player two does not get to move, so there is no point considering that. Omega knows exactly what you'd do, so no point considering what to do if he is wrong.

## ↑ comment by buybuydandavis · 2014-07-20T21:33:57.465Z · LW(p) · GW(p)

And this is the problem I have with all the Newcomb/Omega business.

The hypothetical should be fought. We should no more assign absolute certainty to Omega's predictive power than we should assign absolute certainty to CDT's predictive power.

Instead of assigning 0 probability to the theory that Omega can make a mistake, assign DeltaOmega. Similarly assign DeltaCDT to the probability that CDT analysis is wrong. I'm too lazy to actually do the math, but do you have any doubt that the right decision will depend on the ratio of the two deltas?

There's really nothing to see here. .This is just another case of generating paradoxes in probability theory when you don't do a full analysis using finite, non zero probability assignments.

This is a similar issue the OP has come up against. Proposition1 is that A obeys certain game theoretic rules. Proposition2 is the report that implicates A violating those rules. When your propositions seem mutually contradictory, because you have lazily assigned 0 probability to them, hilarity ensues. Assign finite values, and the mysteries are resolved.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T21:37:44.334Z · LW(p) · GW(p)

You are basically using the trembling hand equilibrium concept. I picked the payoffs so this would not yield an easy solution. Consider an equilibrium where Player 1 intends to pick A, but there is a small but equal chance he will pick B or C by mistake. In this equilibrium Player 2 would pick Y if he got to move, but then Player 1 would always intend to Pick C, effectively pretending he had made a mistake.

## ↑ comment by James_Miller · 2014-07-20T19:28:51.224Z · LW(p) · GW(p)

Player two does not get to move, so there is no point considering that.

First, you are implicitly using circular reasoning. You can not tell me that picking B or C is irrational until you tell me what beliefs Player 2 would have if B or C were picked.

Also, imagine you are playing the game against someone you think is rational. You are Player 2. You are told that A was not picked. What do you do?

Replies from: VAuroch, shminux## ↑ comment by VAuroch · 2014-07-20T21:45:48.559Z · LW(p) · GW(p)

Also, imagine you are playing the game against someone you think is rational. You are Player 2. You are told that A was not picked. What do you do?

If I think Player 1 is rational, I assume he must be modeling my decision-making process somehow. If his model of my decision-making process has picking B or C seems rational, he must be modeling my choice of X and Y in a way that gives him a chance of a higher payoff than he can get by choosing A. Since every combination of (B,C) and (X,Y) is lower than his return from A except [C,Y], no model of my decision-making process would make B a good option, while some models (though inaccurate) would recommend C as a potentially good option. So while it's uncertain, it's very likely I'm at C. In that case, I should pick X, and shake my head at my opponent for drastically discounting how rational *I* am, if he thought he could somehow go one level higher and get the big payoff.

## ↑ comment by shminux · 2014-07-20T19:34:30.235Z · LW(p) · GW(p)

imagine you are playing the game against someone you think is rational. You are Player 2. You are told that A was not picked.

That's the contradiction right there. If you are player 2 and get to move, Player 1 is not rational, because you can always reduce their payoff by picking X.

Replies from: Vladimir_Nesov, satt, James_Miller## ↑ comment by Vladimir_Nesov · 2014-07-20T20:03:00.031Z · LW(p) · GW(p)

Your behavior in impossible-in-reality but in some sense possible-to-think-about situations may well influence others' decisions, so it may be useful to decide what to do in impossible situations if you expect to be dealing with others who are moved by such considerations. Since decisions make their alternatives impossible, but are based on evaluation of those alternatives, considering situations that eventually turn out to be impossible (as a result of being decided to become impossible) is a very natural thing to do.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T20:30:41.793Z · LW(p) · GW(p)

But why is not picking A "impossible-in-reality"? You can not answer until you tell me what Player 2's beliefs would be if A was not picked.

Replies from: Vladimir_Nesov, LimberLarry## ↑ comment by Vladimir_Nesov · 2014-07-20T21:04:14.589Z · LW(p) · GW(p)

I was making the more general point that impossible situations (abstract arguments that aren't modeled by any of the "possible" situations being considered) can matter, that impossibility is not necessarily significant. Apart from that, I agree that we don't actually have a good argument for impossibility of any given action by Player 1, if it depends on what Player 2 could be thinking.

## ↑ comment by LimberLarry · 2014-07-21T11:00:33.271Z · LW(p) · GW(p)

Because for Player 1 to increase his payoff over picking A, the only option he can choose is C, based on an accurate prediction via some process of reasoning that player 2 will pick X, thereby making a false prediction about Player 1's behaviour. You have stated both players are rational, so I will assume they have equal powers of reason, in which case if it is possible for Player 2 to make a false prediction based on their powers of reason then Player 1 must be equally capable of making a wrong prediction, meaning that Player 1 should avoid the uncertainty and always go for the guaranteed payoff.

Replies from: LimberLarry## ↑ comment by LimberLarry · 2014-07-21T11:03:13.505Z · LW(p) · GW(p)

To formulate this mathematically you would need to determine the probability of making a false prediction and factor that into the odds, which I regret is beyond my ability.

## ↑ comment by satt · 2014-07-20T23:54:18.537Z · LW(p) · GW(p)

That's the contradiction right there. If you are player 2 and get to move, Player 1 is not rational, because you can always reduce their payoff by picking X.

Note that "each player cares only about maximizing his own payoff". By assumption, player 2 has only a selfish preference, not a sadistic one, so they'll only choose X (or be more likely to choose X) if they expect that to improve their own expected score. *If* player 1 can credibly expect player 2 to play Y often enough when given the opportunity, it is not irrational for player 1 to give player 2 that opportunity by playing B or C.

## ↑ comment by James_Miller · 2014-07-20T19:37:56.366Z · LW(p) · GW(p)

Please answer the question, what would you do if you are player 2 and get to move? Might you pick Y? And if so, how can you conclude that Player 1 was irrational to not pick A?

Replies from: shminux## ↑ comment by shminux · 2014-07-20T22:35:40.276Z · LW(p) · GW(p)

what would you do if you are player 2 and get to move?

I will realize that I was lied to, and the player 1 is not rational. Now, if you are asking what player 2 should do in a situation where Player 1 does not follow the best possible strategy, I think Eliezer's solution above works in this case. Or Emile's. It depends on how you model irrationality.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T05:13:08.834Z · LW(p) · GW(p)

I don't agree since you can't prove that not picking A is irrational until you tell me what player 2 would do if he gets to move and we can't answer this last question.

## comment by **[deleted]** ·
2014-07-21T14:12:07.386Z · LW(p) · GW(p)

Hmm. The results appear quite different if you allow communication and repeated plays. And they also introduce something which seems slightly different than Trembling Hand (Perhaps Trembling Memory?)

With communication and repeated plays:

Assume all potential Player 1's credibly precommit to flip a fair coin and based on the toss, pick B half the time, and C half the time.

All potential Player 2's would know this, and assuming they expect Player 1 to almost always follow the precommitment, would pick Y, because they would maximize their expected payout. (50% chance of 2 means expected payout of 1, compared to picking X, where 50% chance of 1 means expected payout of 1/2.)

All potential Player 1's, based on following that precommitment universally, and having Player 2's always pick Y, will get 2 50% of the time and 6 50% of the time, which would get an expected per game payout of 4.

This seems better for everyone (by about one point per game) then Player 1 only choosing A.

With Trembling Memory:

Assume further that some of the time, Player 2's memory trembles and he forgets about the precommitments and the fact that this game is played repeatedly.

So If Player 2 suspects for instance, that they MIGHT be in the case above, but have forgotten important facts (they are incorrect about this being a one-off game, they are incorrectly assessing the state of common knowledge, but they are correctly assessing the current payoff structure of this particular game) then following those suspicions, it would still make sense for them to choose Y, and it would also explain why Player 1 chose something that wasn't A.

However, nothing would seem to prevent Player 2 from suspecting other possibilities. (For instance, under the assumption that Player 1 hits player 2 with Amnesia dust before every game, knowing that Player 2 will be forced into a memory trembled and will believe the above, Player 1 could play C every time, with player 2 drawing predictably incorrect conclusions and playing Y at no benefit.)

I'm not sure how to model a situation with trembling memory, though, so I would not be surprised if I was missing something.

## comment by LimberLarry · 2014-07-21T04:21:42.766Z · LW(p) · GW(p)

I'm not overly familiar with game theory, so forgive me if I'm making some elementary mistake, but surely the only possible outcome is Player 1 always picking A. Either other option is essentially Player 1 choosing a smaller or no payoff, which would violate the stated condition of both players being rational. A nonsensical game doesn't have to make sense.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-21T04:58:12.870Z · LW(p) · GW(p)

To know that A gives you a higher payoff than C you have to know what Player 2 would do if he got to move, but since Player 2 expects to never move how do you figure this out?

Replies from: LimberLarry## ↑ comment by LimberLarry · 2014-07-21T05:32:03.357Z · LW(p) · GW(p)

Right that makes sense, but wouldn't Player 1 simply realize that making an accurate forecast of player 2's actions is functionally impossible, and still go with the certain payout of A?

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T02:56:25.040Z · LW(p) · GW(p)

By definition of rationality in game theory, Player 1 will maximize his expected payoff and so need to have some belief as to the probabilities. If you can't figure out a way of estimating these probabilities the game has no solution in classical game theory land.

Replies from: LimberLarry## ↑ comment by LimberLarry · 2014-07-22T04:48:42.196Z · LW(p) · GW(p)

Well, as i said I'm not familiar with the mathematics or rules of game theory so the game may well be unsolvable in a mathematical sense. However, it still seems to me that Player 1 choosing A is the only rational choice. Having thought about it some more I would state my reasoning as follows. For Player 1, there is NO POSSIBLE way for him to maximize utility by selecting B in a non-iterated game, it cannot ever be a rational choice, and you have stated the player is rational. Choosing C can conceivably result in greater utility, so it can't be immediately discarded as a rational choice. If Player 2 finds himself with a move against a rational player, then the only possible choice that player could have made is C, so a rational Player 2 must choose X. Both players, being rational can see this, and so Player 1 cannot possibly choose anything other than A without being irrational. Unless you can justify some scenario in which a rational player can maximize utility by choosing B, then neither player can consider that as a rational option.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T05:09:33.197Z · LW(p) · GW(p)

Then please answer the question, "if Player 2 gets to move what should he believe Player 1 has picked?"

Until you can answer this question you can not solve the game. If it is not possible to answer the question, then the game can not be solved. I know that you want to say "Not picking A would prove Player 1 is irrational" but you can't claim this until you tell me what Player 2 would do if he got to move, and you can't answer this last question until you tell me what Player 2 would believe Player 1 had done if Player 1 does not pick A.

Replies from: LimberLarry## ↑ comment by LimberLarry · 2014-07-22T05:24:17.328Z · LW(p) · GW(p)

If Player 2 gets to move, then the only possible choice for a rational Player 1 to have made is to pick C, because B cannot possibly maximize Player 1's utility. The probability for a rational Player 1 to pick B is always 0, so the probability of picking C has to be 1. For Player 1,there is no rational reason to ever pick B, and picking C means that a rational Player 2 will always pick X, negating Player 1's utility. So a rational Player 1 must pick A.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T21:33:38.069Z · LW(p) · GW(p)

So are you saying that if Player 2 gets to move he will believe that Player 1 picked C?

Replies from: LimberLarry## ↑ comment by LimberLarry · 2014-07-23T00:50:36.192Z · LW(p) · GW(p)

Yes.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-23T04:34:32.619Z · LW(p) · GW(p)

But this does not make sense because then player 1 will know that player 2 will play X, so Player 1 would have been better off playing A or B over C.

Replies from: LimberLarry, LimberLarry## ↑ comment by LimberLarry · 2014-07-23T05:11:49.972Z · LW(p) · GW(p)

You seem to be treating the sub-problem, "what would Player 2 believe if he got a move" as if it is separate from and uninformed by Player 1's original choice. Assuming Player 1 is a utility-maximizer and Player 2 knows this, Player 2 immediately knows that if he gets a move, then Player 1 believed he could get greater utility from either option B or C than he could get from option A. As option B can never offer greater utility than option A, a rational Player 1 could never have selected it in preference to A. But of course that only leaves C as a possibility for Player 1 to have selected and Player 2 will select X and deny any utility to Player 1. So neither option B nor C can ever produce more utility than option A if both players are rational.

## ↑ comment by LimberLarry · 2014-07-23T04:48:14.801Z · LW(p) · GW(p)

Exactly, but B is never a preferable option over A, so the only rational option is for Player 1 to have chosen A in the first place, so any circumstance in which Player 2 has a move necessitates an irrational Player 1. The probability of Player 1 ever selecting B to maximize utility is always 0.

## comment by ike · 2014-09-04T14:43:58.217Z · LW(p) · GW(p)

Semi-plausible interpretation of the game:

Player One and Two are in a war. Player One can send a message to his team, and Player Two can intercept it. There is a price for sending the message, and a price for intercepting it.

A is "don't send any message". B is "send a useless (blank, or random) message". (Uses up 1 utility. Also gives the other player 2 free utility points.) C is "send a useful message". (Uses up 1 utility, but gains 4 if not intercepted, and loses extra 1 if intercepted.) X is "intercept", which costs 2 utility. However, intercepting a useful message gains you 3 utility (after paying 2). Y is "don't intercept".

(I think I got the numbers right, if they don't match up somewhere, tell me and I'll adjust them.)

The analog to the problem is this: sending a useless message can never beat doing nothing. (We're assuming you don't gain from the opponent wasting energy. It's possible if the energy is small enough to be outweighed by the cost of sending the message.) So if Player Two sees a message, he should always assume it is useful. Therefore Two will intercept all messages. Therefore Player One should never send any messages.

However, if One does, in fact, send a message, then Two can't rely on his rationality, and can't assume the message has value to intercept.

## comment by Punoxysm · 2014-07-22T18:19:20.619Z · LW(p) · GW(p)

This can be analyzed as a regular 2-player game with payoff matrix

-- X Y

A 3,0 3,0

B 2,0 2,2

C 0,1 6,0

Player 2's indifference between X and Y when player 1 plays A means that player 2 only considers whether player 1 plays B or C.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T21:30:34.993Z · LW(p) · GW(p)

Your model doesn't incorporate the uncertainty of my game. Even if Player 2 knows that Player 1 didn't play A, the fact that he could have impacts his estimate of whether Player 1 picked B or C.

Replies from: Punoxysm## ↑ comment by Punoxysm · 2014-07-23T00:30:40.131Z · LW(p) · GW(p)

I'm saying that Player 2's reward is strictly controlled by whatever fraction of the time player 1 plays B or C, since if player 1 plays A player 2's reward is guaranteed to be zero, and diminishes expected reward from X and Y in the same proportion.

If player 1 moves and when they pick either A,B or C player 2 is told "player 1 picked A, B or C" then player 2 can reduce it to only considering the possibility of B and C because even though A strictly dominates B, player 2's reward is only non-zero in the case where B or C are played.

This analysis would change if A,X were 3,0.5 or even 3,0.01

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-23T04:37:11.429Z · LW(p) · GW(p)

We look at game theory in different ways. By my analysis it is irrelevant what Player 2 would get if A were played, it could be $1 trillion or -$1 trillion and it would have no impact on the game as I see it. But then I don't use timeless decision theory, and you might be. This could be the source of our disagreement.

Replies from: Punoxysm## comment by drethelin · 2014-07-20T22:15:46.702Z · LW(p) · GW(p)

What is this game supposed to analogize to in reality? I usually don't like to fight the hypothetical but in this sort of situation I feel like as player 2 I would assume they are including considerations from outside the game's rationality like a sense of fairness or generosity.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T22:35:37.668Z · LW(p) · GW(p)

As far as I know the game has no direct real life analogy. But I believe it shows a weakness of game theory, and shows that sometimes to know what you are going to do you need to take into account what another person would do if that person were in a situation they would never expect to be in.

## comment by twanvl · 2014-07-20T20:51:54.794Z · LW(p) · GW(p)

This game is exactly equivalent to the standard one where player one chooses from (A,B,C) and player two chooses from (X,Y), with the payoff for (A,X) and for (A,Y) equal to (3,0). When choosing what choice to make, player two can ignore the case where player one chooses A, since the payoffs are the same in that case.

And as others have said, the pure strategy (A,X) is a Nash equilibrium.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-20T21:04:32.041Z · LW(p) · GW(p)

It's not equivalent because of the uncertainty.

Also, even if it were, lots of games have Nash equilibra that are not reasonable solutions so saying "this is a Nash equilibrium" doesn't mean you have found a good solution. For example, consider the simultaneous move game where we each pick A or B. If we both pick B we both get 1. If anything else happens we both get 0. Both of us picking A is a Nash equilibrium, but is also clearly unreasonable.

Replies from: twanvl## ↑ comment by twanvl · 2014-07-21T10:51:43.319Z · LW(p) · GW(p)

It's not equivalent because of the uncertainty.

Could you explain what you mean? What uncertainty is there?

Also, even if it were, lots of games have Nash equilibra that are not reasonable solutions so saying "this is a Nash equilibrium" doesn't mean you have found a good solution.

For example, consider the simultaneous move game where we each pick A or B. If we both pick B we both get 1. If anything else happens we both get 0. Both of us picking A is a Nash equilibrium, but is also clearly unreasonable.

This game has two equilibria: a bad one at (A,A) and good one at (B,B). The game from this post also has two equilibria, but both involve player one picking A, in which case it doesn't matter what player two does (or in your version, he doesn't get to do anything).

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T03:01:44.603Z · LW(p) · GW(p)

Could you explain what you mean? What uncertainty is there?

If Player 2 gets to move he is uncertain as to what Player 1 did. He might have a different probability estimate in the game I gave than one in which strategy A did not exist, or one in which he is told what Player 1 did.

I'm not convinced that the game has any equilibrium unless you allow for trembling hands. For A,A to be an equilibrium you have to tell me what belief Player 2 would have if he got to move, or tell me that Player 1's belief about Player 2's belief can't effect the game.

Replies from: twanvl## ↑ comment by twanvl · 2014-07-22T10:08:18.849Z · LW(p) · GW(p)

If Player 2 gets to move he is uncertain as to what Player 1 did. He might have a different probability estimate in the game I gave than one in which strategy A did not exist, or one in which he is told what Player 1 did.

In a classical game all the players move simultaneously. So to repeat, your game is:

- player 1 chooses A, B or C
- then, player 2 is told whether player 1 chose B or C, and in that case he chooses X or Y
- payoffs are (A,-) -> (3,0); (B,X) -> (2,0); (B,Y) -> (2,2); (C,X) -> (0,1); (C,Y) -> (6,0)

The classical game equivalent is

- player 1 chooses A, B or C
- without being told the choice of player 1, player 2 chooses X or Y
- payoffs are as before, with (A,X) -> (3,0); (A,Y) -> (3,0).

I hope you agree that the fact that player 2 gets to make a (useless) move in the case that player 1 chooses A doesn't change the fundamentals of the game.

In this classic game player 2 also has less information before making his move. In particular, player 2 is not told whether or not player 1 choose A. But this information is completely irrelevant for player 2's strategy, since if player 1 chooses A there is nothing that player 2 can do with that information.

I'm not convinced that the game has any equilibrium unless you allow for trembling hands.

If the players choose (A,X), then the payoff is (4,0). Changing his choice to B or C will not improve the payoff for player 1, and switching to Y doesn't improve the payoff for B. Therefore this is a Nash equilibrium. It is not stable, since player 2 can switch to Y without getting a worse payoff.

Replies from: James_Miller## ↑ comment by James_Miller · 2014-07-22T21:40:56.346Z · LW(p) · GW(p)

In a classical game all the players move simultaneously.

I'm not sure what you mean by "classical game" but my game is not a simultaneous move game. Many sequential move games do not have equivalent simultaneous move versions.

"I hope you agree that the fact that player 2 gets to make a (useless) move in the case that player 1 chooses A doesn't change the fundamentals of the game."

I do not agree. Consider these payoffs for the same game:

A 3,0 [And Player Two never got to move.]

B,X 2,10000

B,Y 2,2

C,X 0,1

C,Y 4,4

Now although Player 1 will never pick A, its existence is really important to the outcome by convincing Player 2 that if he moves C has been played.

Replies from: twanvl## ↑ comment by twanvl · 2014-07-23T09:29:12.898Z · LW(p) · GW(p)

I do not agree. Consider these payoffs for the same game: ...

Different payoffs imply a different game. But even in this different game, the simultaneous move version would be equivalent. With regards to choosing between X and Y, the existence of choice A still doesn't matter, because if player 1 chose A X and Y have the same payoff. The only difference is how much player 2 knows about what player 1 did, and therefore how much player 2 knows about the payoff he can expect. But that doesn't affect his strategy or the payoff that he gets in the end.

## comment by JoeTheUser · 2014-07-22T19:21:50.028Z · LW(p) · GW(p)

There's no mathematical solution for single-player, non-zero sum games of any sort. All these constructs lead to is arguments about "what is rational". If you a full math model of a "rational entity", then you could get a mathematically defined solution.

This is why I prefer evolutionary game theory to classical game theory. Evolutionary game theory generally has models of its actors and thus guarantees a solution to the problems it posits. One can argue with the models and I would say that's where such arguments most fruitfully should be.