Why the empirical results of the Traveller’s Dilemma deviate strongly away from the Nash Equilibrium and seems to be close to the social optimum?

rocksbasil

Why the empirical results of the Traveller’s Dilemma deviate strongly away from the Nash Equilibrium and seems to be close to the social optimum?

post by RocksBasil (RorscHak) · 2019-05-23T12:36:47.564Z · LW · GW · 15 comments

15 comments

Epistemic status: I think I'm over complicating the matter.

In the Traveller’s Dilemma (call it TD below for short), theoretically the only Nash Equilibrium is to have both players (I'll call them Alice and Bob) reasoning to give the lowest bid [LW · GW] (Thanks Stuart_Armstrong for letting me notice this): starting with bidding 100 dollars, Alice would realise she can gain more by claiming 99, so Bob’s best choice is to claim 98, but Alice would also know this and claim 97 and so on…until the race to the bottom finishes at the lowest claim of $2.

Empirically this doesn’t seem to happen, both players would likely be cooperative and bid high. Which for me seems rather bizarre for a single round simultaneous game.

Notice that TD is very similar to the dollar auction/war of attrition: In the dollar auction, both players pay for their bids, with the higher bidding player receiving the auctioned dollar.

We can slightly modify the two-player dollar auction to make it even more similar to TD: the player with the lower bid would also have to pay for the same bid as the winner.

This modified dollar auction has the same payoff rules as a TD with the range of claims being (-infinity,0]

The only difference is the bidding process: in TD, both players choose their claims simultaneously, while in DA the two players engage in multiple rounds of competing bids.

Given the difference between TD and I believe there is something wrong with the assumed reasoning that we use to derive the Nash Equilibrium of TD. If Alice believes that Bob is fully rational, she would not believe that Bob would follow this line of reasoning in his own head and give a claim of $2.

Imagine that Alice and Bob are allowed to communicate before choosing their claims, but they only discussed their claims in approximate terms (eg: would it be a high claim close to $100? A low claim close to $2? Somewhere in between?)

Would a rational Alice want to make Bob convinced that she would give a low claim, or would she want to convince Bob that she would give a high claim?

If Alice convinced Bob that she would give a low claim, Bob’s best response is to give a low claim. Knowing this, Alice would give a low claim and both Alice and Bob will receive a low payoff.

While if Alice convinced Bob that she would give a high claim, Bob’s best response is to give a high claim. Knowing Bob will give a high claim, Alice would give a high claim and both Alice and Bob will receive a high payoff.

It appears that Alice has an incentive to convince Bob that she would give a high claim close to $100, instead of a low claim close to $2.

Also, even if Alice’s promise is not binding, her best response is to keep to it: she runs a greater risk of losing $2 if she claim high after promising a low claim, and will likely lose a lot if she claim low after promising a high claim. As a result, Bob would still trust Alice’s promises even when he knows that Alice is fully capable of lying.

Conclusion:

If we imagine a round of “communication in approximate terms” for Alice and Bob, the line of reasoning for an equilibrium with both players bidding high becomes visible. A rational player would prefer to be believed that they will be cooperative in this game, and in this particular case they have the incentive to keep their promises of cooperation. Even if we disrupt the communication round and make each player’s promise invisible to the other (thereby we create a round with imperfect information, and the resulting game is functionally identical to the original TD), each player can still make a guess on what the other player would’ve communicated, and how they would plan their subsequent bid based on the unspoken communication.

I haven’t done anything to evaluate this process vigorously, as the “low”, “middle”, “high” bids are rather vague terms that would not allow me to draw clear boundaries for them. However, it appears to me that the strategy for both players on the communication round would be a mixed strategy that skews towards the cooperative (high bid) end.

As a result, the Bob who claims $2 in Alice’s imagination is probably not a rational player. A rational Bob will promise a high claim and keep with his promise.

I am aware that the vacuous terms of low, middle, and high claims are extremely slippery, but I believe the absence of precise information does stop TD from going continuously downhill: it is almost impossible to claim exactly $1 below the claim of the other player.

I think that's how it stays less disastrous than the dollar auction.

I think we can also apply this same logic to the centipede game and conclude why defecting in the first round is not empirically common: both players have the incentive to be believed that they will be cooperative until late in the game, and (depending on the parameters of the game) it is rational to keep the promise of long term cooperation if the other player trusts you.

15 comments

Comments sorted by top scores.

comment by Zvi · 2019-05-23T13:32:01.388Z · LW(p) · GW(p)

I think your epistemic status is right. Nash is not a good guide to what happens in one-shot games.

You give a high bid in TD (assuming no one cares what the bags are worth, only EV) because your expected returns are higher when you bid high, given most others will either not get to the Nash logic, or get far enough to realize that many others won't get there and therefore they shouldn't use it, or realize that those who do get there will realize others won't get there, or even just realize that even if everyone gets there some will choose to ignore it because even if a few others ignore it, you do better that way. And so on.

A thought experiment is, what is the right mixed strategy if your opponent will know what your exact mixed strategy is? I think of this as 'asymmetric Nash.'

Replies from: Dagon

↑ comment by Dagon · 2019-05-24T18:55:25.605Z · LW(p) · GW(p)

Is a mixed strategy enough in this case, or does it require communication and trust (in this case, "trust" is equivalent to changing the payout structure to include points for self-image and social cohesion)?

A mixed-strategy would be to bid between $2 and $100 in some probability distribution that gives some weight to each value, and adds up to 1. Assume 1.01% to each value 2..100 to start. The hypothetical counter to this never includes 100 (99 dominates it as a pure strategy, and as a sub-strategy), so either distributes 1.02% to 2..99 or 1.01% to 2..98 and 2.02% to 99, unsure which, but it doesn't matter because we're going to iterate further. The obvious response to this counter-strategy is to never bid 99 or 100, redistributing those probabilities. Continue until you're 100% $2 bids.

Nash is Nash, there's no asymmetry available. The only way to win is to play a different game - communication allows you to change the payoff matrix by getting your opponent to consider future interactions and image considerations as valid parts of the result.

Replies from: RorscHak

↑ comment by RocksBasil (RorscHak) · 2019-05-27T04:38:11.786Z · LW(p) · GW(p)

" in this case, "trust" is equivalent to changing the payout structure to include points for self-image and social cohesion "

I guess I'm just trying to model trust in TD without changing the payoff matrix. The payoff matrix of the "vague" TD works in promoting trust--a player has no incentive breaking a promise.

Replies from: Dagon

↑ comment by Dagon · 2019-05-28T05:01:33.373Z · LW(p) · GW(p)

You're just avoiding acknowledging the change in payoff matrix, not avoiding the change itself. If "breaking a promise" has a cost or "keeping a promise" has a benefit (even if it's only a brief good feeling), that's part of the utility calculation, and is part of the actual payoff matrix used for decision-making..

Replies from: RorscHak

↑ comment by RocksBasil (RorscHak) · 2019-05-28T09:29:56.797Z · LW(p) · GW(p)

"breaking a promise" or "keeping a promise" has no intrinsic utilities here.

What I state is that under this formulation, if the other player believes your promise and plays the best response to your promise, your best response is to keep the promise.

Replies from: Dagon

↑ comment by Dagon · 2019-05-28T14:40:36.972Z · LW(p) · GW(p)

What utility do you get from keeping the promise, and how does it outweigh an extra $1 from bidding $99 (and getting $101) instead of $100?

If you're invoking Hofstadter's super-rationality (the idea that your keeping a promise is causally linked to the other person keeping theirs), fine. If you're acknowledging that you get outside-game utility from being a promise-keeper, also fine (but you've got a different payout structure than written). Otherwise, why are you giving up the $1?

And if you are willing to go $99 to get another $1 payout, why isn't the other player (kind of an inverse super-rationality argument)?

Replies from: RorscHak

↑ comment by RocksBasil (RorscHak) · 2019-05-29T06:01:16.550Z · LW(p) · GW(p)

My assumption is that promises are "vague", playing $99 or $100 both fulfil the promise of giving a high claim close to $100, for which there is no incentive to break.

I think the vagueness stops the race to the bottom in TD, compared to the dollar auction in which every bid can be outmatched by a tiny step without risking going overboard immediately.

I do think I overcomplicated the matter to avoid modifying the payoff matrix.

comment by Dagon · 2019-05-23T13:40:01.277Z · LW(p) · GW(p)

It's very hard (perhaps not possible in humans) to have communication without changing the payoff matrix. As soon as the framing changes to make the other player slightly more trustworthy or empathetic, the actual human evaluation will include other factors (kindness, self-image, etc.). In other words, most people's utility function _does_ include happiness of others. The terms can vary widely, and even vary in sign, based on framing and evaluation of the other, though.

More importantly, the Nash equilibrium is kind of irrelevant to non-zero-sum games. There is no reason to believe that any optimization process is seeking it. edit: I retract this paragraph. The Nash equilibrium is relevant to some non-zero-sum games, but there are truly ZERO one-shot independent games that humans participate in. Any trial or demonstration cannot avoid the fact that utility payout is not linear with the stated matrix.

comment by Charlie Steiner · 2019-05-25T04:41:23.097Z · LW(p) · GW(p)

I would guess that people don't actually compute the Nash equilibrium or expect other people to.

Instead, they use the same heuristic reasoning methods that they evolved to learn, and which have served them well in social situations their entire life, and expect other people to do the same.

I think we should expect these heuristics to be close to rational (not for the utilities of humans, but for the fitness of genes) in the ancestral environment. But there's no particular reason to think they're going to be rational by any standard in games chosen specifically because the Nash equilibrium is counterintuitive to humans.

comment by sapphire (deluks917) · 2019-05-24T21:29:32.803Z · LW(p) · GW(p)

If you bid 2$ you get at most 4$. If you bid 100$ you have a decent chance to get much more. If even 10% of people big ~100 and everyone else bids two you are better off bidding 100. Even in a 5% 100$ / 95% 2$ the two strategies ahve a similar expected value. In order for bidding 2$ to be a good strategy you have to assume almost everyone else will bid 2$.

Replies from: Dagon, RorscHak

↑ comment by Dagon · 2019-05-27T17:03:54.293Z · LW(p) · GW(p)

If you bid $2 you get at least $2 (you might get $4 if your partner bids above $2, but there's no partner bid that can get you less than $2). If you bid anything more than $2, you might get $0, if the other party bids $2. Nash equilibrium is simply the state where no other-player-choice can reduce your payout.

If you're trying to maximize average/expected payout, and you have some reason to believe that the other player is empathetic, super-rational, or playing a different game than stated (like part of their payout is thinking of themselves as cooperative), you should usually bid $100. Playing against an alien or an algorithm who you expect is extremely loss-averse and trying to maximize their minimum payout, you should do the same and bid $2.

↑ comment by RocksBasil (RorscHak) · 2019-05-27T04:34:59.663Z · LW(p) · GW(p)

This is true. The issue is that the Nash Equilibrium formulation of TD predicts that everyone else will bid $2, which is counter-intuitive and does not confirm empirical findings.

I'm trying to convince myself that the NE formulation in TD is not entirely rational.

comment by Gurkenglas · 2019-05-24T09:24:48.543Z · LW(p) · GW(p)

"While if Alice convinced Bob that she would give a high claim, Bob’s best response is to give a high claim." Why? He gets a higher payoff by giving a low claim.

Replies from: RorscHak

↑ comment by RocksBasil (RorscHak) · 2019-05-27T04:32:18.729Z · LW(p) · GW(p)

If Alice claims close to $100 (say, $80), Bob gets a higher payoff claiming $100 (getting $78) instead of claiming $2 (getting $4).

Replies from: Gurkenglas

↑ comment by Gurkenglas · 2019-05-27T10:50:47.637Z · LW(p) · GW(p)

Ohh, I thought it's 2$ per dollar of difference between them. Okay.

Why the empirical results of the Traveller’s Dilemma deviate strongly away from the Nash Equilibrium and seems to be close to the social optimum?

Contents

15 comments