The Calculus of Nash Equilibria
post by Heighn · 2022-04-01T14:40:59.937Z · LW · GW · 0 commentsContents
Prisoner's dilemma Nonlinear payoff functions Making things a bit more complicated None No comments
Now that we know a bit about derivatives [? · GW], it's time to use them to find dominant strategies and Nash equilibria. It helps if the reader is familiar with Nash equilibria already.
Prisoner's dilemma
The payoff matrix of the Prisoner's dilemma [? · GW] can be as follows:
We can see that the payoff for Prisoner 1 depends on her own action (Cooperate/Defect) but also on the action of Prisoner 2. Therefore, the payoff function for Prisoner 1 is a multivariable function: , where is the action of Prisoner (and ). Let's say when the action of Prisoner is Cooperate, and for Defect. So . Then , and crucially, . So for Defect (), Prisoner 1's payoff will be 0 higher than for Cooperate (), as can be confirmed in the table. Note that doesn't show up in : Defect gives more for Prisoner 1 regardless of what Prisoner 2 does, which makes Defect a dominant strategy. Don't get me wrong: Prisoner 1's payoff certainly does depend on what Prisoner 2 does. The point is that no matter what Prisoner 2 does, Prisoner 1's payoff will be $10 higher when she (Prisoner 1) defects - and that's what's reflected in .
Since the payoff matrix is symmetrical, and . Prisoner 2 therefore also has a dominant strategy: Defect. The Prisoner's dilemma, then, has a Nash equilibrium: when both prisoners defect. With the partial derivatives, we demonstrated that when both prisoners defect, no one prisoner can do better by changing her action to Cooperate. If e.g. Prisoner 1 were to do this, then would go from to , and since , that would lower (regardless of ). By symmetry, the same is true for Prisoner 2.
Nonlinear payoff functions
In the Prisoner's dilemma, the payoffs of both players (prisoners) can be modelled by linear payoff functions. What if the payoffs are nonlinear?
Let's say and . Then and . A Nash equilibrium is a point where no player can do better by doing another action given the action of the other player; therefore, should be maximized with respect to while keeping constant, whereas should be maximized with respect to while keeping constant. If has a peak value with respect to , must be in that point. gives . So could represent a peak, but also a valley, since would be in both. If , . So represents a local maximum in (when is held constant)! Since is quadratic, we can be sure this local maximum is the global maximum too (so there are no values for for which is higher when is held constant).
gives and . , so again represents a local maximum. is quadratic, so this is a global maximum as well.
So represents a global maximum for (for a constant ), and represents a global maximum for (for a constant ). That means is a dominant strategy for player 1, is a dominant strategy for player 2 and we have a Nash equilibrium in .
Making things a bit more complicated
Let's now define and . Then for , we have . , which is negative when .
For we have . , so this is a local optimum - and also the global one, since is quadratic. For , . Solving for gives (which we found earlier as well). And since and therefore , we now have a local maximum for ! For a constant , is quadratic, so this is the global maximum as well. We found a Nash equilibrium: .
0 comments
Comments sorted by top scores.