## Posts

## Comments

**JonasMoss**on Open problem: how can we quantify player alignment in 2x2 normal-form games? · 2021-06-19T07:35:12.366Z · LW · GW

It's a game, just a trivial one. Snakes and Ladders is also a game, and its payoff matrix is similar to this one, just with a little bit of randomness involved.

My intuition says that this game not only has maximal alignment, but is the only game (up to equivalence) game with maximal alignment for any set of strategies . No matter what player 1 and player 2 does, the world is as good as it could be.

The case can be compared to the when the variance of the dependent variable is 0. How much of the variance in the dependent variable does the independent variable explain in this case? It'd say it's all of it.

**JonasMoss**on Variables Don't Represent The Physical World (And That's OK) · 2021-06-18T10:10:17.119Z · LW · GW

This reminds me of the propensity of social scientists to drop inference when studying the entire population, claiming that confidence intervals do not make any sense when we have every single existing data point. But confidence intervals do make sense even then, as the entire observed population isn't equal to the theoretical population. The observed population does not give us exact knowledge about any properties of the data generating mechanism, except in edge cases.

(Not that confidence intervals are very useful when looking at linear regressions with millions of data points anyway, but make sure to have your justification right.)

**JonasMoss**on Open problem: how can we quantify player alignment in 2x2 normal-form games? · 2021-06-18T08:41:31.723Z · LW · GW

I believe the upper right-hand corner of shouldn't be 1; even if both players are acting in each other's best interest, they are not acting in their *own* best interest. And alignment is about having both at the same time. The configuration of Prisoner's dilemma makes it impossible to make both players maximally satisfied at the same time, so I believe it cannot have maximal alignment for any strategy.

Anyhow, your concept of alignment might involve altruism only, which is fair enough. In that case, Vanessa Kosoy has a similar proposal to mine, but not working with sums, which probably does exactly what you are looking for.

Getting alignment in the upper right-hand corner in the Prisoner's dilemma matrix to be 1 may be possible if we redefine to , the best attainable payoff sum. But then zero-sum games will have maximal instead of minimal alignment! (This is one reason why I defined .)

(Btw, the coefficient isn't symmetric; it's only symmetric for symmetric games. No alignment coefficient depending on the strategies can be symmetric, as the vectors can have different lengths.)

**JonasMoss**on Open problem: how can we quantify player alignment in 2x2 normal-form games? · 2021-06-17T13:18:00.757Z · LW · GW

Alright, here comes a pretty detailed proposal! The idea is to find out if the sum of expected utility for both players is “small” or “large” using the appropriate normalizers.

First, let's define some quantities. (I'm not overly familiar with game theory, and my notation and terminology are probably non-standard. Please correct me if that's the case!)

- The payoff matrix for player 1.
- The payoff matrix for player 2.
- the mixed strategies for players 1 and 2. These are probability vectors, i.e., vectors of non-negative numbers summing to 1.

Then the expected payoff for player 1 is the bilinear form and the expected payoff for player 2 is . The sum of payoffs is

But we're not done defining stuff yet. I interpret alignment to be about welfare. Or how large the sum of utilities is when compared to the best-case scenario and the worst-case scenario. To make an alignment coefficient out of this idea, we will need

- This is the lower bound to the sum of payoffs, , where are probability vectors. Evidentely,
- The upper bound to the sum of payoffs in the counterfactual situation where the payoff to player 1 is not affected by the actions of player 2, and vice versa. Then . Now we find that .

Now define the alignment coefficient of the strategies in the game defined by the payoff matrices as

The intuition is that alignment quantifies how the expected payoff sum compares to the best possible payoff sum attainable when the payoffs are independent. If they are equal, we have perfect alignment . On the other hand, if , the expected payoff sum is as bad as it could possibly be, and we have minimal alignment ().

The only problem is that makes the denominator equal to 0; but in this case, as well, which I believe means that defining is correct. (It's also true that, but I don't think this matters too much. The players get the best possible outcome no matter how they play, which deserves .) This is an extreme edge case, as it only holds for the special payoff matrices () that contain the same element () in every cell.

Let's look at some properties:

- A pure coordination game has at least one maximal alignment equilibrium, i.e., for some . All of these are necessarily Nash equilibria.
- A zero-sum game (that isn't game-theoretically equivalent to the 0 matrix) has for every pair of strategies . This is because for every . The total payoff is always the worst possible.
- The alignment coefficient is linear in a specific senst, i.e., where is the matrix consisting of only s.

Now let's take a look at a variant of the Prisoner's dilemma with joint payoff matrix

Then

The alignment coefficient at is

Assuming pure strategies, we find the following matrix of alignment, where is the alignment when player 1 plays with certainty and player 2 plays with certainty.

Since is the only Nash equilibrium, the “alignment at rationality” is 0. By taking convex combinations, the range of alignment coefficients is .

Some further comments:

- Any general alignment coefficient probably has to be a function of , as we need to allow them to vary when doing game theory.
- Specialized coefficients would only report the alignment at Nash equilibria, maybe the maximal Nash equilibrium.
- One may report the maximal alignment without caring about equilibrium points, but then the strategies do not have to be in equilibrium, which I am uneasy with. The maximal alignment for the Prisoner's dilemma is 1/2, but does this matter? Not if we want to quantify the tendency for rational actors to maximize their total utility, at least.

- Using e.g. the correlation between the payoffs is not a good idea, as it implicitly assumes the uniform distribution on . And why would you do that?