Handicapping competitive games

post by DanielFilan · 2021-07-22T03:00:00.498Z · LW · GW · 19 comments

[epistemic status: thing I thought of while falling asleep and just wrote up]

Suppose you’re playing a competitive game. By that, I mean a game where there are multiple players, and each is trying to win by beating the others. An example of a game like this is Go. But, if you think about it, soccer is also kind of like this: each ‘player’ is composed of a team of people, and the two ‘players’ are competing against each other. We’ll say that that also counts.

Sometimes, you’d like to play a competitive game with a friend or multiple friends, but the problem is that one of you is stronger than the other. It’s easy to see what this means in Go, and in the case of soccer, you could imagine that you’re part of a pre-set team, and so are your friends, and it wouldn’t be as fun to swap people between teams to even it out (perhaps because e.g. the teams are based on where you live). This is kind of sad because it means that by default, the stronger player or team will predictably win, which makes it a bit less fun. A way to get around this is by handicapping the stronger player: giving them some disadvantage so that the weaker player has a decent chance of winning, even if the stronger player tries their best. In Go, the standard way of doing this is to have the weaker player start with some well-placed stones already on the board. I don’t know how exactly this works in soccer - perhaps by having the stronger team play with fewer members than usual?

If you’re in this situation, but you don’t know the standard way to handicap - for instance, if you’re me and the game is soccer - it might be useful to have a taxonomy of ways to handicap games to choose between. Or if you’re bored of the standard way of handicapping, a taxonomy might inspire you to create new ideas. In this post, I’ll detail what I think is an exhaustive taxonomy.

To think about how to handicap competitive games, I find it helpful to think about what a competitive game is. I think that a competitive game is specified by the following things:

One way of handicapping is to change the starting state in order to give one player an initial advantage. This is how I’d think about handicapping in Go: the weaker player starts with more stones on the board than the stronger player [1]. In soccer, you could imagine the kick-off happening closer to the stronger team’s goal, which might make it easier for the weaker player to score. I’d say this is usually a good option.

I don’t think it really makes sense to vary the number of players in the game - in soccer and Go this wouldn’t make much sense, and in general it’s hard to see how this would help the weaker player relative to the stronger player.

Changing the set of options for the players can be a possible handicapping scheme. In Go, it’s hard to see how to do this without significantly changing the game - the closest thing I can think of is banning the stronger player from playing on certain points on the board, or maybe forbidding the stronger player from killing or cutting groups. I think it makes more sense in soccer, however: one team could accept a limitation on how fast they can run. In order not to change the game, I imagine it will usually look like narrowing the option set for the stronger player, since enlarging the option set for the weaker player seems like it would significantly change the game.

The win condition can be a promising way to handicap, especially for points-based games like Go and soccer: one can simply add some number of points to the weaker player’s total at the end, before deciding the winner [2]. Especially in Go, I think this handicap system is under-used - in my opinion, it changes the game less than giving the weaker player handicap stones on the board at the start. However, it’s less clear how to apply it in games that do not determine the winner by keeping a score, such as chess.

The transition function is pretty core to the identity of a game, and therefore not to be trifled with. That being said, it’s possible that minor tweaks could provide a decent handicapping system - for instance, one could imagine a high-tech soccer ball that acted as tho it was heavier when the stronger team kicked it and lighter when the weaker team kicked it, or a version of Go where 5% of the time the stronger player’s move was replaced by a random move.

The observability function seems easier to tweak while retaining the character of a game. In Go, for example, one player could be required to play without seeing the board, only hearing the coordinates of each move and saying the coordinates they would like to play on. A less extreme case would be to use technology to allow the weaker player to see which stones are which colour, but make the stones’ identities invisible to the stronger player. In order to adjust the degree of handicap, one could change the number of moves in which this observability constraint applies. For soccer, you could imagine requiring one team to wear glasses that slightly distorted their vision.

That concludes my list of aspects of a game to tweak for handicapping. But there’s one more crucial ingredient that goes into playing a game - the computation available to each player. One can handicap a player by limiting the computation they have available. For instance, in Go, one could use asymmetric time controls, where the weaker player gets more time to think about their moves than the stronger player. In soccer, this could look like requiring all members of the stronger team to use earplugs, so that they can’t communicate with one another as easily [3].

I think this is an exhaustive taxonomy. I also think it’s useful: as far as I’m aware, most ways of handicapping fall pretty cleanly into just one of these, and it’s helped me come up with handicap ideas (in the process of writing the post). I hope you also find it useful.

[1] If the weaker player gets to choose where to put the stones, then this isn’t quite just a modification of the starting state. But normally the stones are put in a set position.

[2] This could also be seen as a modification of the starting state.

[3] This also changes the observability function, but I think that’s not its main effect.


Comments sorted by top scores.

comment by David Quarel (david-quarel) · 2021-09-09T05:04:25.264Z · LW(p) · GW(p)

I think you left out a very crucial point; perverse incentives. You want to handicap the stronger player, but in such a way that both players would not drastically change their policy from one they would normally use against a player of similar strength. 
I've had this problem before in real time strategy games, (like age of empires 2, which is what I play), the gap between the skill floor and skill ceiling is massive, and so when playing with friends (especially 1v1s) there's often a large difference in skill between two players (making it not fun for either player) and the game doesn't have any nice built-in features to easily handicap. 
One approach is to have the stronger player sit idle for the first X minutes of the game, but that leads to the weaker player forced to play much more aggressively and "rush", or build a bunch of defensive towers in the enemy base, or other such things, otherwise they lose the advantage of the stronger player going idle. 
Similarly, barring the stronger player from using a particular class of units (say, cavalry) will again mean the weaker player will shift from the meta and not bother with making units that counter cavalry, and make more of that units that are countered by cavalry (the game has a rough rock-paper-scissors model of unit types).
The only policy-preserving handicap I can think of for would be to force the stronger player to not use hotkeys, and manually click every action with the mouse. I think I would not drastically change my behaviour with no hotkeys, I would just execute actions slower. The downside is that for the stronger player it's less fun, they're now fighting the brain-to-computer interface rather than a skilled opponent.

Replies from: pktechgirl
comment by Elizabeth (pktechgirl) · 2021-09-09T06:35:30.074Z · LW(p) · GW(p)

For AoE in particular, would any of the following be policy-preserving?:

  • unit or building costs are higher/resources are slower to gather
  • units move slower
  • units are proportionally weaker but maintain the same ordinal ranking of what they are vulnerable to and strong against
  • units or buildings take longer to build

I only played AoE very casually as a teenager so I don't have a sense of what matters at a competitive level, but it is interesting to think about.

Replies from: david-quarel
comment by David Quarel (david-quarel) · 2021-09-09T13:42:27.202Z · LW(p) · GW(p)

I can only speak about those that are policy-preserving for a particular player (me).
(In retrospect, I should have maybe been optimising for fun-preserving instead, though policy-preserving is important if you spend a lot of time playing against people of drastically different skill levels, you don't want to unlearn how to play the game correctly when you're back to playing a normal game).

That being said,  

  • unit or building costs are higher/resources are slower to gather.
    I think if you scaled the cost of everything up by x1.5 or so, it would ruin my build order (though I suppose I could just learn one for the new costs), and I also might have to play more aggressively (as in the late game you'd be at a severe disadvantage). For the same reason, the enemy would be incentivised to play defensively and focus on economy. So I think this fails policy-preservation, but I still think this would be a reasonable handicap, and it'd still be fun for both sides.
  • units move slower
    This would really mess things up. An enemy gets to now choose which engagements they want, archers can no longer hit-and-run infantry, enemy siege would be proportionally too fast, ect. I think this would break core mechanics of the game, or at least drastically shift show I would play (probably overly reliant on units that can move quickly like cavalry, or play heavy defense). 
  • units are proportionally weaker but maintain the same ordinal ranking of what they are vulnerable to and strong against
    What you mean by "weaker", do my units deal less damage, or do they have less health? In the first case, I would probably focus more on ranged units to either push the exponent of Lanchester's Law[1] closer to 2 in the case of less damage (as for any melee battle I'll not be able to win without drastically overproducting units) or for the latter, focus mainly on units with the ability to hit and run (cavalry archers, conquistadors, ect.) or for which their lack of health is already offset by something else (monks, siege). 
    For large armies this would mostly average out, but since units deal a discrete amount of attack, and units have a discrete quantity of health points, the thing that matters for very small numbers of units is "number of attacks to kill", so unfortunately the effect on the game would vary discontinuously with a health penalty adjustment. I don't see why the enemy couldn't just waltz over very early and cripple me near the start of the game. 
    It would probably still preserve fun though.
  • units or buildings take longer to build
    It would make for a frustrating penalty to play with, and I would play defensively until I could get enough production buildings running. But I think a modified version of this where I have a limit on the number of production buildings I'm allowed to make, or on the number of villagers I'm allowed to have, might better preserve policy. 

Of these, I think the third option is probably the best for game balancing (as it'd be easy to dial in the penalty as needed), but the fourth would (at least for me) would probably be the most policy-preserving.

[1] https://en.wikipedia.org/wiki/Lanchester%27s_laws

Replies from: pktechgirl
comment by Elizabeth (pktechgirl) · 2021-09-09T19:23:14.477Z · LW(p) · GW(p)

This was very interesting, thank you.

comment by sanxiyn · 2021-07-22T06:16:58.068Z · LW(p) · GW(p)

Especially in Go, I think this handicap system is under-used - in my opinion, it changes the game less than giving the weaker player handicap stones on the board at the start.

I think "the character of a game" preserved by existing handicap systems of Go/Chess/Shogi (handicap stones and piece odds) is the evaluation function from board position to winning probability. That is, you use the same evaluation function to decide whether you are winning or losing given board position.

With point odds in Go you are suggesting, you can be winning while board position is losing. If evaluation function returns point difference you can do the adjustment, but I think evaluation function return type is closer to winning probability in practice.

Replies from: DanielFilan, mikkel-wilson
comment by DanielFilan · 2021-07-22T06:50:11.085Z · LW(p) · GW(p)

This argument would be more compelling to me if komi weren't already used - given that you already have to factor that number in, it doesn't seem like such a big deal to use a different number instead.

comment by MikkW (mikkel-wilson) · 2021-07-22T16:25:31.314Z · LW(p) · GW(p)

Handicap Go does traditionally also alter the point odds; in an even game, a komi worth ~7 stones is added to the second player's score, but in handicap games, the komi is set to 0 instead; so the mapping from board state to winning is not perfectly preserved. That said, a komi difference of ~7 is much more subtle than the difference that would be required to completely balance a many-stone handicap

comment by MikkW (mikkel-wilson) · 2021-07-22T16:30:45.717Z · LW(p) · GW(p)

It seems reasonable that in association football, removing players from one team, to create an unbalanced 8 vs 9 scenario is a decent way to handicap a sufficiently stronger team

Replies from: Measure
comment by Measure · 2021-07-23T03:06:24.094Z · LW(p) · GW(p)

I think the OP was using "players" to refer to the entire teams in keeping with their opening description of the term.

comment by ChristianKl · 2021-09-09T10:26:31.747Z · LW(p) · GW(p)

Especially in Go, I think this handicap system is under-used - in my opinion, it changes the game less than giving the weaker player handicap stones on the board at the start.

Having all the handicap in points usually means that the stronger player needs to kill more of the stones of the weaker player. I expect that normally to be less fun and also to teach the weaker player the wrong things if they have to be overly worried about being attacked. 

comment by Firinn · 2021-07-23T01:13:29.418Z · LW(p) · GW(p)

Changing the number of players is a pretty popular option in Overwatch custom games and content; people love "can six bronze players beat three grandmasters?" videos.

We easily have the option to change many aspects of the game - for instance, we can let the weaker team deal 150% damage or give the stronger team longer cooldowns - but in my experience it isn't popular. People learn split-second gut-level reactions and habits for certain things, and part of being a "good player" is knowing instinctively whether you can tank a certain shot when you peek it or knowing when your ability will come off cooldown. Handicapping people by changing those learned values messes with their instincts, and it doesn't feel good to be handicapped that way; people enjoy making the challenge more difficult much more than they enjoy changes that negate their pre-existing skill and nullify their hard work.

Replies from: Linch
comment by Linch · 2021-07-25T02:55:22.362Z · LW(p) · GW(p)

"can six bronze players beat three grandmasters?"

Well, can they? 

It surprises me that this is remotely in question, like 3 GMs will almost certainly smoke 6 bronze players in Starcraft (I've seen far more impressive feats), and naively shooter games would be even more asymmetric (like if the GM player has much better aim, they can beat ~infinite bronze players).

Replies from: Firinn
comment by Firinn · 2021-07-26T07:16:54.989Z · LW(p) · GW(p)

Overwatch is a hero shooter where every player has a different role and different abilities. As an experiment maybe a year ago, I once asked the best monkey player I knew at the time (4200 elo on a 0-5000 scale) to 1v1 the worst Bastion player I knew (under 1000 elo). In the neutral, the Bastion player consistently won despite the yawning chasm between their ratings. This is because monkey is a tank designed to take space and counter snipers and isolate squishy targets from their healers, and is not a character designed to 1v1 a Bastion. If you are missing three people from your team, you are missing three of the six key roles. The best player of all time playing Reinhardt could still probably lose a 1v1 to a bronze Pharah.

Running 2-3 higher-skill players versus 4-6 lower-skill players in variety PUGs, I've generally found that the lower-skill players very consistently win unless we give the 3 higher-skilled players an additional advantage like extra HP or damage. But that's with a ton of obvious confounding factors - my higher rated players might be more inclined to just play for fun, plus the lower-skill players in my community are still reasonably strategic from exposure to team environments.

The first result of my YouTube search is https://www.youtube.com/watch?v=ZfhdHUQbcNA which, as you predict, goes in favour of the GMs. But I think there's very easy tweaks (such as to team composition) that would allow the bronze players to do better. You can see that the first round actually goes to overtime for quite a while, so on paper it's pretty close. Not really analysing this in depth as it's 8am.

comment by Linch · 2021-07-22T07:03:03.481Z · LW(p) · GW(p)

Modifying the number of players seems promising as a handicap. Eg, if there are 3 players who want to play Go, you can pair the strongest player with the weakest player against the medium player, and the strongest and weakest players alternate moves for their side (no comms)

I've also seen versions of this for starcraft, where eg the professional player is in charge of microstrategy and the weaker player is in charge of macrostrategy, or vice versa.

Replies from: DanielFilan, GWS, purge
comment by DanielFilan · 2021-07-22T07:24:38.038Z · LW(p) · GW(p)

Imo this is better modelled as splitting players into a team in the taxonomy of this post, giving the weaker side a computational advantage. But it points to an awkwardness in the formalism.

comment by Stephen Bennett (Previously GWS) (GWS) · 2021-09-09T15:06:19.383Z · LW(p) · GW(p)

Continuing the thread of splitting what are usually considered atomic players into a team:

Chess has a fun variant called Hand and Brain that lets players of disparate skill levels enjoy the game concurrently. A single chess player is broken into a team: the hand and the brain. Generally the stronger player serves as the brain, who names a chess piece type on each move (e.g. “pawn”). It is then up to the hand to play a legal move with that piece type (e.g. by moving a particular pawn to a particular square). Frequently pairs of hands and brains will play against one another, but a single hand and brain combo could play against unitary players and would be moderately stronger than the hand alone. What are the benefits of such a game mode?

The brain is forced to find as many good moves as possible, and therefore enjoys what is engaging about chess. However, the brain is also forced to engage in meta cognitive and social reasoning about the board from the perspective of their partner. If moving the rook to e4 is a blunder that requires you to recognize a sneaky tactic, perhaps the quiet bishop move will set things up better down the road even if another rook move would be slightly stronger. The brain can alternatively say a piece type that only has one good move and many obviously terrible ones on the hope that the hand can successfully rule out the blunders.

From the perspective of the hand, the game gives them an opportunity to learn from the stronger player: by looking at a narrower subset of the board, they can find individual moves that are stronger than they might otherwise. In a sense they are increasing the amount of computation they bring to the game. If they could, in a regular game, run this process in parallel for each of the piece types and then choose the one that is best via an oracle (the brain), then they would presumably play very well! By finding strong moves in the more limited case, they will be more likely to find them again in future games. It can also build confidence in their ability: “you can indeed find strong moves, you just need to also take the time to find the right piece”

Lastly, the game has a strong social component. I usually see it played in the context of coaches goofing off with their students (against other coach-student pairings), but it’s goofing off that lets the coach see where their student is misunderstanding the game. What moves do they rule out too early or simply not see at all? Can I get my student to play strong moves that are both aggressive and defensive? The hand is also usually encouraged to think aloud, which helps the brain identify both what to suggest for the current game and also what to work on in study.

Sadly this variant doesn’t translate well to something like go, since there isn’t a good way to let the stronger player narrow down the space of possible moves. I suppose they could literally narrow down the space by giving a quadrant or something, but it’s not clear to me that the weaker player would get much out of this, neither in the immediate game nor in their general understanding of go.

comment by purge · 2021-07-22T20:45:06.876Z · LW(p) · GW(p)

Two teams of two players (strong + weak vs. medium + medium) is fairly common, I think.  It's called ren go.  But 2 vs. 1 would be different--the team of 2 players would be handicapped not just by the weaker player, but also by the lack of communication.  This is a possible way to handicap, sure, but it can't be tuned as precisely as komi or even star-point handicap stones.  Precision is an important consideration for handicapping.

I've also seen another method where two players of unequal strength played an even game, but a stronger third player teamed up with the weakest player.  They didn't communicate, and didn't alternate turns within their team--instead, the strong player was allotted a certain number of stones at the beginning of the game.  Then when he spotted an especially big mistake by the weaker player, he could spend a stone to correct that move.  This might be categorized like asymmetric time controls: the weaker player gets more resources.

comment by MikkW (mikkel-wilson) · 2021-07-22T16:31:25.430Z · LW(p) · GW(p)

Or, less likely to get the players to completely ignore the game, get them a little bit drunk

(This was a response to a (now deleted?) comment saying simply "on acid")

comment by Ericf · 2021-07-22T13:07:56.292Z · LW(p) · GW(p)

Re-stating your conclusion: To apply a handicap, you can change one (or more) of the following:

  1. The starting conditions
  2. The amount of out-of-game resources each player gets
  3. The ending victory point count

Taking the example of the 100 yard dash:

  1. Give one player a head start
  2. One player has less oxygen to use (eg by doing 50 jumping jacks right before the race)
  3. Add a fixed number of seconds to one player's time Or the example of Magic:The Gathering:
  4. Players have different decks
  5. One player has to do a distracting thing while playing (eg, a second game of Magic with a 3rd player)
  6. Play first-to-N wins, with different Ns.

Or you could change the game rules to something else, which is equivalent to playing a different (and hopefully more balanced) game.