Balancing Games

post by jefftk (jkaufman) · 2024-02-24T14:40:04.237Z · LW · GW · 18 comments

Contents

18 comments

When I play an N-player game I want everyone to both:

With many games and groups of participants these are in conflict: if I play bridge against my kids I'm going to win all the time, but I'm not very good at the game so if I play against people who are serious about it I'm going to lose ~all the time.

One way some games handle this is by including a lot of luck. The more random the outcomes are, the more you'll approach 1/N regardless of player skill. Kid games where you make no choices, like Candyland or War, take this to the extreme.

Instead, I think handicapping is a much better approach. For example in Go the weaker player can start with several stones already on the board, which gives them an advantage while still keeping it interesting and without turning it into a different-feeling game. When I was little and playing Go with my dad I remember slowly reducing the number of handicaps I needed over months, which was really rewarding: each game was fun and challenging, and I could see my progress.

Other examples:

I like it when games are designed in a way that makes this kind of adjustment easy and granular. You can calibrate by removing a handicap after the weaker player wins some number of games in a row (I think three is about right though it depends on granularity) and vice versa.

I'm curious, though: why isn't this more common? It's very normal in Go, mostly of historical interest in chess, and in most game cultures I'm around it seems like the expectation is just that weaker players will just lose a lot or or stronger players will "go easy" on them? Is it that acknowleging that some players are stronger than others is awkward? Too hard to calculate for games with more than two players?

Comment via: facebook, mastodon

18 comments

Comments sorted by top scores.

comment by Slapstick · 2024-02-24T23:20:37.402Z · LW(p) · GW(p)

I think one reason I don't like that sort of thing is there's more ambiguity in "what it took to win the game"

It's hard to know whether an artificial advantage is proportional to the skill gap. If I win, I won't know the extent to which I should attribute that win to good play (that I ought to be proud of, and that will impress others), VS attributing the win to a potentially greater than 1/N chance of winning(that I came by artificially).

If the greater skill is the absolute advantage that leads me to a win , I will discount the achievement on account of having an absolute advantage, but I'll still feel satisfied that I have achieved a relatively higher skill level.

If an improperly calibrated handicap is the absolute advantage that leads me to a win, it's a win I'd discount on account of there being an absolute advantage, but in this case I'd garner no satisfaction from having an (artificial) absolute advantage.

Morestill the win might feel insulting or condescending if I was given a disproportionately large advantage due to my friends/competitors underestimation of my expected quality of play.

My win will also not necessarily give my competitors an update as to whether they underestimated my expected quality of play.

If the expectation is that I will win 1/N times, they won't update on my skill level if I win. (Maybe very slightly, and eventually as you play more games)

If I win when the odds are against me, people update significantly on my expected quality of play.

It feels good to know people are updating favourably on my expected quality of play.

comment by GoteNoSente (aron-gohr) · 2024-02-25T05:53:03.394Z · LW(p) · GW(p)

In chess, I think there are a few reasons why handicaps are not more broadly used:

  1. Chess in its modern form is a game of European origin, and it is my impression that European cultures have valued "equal starting conditions for everyone" always higher than "similar chances for everyone to get their desired outcome". This might have made use of handicaps less appealing, because with handicaps, the game starts from a position that is essentially known to be lost for one side.
  2. There is no good way to combine handicaps in chess with Elo ratings, making it impossible to have rated handicap games. It is also not easy to use handicap results informally to predict optimal handicap between players who haven't met (if John can give me a knight, and I can give f7 pawn and move to James, it is not at all clear what the appropriate handicap for John against James would be). This is different in Go. 
  3. Material handicaps significantly change the flow of the game (the stronger side can try to just trade down into a winning endgame, and for larger handicaps, this becomes easy to execute), and completely invalidate opening theory. This is different in Go and also in more chess-like games such as Shogi, where I understand handicaps are more popular.
  4. Professional players (grandmasters and above) are probably strong enough to convert even a small material handicap like pawn and move fairly reliably into a win against any human (computers, at a few hundred elo points above the best humans, can give probably about pawn to top players and win, at tournament time controls). This implies any handicap system would use only very few handicaps in games between players strong enough that their games are of public interest (Go professionals I understand have probably 3-4 handicap stones between weak and best professional, and maybe two stones vs the best computers). I think that would have been different in the 19th century, when material handicaps in chess were more popular than today.


That said, chess does use handicaps in some settings, but they are not material handicaps. In informal blitz play, time handicaps are sometimes used, often in a format where players start at five minutes for the game and lose a minute if they win, until one of the players arrives at zero minutes. Simultaneous exhibitions and blindfold play are also handicaps that are practiced relatively widely. Judging just by the number of games played in each handicap mode, I'd say though that time handicap is by far the most popular variant at the club player level.

Replies from: antanaclasis, thomas-kwa
comment by antanaclasis · 2024-02-29T08:39:35.033Z · LW(p) · GW(p)

For chess in particular the piece-trading nature of the game also makes piece handicaps pretty huge in impact. Compare to shogi: in shogi having multiple non-pawn pieces handicapped can still be a moderate handicap, whereas multiple non-pawns in chess is basically a predestined loss unless there is a truly gargantuan skill difference.

I haven’t played many handicapped chess games, but my rough feel for it is that each successive “step” of handicap in chess is something like 3 times as impactful as the comparable shogi handicap. This makes chess handicaps harder to use as there’s much more risk of over- or under-shooting the appropriate handicap level and ending up with one side being highly likely to win.

comment by Thomas Kwa (thomas-kwa) · 2024-02-28T22:09:30.674Z · LW(p) · GW(p)

Is the gap only 2 stones between best professionals and best computers? A reddit thread from 2 years ago said Shin Jinseo has a losing record getting 2 stones from FineArt, and computers have probably improved since then.

comment by AdamYedidia (babybeluga) · 2024-02-25T09:48:56.543Z · LW(p) · GW(p)

In Drawback Chess, each player gets a hidden random drawback, and the drawbacks themselves have ELOs (just like the players). As players' ratings converge, they'll end up winning about half the time, since they'll get a less stringent drawback than their opponent's. 

The game is pretty different from ordinary chess, and has a heavy dose of hidden information, but it's a modern example of fluid handicaps in the context of chess.

comment by [deleted] · 2024-02-24T21:55:23.395Z · LW(p) · GW(p)
  • Win about 1/N of the time

Morphie's law does this.  

https://store.steampowered.com/app/948960/Morphies_Law_Remorphed/

Doesn't seem to be a particularly successful implementation, but it's an FPS game where players with more kills grow bigger (and are easier to hit), while players with more deaths grow smaller (and are harder to hit and can hide in places the larger players cannot access)

This algorithm is supposed to make the KDR 1/1 over infinite time.

comment by avancil · 2024-02-25T07:53:34.625Z · LW(p) · GW(p)

In multiplayer games, one balancing factor is that other players can gang up on the person who is ahead. Depending on the game dynamic, this can even things out a lot. In some games, this even creates the dynamic where you don't want to look too strong, so that others don't focus their attention on you.

Playing games against my kids when they were young, rather than just slack off and let them win, it was more fun for me to figure out the best way to handicap myself: What algorithm for sub-optimal play would keep the game close? Solving that puzzle effectively became my victory condition, rather than the game's victory condition, and I was effectively competing against myself, a more balanced opponent.

comment by Lukas_Gloor · 2024-02-26T14:12:28.393Z · LW(p) · GW(p)

Small edges are why there's so much money gambled in poker. 

It's hard to reach a skill level where you make money 50% of the night, but it's not that hard to reach a point where you're "only" losing 60% of the time. (That's still significantly worse than playing roulette, but compared to chess competitions where hobbyists never win any sort of prize, you've at least got chances.) 

comment by RamblinDash · 2024-02-26T13:49:33.177Z · LW(p) · GW(p)

In a game where you play a higher number of shorter games, you can ideally have a handicap that adjusts after every game.  For example, in Super Smash Bros, if you turn handicap to "auto" then the stronger player starts with damage, which (in two player) goes up 10% every time they win, and down 10% every time they lose. It gets a little more complicated in 3+ player games, and I'm not sure the exact algorithm, but it works reasonably well. Maybe something to emulate in a game where handicaps can be reasonably granular?

comment by MondSemmel · 2024-02-24T20:04:34.787Z · LW(p) · GW(p)

When I play an N-player game I want everyone to both: 

  • Try to win
  • Win about 1/N of the time

This may be besides the point of your post, but: you can do even better than that, and without a need for handicapping, by playing co-op board games instead. Versus-style board games are just one type of game, and while you can modify their rules to come closer to equality of outcomes, that seems like a rather convoluted way of getting there. Like, in this situation, why play a zero-sum game when you could play a positive-sum game instead?[1]

Or if entirely co-op games don't seem appealing, another option along this axis is to play team-based games; then you can balance team strengths by which and how many people you assign to each team.

Some co-op board game recommendations suitable even for groups of widely disparate skill levels: Letter Jam, Just One.

A co-op game for groups that want a challenge: Hanabi.

Some team-based board game recommendations: Codenames, Decrypto. I wrote about these two games here [LW(p) · GW(p)].

  1. ^

    Speaking from my own experience, when I grew up I only knew versus board games, stuff like Monopoly or Settlers of Catan. But once I discovered co-op board games, I eventually realized that I had a lot more fun playing those with my siblings.

Replies from: GuySrinivasan, jkaufman
comment by SarahNibs (GuySrinivasan) · 2024-02-25T00:14:11.519Z · LW(p) · GW(p)

One of the reasons I tend to like playing zero-sum games rather than co-op games is that most other people seem to prefer:

  • Try to win
  • Win about 70% of the time

While I instead tend to prefer:

  • Try to win
  • Win about 20% of the time
comment by jefftk (jkaufman) · 2024-02-25T16:45:02.837Z · LW(p) · GW(p)

Many cooperative board games run into a problem where if there are people of differing skill levels on the same team than the strongest player ends up doing most of the playing. Hanabi is the only multiplayer game I've tried that successfully avoids this, where every player needs to be engaged and trying their best.

Replies from: MondSemmel
comment by MondSemmel · 2024-02-25T20:32:02.777Z · LW(p) · GW(p)

I know what you mean, and it used to absolutely be an issue in our group, especially with games like Eldritch Horror or Pandemic Legacy, i.e. multi-hour games where you have full information about everything every player is doing. That said, an obvious design which circumvents this problem is co-op games where every player has some private information: then other players can't play for you and vice versa.

Incidentally, all the non-team co-op games I suggested above have this design.

Just One is a co-op party game where the active player must guess a word and each other player independently provides a word hint. Then the hint givers compare hints and eliminate all hints that were given multiple times (hence the title, "Just One").

Resulting game flow: If everyone tries to give an "obvious" hint (e.g. giving the hint "metal" for the word "steel"), then multiple people will likely give the same hint, and as such this hint will be unavailable to the active player. Whereas if nobody gives obvious hints, there's a higher chance that there are no duplicate hints to eliminate, so the active player can work with a lot of hints but might get misled by all hints being non-obvious. This makes it an interesting challenge for what kinds of hints to give and how to interpret the hints one receives.

Meanwhile Letter Jam is a bit like Hanabi: Every player has one letter card facing away from themselves, so everyone but themselves knows what it is. The goal is for everyone to guess their 4-7 letter cards in as few rounds as possible. Every round one player (chosen by the group) gives a word hint to the other players based on the letters they see.

E.g. suppose there are four players. Then I would see the letter cards of the three other players, plus 1-2 letters visible to everyone, plus finally a joker which can substitute for any one letter. And suppose I see the player letters P L A, and an open letter T. Then I could make the word hint PLANT (by using the joker for the N). This hint is given silently by placing numbered poker chips next to the letters I want to use, e.g. the 1-chip in front of the player with the letter P. Here's how these hints look like to the other players: player 1 sees ?LA*T, player 2 P?A*T, player 3 PL?*T. Based on such hints, players try to narrow down what their own letter is.

The hint I gave involved the joker and thus doesn't provide much info on the hidden letters, whereas one great hint can directly help multiple players guess their current letter and proceed to the next one. But even if one player is much better at giving hints, they still rely on others to also provide hints, since you cannot identify your own letters when you give a hint. And even if you could give 5 perfect hints and would then need 5 perfect hints yourself, that's still much less efficient (i.e. it requires more rounds) than if each player can contribute a perfect hint.

comment by Dagon · 2024-02-24T17:42:13.149Z · LW(p) · GW(p)

I don't play golf, but the idea of "handicap" is a part of at least pop-culture renditions of the game.  In my boardgame and videogame groups, explicit in-game support for varying player skill levels is pretty well absent and not really considered.  Competitive sports tend to have "divisions" rather than individual compensations.

It's pretty obvious why it's not common in money games like poker or betting pools.  

In my experience for casual games, the expectation of equality of outcome just isn't there, and isn't all that important.  People still have fun in the optimization and game decisions, even when some of us are really known to be more likely to win.  Having a formal handicap system would BOTH be too much effort, AND make some players have less fun because it feels like they're not "really" playing the same game.  Maybe.  Maybe it's just the effort thing.  

Also, for many multiplayer games, there are social dynamics that make current leaders have a bit of a disadvantage in trading or interactions.   

comment by Andrew Currall (andrew-currall) · 2024-02-26T14:45:05.875Z · LW(p) · GW(p)

Bridge is a slightly odd choice of example in your opening section. A single hand of Bridge has very high randomness; it's quite likely the weaker partnership will "win", assuming they have at least basic competence in the game. The advantage of a stronger pair only really becomes apparent over a large number of hands. 

The same is true is Poker, even more so. In fact stronger players may not "win" very many more hands than weaker players at all; it's just that when they win they win more and when they lose they lose less. 

This isn't true at all in Chess, of course.

Replies from: evin
comment by evin · 2024-02-26T21:09:04.557Z · LW(p) · GW(p)

Despite the randomness, bridge is an excellent example, as "people who are serious about it" play duplicate bridge. Duplicate poker exists, but doesn't seem as popular.

comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-25T23:06:56.357Z · LW(p) · GW(p)

mostly of historical interest in chess

Aren't time handicaps still common in chess?

Replies from: Chris Land
comment by Chris Land · 2024-02-26T02:54:02.733Z · LW(p) · GW(p)

You're correct, time handicaps (e.g. 2m vs. 5m) are more common than pawn/piece handicaps. Mostly for in-person play.

Master vs. Amateur handicaps can look crazy: 2m vs. 15m and -QRR is a slight advantage for the master simply because most amateurs are not used to playing with the clock. Another M v. A handicap is 'capped pawn': amateur picks a pawn, checkmate must be delivered with that pawn (pre-promotion). It's a bit like having two Kings, as if that pawn is captured the game is lost.