Number-guessing protocol?

post by abramdemski · 2020-12-07T15:07:48.019Z · LW · GW · 4 comments

This is a question post.

Contents

  Answers
    25 Ericf
    10 grumphrey
    8 RamblinDash
    7 Dagon
    5 Unnamed
    4 Donald Hobson
    3 SimonM
    2 Scott Garrabrant
    2 Scott Garrabrant
    2 GuySrinivasan
    1 Measure
    1 Bruce G
None
4 comments

Suppose several people are guessing a number, and then find an estimate to see who is right.

The super-common protocol is: whoever is closest, wins.

This protocol is really bad. If there are three people, and I guess 50, then the other two people can guess 51 and 49. This means I'll almost certainly lose. Unless it's within 1/2 of 50, one of the other guesses will be closer.

There are lots of ways to fix this protocol. However, most of them suffer from too much added complexity. For example, squared error incentivises everyone to guess their expected value (mean). However, people don't generally want to calculate squares, and if they did, they'd still feel like the lowest squared error was the winner (which amounts to the usual protocol).

Or, we could give confidence intervals. But how should they be scored?

My question is this: what are some ways to play this game that combine simplicity with good incentives? 

Answers

answer by Ericf · 2020-12-07T17:32:25.615Z · LW(p) · GW(p)

If you must find a single winner, allow people to change their guess after hearing the later ones, continuing until everyone is happy with their guess (and implicit range). Thus, if player 1 was thinking "somewhere around 50" they can now pick between "exactly 50" "lower" or "higher;" and there will be an auction to determine how far below 50 someone stops (or at what point the "51" player drops down, and maybe gets "overcut"

You could also use something like the "name that tune" algorithm:

  1. The first guesser picks a range of values
  2. The second guesser picks a range of values that either A. Does not overlap with the first guesses OR B. Covers a different number of values as the first guesser
  3. Each subsequent guesser has the same options A. Make a guess that doesn't overlap anyone B. Make a guess with a different number of values as anyone who is overlapped
  4. Allow people to revise their ranges, as above.

Then, whoever has the smallest range that contains the true value wins.

comment by jimmy · 2020-12-07T18:30:34.454Z · LW(p) · GW(p)

This answer is great because it takes the problem with the initial game (one person gets to update and the other doesn't) and returns the symmetry by allowing both players to update. The end result shows who is better at Aumann updating and should get you closer to the real answer.

If you'd rather know who has the best private beliefs to start with, you can resolve the asymmetry in the other direction and make everyone commit to their numbers before hearing anyone else's. This adds a slight bit of complexity if you can't trust the competitors to be honest, but it's easily solved by either paper/pencil or everyone texting their answer to the person who is going to keep their phone in their pocket and say their answer first.

comment by abramdemski · 2020-12-07T21:35:38.408Z · LW(p) · GW(p)

Two good suggestions!

I particularly like the first suggestion as an extremely simple and lightweight modification to the standard procedure. OTOH, it does incentivise rather dishonest reports (since players still want to strategically take under-covered ranges).

The second proposal is pretty good as well, although to avoid the complexity of "do not overlap OR cover a different number of values" I might prefer to simply allow ties.

answer by grumphrey · 2020-12-07T16:11:21.808Z · LW(p) · GW(p)

The problem with the existing protocol is that it forces the choice of a single winner.  If multiple players are all basically right, the protocol you describe forces them into a deathmatch because only one player can be "the winner".

(Another problem with the existing protocol is that it has some players making their predictions "before" others, in a way that is visible to the others.)

 

Here's a better protocol: everyone makes their prediction at the same time without seeing anyone else's prediction.  If someone is off by X units then their score for that round is 1/(X+1).  For best results, play several rounds and compute the average score.

 

You might also be interested in Wits And Wagers, which is the "everyone predicts a number" activity made into a six-player board game.  I've played it.  It's pretty fun.

comment by Ericf · 2020-12-07T16:52:11.194Z · LW(p) · GW(p)

What they said. Adjust the protocol to allow everyone equally close to be the "winner"

As a simple protocol, take the true number, and check at each order of magnitude:

0. Anyone who got it exact wins

  1. Anyone within 90% - 110% of the true value wins. If that's nobody
  2. Anyone within 50% - 200% of the true value wins. If that's still nobody
  3. Anyone within 10% - 1000% of the true value wins.
  4. If that's nobody, then everyone loses.
comment by abramdemski · 2020-12-07T21:37:49.460Z · LW(p) · GW(p)

Everyone settling on an answer before anyone speaks is a good norm in general to avoid anchoring, in many settings.

However, when playing with non-rationalists, I feel like one would need paper in order to implement it in a trustworthy way, which makes me think it's not going to be popular for that use-case.

answer by RamblinDash · 2020-12-07T18:55:13.453Z · LW(p) · GW(p)

It seems like the most straightforward simple fix is for everyone to guess simultaneously. Then people can't just grab big ranges because they are big. E.g. the player who guesses 49 in your example would have to have actually had a lower estimate for the value rather than merely cutting you off at the knees. I haven't studied in depth but I think the incentives (if each person is trying to account what the others, N-layers deep) are to just guess your true prediction.

comment by abramdemski · 2020-12-07T21:42:46.503Z · LW(p) · GW(p)

I suspect that the incentive is far from that; instead, the strategic response is to randomize somewhat, so that you expect to land in a relatively unclaimed region.

However, this strategy is counter-intuitive enough that humans may just be honest instead, relying on differences in beliefs to inject some randomization for you.

answer by Dagon · 2020-12-07T19:09:22.585Z · LW(p) · GW(p)

When played as a game, the purpose is rarely for the organizer to get a good estimate.   It's either for entertainment or for point-scoring by the participants.  As such, the bracketing strategy is just part of the fun.  If you want to "fix" it, just make it simultaneous - you get better information about independent estimates, and you remove (some of) the strategic adjustment of guesses.

You could also give up the "one winner" idea - scoring based on distance from actual (or better, by impact of error, which can vary by topic), without caring whether someone else was closer, would incent true estimate submission.

Finally, why not just set up a betting market?  Let participants weight their estimates by strength of information.

 

answer by Unnamed · 2020-12-07T21:33:51.157Z · LW(p) · GW(p)

If you're just doing this occasionally without recordkeeping, then it seems convenient to have the game result in "winners" rather than a more fine-grained score. But it could be fine to sometimes have multiple winners, or zero winners. Here's a simple protocol that does that:

The person who asks the question also defines what counts as "winning". e.g. "What's the value of such-and-such? Can anybody get it within 10%?" Then everyone guesses simultaneously, and all the people whose guesses are within 10% of the true value are "winners".

("Simultaneous" guessing can mean that first everyone comes up with their guess in their head, and then they take turns saying them out loud while on the honor system to not change their guess.)

Slightly more complicated, the asker could propose 2 standards of winning. "When did X happen? Grand prize if you guess the exact year, honorable mention if you get it within 5 years." Then if anyone guesses the exact year they're the big winner(s) and the people who get it within 5 years get the lesser glow of "honorable mention". And if no one guesses the exact year then the people who get it within 5 years feel more like winners.

If you continue farther in this direction you could get to one of Ericf's proposals [LW(p) · GW(p)]. I think my version has lower barriers to entry, while Ericf's version could work better among people who use it regularly.

answer by Donald Hobson · 2020-12-08T00:32:43.233Z · LW(p) · GW(p)

Get each player to assign a probability distribution over answers. Starting with an equal prior over players answers, update it based on the observation. Sample from the posterior. (So if Alice assigned twice as much prob as Bob to the correct outcome, then Alice is twice as likely to win. )

answer by SimonM · 2020-12-07T16:54:30.521Z · LW(p) · GW(p)

Or, we could give confidence intervals. But how should they be scored?

 

A relatively simply way to do this would be rather than CI, you give "location and scale" and are scored according to:

comment by Bruce G · 2020-12-07T17:39:24.020Z · LW(p) · GW(p)

Can you clarify (possibly by giving an example)?  Are players are trying to minimize their score as calculated by this method?

And if so, is there any incentive to not just pick a huge number for the scale to minimize that way?

answer by Scott Garrabrant · 2020-12-08T01:02:28.603Z · LW(p) · GW(p)

Here is a proposal. Not sure if I like it better or worse than my other one.

In all cases, add infinitesimal noise to everyones votes, and the final answer, to avoid ties.

If there are an odd number  of people, have everyone submit a  confidence interval. Alternate (choosing randomly which to do first) taking the unlabeled person with the lowest remaining lower bound, and labeling them "below", and taking the unlabeled person with the highest remaining upper bound, and labeling them "above." When one person remains unlabeled, they are given the region , where  is the greatest lower bound among people labeled "below," and  is the least upper bound among people labeled "above." Recursively distribute the remaining region less than  across people labeled below and the remaining region greater than  across people labeled above.

If there are an even number  of people, have everyone name their median guess, let  be the average of the two inner-most guesses, and recursively distribute the region less than  to the people who gave guesses less than , and the region greater than  to the people who gave guesses greater than 

If there is 1 person, they get the entire region.

answer by Scott Garrabrant · 2020-12-08T00:03:10.874Z · LW(p) · GW(p)

This is my initial proposal, but I am going to think about it more:

There are  people. Everyone privately names a number, where the prompt is "What number do you think the answer has probability  of being below." Lowest bidder gets the territory , where  is the second lowest bid. 

The remaining  people divide up the  using the same process, with the prompt "What number do you think the answer has probability  of being below conditioned on being at least . Lowest bidder getting the territory , where  is the second lowest bid.

Continue this pattern until there is only one person left, who gets the remaining region.

This is not symmetric with respect to negating the answers, but it does have the property that everyone has a strategy (honest reporting) that guarantees that they win with subjective probability at least 

comment by Scott Garrabrant · 2020-12-08T00:25:47.038Z · LW(p) · GW(p)

I am not sure that the second price part of this algorithm is actually doing anything useful, but it feels more fair, and closer to incentivizing honest reporting than the first price version. I am not actually sure whether it is better to leave at the beginning or end if everyone reports honestly for the second price version. The first price version is equivalent to:

Move the knife slowly across the cake from left to right. Anyone can say "cut" at any time and get the piece to the left of the knife and leave the game. Repeat.

Replies from: Scott Garrabrant
comment by Scott Garrabrant · 2020-12-08T00:27:48.449Z · LW(p) · GW(p)

It is not exactly equivalent, because people get to see each other's bids from the previous round.

answer by GuySrinivasan · 2020-12-07T17:56:41.801Z · LW(p) · GW(p)

Everyone has to name a guess (mean) and range (standard deviation) of a normal distribution. Whoever's pdf takes the largest value at the true answer wins. Bonus: you may opt to invertibly transform the input first in one of several acceptable ways, most notably taking the logarithm. Now take that and simplify it. ;)

comment by Bruce G · 2020-12-07T18:26:12.519Z · LW(p) · GW(p)

Something like that could work, but it seems like you would still need to have a rule that you must guess before you know the other players guesses.

Otherwise, player 2 could simply guess the same mean as player 1 - with a slightly larger standard deviation - and have a PDF that takes a higher value everywhere except for a very small interval around the mean itself.

Alternatively, if 3 players all guessed the same standard deviation, and the means they guessed were 49, 50, and 51, then we would have the same problem that the opening post mentions in the first place.

comment by abramdemski · 2020-12-07T21:44:31.862Z · LW(p) · GW(p)

This seems fairly nice, except that actually doing the math is far too demanding for most settings where I'd play this game.

answer by Measure · 2020-12-07T20:16:58.695Z · LW(p) · GW(p)
  1. Everyone guesses a number (could be sequentially or simultaneously, but duplicate guesses are allowed).

  2. Anyone who guessed the correct value in the previous step wins.

  3. If no guesses were exactly correct, everyone guesses a new number and repeat until one or more players guess correctly.

This only works for reasonably small discrete ranges. For continuous or very large ranges, modify step 2 so that anyone who guesses within some distance of the true value wins.

answer by Bruce G · 2020-12-07T17:25:46.799Z · LW(p) · GW(p)

Is this for a one-shot game or are you doing this over many iterations with players getting some number of points each round?

One simple method (if you are doing multiple rounds) is to rank players each round (Closest=1st, Second Closest=2nd, etc) and assign points as follows:

Points = Number of Players - Rank

So say there are 3 players who guess as follows:

Player 1 guesses 50

Player 2 guesses 49

Player 3 guesses 51

And say the actual number is 52.

So their ranks for that round would be:

Player 1: 2nd place (Rank 2)

Player 2: 3rd place (Rank 3)

Player 3: 1st place (Rank 1)

And their scores would be:

Player 1: 3 - 2 = 1 point

Player 2: 3 - 3 = 0 points

Player 3: 3 - 1 = 2 points

I think this works better if you are calculating a winner over many rounds, so that there is a new ranking and new awarding of points on each round.  The same is true of least squared error, which you mention, and most of the other methods of incentivizing players to try to guess the mean expected value.

I could also think of other ways to incentivize this, and to use confidence intervals, but they all add complexity to the points calculations.

comment by abramdemski · 2020-12-07T21:45:58.481Z · LW(p) · GW(p)

Alas, it's usually one-shot.

Replies from: Bruce G
comment by Bruce G · 2020-12-07T23:33:09.129Z · LW(p) · GW(p)

In that case, the options are really limited and the main simple ideas for that (eg: guess before you know other player's guesses) have been mentioned already.

One other simple method for one-shot number games I can think of is:

Automatic Interval Equalization:

When all players guesses are known, you take the two players whose guesses are closest and calculate half the difference between them.  That amount is the allowable error, and each player's interval is his or her guess, plus or minus that allowable error.

You win if and only if the answer is in your interval.

Example:

Player 1 guesses 44

Player 2 guesses 50

Player 3 guesses 60

The allowable error for this would be ((50-44)/2) = 3

So the winning intervals would be:

Player 1: 41-47

Player 2: 47-53

Player 3: 57-63

This would result in at most one winner (unless the answer is half way between the 2 closest guesses).  Everyone's winning interval would be the same size and none would overlap.  And nobody would have an incentive to guess near someone else's (stated or expected) guess, unless they thought the answer was actually close to that.

However, it has the disadvantage that a lot of such contests would end up with no winner.

4 comments

Comments sorted by top scores.

comment by abramdemski · 2020-12-07T21:46:43.044Z · LW(p) · GW(p)

I'm surprised by the combination of high response rate and low karma this post currently has.

Replies from: habryka4, Measure
comment by habryka (habryka4) · 2020-12-07T21:54:05.376Z · LW(p) · GW(p)

Yeah, me too. No idea why.

comment by Measure · 2020-12-07T22:43:46.107Z · LW(p) · GW(p)

It feels weird to me to upvote what sounds like a request for ideas rather than upvoting the ideas themselves.

Maybe also people jump right into designing and critiquing the mechanisms and bypass the voting step?

Replies from: Dagon
comment by Dagon · 2020-12-07T23:20:04.323Z · LW(p) · GW(p)

I commented, but didn't vote.  It's mildly interesting, but only mildly - I suspect I'm missing something about how frequent or important this game is to your group.  To me, I'm familiar with "The Price is Right" game show, and I've very occasionally seen this ruleset for "guess the number of jellybeans" kind of party game, but it's just not important enough to really put much weight on.