0 comments

Comments sorted by top scores.

comment by lsusr · 2020-09-25T11:12:24.555Z · LW(p) · GW(p)

So by the end Glue is almost never chosen, while Rock reverts to being a regular option which risks defeat by paper, that in time becomes more popular.

This sentence confuses me. It seems to me like you're implying that there is a time when the probability of choosing rock exceeds the probability of choosing glue when in fact the Nash equilibrium strategy is rock, $\frac{1}{3}$ paper, $\frac{1}{3}$ glue and $\frac{0}{3}$ scissors.

Replies from: johnswentworth, None, KyriakosCH

↑ comment by johnswentworth · 2020-09-25T16:18:10.405Z · LW(p) · GW(p)

For others who want to check those numbers: note that glue dominates scissors (both beat paper, lose to rock, and glue beats scissors), so scissors should never be played. With that simplification, it's an ordinary game of rock-paper-scissors, except "scissors" is now called "glue".

Replies from: KyriakosCH

↑ comment by KyriakosCH · 2020-09-25T19:23:35.508Z · LW(p) · GW(p)

Please read my edited reply to lsusr.

↑ comment by [deleted] · 2020-09-27T22:41:42.373Z · LW(p) · GW(p)

You mean Nash equilibrium strategy? Rock-Paper-Scissors is a zero-sum game, so Pareto optimal is a trivial notion here.

Replies from: lsusr

↑ comment by lsusr · 2020-10-06T04:32:26.565Z · LW(p) · GW(p)

Fixed.

↑ comment by KyriakosCH · 2020-09-25T11:57:40.580Z · LW(p) · GW(p)

Edit (I rewrote this reply, cause it was too vague in the original :) )

Very correct in regards to every player actually having identified this (indeed, if all players are aware of the new balance, they will pick up that glue is a better type of scissors so scissors should not be picked). But imagine a player comes in and hasn't picked up this identity, while (for different reasons) they have picked up an aversion to choose rock from previous players. Then scissors still has a chance to win (against paper), and effectively rock is largely out, so the triplet scissors-paper-glue has glue as the permanent winner. This in turn (after a couple of games) is picked up and stabilizes the game as having three options for all (scissors no longer chosen), until a new player who is unaware joins.

Essentially the dynamic of the 4-choice game allows for periodic returns to a 3-choice, which is what can be used to trigger ongoing corrections to other systems.

Replies from: None

↑ comment by [deleted] · 2020-09-27T22:39:02.793Z · LW(p) · GW(p)

Regardless of what the new player does, there is no reason to ever play scissors. I don't see any interesting "4-choice dynamic" here. Perhaps you should pick a different example with multiple Nash equilibria.

Replies from: KyriakosCH

↑ comment by KyriakosCH · 2020-09-28T07:24:14.369Z · LW(p) · GW(p)

You are confusing "reason to choose" (which is obviously not there; optimal strategy is trivial to find) with "happens to be chosen". Ie you are looking at what is said from an angle which isn't crucial to the point.

Everyone is aware that scissors is not be chosen at any time if the player has correctly evaluated the dynamic. Try asking a non-sentence in a formal logic system to stop existing cause it evaluated the dynamic, and you'll get why your point is not sensible.

comment by Andy Jones (andyljones) · 2020-09-26T12:24:01.630Z · LW(p) · GW(p)

You may be interested in alpha-rank. It's an Elo-esque system for highly 'nontransitive' games - ie, games where there're lots of rock-paper-scissors-esque cycles.

At a high level, what it does is set up a graph like the one you've drawn, then places a token on a random node and repeatedly follows the 'defeated-by' edges. The amount of time spent on a node gives the strength of the strategy.

You might also be interested in rectified policy space response oracles, which is one approach to finding new, better strategies in nontransitive games.

Replies from: KyriakosCH

↑ comment by KyriakosCH · 2020-09-26T14:20:02.733Z · LW(p) · GW(p)

Thank you, I will have a look!

My own interest in recollecting this variation (an actual thing, from my childhood years) is that intuitively it seems to me that this type of limited setting may be enough so that the inherent dynamic of 'new player will go for the less than optimal strategy', and the periodic ripple effect it creates, can (be made to) mimic some elements of a formal logic system, namely the interactions of non-sentences with sentences.

So I posted this as a possible trigger for more reflection, not for establishing the trivial (optimal strategy in this corrupted variation of the game) ^_^

comment by David Scrimshaw (david-scrimshaw) · 2020-09-25T13:55:45.075Z · LW(p) · GW(p)

To see what other rock-paper-scissor scholars have to say on this, you might want to investigate the controversial "dynamite" option.