Poker is a bad game for teaching epistemics. Figgie is a better one.

post by rossry · 2024-07-08T06:05:20.459Z · LW · GW · 47 comments

This is a link post for https://blog.rossry.net/figgie/

Contents

47 comments

Editor's note: Somewhat after I posted this on my own blog, Max Chiswick cornered me at LessOnline / Manifest and gave me a whole new perspective on this topic. I now believe that there is a way to use poker to sharpen epistemics that works dramatically better than anything I had been considering. I hope to write it up—together with Max—when I have time. Anyway, I'm still happy to keep this post around as a record of my first thoughts on the matter, and because it's better than nothing in the time before Max and I get around to writing up our joint second thoughts.

As an epilogue to this story, Max and I are now running a beta test for a course on making AIs to play poker and other games. The course will a synthesis of our respective theories of pedagogy re: games, and you can read more here or in the comments. The beta will run July 15-August 15, in-person in SF, and will be free but with limited signups has gone to waitlist-only. We're hoping to run additional iterations in-person in NYC and remote-first starting in September, so please sign up to the mailing list if either of those are of interest.


Some trading firms are driven by good decisions made by humans. (Some aren't, but we can set those aside. This post is about the ones that are.) Humans don't make better-than-average-quality decisions by default, so the better class of intellectually-driven quantitative trading firm realizes that they are in the business of training humans to make better decisions. (The second-best class of firm contents themselves with merely selecting talent.) Some firms, famously, use poker to teach traders about decision making under uncertainty.

First, the case for poker-as-educational-tool: You have to make decisions. (Goodbye, Candy Land.) You have to make them under uncertainty. (Goodbye, chess.) If you want to win against smart competition, you have to reverse-engineer the state of your competitors' uncertainty from their decisions, in order to make better decisions yourself. (Goodbye, blackjack.)

It's the last of these that is the rarest among games. In Camel Up—which is a great game for sharpening certain skills—you place bets and make trades on the outcome of a Candy Land-style camel race. Whether you should take one coin for sure or risk one to win five if the red camel holds the lead for another round... Turn after turn, you have to make these calculations and decisions under uncertainty. But there's no meaningful edge in scrutinizing your opponent's decision to pick the red camel. If they were right about the probabilities, you shouldn't have expected differently. And if they're wrong, it means they made a mistake, not that they know a secret about red camels.

Poker is different. Your decision is rarely dictated by the probabilities alone. Even if you draw the worst possible card, you can win if your opponent has been bluffing and has even worse—or if your next action convinces them that they should fold a hand that would have beaten yours. If you only play the odds that you see, and not the odds you see your opponent showing you, you will on average lose.

So as you grind and grind at poker, first you learn probabilities and how they should affect your decisions, then you learn to see what others' decisions imply about what they see, and then you can work on changing your decisions to avoid leaking what you know to the other players that are watching you. Or so I'm told. I would not describe myself as a particularly skilled poker player. I certainly have not ground and ground and ground.

Here's the thing, though: If you are a trading firm and you want to teach traders about making decisions uncertainty, it's not enough that poker teaches it. Nor is it enough that poker, if you grind for thousands of hours, can teach quite a lot of it. A quantitative trading firm is primarily a socialist collective run for the benefit of its workers, but it is secondarily a capitalist enterprise trying to make money. The question, for our trader-curriculum designer, is whether poker is the most effective and efficient tool for teaching the epistemic skills you want. Ideally in the first hundred hours or so.

In Orson Scott Card's Ender's Game, the child-soldier-generals aren't taught formation tactics by quarterbacking American football. They're all going to Battle School for months(?) of training, and the International Fleet can afford to teach them a new and made-up game. So the Fleet does give the children an entirely new game that's better-aligned with the skills they care about, a kind of zero-gravity capture-the-flag.

Back on present-day earth, our trader-curriculum designer is looking for a game that yields its lessons over dozens of hours of play among a group of traders (or interns) that work for the firm. They're going to do this year after year for class after class of traders and interns. For them, it is absolutely a live option to invent a new game out of whole cloth, teach them all the rules in an hour or two, and use it as the tool for teaching trading epistemics.

Jane Street, the trading firm, recently released a new version of its game Figgie for iOS and Android, so maybe we should talk about that, especially as it compares to poker. Figgie, somewhat like the Battle School game, was invented in-house specifically to train interns in the skills of trading. The rules are here if you're curious, but this post should make sense even if you don't tab away to read them.


This might be a good time for a disclaimer. I worked at Jane Street from 2016 until 2022. For large parts of that time, I had responsibility for parts of the internship training program, including countless games of Figgie. I organized the first, second, and third Jane Street recruiting events where we taught Figgie to attendees on college campuses. And I won Jane Street's 2021 inter-office Figgie championship.

Okay, a slightly less self-congratulatory disclaimer: This month I learned that Jane Street had a public Figgie website at all. So I've been out of the world for a while, as it were.

Finally, Jane Street has not reviewed or endorsed the contents of this post, and has no editorial rights over what I write except those defined by the confidentiality agreements I signed as an employee. (I'm not under any non-disparagement agreement to Jane Street or any other former employer, for what it's worth.) This post is a review of the public features of a now-public game. My description of how Figgie might be used in a hypothetical educational curriculum should not be read as a close description of Jane Street's own use of the game, which in nontrivial ways differs on some of the points I suggest here.


Actually, I want to talk about poker a bit more. What's bad about poker as a teaching tool? (I'll expand on each of these later on.)

Figgie, as an educational tool, has the advantages of poker that I listed and avoids these downsides. For that reason, it's a straightforwardly superior game for teaching traders (or anyone else) about making decisions under uncertainty, interpreting decisions made under uncertainty, disguising the interpretation of decisions you are making under uncertainty, and so on. (It has its own bad parts too, and if you use it as your only teaching tool, I suppose your trading firm will get what's coming to it.)

In Figgie, you make decisions, and you make them under uncertainty. More than that, you watch others make decisions under uncertainty and work to reverse-engineer what they know from their decisions. Even more so than in poker, the effects of your decisions interact directly with the nature of the uncertainty in a way that hammers in deep lessons about the hard parts of trading in markets. But also...

In poker, most decisions don't give you feedback about whether you were right for the right reasons. In traditional Texas Hold 'em, players nearly always fold their hands without revealing them to the other players, nor do they reveal their winning hand when their opponent folds. The only situation where anyone sees any cards other than their own is if two players stay in through the final round of betting (and even then, the second player might not show if they realize they've lost to the first-showing player). As a matter of competitive strategy, it's somewhat to your advantage to hide how you're playing certain combinations of cards from your tablemates.

But if you play 30 hands an hour and 5% of deals go to a showdown, there are just 2.25 player hands shown to the table every hour. This is terrible if you're a learning player trying to understand how better players play the game! On the rare occasion that I sat in on after-hours poker games with student-interns, I nearly always insisted that we fix this particular flaw by showing all folded and winning player-hands on any hand with betting, but even then it's not great.

By comparison, in Figgie, you see all four players' hands every game, and you might play 12 games an hour, for 36 chances to see why someone else played how they did. And when you do, the cards themselves can tell you how it worked for them.

If your poker playing partners aren't sufficiently skilled, you'll learn bad lessons. The rarity of revealed hands is particularly bad in a less-skilled or semi-skilled group, because nearly all of your actual feedback about hands won or lost will be based on the assumptions of your opponent in that hand.

If your opponent makes bad assumptions or bad decisions, your decisions won't be rewarded properly, and it can take you a very long time indeed to figure out from first principles that that is happening. If you are playing with a player who thinks that "all reds" is a strong hand, it can take you many, many hands to figure out that they're overestimating their hands instead of just getting anomalously lucky with their hidden cards while everyone else folds!

(Is someone who knows more about poker than I do going to tell me that this specific example is wrong-ish? We'll find out!)

There are certain strategies in Figgie that work on less-skilled players and don't work well on more-skilled players, as there are in any interesting game. But for the most part, a smart and dedicated group of new Figgie players in their first twenty or so games will have re-discovered roughly reasonable play that will reward better play. The game very nearly teaches itself, including its strategic depth, and makes it easy to update towards better habits even if your entire playgroup starts without a clue. Helping matters further, the misconceptions that you do have tend to get sanded down fairly rapidly by the game's results.

Making all this even worse (for poker), it takes a long time to get reasonably good at poker. The consensus opinion I found on poker forums is that it takes between 500 and 1,000 hours to become "good" at the game (according to forum-posters, I guess). I'll assert that no matter how educational you think poker is, it's not really efficient for your staff to spend three to six full-time months learning the game. And in my personal experience, the first part of that learning curve is a bit of an unforgiving wall where it is hard to be learning any transferable skills while you're still trying to get the game-specific fundamentals down.

By contrast, Figgie's learning curve is relatively forgiving, and it's mostly teaching good lessons even while you're scrambling (so long as you have the mechanics of trading down, which I claim takes barely less time than learning how and when to bet in Hold 'em). Players get a lot out of a few dozen hours without the long slog through gittin' gud.

Poker players spend most of the time at the table not making decisions. One of the greatest hazards for a beginning poker player is that they will make bad decisions because they want to play more poker instead of exiting hands just after seeing their cards. But this is understandable, because correct Texas Hold 'em play involves immediately folding something like 75% of the hands you are dealt!

Unhelpfully, when you correctly fold but two of your eight tablemates get non-foldable hands, then you get to spend several minutes watching them play poker, very likely won't see their cards, and then finally get dealt the next hand (which you are probably supposed to fold). In the rare hand that you do play, you'll spend half your time waiting for your opponent to make a decision. There's a reason that professional online players play four or more different tables at once—you spend only a small fraction of the time making decisions, and the vast majority of it waiting for others to play poker.

In Figgie, I'd estimate that every player at the table has something to be doing for 75% or more of a 4-minute round, and the dead time between rounds in a fast-moving table can be well under 20%. That's an action-to-dead-time ratio that pulls ahead of the John Wick movies (which blow nearly every other "action" movie out of the water).

A few poker situations turn the emotional stakes way up, past the level that's helpful. To a first approximation, the stakes of a decision in poker go up literally exponentially in the rounds of a single hand. In Hold 'em, it's not unusual for the stakes of the fourth round of betting to be several hundred times the initial stakes (unless someone folds before then). Since it's conventional for the initial stakes to be an amount of money that you'd at least notice losing (say, a dollar), stakes hundreds of times that can be...stressful.

It's commonly argued that it's helpful for traders to train a lower level of risk aversion for non-fatal bets, but I would submit that it's counterproductive to be training that risk tolerance while teaching another important lesson. Though these late-round high-stakes situations are rare under proper play, a player who makes systematically conservative choices in high-stakes situations (specifically, by folding more often) can be exploited by other players pushing them into the high stakes in order to get them to mis-play. So an emotional bias that is tough to scrub from a small set of situations can bias an entire table's worth of play for the worse and the less-educational.

Bets in Figgie range from 1 unit to 59 units, and in practice the vast majority of "big" decisions will only have stakes ten or fifteen times larger than the smallest ones. This amount of range rewards players for thinking about the more-valuable actions first, but still lets a group set the cents-per-betting-unit stakes to be meaningful at the small end without being unproductively stressful at the high end.

Certain poker metaphors are perverse in real trading. There's no natural analogue of a poker bluff in quantitative trading. While you may be trying to hide your very best trading among your merely-good trading so that the extremely-attentive don't find out what you're doing, I sincerely hope that you never have reason to hide your worst trades in with your best ones as part of a mixed strategy! Meanwhile, the skill and instinct of mixing ranges and reading mixed ranges is at the heart of mid- and high-level poker strategy (I am told; again, I'm not a particular expert here).

Figgie, as a game whose core metaphor is directly about distinguishing between positive-sum and adversarial trading, mostly trains instincts that make good fundamental sense in markets. For example:

There are some artificial tricks to learn ("when someone is buying cards and suddenly stops, it means they got five of that suit"), but much fewer than in poker.


I don't want to claim that Figgie is the perfect game; it has its own shortcomings and flaws.

These shortcomings, I should note, tend to have the effect of further disadvantaging student-players from historically under-represented groups. Any institutional educator using Figgie should thoughtfully account for that fact, or their efforts will feed structural biases already being pushed by the systems around them.

47 comments

Comments sorted by top scores.

comment by rossry · 2024-07-08T06:13:18.006Z · LW(p) · GW(p)

As mentioned in the opening note, Max Chiswick and I are working on launching an online class that provides a ladder of practical challenges between "write a bot that plays tic-tac-toe" and "write a bot that achieves the 2019 state of the art in no-limit texas holdem". I'm excited to be working on this and teaching it not because I think that programming game-playing AIs is the great important skill of our time, but because I think that thinking about systematically playing imperfect-information games is one of the best ways to sharpen your skills at systematically reasoning under uncertainty.

If this is of interest to you, we'll be running a beta test of the course material from July 15 to August 15, in-person, in San Francisco. More information here or reach out on any channel that reaches me. EDIT: Signups for the beta have gone to a waitlist, as we have more students than capacity. We're hoping to run additional iterations in-person in NYC and remote-first starting in September, so please sign up to the mailing list if either of those are of interest.

Replies from: Alexei
comment by Alexei · 2024-07-08T12:19:52.590Z · LW(p) · GW(p)

I’m interested! But I live in Portugal, so it would need to be remote.

Replies from: rossry
comment by rossry · 2024-07-08T15:29:13.159Z · LW(p) · GW(p)

You can certainly join our mailing list and you'll hear when we launch remotely!

comment by Max Entropy · 2024-07-12T02:40:56.923Z · LW(p) · GW(p)

I've played Figgie a fair bit and don't think it's a good tool for teaching epistemics outside of a Jane Street trading internship.

To actually learn with it, you first need a number of motivated playing partners. They need to be quite skilled for their actions to be informative, and the game sucks before they reach this point. In my experience people don't reach this skill level within their first couple hours of play. It also requires N custom-made decks for N rounds of play, or for everyone to install a specific app. So it's really only practical for a trading firm internship program.

I have what I would call unusually good epistemics, and I mostly got there by obsessively reading the financial news (Bloomberg, Reuters) and making concrete predictions about how various stories would develop, and seeing where I went wrong. I'd ask myself questions like "how, concretely, did this article come into being?" I did this for about 9 months, and by the end I was rarely surprised by news items, could spot the signs of failure modes like circular reporting (super common), and had working mental models of many important institutions.

The other valuable thing I did was putting probabilities on everyday occurrences (65% confident I get home within 15 mins, etc.). I recorded these, and reflected on what went wrong when I deviated too far from perfect calibration.

Replies from: Fejfo, flusterclick, ben-millwood
comment by Fejfo · 2024-07-17T17:32:58.504Z · LW(p) · GW(p)

Figgie may not be a good game but it's certainly better then poker, what game would be better then Figgie?

Replies from: rossry
comment by rossry · 2024-07-19T04:28:46.534Z · LW(p) · GW(p)

In a slightly different vein, I think the D&D.Sci [? · GW] series is great at training analysis and inference (though I will admit I haven't sat down to do one properly).

Depending on your exact goals, a simulated trading challenge might be better than that, which I have even more thoughts about (and hopefully, someday, plans for).

comment by flusterclick · 2024-07-12T11:33:13.309Z · LW(p) · GW(p)

Having learned about this game right now, this is my idea to set up a random deck (so not field-tested, and maybe I'm misunderstanding):

Separate into 4 piles of 12 cards, according to suit, face down.

Another person moves the piles about like in 3-card Monty until no one knows where each suit is.

Randomly choose 3 piles to remove 2, 2, and 4 cards. Or always remove 4 from the first, 2 each from the next ones, from left to right or by whatever convention.

Shuffle together remaining piles, deal.

comment by Ben Millwood (ben-millwood) · 2024-07-12T11:14:13.528Z · LW(p) · GW(p)

It also requires N custom-made decks for N rounds of play

You can make a figgie deck by taking a normal deck of cards and removing 1, 3, 3, and 5 cards from randomly-selected suits. This is easiest to do if you have a non-participant dealer, but if you don't, it's not too hard to come up with a protocol that allows you to do this as part of shuffling without anyone knowing the results.

Replies from: rossry
comment by rossry · 2024-07-12T18:31:19.999Z · LW(p) · GW(p)

it's not too hard to come up with a protocol

For example: A moves the piles with B watching and C+D looking away, then C removes 1 / 3 / 3 / 5 cards from random piles and shuffles them together with D watching and A+B looking away.

Replies from: ben-millwood
comment by Ben Millwood (ben-millwood) · 2024-07-12T18:38:54.419Z · LW(p) · GW(p)

right, and as further small optimisations:

  1. you can just remove 1 card from each suit permanently before playing, leaving 0 / 2 / 2 / 4 to remove each game
  2. you don't need to split the entire deck into suits, just make 4 piles of 4 cards from each suit and remove from those (though I guess in practice the game often separates cards into suits anyway, so maybe this doesn't matter)
Replies from: philh
comment by philh · 2024-07-22T11:45:16.388Z · LW(p) · GW(p)

just make 4 piles of 4 cards from each suit and remove from those

I don't think you can do this because at least one person will see which cards are in those piles, and then seeing those cards in game will give them more info than they're supposed to have. E.g. if they see 9h in one of the piles and then 9h in game, they know hearts isn't the 8-card suit.

(The rules as written are unclear on this. But I assume that you're meant to remove cards at random from the suits, rather than having e.g. A-8 in one suit, A-Q in one, and A-10 in the other two. If you did that then getting dealt the Q or J would be a dead giveaway.)

Replies from: rossry
comment by rossry · 2024-07-22T16:59:10.852Z · LW(p) · GW(p)

I think all of Ben's and my proposals have assumed (without saying explicitly) that you shuffle within each suit. If you do that, then I think your concerns all go away? Let me know if you don't think so.

Replies from: philh
comment by philh · 2024-07-23T08:38:54.310Z · LW(p) · GW(p)

I think Ben's proposal is: between rounds, it takes a while to split the whole deck into suits, all hearts in one pile and all spades in another and so on. Instead you can just pick out four hearts, and four spades, and so on, and remove 0/2/2/4 cards from those piles, and shuffle the rest back into the deck. But no matter how you shuffle, I don't think you can do that without leaking information.

comment by kave · 2024-07-11T04:17:52.219Z · LW(p) · GW(p)

I have not read this post super carefully, so apologies if I've misread, but I think this post equivocates between "epistemics" and "how to trade". It may well be true that Figgie is better for teaching epistemics than poker, or that Figgie is good in an absolute sense! But I also think most interesting decision making under uncertainty is actually less adversarial than trading. Like most of the problems of are things like figuring out how to locate hypotheses (e.g. "which feature should I build?", or "what explains the fact the data goes up and then down like that?" or "why won't my car start?"), how and how much to pay for information ("who can we ask who will know about this?", "is it worth paying for this independent analysis of my design for a swing in my backyard, or do I trust my own?", "how much should I draw down user goodwill by throwing experimental features at them?"), or PvE bet-sizing ("how many times should I apply for a tenure track position before I give up" or "I got enough cheese and crackers for eight people, do you think that's enough?").

Replies from: rossry
comment by rossry · 2024-07-11T22:16:55.576Z · LW(p) · GW(p)

I agree that I'm conflating a few different teaching objectives, and there are dimensions of "epistemics" that that trading in general doesn't teach. But on this I want to beg forgiveness on the grounds of, if I was fully recursively explicit about what I meant and didn't mean by every term, the post would have been even longer than it was.

I do have another long post to write with working title "What They Don't Teach You in Your Quant Trading Internship" about the ways that training in trading doesn't prepare you for other important things in the world, or will actively interfere with having good intuitions elsewhere.

All that being said, I think that if you think "which feature should I build" doesn't have something to learn from Toward a Broader Conception of Adverse Selection [LW · GW], I posit that there's something missing.

comment by Michael Townsend (michael-townsend) · 2024-07-12T02:57:07.428Z · LW(p) · GW(p)

Interesting post! I used to play poker professionally, and think that this post is correct in identifying limitations what people assume poker is primarily valuable for teaching (i.e., expected-value reasoning under uncertainty, and game theory) but misses what I think is most valuable about playing poker.

I feel fairly confident that within ~24 hours or so, I would be able to teach anyone enough strategy to, in principle, be a winning player who could maybe rack up $25 USD an hour at a typical US casino. But I don't think I could teach them the emotional control and discpline required to faithfully execute that strategy over a long period of time. That emotional control is what I see as the most difficult and generalisable aspect of the game. 

Many of what you listed as disadvantages of the game are actually the very reason why poker is so testing.

Most decisions don't give you feedback on whether you were right for the right reasons, right for the wrong reasons, wrong for the right reasons, or wrong for the wrong reasons.

Yes, and this is extremely frustrating. It's also pretty analogous to the hazy feedback you get in real life. (Maybe trading is different.) Learning how to discern between when you lost but made the right move, and won but made the wrong one, is a constant struggle. A common (bad) habit is to look at some seemingly objective source of information (e.g., a "solver" or simulations that try to find the nash equilibrium of a particular decision) — but this is almost always just an emotional cop-out. I think there are anologs to this in real life. A lazy example might be: "well the expert said to do X" when actually, if you'd properly evaluated X you would have noticed the expert was wrong. 

If your playing partners aren't sufficiently skilled at the game, you'll learn bad lessons.

This is one of my favourite misconceptions :).

Good players can make extraordinary profit against bad players, because their errors are highly predictable. Even the example you gave — of over-valuing red-cards — results in highly predictable mistakes (e.g., over-valuing random hands). Generally, the way you should think about playing against less experience players is just to account for the fact that they do random stuff, and there are better/worse strategies against random stuff.

But it is absolutely true that it people find it frustrating losing to players worse than them, in ways that feel unfair. Getting used to that is another skill, similar to the one described above, where you have to learn to feel reward when you make a positive EV decision, rather than when you win money. Again, I think there are analogies in real life where this thinking is valuable.

Players spend the supermajority of their time at the table not playing the game and not making decisions.

True, but again, this forces you to confront how to make good decisions even when:

  • Bored.
  • You've been losing for a while and this is the first opportunity you have to maybe win money.

Certain poker metaphors are perverse in real trading.

You listed a few examples:

  • Bluffing
  • Mixed strategies being at the heart of poker

I think the main thing that would generalise here is that bluffing is quite scary, and unpleasant, but it's still often the right move. It's really difficult to balance having the courage to do when profitable, without ending up being maniacal and doing it far too often. 

A separate rant on mixed strategies: Yes, these are at the heart of poker theory. But in practice, I think even at surprisingly high levels of play, people misunderstand on some fundamental level why mixed strategies are part of theoretically optimal play. Those reasons apply in vastly fewer situations than people realise. To be honest, this is a bit distinct/off-topic from the thesis I'm arguing for in this comment, but if I tried to tie it in, I guess there's some generalisable skill in not just mimicking theoretically optimal play without actually understanding what parts of theory apply in practice?

To sum up:

In general, all the above factors you've listed as negative are also some of the main challenges to playing poker well. I think that what I got the most value from in my time playing was all the repetitions of:

  • Noticing a particular emotion or feeling, drawing me to a decision (e.g., "I'm scared --> fold" or "this guy has won the last five hands against me --> call")
  • Identifying what parts of that emotion are actually valuable and informative — just ignoring emotion is a mistake, it usually contains important information. The skill is disentangling what parts of the emotion you endorse versus which parts are irrational. 
  • Once you decide on what the best decision is, actually clicking the button. It's shocking how hard that is. (A funny example here is from one of the most famous poker players' "highlights" including repeatedly, correctly, identifying the specific hand his opponent has, and calling anyway. [1])

There is one part of what you're saying that rings true to me:

It takes too long to get good enough to squeeze the real educational juice out of the game.

Yep — and for this reason alone, I'd be reluctant to recommend anyone play poker for "epistemic training" reasons. Play poker if you find it fun, and gamble responsibly! Even though I'm arguing above that there are valuable lessons in poker that do generalise to real life, I spent thousands of hours playing, and it's honestly still a kind of novel rarity to go "oh there's something I learned from poker that can help me here." Most of what you'll get from playing poker is getting good at poker.

 

  1. ^

    Though, it might have been the right play if he was sufficiently uncertain! That's the frustrating thing about the game :).

Replies from: GuySrinivasan
comment by SarahNibs (GuySrinivasan) · 2024-07-12T20:17:45.704Z · LW(p) · GW(p)

it is absolutely true that it people find it frustrating losing to players worse than them, in ways that feel unfair. Getting used to that is another skill, similar to the one described above, where you have to learn to feel reward when you make a positive EV decision, rather than when you win money

 

This is by far the most valuable thing I learned from poker. Reading Figgie's rules, it does seem like Figgie would teach it too, and faster.

comment by Dave Orr (dave-orr) · 2024-07-09T05:11:21.609Z · LW(p) · GW(p)

"If you are playing with a player who thinks that "all reds" is a strong hand, it can take you many, many hands to figure out that they're overestimating their hands instead of just getting anomalously lucky with their hidden cards while everyone else folds!"

As you guessed, this is wrong. If someone is playing a lot of hands, your first hypothesis is that they are too loose and making mistakes. At that point, each additional hand they play is evidence in favor of fishiness, and you can quickly become confident that they are bad.

Mistakes in the other direction are much harder to detect. If someone folds for 30 minutes, they plausibly just had a bad run of cards. We've all been there. They do have some discipline, but because folding is so common, each additional fold only adds a small bit of is Bayesian evidence that the person is a rock.

Replies from: rossry
comment by rossry · 2024-07-10T20:19:43.578Z · LW(p) · GW(p)

That all sounds right, but I want to invert your setup.

If someone is playing too many hands, your first hypothesis is that they are too loose and making mistakes. If someone folds for 30 minutes, then steals the blinds once, then folds some more, you will have a hard time telling whether they're playing wrong or have had a bad run of cards.

But in either case, it is going to be significantly harder for them to tell, from inside their own still-developing understanding of the game, whether the things that are happening to them are evidence about their own mistakes or anomalous luck or just the way the game is. Even more so if their opponents are playing something close to GTO rather than playing way-off-equilibrium exploits.

And, from a pedagogical perspective, the thing that I am usually trying to optimize for as a teacher is whether the game teaches itself to a student who is still largely confused -- not whether the game can be appreciated by a student who has already reached a level of understanding of the concepts it's being used to teach.

comment by Raemon · 2024-07-11T20:26:16.859Z · LW(p) · GW(p)

I'm curating this post, both for the post itself, as well as various followup discussion in the post disclaimer and comments that I found valuable.

I think the question of "how do we quickly/efficiently train epistemic skills?" is a very important one. I'm interested in the holy grail of training full-generality epistemic skills, and I'm interesting in training more specific clusters of skills (such as ones relevant for trading). I agree with kave's comment that this post equivocates between "epistemics" and "trading" but I'm generally excited for LessWrong folk to develop the art of "designing games that efficiently teach nuanced skills that can transfer". 

I like rossry's attitude of "the main feedbackloop of the game should help players become unconfused".

Replies from: rossry
comment by rossry · 2024-07-11T22:22:22.390Z · LW(p) · GW(p)

Appreciate it, Ray.

I definitely don't think this is the definitive word on how we [quickly, efficiently, usefully, comprehensively...] train epistemic skills. In my opinion, too many blog posts in the world try to be the definitive word on their thesis instead of one page in an ongoing conversation, and I'm trying to correct that instinct in myself. Plausibly I could have been clearer about this epistemic status up-front.

In any case, I'm looking forward to getting to revisit this post in the context of my LessOnline conversations with Max, and with the lessons we both learn as we design and run the AI-games course.

comment by Ben (ben-lang) · 2024-07-12T13:09:17.412Z · LW(p) · GW(p)

Something that confuses me a bit about Figgie, is that not only is it a zero-sum game (which is fine), but every individual exchange is also zero-sum (which seems not fine). If I imagine a group of 4 people playing it, and two of them just say "I won't do any trading at all, just take my dealt hand (without looking at it) to the end of the round", and the other two players engage in trade, then (on average) the score of the two trading players will be the same as those of the two players who don't trade. This, seems like its a problem. If your assessment is that the other players are more skilled than you, then it is optimal to just not engage.

I haven't played it, so this idea might be very silly, but it feels like the scoring should be rewarding players who have made their hand very strongly contain one particular suit (even if its not the goal suit). Then in the example above the two players engaging in trade can help one another to end up with lopsided hands (eg. one has lots of hearts, the other lots of spades), so that the group that trades has a relative advantage over a group that doesn't.

As a candidate rule it would be something like: At round end every spade you have makes you pay 1 chip out to the person with the most spades (for all suits except the goal suit).

Replies from: rossry, Pixellation
comment by rossry · 2024-07-12T18:14:17.634Z · LW(p) · GW(p)

Notice your confusion! It isn't zero-sum at the level of each individual exchange. If you'd like the challenge of figuring out why not (which I think you can probably do if you load in a 4-minute bot game, don't make any trades yourself, double-check the scoring, and think about what is happening), then I think it would be a useful exercise!

If you want the spoiler:

The player with the most of the goal suit gets paid a bonus of 100 or 120; this is the portion of the pot not paid out as ten chips per card. When two players trade a particular card from A who has less of that suit to B who has more of that suit, it's zero-sum for them in terms of the per-card payout but positive-sum for them in terms of the bonus (at the expense of the players not participating), since it makes it more likely that the buyer will beat a non-participating player for the bonus (but not less likely that the seller will win it).

Replies from: ben-lang
comment by Ben (ben-lang) · 2024-07-12T22:41:50.730Z · LW(p) · GW(p)

Confusion slain!

I forgot that their were leftover chips rewarded to the player with the most goal suit cards (I now remember seeing that in the rules, but wrote it off as a way of fixing the fact that the number of goal suit cards and players could both vary so their would be rounding errors, and didn't keep it in mind). That achieves the same kind of thing I was gesturing at (most of a suit), but much more elegantly.

Thank you for clarifying that.

comment by Minetta (Pixellation) · 2024-07-12T16:25:01.476Z · LW(p) · GW(p)

The real heart of this game is that the trades are not zero sum between the two parties. If you have 5 of a suit and your counterparty has 1, there are huge gains from trade to be had by interacting. This is a totally necessary condition for the game to be interesting. 

comment by max-sixty · 2024-07-12T00:31:17.314Z · LW(p) · GW(p)

(Excellent post, strongly agree at the object-level)

It's worth considering why poker is so popular relative to a game like Figgie — I'd claim is significantly helped by the downsides you outline in obfuscating the quality of decisions and increasing the emotional stakes.

For a betting game to be successful, you need an ecosystem which includes lots of bad players, who ideally don't realize they're bad. So having your mistakes laid bare is prohibitive. And some emotional journey is fun, both for playing and watching other play.

Is there some way of making games like Figgie also have some of these properties? (preferably something more connected to the actual game than "winning a game of Figgie gives you get a 60/40 chance to win $x"...)

Replies from: rossry
comment by rossry · 2024-07-12T18:25:46.501Z · LW(p) · GW(p)

Oh, I agree. Sort-of-relatedly, I asked a few poker pros at Manifest why we conventionally play 8-handed when we play socially, and my favorite answer was "because playing heads-up doesn't give you enough time to relax and chat". (My second-favorite, which is probably more explanatory, was "it's more economical for in-person casinos, and everyone else apes that.") And if you talk to home-game pros, they will absolutely have thoughts about how to win money on average while keeping their benefactors from knowing that they're reliably losing. The format of the game we play is shaped by social-emotional-economic factors other than pedagogy, but which are real incentives all the same.

Is there some way of making games like Figgie also have some of these properties?

I mean, Figgie itself is not purely skill-testing; you can always blame the cards, or other blame A for feeding B and losing but also causing you to lose, or any number of other things.

If you wanted to make it fuzzier on purpose, I think you could do the thing that often gets proposed for dealing it at home, which is to deal 40 cards out of a 52-card deck and call the goal suit the opposite of the longest suit (which might not be 12), with some way to break ties. I think it's a worse pedagogical game for being less clear -- not unrelated to the fact that it will make it harder to figure out why you're winning or losing. And my guess is that the skill ceiling is higher, also not-unrelatedly.

comment by Lukas_Gloor · 2024-07-13T16:05:01.155Z · LW(p) · GW(p)

Some of the points you make don't apply to online poker. But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game. 

  • If you did want to focus on the latter goal, you can play online poker (many players can >12 tables at once) and after every session, run your hand histories through a program (e.g., "GTO Wizard") that will tell you where you made mistakes compared to optimal strategy, and how much they would cost you against an optimal-playing opponent. Then, for any mistake, you can even input the specific spot into the trainer program and practice it with similar hands 4-tabling against the computer, with immediate feedback every time on how you played the spot. 
Replies from: rossry
comment by rossry · 2024-07-15T07:57:22.160Z · LW(p) · GW(p)

But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game.

Strongly agree. I didn't realize this when I wrote the original post, but I'm now convinced. It has been the most interesting / useful thing that I've learned in the working-out of Cunningham's Law with respect to this post.

And so, there's a reason that the curriculum for my and Max's course shifts away from Nash equilibrium as the solution concept to optimizing winnings against an empirical (and non-Nash) field just as soon as we can manage it. For example, Practicum #3 (of 6) is "write a rock-paper-scissors bot that takes advantage of our not-exactly-random players as much as you can" without much further specification.

comment by FinalFormal2 · 2024-07-11T22:03:59.977Z · LW(p) · GW(p)

I've been interested in learning and playing figgie for a while. Unfortunately, when I tried the online platform I wasn't able to find any online games. Very enthused to learn there's an android option now, will be trying that out.

Your comparison of poker and figgie very much reminded me of Daniel Coyle's comparison of football and futsal, to which he attributed the disproportionate number of professional Brazillian footballers.

TL;DR futsal is a sort of indoor soccer favored in Brazil with a smaller heavier ball, a smaller field, and fewer players. Fewer players mean that more people get more ball time, and the ball and the field favor a focus on footwork and quick passing. Practicing futsal seems to make people better at football than practicing football.

Also, if anyone is interested in joining or hosting a game of figgie, that would be really cool and I'd be interested in that.

Replies from: MathiasKirkBonde, rossry, rossry
comment by MathiasKB (MathiasKirkBonde) · 2024-07-18T19:40:04.053Z · LW(p) · GW(p)

If someone wants to set up a figgy group to play, I'd love to join

comment by rossry · 2024-07-11T22:29:38.787Z · LW(p) · GW(p)

I'd also be happy to log on and play Figgie and/or post-match discussion sometime, if someone else wants to coordinate. I realistically won't be up for organizing a time, given what else competes for my cycles right now, but I would enthusiastically support the effort and show up if I can make it.

comment by rossry · 2024-07-11T22:27:10.710Z · LW(p) · GW(p)

You know, I had read the football / futsal thesis way back when I was doing curriculum design at Jane Street, though it had gotten buried in my mind somewhere. Thanks for bringing it back up!

If I'm being honest, it smells like something that doesn't literally replicate, but it has a plausible-enough kernel of truth that it's worth taking seriously even if it's not literally true of youth in Brazil. And I do take it seriously, whether consciously or not, in my own philosophy of pedagogical game design.

comment by followthesilence · 2024-07-09T05:12:23.935Z · LW(p) · GW(p)

Thanks for the intro to Figgie. It makes sense that it's a better game to teach trading concepts given it was designed specifically to teach trading interns, has its own trading platform with bid-ask pricing, and all the other good reasons you mention above.

I would take issue with the first part ("poker is a bad game for teaching epistemics"), especially relative to the universe of well-known games out there. To address your criticisms:

In poker, most decisions don't give you feedback about whether you were right for the right reasons.

This strikes me as more feature than bug. Just as it can be "to your advantage to hide how you're playing certain combinations of cards from your tablemates", so too is it typical for firms to try to disguise their motives and trading strategies from rivals. Poker (and trading) is about making optimal decisions with incomplete information. Learning to do this without immediate feedback is itself a valuable skill. Relying on results from a single hand/trade is too noisy and often the best you can do is guess/deduce the likelihood your play was +EV -- the most valuable feedback comes from your long-term results.

If your poker playing partners aren't sufficiently skilled, you'll learn bad lessons.

A big part of the game is understanding your relative skill and assessing your adversaries ("If you can't spot the sucker in your first half hour at the table, you are the sucker"). Once someone becomes proficient at poker, arguably the most lucrative skill becomes identifying unsophisticated players/markets and exploiting them. Clearly transferable to trading, though maybe not to being a decent human being.

My favorite poker concept applicable to trading and other areas of life is: What level are they on?, Where levels are sequentially: "What do I have?", "What do they have?", "What do they think I have?", "What do they think I think they have?" and so on. 

I see this as applicable in speculative markets. For instance, when the last Bitcoin halving date was approaching, funny investment theses could abound: "(1) BTC Halving --> less supply --> BUY", "(2) Halving already priced in --> Level 1 thinkers will dump holdings when they don't get anticipated halving bounce --> SELL", "(3) Level 2 is right that BTC has already appreciated due to anticipated halving, but they don't realize that demand from new BTC ETF inflows are going to vastly outstrip newly constrained supply and we'll get a squeeze --> BUY", etc. Here some Level always risks learning a bad lesson (being right for the wrong reasons). The true skill is being able to deduce whether you can, over a larger sample, correctly assess the state/thinking of the market. 

It takes a long time to get reasonably good at poker

Good is a relative term here. Basic competence and understanding of key concepts that have transferability to trading can be achieved over much shorter timelines than those poker boards suggested. They are more referring to holding your own against professionals (or bots if playing online) for real money. 

Poker players spend most of the time at the table not making decisions.

Probably depends on what you're trading, but in my experience traders technically spent most of the time at their desks not making trades. Whether waiting to act or waiting for the next hand, there is value in gathering information and observing how your opponents are playing. 

A few poker situations turn the emotional stakes way up, past the level that's helpful.

This is another feature (not bug) to me. Even just setting up a toy game with play money or nickel stakes, poker has an amazing ability to put people "on tilt" where emotions distract from pursuit of optimal play or cause them to take outsized risks to chase losses. This can teach valuable lessons to junior traders learning to manage real assets. The best traders and poker professionals possess the skill, whether innate or learned, of tuning out the noise and not letting losing streaks get to their head.

Replies from: rossry, rossry
comment by rossry · 2024-07-10T19:59:08.557Z · LW(p) · GW(p)

I totally agree that poker (and I'll restrict to no-limit holdem especially) far surpasses nearly any other game at the broader cluster of goals. And I agree that there is a lot of value in the total of all the lessons you learn by fully mining out poker for insights.

My issue is really one of relative advantage / disadvantage, and of the ratio of grinding to insight across different parts of the learning curve. Together with some amount of, I think it's significantly more efficient to learn certain components separately and then to put them together than to approach them as one combined package. When I taught new traders, I thought it helpful to expose them to the emotional feeling of risk tolerance separately from the intuitive sense of adverse selection, separately from level-N efficiency / level-N+1 marginal, and separately from the skills of quantitative research. Then we'd work on putting the concepts together into increasingly complete exercises, building up to the scale of deploying research-derived algorithmic trading strategies to miniaturized stock markets (and then to real markets, though at some point that left my purview...).

I don't mean that it was a strict waterfall model -- it's sometimes extremely helpful to jump ahead temporarily to understand how things come together before going back to focus more on the fundamental components -- but as a matter of pedagogical design I feel reasonably confident that jumping straight into an environment with all of the concepts active is suboptimal, especially if having one under-developed makes it actively harder for you to learn another at the same time.

So yes, I think if you have nearly all of the right skills except for an impatience and a bias towards action, then playing in-person poker and practicing folding 80% of your hands can be just the prescription the doctor ordered. Or if you're trying to calibrate over-updating versus under-updating on limited information. Or if you're at a reasonable level at most of the things and are trying to stay sharp. But if you're early on the learning curve of four different things, then I want to claim it's not optimal to throw yourself at a game that wraps all of them up in interconnected ways, especially if they'll be harder to disentangle if you don't have a solid place to stand -- so to speak -- in the first place.

comment by rossry · 2024-07-10T20:12:20.788Z · LW(p) · GW(p)

(Separately from my sibling comment,) I think agree that the richest source of insight from poker is to be had in evaluating other players' off-equilibrium behavior and determining how to respond with off-equilibrium behavior of your own.

I think that it is easy to dramatically over-estimate how much of this the typical student(*) will actually do in their first several-hundred hours of playing the game. At a minimum, I think (I think common?) idea that the idea that GTO post-flop play is an intermediate-level technique and exploitative play is an advanced-level technique is correctly ordered if you're trying to reduce your $ losses at a strong social table, but backwards if you're trying to use the game as mental weightlifting. And the fact that it took me a decade after starting to casually learn the game to understand the preceding sentence is, at a minimum, a critique of how pedagogy-through-poker is nearly always done in practice.

(*) and I mean the term "student" broadly, to include professionals-in-training and adult learners looking to re-train

In fact, it wasn't until my conversation with Max that I appreciated that I had spent far too much time working on playing more GTO -- which I am still very far from -- and that I should probably have started trying to understand and exploit my opponents' play while I was still definitely bleeding money to my own exploitability. This is the largest thing that I've updated on since writing the post, and the thing I'd most want to cover in a part-2 follow-up.

comment by Review Bot · 2024-09-10T22:57:49.869Z · LW(p) · GW(p)

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?

comment by Joseph Miller (Josephm) · 2024-07-19T04:22:39.493Z · LW(p) · GW(p)

We just played Figgie at MATS 6.0, most players playing for the first time. I think we made lots of clearly bad decisions for the first 6 or 7 games. And reached a barely acceptable standard by about 10-15 games (but I say this as someone who was also playing for the first time).

Replies from: rossry
comment by rossry · 2024-07-19T04:30:58.079Z · LW(p) · GW(p)

Do you think it was educational even though you were making clearly bad decisions / not at "an acceptable standard" for the first dozen games?

Replies from: Josephm
comment by Joseph Miller (Josephm) · 2024-07-19T04:34:46.665Z · LW(p) · GW(p)

I think so. Mostly we learned about trading and the price discovery mechanism that is a core mechanic of the game. We started with minimal explanation of the rules, so I expect these things can be grokked faster by just saying them when introducing the game.

Replies from: rossry
comment by rossry · 2024-07-19T04:40:53.287Z · LW(p) · GW(p)

Good to hear it!

One of the things I find most remarkable about Figgie (cf. poker) is just how educational it can be with only a minimal explanation of the rules -- I'm generally pretty interested in what kinds of pedagogy can scale because it largely "teaches itself".

Replies from: Josephm
comment by Joseph Miller (Josephm) · 2024-07-19T19:29:40.306Z · LW(p) · GW(p)

Note that the group I was in only played on the app. I expect this makes it significantly harder to understand what's going on.

Replies from: rossry
comment by rossry · 2024-07-19T20:07:31.725Z · LW(p) · GW(p)

That makes sense; I am generally a big believer in the power of physical tokens in learning exercises. For example, I was pretty opposed to electronic transfers of the internal currency that Atlas Fellowship participants used to bet their beliefs (even though it was significantly more convenient than the physical stones that we also gave them to use).

I do think that the Figgie app has the advantage of taking care of the mechanics of figuring out who trades with who, or what the current markets are (which aren't core to the parts of the game I find most broadly useful), so I'm still trying to figure out whether I think the game is better taught with the app or with cards.

comment by Aleksander (Omnni) · 2024-07-13T22:00:17.737Z · LW(p) · GW(p)

I will pick out a specific and somewhat irrelevant part of this post because I want to leave a comment but don’t feel qualified to talk about any other part. This part is the segment about Ender’s game. It’s really going to depend on whether we are talking about the books or the movies how hard battle school is. In the movie, battle school is effectively a summer camp for learning how to kill aliens. In the books, however, battle school represents years of psychological torment and isolation which actually occur in multiple locations.

Replies from: rossry
comment by rossry · 2024-07-15T08:23:29.799Z · LW(p) · GW(p)

In my personal canon of literature, they never made a movie.

I think I've seen it...once? And cached the thought that it wasn't worth remembering or seeing again. When I wrote those paragraphs, I was thinking not at all about the portrayal in Hood's film, just what's in Card's novels and written works.

comment by angmoh · 2024-07-11T22:04:39.462Z · LW(p) · GW(p)

If your opponent makes bad assumptions or bad decisions, your decisions won't be rewarded properly, and it can take you a very long time indeed to figure out from first principles that that is happening. If you are playing with a player who thinks that "all reds" is a strong hand, it can take you many, many hands to figure out that they're overestimating their hands instead of just getting anomalously lucky with their hidden cards while everyone else folds!

(Is someone who knows more about poker than I do going to tell me that this specific example is wrong-ish? We'll find out!)

I'll take the bait since this is one of the cool meta aspects of poker!

There's a saying in online poker: "move up to where they respect your raises" - it's poking fun at the notion that it's possible to play well without modelling your opponents. The idea is that it's not valid to conclude that if you lose to a poor player, that you weren't "rewarded properly" - it is in fact your fault for lacking the situational awareness to adjust your strategy.

For a good player sitting with a person who thinks 'all reds' is a good hand, it'll be obvious before you ever see their cards. 

 

Anyway your point is right about the difficulty of learning 'organically' where you only play bad players.  A common failure mode in online poker involved players getting stuck at local maximums strategically - they'd adopt an autopilot-style strategy that did very well at lower limits surrounded by 'all reds' types, but get owned when they moved up to higher stakes and failed to adjust.

Replies from: rossry
comment by rossry · 2024-07-11T22:39:08.584Z · LW(p) · GW(p)

For a good player sitting with a person who thinks 'all reds' is a good hand, it'll be obvious before you ever see their cards.

I basically agree that it will be obvious to you (a reasonable poker player) or even to me (an interested and over-theorized amateur), but as I said in a cousin comment, what actually matters is whether it'll be obvious to the student making the mistake, which is a taller order.

I think that "all reds" is overstated as literally written (I mean, you'll eventually go to showdown and have it explained to you), but I mean it to gesture at a broader point, and because the scene in Eleven is too good not to quote.