D&D.Sci Holiday Special: How the Grinch Pessimized Christmas Evaluation & Ruleset
post by aphyer · 2022-01-11T01:29:59.816Z · LW · GW · 3 commentsContents
RULESET DATASET GENERATION STRATEGY LEADERBOARD FEEDBACK REQUEST None 3 comments
This is a follow-up to [LW · GW]last week's D&D.Sci scenario [LW · GW]: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.
Full generation code is available here if you are interested, or you can read on.
RULESET
Each of the six toys makes a different amount of noise depending on what child receives it:
- A Blum-Blooper makes 6 noise.
- A Fum-Foozler is more popular with female Who children - it makes 8 noise with a female Who child but only 4 noise with a male one.
- A Gah-Ginka is very noisy, but too simple to hold the attention of older children for long. It makes 11 noise - 1/2 (rounded down) the child's age (so anywhere from 11 for an Age 1 child to 5 for an Age 12 child).
- A Sloo-Slonker is complicated, and small Who children won't use it very much. It makes 5 noise + 1/2 (rounded down) the child's age (so anywhere from 5 for an Age 1 child to 11 for an Age 12 child).
- A Trum-Troopa can be used to hold very noisy Trum-Troopa Battles with other Trum-Troopas. If another Who child in the same family (having the same middle name e.g. Cindy Lou Who is in the Lou family) also has a Trum-Troopa, it makes 10 noise. If not, it makes only 5.
- A Who-Whonker is more popular with male Who children (who prefer Whonking things for some reason). It makes 9 noise with a male Who child but only 5 noise with a female one.
Each child's noise is the sum of their two toys' noises. Additionally, however, each distinct child has a favorite toy. Any noise a child makes with their favorite toy is doubled.
For example, Johnny Drew Who, in our dataset, is a 4-year old Male Who Child whose favorite toy is the Blum-Blooper. If he is given:
- A Blum-Blooper and a Fum-Foozler, he will make 16 noise in total: 6x2 from the Blu-Blooper and 4 from the Fum-Foozler.
- A Gah-Ginka and a Trum-Trooper, he will make either 14 or 19 noise in total: 9 from the Gah-Ginka, plus 10 if one of his siblings (Betty, Phoebe and Cindy Drew Who) also gets a Trum-Trooper and 5 if none do.
- A Sloo-Slonker and a Who-Whonker, he will make 16 noise in total: 7 from the Sloo-Slonker and 9 from the Who-Whonker.
DATASET GENERATION
There are three families: the Drew Whos, Lou Whos and Sue Whos.
Each year, Who Children age and may be born - each family rolls a d4 and has a new child if the roll is greater than the number of children they currently have. As such, there are never more than 12 children (4 in each of the 3 families), and the overall number is sometimes a bit lower (average is around 3 children per family at any given time). Then each Who child from age 1-12 is given two toys.
Each Who child, when they are born, is assigned a Favorite Toy at random.
There weren't actually any tricks in this one. I meddled with the RNG in only one minor way: the RNG by default had a 1-year-old Who Child just born this year, and I left that child out because there wouldn't be a way for you to determine their Favorite Toy. Aside from that, the ages of Whos and what their favorite toys were was purely the RNG.
STRATEGY
Once you know how the rules work, noise-minimizing strategy is to:
- Assign each toy type to the child who likes it least: Fum-Foozlers to boys, Who-Whonkers to girls, Sloo-Slonkers to young children and Gah-Ginkas to older chidren.
- Assign Trum-Troopers one per family, to avoid getting lots of noise from them.
- Avoid assigning any Who child their Favorite Toy.
While noise-maximizing strategy is essentially the reverse.
The Who children and their Favorite Toys are:
Name | Age | Gender | Favorite Toy |
Andy Sue Who | 12 | M | Sloo-Slonker |
Betty Drew Who | 11 | F | Fum-Foozler |
Sally Sue Who | 11 | F | Fum-Foozler |
Phoebe Drew Who | 9 | F | Sloo-Slonker |
Freddie Lou Who | 8 | M | Sloo-Slonker |
Eddie Sue Who | 8 | M | Who-Whonker |
Cindy Drew Who | 6 | F | Who-Whonker |
Mary Lou Who | 6 | F | Gah-Ginka |
Ollie Lou Who | 5 | M | Fum-Foozler |
Johnny Drew Who | 4 | M | Blum-Blooper |
One example of a noise-minimizing allocation is:
- Andy Sue Who: Fum-Foozler, Gah-Ginka (9)
- Betty Drew Who: Who-Whonker, Gah-Ginka (11)
- Sally Sue Who: Who-Whonker, Blum-Blooper (11)
- Phoebe Drew Who: Who-Whonker, Blum-Blooper (11)
- Freddie Lou Who: Fum-Foozler, Blum-Blooper (10)
- Eddie Sue Who: Fum-Foozler, Trum-Troopa (9)
- Cindy Drew Who: Blum-Blooper, Trum-Troopa (11)
- Mary Lou Who: Sloo-Slonker, Who-Whonker (13)
- Ollie Lou Who: Sloo-Slonker, Trum-Troopa (12)
- Johnny Drew Who: Fum-Foozler, Sloo-Slonker (11)
For a total of 108 noise. One example of a noise-maximizing allocation is:
- Andy Sue Who: Sloo-Slonker, Who-Whonker (31)
- Betty Drew Who: Fum-Foozler, Trum-Troopa (26)
- Sally Sue Who: Blum-Blooper, Fum-Foozler (22)
- Phoebe Drew Who: Sloo-Slonker, Trum-Troopa (28)
- Freddie Lou Who: Sloo-Slonker, Who-Whonker (27)
- Eddie Sue Who: Blum-Blooper, Who-Whonker (24)
- Cindy Drew Who: Trum-Troopa, Who-Whonker (20)
- Mary Lou Who: Fum-Foozler, Gah-Ginka (24)
- Ollie Lou Who: Blum-Blooper, Fum-Foozler (14)
- Johnny Drew Who: Blum-Blooper, Gah-Ginka (21)
For a total of 237 noise.
LEADERBOARD
Note: inputting the toy selections was a somewhat manual process, if you think I've scored you wrong let me know.
Player | Noise |
Noise-minimizing allocation | 108 |
GuySrinivasan (min) | 108 |
abstractapplic (min) | 118 |
simon (min) | 123 |
Yonge (min) | 128 |
MadHatter (min) | 129 |
Random allocation | 165 |
MadHatter (max) | 204 |
abstractapplic (max) | 207 |
simon (max) | 229 |
Noise-maximizing allocation | 237 |
Congratulations to everyone who submitted, particularly to GuySrinivasan (who figured out the entire ruleset and got the perfect noise-minimizing allocation) and also to simon (who had the best Friendly Grinch solution).
FEEDBACK REQUEST
As usual, I'm interested in feedback. If you played the scenario, what did you like and what did you not like? If you might have played but in the end did not, what drove you away? Is the timeline too long/too short/just right?
In particular, I tried to make this scenario simple - there were a couple sneaky things (Favorite Toys and Trum-Troopers) but for the most part I think it was a lot less complex, and bearing this out we had a perfect answer from GuySrinivasan, and near-perfect answers from several others. How do players think the difficulty of this compared to what you want to see?
3 comments
Comments sorted by top scores.
comment by abstractapplic · 2022-01-11T10:22:04.876Z · LW(p) · GW(p)
Reflections on my attempt:
It looks like I was basically right. Even in the place I came up short – figuring out Trum-Troopas – I knew I was probably missing something, since it would have been weird for that to be the only not-perfectly-predictable part of the problem.
Reflections on the challenge:
This is the first D&D.Sci which is a pure puzzle; that is, the first one without randomness in the linkage between explanatory and response variables. I think this would be unfair for something presented as a social science problem, except that a) the Seussian context was a pretty big hint that normal rules don’t apply and implausibly-tidy solutions are on the table, b) it was fairly obvious after an hour or two of looking at the data that there were unusually clean linkages between at least some toy choices at at least some noise levels (G+S equals exactly 16 more than half the time, many toy combos have a suspiciously low number of possible noise outputs), and c) using only ML can still net you a much-better-than-chance result. However, I suspect the anomalous neatness may have caused limited engagement: once someone posts a complete or near-complete solution, why bother investigating for yourself? And why bother writing about your analysis if it’s identical or near-identical to one already posted?
Others may have other opinions, but I really liked the 10-day length: giving players a week plus a choice of weekend provided a lot of breathing room. I also (continue to) think the problem introduction was the best-written one so far. And while I don’t know if the high puzzleishness was a good idea overall, it definitely enabled one heck of a eureka moment when I figured out (most of) the rules. Thank you very much for running this game.
comment by SarahSrinivasan (GuySrinivasan) · 2022-01-12T16:48:21.227Z · LW(p) · GW(p)
I enjoyed this. Due to Events taking many, many spoons, I would not have tried to engage with something that seemed less ... deterministic. But this clearly had very deterministic elements just by groupby toy-pair | uniq, and was pretty fun to find each successive "obvious" thing, model it, "subtract it out", find a new "obvious", repeat until done.
comment by MadHatter · 2022-01-11T02:33:39.127Z · LW(p) · GW(p)
Thanks for organizing!
Feedback: I was a little bit surprised to see a perfectly regular solution. (And I did relatively poorly because of my assumption that there would not be one.) I feel like real-world data is never as clean as this; on the other hand, all data benefits from taking a closer look at it and trying to understand if there are any regularities in the failure modes of your modeling toolkit, so maybe this is just a lesson for me. Hard to say!