D&D.Sci Dungeoncrawling: The Crown of Command Evaluation & Ruleset

post by aphyer · 2021-11-16T00:29:12.193Z · LW · GW · 12 comments

Contents

  RULESET
    THE PARTY
    TRAPS
    ENEMIES
  DATASET GENERATION
  STRATEGY
    The Lost Temple of Lemarchand
    The Infernal Den of Cheliax
    The Goblin Warrens of Khaz-Gorond
  LEADERBOARD
  FEEDBACK REQUEST
None
12 comments

This is a follow-up to last week's D&D.Sci scenario [LW · GW]: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.

Full generation & evaluation code is available here if you are interested, or you can read on.

RULESET

THE PARTY

A party has a combined HP total.  (This represents health but also spells, healing potions, and any other resources a party has). Each adventurer contributes (1 + level) HP to the total, so e.g. a team of 4 Level 3 characters has 16HP.  This HP total drops as the party goes through encounters.  If it hits 0, the party withdraws.

Adventurers come in 6 classes, each with a different combat ability:

Encounters have two types: Traps and Enemies.

TRAPS

All traps work the same way: they deal 1d6 damage, reduced by the level of the highest-level appropriate adventurer in your party (to a minimum of 0):

So if your party has a Level 4 Fighter,  a Level 3 Fighter, a Level 3 Cleric, and a Level 1 Mage:

If you have a Level 6 character, you will take no damage from the associated trap type.  Beyond this there's less benefit to increased level - it gives extra HP to the party, but nothing else.
 

ENEMIES

A list of the enemies:

EnemyDamageAttack TypeNotes
Goblins1d3Range 
Goblin Chieftain1d4Melee 
Orcs1d4Melee 
Wolves1d4MeleeBeast
Orc Shaman1d6Magic 
Orc Warlord1d8Melee 
Skeletons1d3Range, MagicUndead
Zombies1d3Melee, MagicUndead
Ghosts1d4MagicUndead
Basilisk1d8Melee, MagicBeast
Lich1d10MagicUndead
Dragon2d6Melee, Range, Magic 

When you encounter an enemy, it rolls its Damage. (So Orcs roll 1d4, and might do e.g. 3 damage).

If it is using an attack type you do not have defense against, that damage is doubled.  (So if you have no Fighter, the Orcs above will instead do 6 damage).

If an enemy has multiple attack types, you must have defense against them all to avoid taking double damage - so a Dragon will deal double damage unless you have all three defenses - but it can only double once, even if you are missing multiple defenses.

This is unaffected by the level of the character granting the defense - higher overall level gives more party HP to absorb damage, but e.g. when fighting against Orcs there's no difference between a Level 1 Fighter + Level 7 Mage vs a Level 7 Fighter + Level 1 Mage.

Clerics can use their healing magic against Undead: if you have at least one Cleric, that can count as any one form of defense you are otherwise missing against any Undead enemy.

Druids can use Wild Empathy to convince beasts not to attack: before doubling, reduce any damage suffered from Beasts by the level of the highest-level Druid in your party.



 

DATASET GENERATION

Dungeons come in three varieties:

Adventurers then approach these dungeons.  Adventurers are usually pretty good at guessing (from rumors about a dungeon, the number of villages laid waste near it, etc.) how threatening it will be, and adventurers of around the right level will approach it. Adventurers somewhat underestimate Goblins in general, and tend to attempt their dungeons at lower-level than they should (abstractapplic is the first person I think to notice this one).



STRATEGY

Once you understand how the system works, general strategy is:

With perfect play, it is possible to reach 100% win rate on all three dungeons using 36,000gp.  This requires going beyond the bounds of the parties you've seen and building parties with weirdly artificial level differences, though.  A more realistic goal of 'selecting optimal classes while bringing all Level 3 adventurers' achieves a 85.75% overall win rate, with most of the loss probability coming from the risk of unlucky rolls against the Dragon in the Infernal Den of Cheliax.

The Lost Temple of Lemarchand

This is a Ruin-type dungeon.  All encounters in this were known.  

The Infernal Den of Cheliax

This is a Lair-type dungeon.  The final encounter is listed as 'Unknown'.  However, it is guaranteed to be a Dragon (the 'Infernal' prefix shows up only with Dragon or Lich, and a Lich will use Undead rather than Orcs as servants.  Congratulations to measure, who posted a mostly-complete explanation of what encounters would be in this and the next dungeon.

The Goblin Warrens of Khaz-Gorond

This is a Monster Camp-type dungeon.  Most encounters in this were listed as 'Unknown.'  However, based on the generation algorithm they are fully predictable except for the order: there will be 3 Boulder Traps, 6 Goblins, and one Goblin Chieftain at the end.  (Each Goblin camp only builds one type of trap, and while some smaller camps have no Chieftain any camp with 6+ Goblins will have one).



LEADERBOARD

Note: win rates below were Monte-Carlo derived rather than explicitly calculated.

Current leaderboard:

SubmitterLost Temple of LemarchandInfernal Den of CheliaxGoblin Warrens of Khaz-GorondTotal
Fully Optimal PlayD5, Ro5, M1, C1100.00%D6, F3, Ra2, M2100.00%F6, Ra2, C1, C1100.00%100.00%
Optimal Class AllocationD3, Ro3, M3, C398.80%D3, F3, Ra3, M387.88%F3, Ra3, C3, C398.76%85.75%
abstractapplicD2, Ro2, M2, C250.45%D3, F5, Ra3, M396.43%F4, Ra4, F3 C399.27%48.29%
simonD2, Ro2, M2, C362.40%F4, Ra4, C3, D471.22%F4, R3, R3, D358.12%25.83%
YongeD2, Ro2, M2, C362.40%F7, Ra3, D2, C242.16%Ra6, F4, C2, Ro197.97%25.77%
TaleuntumRa1, Ro1, C4, M422.54%M1, F3, D4, Ra598.35%F3, Ra3, Ro3, C490.09%19.97%
MeasureF1, M1, C3, D10.80%F4, Ra3, M7, C488.07%F3, F2, Ra3, C483.57%0.59%
Adventurer's Guild4 random Level 3s27.26%4 random Level 3s14.31%4 random Level 3s10.20%0.40%

Congratulations to those who successfully assembled the Crown of Command and became the Tyrant of Calantha, reigning with an iron fist until at last a desperate team of heroes was able to topple you and restore freedom to the nation once more.  abstractapplic has the highest win-rate here: if a future scenario needs an evil god-tyrant in it, I'll know who to put there.

Condolences to those who failed at this goal, and were doomed to a life of squalor with only a few thousand servants at your command.

For any future players who want to test their performance, you can edit and run the following lines in the code to include and test your proposed teams:

specific_team_test_runs = True
if( specific_team_test_runs == True):
   my_world = World(log=False)
   runs = 0
   wins = 0
   losses = 0
   while runs < 50000:

        
       #lots of teams commented out here
       #string_party = [('Mage', 3), ('Cleric', 3), ('Rogue', 3), ('Druid', 3)]   # Class based LTL
       #string_party = [('Fighter', 3), ('Ranger', 3), ('Mage', 3), ('Druid', 3)]  # Class based ITC
       string_party = [('Ranger', 3), ('Fighter', 3), ('Cleric', 3), ('Cleric', 3)] # Class based GWK
       party = my_world.get_party_by_name_and_levels( [(x[0], x[1]) for x in string_party] )
       dungeon = Dungeon(my_world)
       #dungeon.encounter_names = [ 'Skeletons', 'Skeletons', 'Poison Needle Trap', 'Zombies', 'Snake Pit', 'Poison Needle Trap', 'Ghosts', 'Snake Pit' ] 
       #dungeon.encounter_names = [ 'Snake Pit', 'Orcs', 'Snake Pit', 'Wolves', 'Dragon' ]
       dungeon.encounter_names = [ 'Goblins', 'Boulder Trap', 'Goblins', 'Goblins', 'Boulder Trap', 'Goblins', 'Goblins', 'Boulder Trap', 'Goblins', 'Goblin Chieftain' ]
       dungeon.get_encounters_by_name()
       party.run_dungeon(dungeon, log=False)
       if party.current_hp > 0:
           wins = wins + 1
       else:
           losses = losses + 1
       runs = runs + 1
   print('Won {}/{} ({:.2f}%)'.format(wins, runs, wins * 100 / runs))

or if you aren't familiar with the code you can DM me and I can run for you.



FEEDBACK REQUEST

As usual, I'm interested in feedback.  If you played the scenario, what did you like and what did you not like?  If you might have played but in the end did not, what scared you away?

Additionally, if anyone who's played a few of these is willing to sign up to take a look at the text & first few rows of the data of future scenarios I write before I post them, I would really appreciate it.  (I don't think the LW 'Get Feedback' option is quite meant for this use case).  I'm interested to know in advance things like:

 If you are willing to do this, I can't offer to pay you anything, but I can offer you the rulership of all Asia the love of Helen of Troy your name credited for help in the scenario.  If this somehow seems like a good deal to you, let me know and I'll message you when I have something for you to look at.

Thanks for playing!

12 comments

Comments sorted by top scores.

comment by simon · 2021-11-16T02:11:57.447Z · LW(p) · GW(p)

Thanks for making the scenario. 

I found I was slow to start because it seemed a bit intimidating, as there was so many variables that were obviously going to confuse each other. (This was part of why I looked at Threat Level, since it was something I could actually solve). Once I got going, there was plenty to consider but not enough time, which is OK. Maybe some more obviously low-hanging fruit could help people get into it.

A minor formatting preference: if you would add a unique ID to each row, this would help identify a particular row in a way that is preserved through sorting of the data. 

On my particular solution:

I didn't realize I needed all three of Fighter, Ranger and Mage on Dragon. I should have looked at class combos on Dragon in more detail; that should have been discoverable. Meanwhile, the optimal choice on Goblin Warrens was (by sheer coincidence) going to be my second choice, but even if I had chosen it, I would still have been behind abstractapplic who got the proper Dragon combo as well as a good Goblin party. Congrats abstractapplic.

I explicitly contemplated the idea that cleric might restore resources, not from the data but from priors, but didn't look into it (are Clerics good at easy-but-long dungeons would be one possible line of approach).   I also would have eventually looked at the relative importance of different classes being high level if I had enough time.

comment by abstractapplic · 2021-11-16T17:15:37.460Z · LW(p) · GW(p)

Reflections on my attempt:

I’m pleasantly surprised by how well I did in both a general and absolute sense (if you asked me yesterday I would not have put my strategy’s odds of overall success above 20%). Of course, ~half the credit for this victory goes to Measure, whose inferences about the dungeons’ likely populations I was shameless in making use of.

If I’d had more time and energy to spare, I would have looked into how reliably teams which counter all their encounters win, and how character levels affect this outcome; from what I read here, I think that would have been a good next step.

Reflections on the challenge:

  • The problem statement was the most fun-to-read D&D.Sci introduction so far, including (imo) all of my own.
  • I found myself surprisingly uncomfortable playing a villain. (If you don’t get why I’d be bothered by mostly-task-irrelevant skippable flavortext then that makes two of us.)
  • The mechanic of “you have fungible but limited resources to allocate between multiple tasks, all of which have to be completed for a win” turns out to be an extremely good fit for D&D.Sci, and I look forward to using it in one several of my scenarios.
  • The difficulty level was in an uncanny valley between “simple task that’s only hard because Inference and Application are inherently hard” and “arbitrary fractal complexity which rivals that of the real world”; I would have liked this game more if it were significantly harder or easier.
  • I got to play another D&D.Sci scenario! Which I didn’t make! And which probably helped me to get better at making D&D.Sci scenarios!

Regarding future feedback:

If you – I here refer both to the esteemed OP and to anyone else with a complete-but-unreleased D&D.Sci game – want me to proof a future challenge before it’s released to the wider public, dm me and I’d be happy to take a look. I’d also be (reluctantly) willing to give (a small amount of) more general support to people perpetuating my genre (even though this would disqualify me from playing the resulting games, and my advice would mostly be variations on “do the things I did but better”).

comment by Sir Edmund · 2021-11-16T15:46:48.299Z · LW(p) · GW(p)

Thank you for making this. I have enjoyed following the D&D.Sci series, even though I don't get around to posting full solutions.

One thing that the series has given me is a better awareness of the limits of data science. Given enough time and effort, you can parse a data set along as many dimensions as you please, but the amount of time and effort needed grows exponentially based on the number of possible variables. If this scenario were a video game, I imagine that just controlling a party through a few different dungeons, and thereby seeing which enemies did high damage against which parties, would quickly give an intuitive sense and at minimum would help a strategist to avoid a lot of clear mistakes. The same goes for the League of Defenders of the Storm scenario -- an actual player of the game would quickly learn that level 1's beat their corresponding level 6's in play, whereas that fact wasn't as obvious from observing the overall data set.

So both on-the-ground experience and data science have their uses. It's valuable to practice what can be done with data science alone, but one key takeaway from this series is that if I'm ever making a bet with my own life ( (or just a lot of money) on the line, I pray I'll get a chance to practice my strategy as well as to observe the relevant data.

Replies from: aphyer
comment by aphyer · 2021-11-16T16:09:15.236Z · LW(p) · GW(p)

+1.

The fact that no-one's gotten the optimal solution is very much intended.  (If anyone had, I would be both very impressed with them and somewhat disappointed with myself.)  You should not expect to be able to fully model a domain with data science, it's like trying to thread a needle wearing huge thick gloves.  But you can expect to figure out something about the domain, and use that to at least substantially outperform randomness.  (Our highest scorer this round, abstractapplic, had ~half the optimal winrate, but ~100x the 'random approach' winrate).

comment by Pattern · 2021-11-17T16:11:50.759Z · LW(p) · GW(p)
Clerics have Healing.  This restores 0.5HP per cleric to the party after each encounter if the party hasn't already retreated, plus provides defense against Undead (see Enemies below).

No boost based on level? (I know it's not D&D, but is this just to balance things, because they're also helpful against undead?)

comment by Richard_Kennaway · 2021-11-16T20:25:52.250Z · LW(p) · GW(p)

It seems fairly straightforward for someone, given these rules, to calculate the optimal solutions. Does there currently exist any AI software that would be capable of reading the rules as you have set them out, and finding those solutions?

I recall Lenat's Eurisko that won several tournaments of some type of space war game, by playing massive numbers of games with itself and finding strategies that humans would have difficulty finding (e.g. using "lifeboats" as "armour"), but I've never heard of more advanced stuff in that line. There is Cyc, but nothing seems to have come of that.

Replies from: aphyer
comment by aphyer · 2021-11-16T20:45:49.276Z · LW(p) · GW(p)

I imagine that would be primarily a language-processing issue, I'm not super-familiar with the current standard of AI but I don't think it's quite good enough to do that.

With that said, I think you might be misunderstanding the objective of this game.  Players aren't actually given the rules here until the game is over.   This is the wrapup doc from last week's D&D.Sci scenario [LW · GW], where players were given not these full rules but the records of ~3k dungeon crawls that occurred under these rules.  The objective is to use that data to figure out the rules (or at least as much of them as is possible).  If you've done that successfully, it is supposed to be pretty straightfoward to calculate solutions given the rules.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2021-11-16T21:23:19.568Z · LW(p) · GW(p)

I understand the objective and the context. I was just wondering about the current state of getting an AI to output the implications of a piece of text such as these D&D rules, rather than either generating more text like it, or operating on data like the data set you originally provided.

comment by Measure · 2021-11-16T18:10:04.590Z · LW(p) · GW(p)

My biggest mistake was modeling the encounters as independent points of failure. I had considered the possibility of hit points or something similar, but I didn't put in the effort to check e.g. how encounter failure rates varied with depth.

comment by Yonge · 2021-11-16T17:14:01.766Z · LW(p) · GW(p)

Thank you for posting this. Overall I felt the level of complexity was about right for an average DandD problem. I was able to extract some useful information with a moderate amount of effort, but  reading through the ruleset I doubt anyone could figure out the perfect team from the dataset without a lot of luck.

comment by Taleuntum · 2021-11-16T10:09:41.725Z · LW(p) · GW(p)

Thanks for the game, I really enjoyed it and finally trying out some things I learned in practice. The solution I submitted was the best possible team according to a XGBClassifier calibrated with sklearn.CalibratedClassifierCV. Calibration performance on a test set and evaluating a solution with a different model (Dense NN) did make me realize that the solution is unlikely to be performant, but it was worth a try.

comment by Brendan Long (korin43) · 2021-11-18T21:07:34.039Z · LW(p) · GW(p)

I haven't been playing because I don't have enough time (i.e. I have other priorities right now), but they're really interesting and I read all of them. I just wanted to mention that.