D&D.Sci Dungeoncrawling: The Crown of Command Evaluation & Ruleset

aphyer

D&D.Sci Dungeoncrawling: The Crown of Command Evaluation & Ruleset

post by aphyer · 2021-11-16T00:29:12.193Z · LW · GW · 12 comments

  RULESET
    THE PARTY
    TRAPS
    ENEMIES
  DATASET GENERATION
  STRATEGY
    The Lost Temple of Lemarchand
    The Infernal Den of Cheliax
    The Goblin Warrens of Khaz-Gorond
  LEADERBOARD
  FEEDBACK REQUEST
None
12 comments

This is a follow-up to last week's D&D.Sci scenario [LW · GW]: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.

Full generation & evaluation code is available here if you are interested, or you can read on.

RULESET

THE PARTY

A party has a combined HP total. (This represents health but also spells, healing potions, and any other resources a party has). Each adventurer contributes (1 + level) HP to the total, so e.g. a team of 4 Level 3 characters has 16HP. This HP total drops as the party goes through encounters. If it hits 0, the party withdraws.

Adventurers come in 6 classes, each with a different combat ability:

Fighters have Melee Guard, which helps defend against Melee attacks. If you have no Fighters, enemies with Melee attacks will charge into melee with your squishy ranged characters and deal double damage.
Rangers have Range Guard, which helps defend against Range attacks similarly.
Mages have Magic Guard, which helps defend against Magic attacks similarly.
Clerics have Healing. This restores 0.5HP per cleric to the party after each encounter if the party hasn't already retreated, plus provides defense against Undead (see Enemies below).
Druids have Wild Empathy. This helps defend against Beasts (see Enemies below).
Rogues have Cowardice. This is useless to the party, but makes them tend to show up to lower-level dungeons and not to higher-level ones, so a rogue will usually be higher-level than the dungeon they are encountering.

Encounters have two types: Traps and Enemies.

TRAPS

All traps work the same way: they deal 1d6 damage, reduced by the level of the highest-level appropriate adventurer in your party (to a minimum of 0):

Fighters counter Boulder Traps, smashing boulders with their great strength.
Rangers counter Lever Puzzle Rooms, using their bows to shoot levers from afar and to fire grappling hooks across the room.
Mages counter Riddle Doors, solving the riddles with their keen intellect.
Clerics counter Cursed Altars, dispelling the evil magicks upon them.
Druids counter Snake Pits, making friends with the snakes.
Rogues counter Poison Dart Traps, disarming them in order to steal the poison darts.

So if your party has a Level 4 Fighter, a Level 3 Fighter, a Level 3 Cleric, and a Level 1 Mage:

A Boulder Trap will do 1d6-4 damage (the higher-level fighter is counted, the lower-level one is ignored).
A Cursed Altar will do 1d6-3 damage.
A Riddle Door will do 1d6-1 damage.
A Snake Pit, Lever Puzzle Room or Poison Dart Trap will do 1d6 damage.

If you have a Level 6 character, you will take no damage from the associated trap type. Beyond this there's less benefit to increased level - it gives extra HP to the party, but nothing else.

ENEMIES

A list of the enemies:

Enemy	Damage	Attack Type	Notes
Goblins	1d3	Range
Goblin Chieftain	1d4	Melee
Orcs	1d4	Melee
Wolves	1d4	Melee	Beast
Orc Shaman	1d6	Magic
Orc Warlord	1d8	Melee
Skeletons	1d3	Range, Magic	Undead
Zombies	1d3	Melee, Magic	Undead
Ghosts	1d4	Magic	Undead
Basilisk	1d8	Melee, Magic	Beast
Lich	1d10	Magic	Undead
Dragon	2d6	Melee, Range, Magic

When you encounter an enemy, it rolls its Damage. (So Orcs roll 1d4, and might do e.g. 3 damage).

If it is using an attack type you do not have defense against, that damage is doubled. (So if you have no Fighter, the Orcs above will instead do 6 damage).

If an enemy has multiple attack types, you must have defense against them all to avoid taking double damage - so a Dragon will deal double damage unless you have all three defenses - but it can only double once, even if you are missing multiple defenses.

This is unaffected by the level of the character granting the defense - higher overall level gives more party HP to absorb damage, but e.g. when fighting against Orcs there's no difference between a Level 1 Fighter + Level 7 Mage vs a Level 7 Fighter + Level 1 Mage.

Clerics can use their healing magic against Undead: if you have at least one Cleric, that can count as any one form of defense you are otherwise missing against any Undead enemy.

Druids can use Wild Empathy to convince beasts not to attack: before doubling, reduce any damage suffered from Beasts by the level of the highest-level Druid in your party.

DATASET GENERATION

Dungeons come in three varieties:

Monster Camps are organized groups of a single kind of monster.
- They can be either Goblin, Orc , or Undead.
- This will be given in the name, e.g. 'Orc Keep', 'Undead Sepulcher', 'Goblin Warrens'
- The camp will have 2d4 encounters with that type of enemy, the last of which will usually be a tougher boss enemy.
- The camp then picks a trap type to build as its defense. Every 2 enemies in the camp build 1 trap of that type.
Lairs are places where a single powerful monster makes its lair:
- The powerful monster will be the last encounter. It can be a Basilisk, Lich or Dragon (leading to slightly different names).
- In addition, there are 1d3 encounters with its minions (it will pick a random enemy type and enslave them as its servants). Liches will always have Undead servants.
- There are also 1d3 traps (of any variety of kinds).
Ruins are remnants of old civilization. You can encounter a wide variety of things there:
- There will be 2d3 traps (of any variety of kinds).
- There will also be 1d4 enemies. Undead are most common, but almost any enemy can in theory be encountered in a ruin.

Adventurers then approach these dungeons. Adventurers are usually pretty good at guessing (from rumors about a dungeon, the number of villages laid waste near it, etc.) how threatening it will be, and adventurers of around the right level will approach it. Adventurers somewhat underestimate Goblins in general, and tend to attempt their dungeons at lower-level than they should (abstractapplic is the first person I think to notice this one).

STRATEGY

Once you understand how the system works, general strategy is:

Try to have in your party the defense types that counter enemy attack types to avoid taking double damage (or a Cleric can help with this against Undead). The characters doing this can be of any level.
Try to have in your party the classes that counter traps that are present. The characters doing this need to be high-level.
If there are Beasts to fight, bring a Druid (again high-level).
If you don't need any other characters, extra Clerics (of any level) are valuable for healing. Duplicate Clerics can be valuable, while duplicates of other classes never are.

With perfect play, it is possible to reach 100% win rate on all three dungeons using 36,000gp. This requires going beyond the bounds of the parties you've seen and building parties with weirdly artificial level differences, though. A more realistic goal of 'selecting optimal classes while bringing all Level 3 adventurers' achieves a 85.75% overall win rate, with most of the loss probability coming from the risk of unlucky rolls against the Dragon in the Infernal Den of Cheliax.

The Lost Temple of Lemarchand

This is a Ruin-type dungeon. All encounters in this were known.

You want to bring a Druid and a Rogue for the Snake Pit and Poison Needle Traps.
You want to avoid taking double-damage from Skeletons, Zombies, and Ghosts. The only way to do this with two characters is one Mage (who counters Magic) and one Cleric (whose ability to counter Undead will work on either of the other attack types).
The Mage and the Cleric can be any level, the Druid and Rogue should be high-level to counter traps.
A party of Druid5, Rogue5, Mage1, Cleric1 (12 levels in total) is guaranteed to win:
- Maximum possible damage taken is 3 each from Skeletons (twice) and Zombies (once), 4 from Ghosts (once), and 1 from each trap (four times), for a total of 17.
- You start with 16HP.
- Additionally, you will receive 3HP of meaningful healing over the course of the adventure.
A party of Druid3, Rogue3, Mage3, Cleric3 has a 98.80% win-rate.

The Infernal Den of Cheliax

This is a Lair-type dungeon. The final encounter is listed as 'Unknown'. However, it is guaranteed to be a Dragon (the 'Infernal' prefix shows up only with Dragon or Lich, and a Lich will use Undead rather than Orcs as servants. Congratulations to measure, who posted a mostly-complete explanation of what encounters would be in this and the next dungeon.

You want to bring a high-level Druid for the Wolves and Snake Pits.
In order to avoid taking double-damage from the Dragon, you need to bring a Fighter, a Ranger and a Mage (of any level).
Total level needs to be high enough to absorb potentially high damage.
A party of Druid6, Fighter3, Ranger2, Mage2 (13 levels in total) is guaranteed to win. You can also distribute levels differently so long as the druid is at least Level 6:
- Maximum possible damage taken is 4 from the Orcs and 12 from the Dragon, for a total of 16.
- You start with 17HP.
A party of Druid3, Fighter3, Ranger3, Mage3 has an 87.88% win rate.

The Goblin Warrens of Khaz-Gorond

This is a Monster Camp-type dungeon. Most encounters in this were listed as 'Unknown.' However, based on the generation algorithm they are fully predictable except for the order: there will be 3 Boulder Traps, 6 Goblins, and one Goblin Chieftain at the end. (Each Goblin camp only builds one type of trap, and while some smaller camps have no Chieftain any camp with 6+ Goblins will have one).

You want to bring a Ranger (of any level) to avoid taking double-damage from tons of Goblins.
You want to bring a Fighter (ideally high-level) to reduce damage from Boulder Traps and avoid taking double-damage from the Goblin Chieftain.
Since you don't have any other classes you really need, but the adventure is really long, Clerics will help keep your HP high.
A party of Fighter6, Ranger3, Cleric1, Cleric1 (11 levels in total) is guaranteed to win. You can also distribute the Ranger's second and third level to any other character:
- Maximum possible damage is 3 damage from each Goblin (six times) plus 4 from the Chieftain (once), for a total of 22.
- You start with 15HP.
- Additionally, you will receive at least 8 (and usually 9) points of meaningful healing.
A party of Fighter3, Ranger3, Cleric3, Cleric3 has a 98.76% win rate.

LEADERBOARD

Note: win rates below were Monte-Carlo derived rather than explicitly calculated.

Current leaderboard:

Submitter	Lost Temple of Lemarchand		Infernal Den of Cheliax		Goblin Warrens of Khaz-Gorond		Total
Fully Optimal Play	D5, Ro5, M1, C1	100.00%	D6, F3, Ra2, M2	100.00%	F6, Ra2, C1, C1	100.00%	100.00%
Optimal Class Allocation	D3, Ro3, M3, C3	98.80%	D3, F3, Ra3, M3	87.88%	F3, Ra3, C3, C3	98.76%	85.75%
abstractapplic	D2, Ro2, M2, C2	50.45%	D3, F5, Ra3, M3	96.43%	F4, Ra4, F3 C3	99.27%	48.29%
simon	D2, Ro2, M2, C3	62.40%	F4, Ra4, C3, D4	71.22%	F4, R3, R3, D3	58.12%	25.83%
Yonge	D2, Ro2, M2, C3	62.40%	F7, Ra3, D2, C2	42.16%	Ra6, F4, C2, Ro1	97.97%	25.77%
Taleuntum	Ra1, Ro1, C4, M4	22.54%	M1, F3, D4, Ra5	98.35%	F3, Ra3, Ro3, C4	90.09%	19.97%
Measure	F1, M1, C3, D1	0.80%	F4, Ra3, M7, C4	88.07%	F3, F2, Ra3, C4	83.57%	0.59%
Adventurer's Guild	4 random Level 3s	27.26%	4 random Level 3s	14.31%	4 random Level 3s	10.20%	0.40%

Congratulations to those who successfully assembled the Crown of Command and became the Tyrant of Calantha, reigning with an iron fist until at last a desperate team of heroes was able to topple you and restore freedom to the nation once more. abstractapplic has the highest win-rate here: if a future scenario needs an evil god-tyrant in it, I'll know who to put there.

Condolences to those who failed at this goal, and were doomed to a life of squalor with only a few thousand servants at your command.

For any future players who want to test their performance, you can edit and run the following lines in the code to include and test your proposed teams:

specific_team_test_runs = True
if( specific_team_test_runs == True):
my_world = World(log=False)
runs = 0
wins = 0
losses = 0
while runs < 50000:

#lots of teams commented out here
#string_party = [('Mage', 3), ('Cleric', 3), ('Rogue', 3), ('Druid', 3)] # Class based LTL
#string_party = [('Fighter', 3), ('Ranger', 3), ('Mage', 3), ('Druid', 3)] # Class based ITC
string_party = [('Ranger', 3), ('Fighter', 3), ('Cleric', 3), ('Cleric', 3)] # Class based GWK
party = my_world.get_party_by_name_and_levels( [(x[0], x[1]) for x in string_party] )
dungeon = Dungeon(my_world)
#dungeon.encounter_names = [ 'Skeletons', 'Skeletons', 'Poison Needle Trap', 'Zombies', 'Snake Pit', 'Poison Needle Trap', 'Ghosts', 'Snake Pit' ]
#dungeon.encounter_names = [ 'Snake Pit', 'Orcs', 'Snake Pit', 'Wolves', 'Dragon' ]
dungeon.encounter_names = [ 'Goblins', 'Boulder Trap', 'Goblins', 'Goblins', 'Boulder Trap', 'Goblins', 'Goblins', 'Boulder Trap', 'Goblins', 'Goblin Chieftain' ]
dungeon.get_encounters_by_name()
party.run_dungeon(dungeon, log=False)
if party.current_hp > 0:
wins = wins + 1
else:
losses = losses + 1
runs = runs + 1
print('Won {}/{} ({:.2f}%)'.format(wins, runs, wins * 100 / runs))

or if you aren't familiar with the code you can DM me and I can run for you.

FEEDBACK REQUEST

As usual, I'm interested in feedback. If you played the scenario, what did you like and what did you not like? If you might have played but in the end did not, what scared you away?

Additionally, if anyone who's played a few of these is willing to sign up to take a look at the text & first few rows of the data of future scenarios I write before I post them, I would really appreciate it. (I don't think the LW 'Get Feedback' option is quite meant for this use case). I'm interested to know in advance things like:

The problem statement is confusing and you should reword it.
The data looks mistaken and you should fix it.
The data is formatted inconveniently and you should supply columns XYZ.
Everything looks much too complicated and you should go back to the drawing board. This is meant to be Dungeons and Data Science, not Dark Souls and Data Science.

If you are willing to do this, I can't offer to pay you anything, but I can offer you ~~the rulership of all Asia~~ ~~the love of Helen of Troy~~ your name credited for help in the scenario. If this somehow seems like a good deal to you, let me know and I'll message you when I have something for you to look at.

Thanks for playing!

12 comments

Comments sorted by top scores.

comment by Sir Edmund · 2021-11-16T15:46:48.299Z · LW(p) · GW(p)

Thank you for making this. I have enjoyed following the D&D.Sci series, even though I don't get around to posting full solutions.

One thing that the series has given me is a better awareness of the limits of data science. Given enough time and effort, you can parse a data set along as many dimensions as you please, but the amount of time and effort needed grows exponentially based on the number of possible variables. If this scenario were a video game, I imagine that just controlling a party through a few different dungeons, and thereby seeing which enemies did high damage against which parties, would quickly give an intuitive sense and at minimum would help a strategist to avoid a lot of clear mistakes. The same goes for the League of Defenders of the Storm scenario -- an actual player of the game would quickly learn that level 1's beat their corresponding level 6's in play, whereas that fact wasn't as obvious from observing the overall data set.

So both on-the-ground experience and data science have their uses. It's valuable to practice what can be done with data science alone, but one key takeaway from this series is that if I'm ever making a bet with my own life ( (or just a lot of money) on the line, I pray I'll get a chance to practice my strategy as well as to observe the relevant data.

Replies from: aphyer

↑ comment by aphyer · 2021-11-16T16:09:15.236Z · LW(p) · GW(p)

+1.

The fact that no-one's gotten the optimal solution is very much intended. (If anyone had, I would be both very impressed with them and somewhat disappointed with myself.) You should not expect to be able to fully model a domain with data science, it's like trying to thread a needle wearing huge thick gloves. But you can expect to figure out something about the domain, and use that to at least substantially outperform randomness. (Our highest scorer this round, abstractapplic, had ~half the optimal winrate, but ~100x the 'random approach' winrate).

comment by simon · 2021-11-16T02:11:57.447Z · LW(p) · GW(p)

Thanks for making the scenario.

I found I was slow to start because it seemed a bit intimidating, as there was so many variables that were obviously going to confuse each other. (This was part of why I looked at Threat Level, since it was something I could actually solve). Once I got going, there was plenty to consider but not enough time, which is OK. Maybe some more obviously low-hanging fruit could help people get into it.

A minor formatting preference: if you would add a unique ID to each row, this would help identify a particular row in a way that is preserved through sorting of the data.

On my particular solution:

I didn't realize I needed all three of Fighter, Ranger and Mage on Dragon. I should have looked at class combos on Dragon in more detail; that should have been discoverable. Meanwhile, the optimal choice on Goblin Warrens was (by sheer coincidence) going to be my second choice, but even if I had chosen it, I would still have been behind abstractapplic who got the proper Dragon combo as well as a good Goblin party. Congrats abstractapplic.

I explicitly contemplated the idea that cleric might restore resources, not from the data but from priors, but didn't look into it (are Clerics good at easy-but-long dungeons would be one possible line of approach). I also would have eventually looked at the relative importance of different classes being high level if I had enough time.

comment by abstractapplic · 2021-11-16T17:15:37.460Z · LW(p) · GW(p)

Reflections on my attempt:

I’m pleasantly surprised by how well I did in both a general and absolute sense (if you asked me yesterday I would not have put my strategy’s odds of overall success above 20%). Of course, ~half the credit for this ~victory goes to Measure, whose inferences about the dungeons’ likely populations I was shameless in making use of.

If I’d had more time and energy to spare, I would have looked into how reliably teams which counter all their encounters win, and how character levels affect this outcome; from what I read here, I think that would have been a good next step.

Reflections on the challenge:

The problem statement was the most fun-to-read D&D.Sci introduction so far, including (imo) all of my own.
I found myself surprisingly uncomfortable playing a villain. (If you don’t get why I’d be bothered by mostly-task-irrelevant skippable flavortext then that makes two of us.)
The mechanic of “you have fungible but limited resources to allocate between multiple tasks, all of which have to be completed for a win” turns out to be an extremely good fit for D&D.Sci, and I look forward to using it in ~~one~~ several of my scenarios.
The difficulty level was in an uncanny valley between “simple task that’s only hard because Inference and Application are inherently hard” and “arbitrary fractal complexity which rivals that of the real world”; I would have liked this game more if it were significantly harder or easier.
I got to play another D&D.Sci scenario! Which I didn’t make! And which probably helped me to get better at making D&D.Sci scenarios!

Regarding future feedback:

If you – I here refer both to the esteemed OP and to anyone else with a complete-but-unreleased D&D.Sci game – want me to proof a future challenge before it’s released to the wider public, dm me and I’d be happy to take a look. I’d also be (reluctantly) willing to give (a small amount of) more general support to people perpetuating my genre (even though this would disqualify me from playing the resulting games, and my advice would mostly be variations on “do the things I did but better”).

comment by Pattern · 2021-11-17T16:11:50.759Z · LW(p) · GW(p)

Clerics have Healing. This restores 0.5HP per cleric to the party after each encounter if the party hasn't already retreated, plus provides defense against Undead (see Enemies below).

No boost based on level? (I know it's not D&D, but is this just to balance things, because they're also helpful against undead?)

comment by Brendan Long (korin43) · 2021-11-18T21:07:34.039Z · LW(p) · GW(p)

I haven't been playing because I don't have enough time (i.e. I have other priorities right now), but they're really interesting and I read all of them. I just wanted to mention that.

comment by Richard_Kennaway · 2021-11-16T20:25:52.250Z · LW(p) · GW(p)

It seems fairly straightforward for someone, given these rules, to calculate the optimal solutions. Does there currently exist any AI software that would be capable of reading the rules as you have set them out, and finding those solutions?

I recall Lenat's Eurisko that won several tournaments of some type of space war game, by playing massive numbers of games with itself and finding strategies that humans would have difficulty finding (e.g. using "lifeboats" as "armour"), but I've never heard of more advanced stuff in that line. There is Cyc, but nothing seems to have come of that.

Replies from: aphyer

↑ comment by aphyer · 2021-11-16T20:45:49.276Z · LW(p) · GW(p)

I imagine that would be primarily a language-processing issue, I'm not super-familiar with the current standard of AI but I don't think it's quite good enough to do that.

With that said, I think you might be misunderstanding the objective of this game. Players aren't actually given the rules here until the game is over. This is the wrapup doc from last week's D&D.Sci scenario [LW · GW], where players were given not these full rules but the records of ~3k dungeon crawls that occurred under these rules. The objective is to use that data to figure out the rules (or at least as much of them as is possible). If you've done that successfully, it is supposed to be pretty straightfoward to calculate solutions given the rules.

Replies from: Richard_Kennaway

↑ comment by Richard_Kennaway · 2021-11-16T21:23:19.568Z · LW(p) · GW(p)

I understand the objective and the context. I was just wondering about the current state of getting an AI to output the implications of a piece of text such as these D&D rules, rather than either generating more text like it, or operating on data like the data set you originally provided.

comment by Measure · 2021-11-16T18:10:04.590Z · LW(p) · GW(p)

My biggest mistake was modeling the encounters as independent points of failure. I had considered the possibility of hit points or something similar, but I didn't put in the effort to check e.g. how encounter failure rates varied with depth.

comment by Yonge · 2021-11-16T17:14:01.766Z · LW(p) · GW(p)

Thank you for posting this. Overall I felt the level of complexity was about right for an average DandD problem. I was able to extract some useful information with a moderate amount of effort, but reading through the ruleset I doubt anyone could figure out the perfect team from the dataset without a lot of luck.

comment by Taleuntum · 2021-11-16T10:09:41.725Z · LW(p) · GW(p)

Thanks for the game, I really enjoyed it and finally trying out some things I learned in practice. The solution I submitted was the best possible team according to a XGBClassifier calibrated with sklearn.CalibratedClassifierCV. Calibration performance on a test set and evaluating a solution with a different model (Dense NN) did make me realize that the solution is unlikely to be performant, but it was worth a try.

D&D.Sci Dungeoncrawling: The Crown of Command Evaluation & Ruleset

Contents

RULESET

THE PARTY

TRAPS

ENEMIES

DATASET GENERATION

STRATEGY

The Lost Temple of Lemarchand

The Infernal Den of Cheliax

The Goblin Warrens of Khaz-Gorond

LEADERBOARD

FEEDBACK REQUEST

12 comments