D&D.Sci June 2022 Evaluation and Ruleset

post by abstractapplic · 2022-06-13T10:31:25.447Z · LW · GW · 11 comments

Contents

  Ruleset
    Traits
    Aptitude
    Cheat Skills
    Power
    Success and Failure
  Strategy
  Reflections
None
11 comments

This is a followup to the D&D.Sci post [LW · GW] I made ten days ago; if you haven’t already read it, you should do so now before spoiling yourself.

Here is the web interactive I built to let you evaluate your solution; below is an explanation of the rules used to generate the dataset (My full generation code is available here, in case you’re curious about some detail not explained below). You’ll probably want to test your answer before reading any further.

Ruleset

(Note: to make writing this easier, I’m using standard D&D dice notation, in which “4d8+3” means “roll four eight-sided dice, sum the results, then add three”.)

Traits

There’s a moderate positive correlation between the Otaku trait and the Nerd trait, and a strong negative correlation between the Office Worker trait and the Hikkikomori trait (it would be stronger still, but remote work from a home office allows these traits to coexist). The Sociopath trait doesn’t correlate with any other.

In addition to the five recorded traits, there’s one the angel and the goddess aren’t equipped to detect: a random 19.2% of heroes are Fated to win regardless of their choices. Being Fated also doesn’t correlate with other traits.

Aptitude

A hero’s Aptitude – that is, how well-adjusted they are to being isekai’d – is (by default) given by 2+1d8. Traits modify this as follows:

Cheat Skills

Choices of Cheat Skills are informed by both Traits and Aptitude. In particular, Fated heroes tend to be at least dimly aware of their good fortune, and are therefore liable to pick cheat skills optimized for smoothing the path to an inevitable victory; when given a choice, they’re much more likely to select Enlightenment, Radiant Splendor, and Uncanny Luck than un-Fated champions. (This causes those skills to appear more useful than they are.)

Power

Power is found by taking Aptitude and adding to it based on the cheat skills chosen.

Success and Failure

If you’re Fated, you win automatically. Otherwise, success is decided by rolling [Power]d10 dice; if the total exceeds 90, you win.

Strategy

Your character is a Nerd and an Office Worker. The success-chance-maximizing tactic is therefore to select Temporal Distortion, and combine it with a 5-Power skill: Barrier Conjuration, Anomalous Agility, Monstrous Regeneration and Rapid XP Gain are all equally valid choices.

Reflections

Gameplay-wise, the mission statement for my first challenge this year was “tricky but accessible”. I think I succeeded at this: indeed, I suspect I made it too tricky (I somehow failed to consider that players might put differing success rates down to sabotage on the part of the goddess’ collaborators, and had to awkwardly clarify partway through the challenge that the Chaos Deity is 100% legit) and too accessible (Jay Bailey reached a perfect answer remarkably quickly [LW(p) · GW(p)], and aphyer dissected the world remarkably thoroughly [LW(p) · GW(p)]; this speaks well of them, but perhaps poorly of the puzzle).

Thematically and pedagogically, I’m on more solid ground. The reader will have deduced that the intended takeaways from this game were along the lines of “A smaller unbiased dataset can be much more useful than a larger biased one”, “Randomized Controlled Trials are Randomized for a reason”, and “When reality hands you a relevant natural experiment, don’t ignore it”.

They . . . may also have learned some lessons I didn’t set out to teach. The combination suggested by “filter using literally every explanatory variable, then average” tactics happens to be one of the optimal four: more sophisticated methods do nothing but confirm the solution provided by a sensibly naive approach, and offer alternate points along the efficient frontier. On one hand, this produces an unsightly discontinuity in the gradient of effort vs outcome; on the other, it communicates the correct insight “A smaller dataset specific to your situation can be much more useful than a larger more general one”, and reflects the realism of advanced analysis confirming simpler suspicions; as such, I’m torn between embarrassment and wishing I did it on purpose. Feedback on this point, and on all other points, would be greatly appreciated.

11 comments

Comments sorted by top scores.

comment by aphyer · 2022-06-13T14:32:12.024Z · LW(p) · GW(p)

Epistemic status: Trying to make vague rambling gestures at the sort of thing I would ideally like to find inside the data in one of these scenarios.   Probably comes off to some extent as rude backseating of the author, I apologize for that but promise that no rudeness is intended.

The synergies/anti-synergies in this example arose mostly from ad-hoc things: 'Shapeshifting plus Sociopath is an extra +4, while Shapeshifting plus Otaku is an extra +5.'

The way I would ideally like the under-the-hood generation to work would be to have some kind of actual world model from which these synergies arise.

So imagine a world model that looks something like this:

A Hero will face four challenges en route to victory:

  1. The Demon King will try to assassinate them while they are still weak.
  2. They will need to rally followers behind them and build a power base.
  3. The Demon King will try to kill their Fated True Love and drive them to despair.
  4. They will gather their forces and march to siege the Demon Castle and gain victory.

Each challenge is implemented by rolling 2d10, and the challenge is passed if the hero rolls >8.  So a hero with no traits and no cheat power has a ~2/3 chance of winning each challenge and a ~1/4 chance of winning overall.

Monstrous Strength or Anomalous Agility makes you automatically pass Challenge 1 (you are nearly impossible to assassinate).

Barrier Conjuration applies +3 to both Challenge 1 and Challenge 3 (barriers can protect both you and your Fated True Love). 

Radiant Splendor makes you automatically pass Challenge 2 (very easy to convince people).

Rapid XP Gain applies +1 to Challenge 2, +2 to Challenge 3 and +3 to Challenge 4 (you grow stronger over time).

Sociopaths automatically pass Challenge 3 (since they will not be driven to despair even if their True Love dies).

Nerds automatically pass Challenge 4 (since once they are established they can make machine guns and tanks and so on).

Hikkikomori have a -3 penalty to Challenge 2 (they are bad at talking to people).

...and so on...

There would be a lot of interactions in this world model:

  • Monstrous Strength and Anomalous Agility would work very badly together, and quite badly with Barrier Conjuration, as they would point at the same problems.
  • Sociopaths would get less benefit out of Barrier Conjuration, as part of its benefit would be pointed at something they already passed automatically.
  • Similarly, Nerds would get less benefit out of Rapid XP Gain.
  • Radiant Splendor would be very good for Hikkikomori, who need to deal with their problems in Challenge 2.
  • Rapid XP Gain would work well with one of the defense powers that helped in Challenge 1, as it would help with everything else.
  • ...and so on...

But these interactions wouldn't arise from explicitly saying 'Sociopaths get less benefit out of Barrier Conjuration'.  Rather they would arise...naturally? organically?...out of a set of general rules.  And I feel in some fuzzy way I can't quite describe that this would be...a better way of modeling things?  A model that fits together better under the hood?  Mumble mumble emergent behavior?  Even if players aren't likely to actually reconstruct the full data model, I feel like this is in some way a more honest way of generating the data than having ad-hoc synergies and antisynergies?

Again, this isn't intended as criticism - I like the scenario, and I don't in fact think that I've done a good job of having a clean underlying data model in scenarios I've written.  I'm just trying to convey this sort-of-hand-wavy concept of 'the synergies exist inside a model' rather than 'the synergies are glued onto the outside'.

comment by Jay Bailey · 2022-06-14T03:13:15.794Z · LW(p) · GW(p)

I think the puzzle was good, but it might have been better if the scenario had more explicitly included "Given the next X heroes, maximise their chance of survival" the way aphyer did. As it is, I was expecting that the perfect solution would have aphyer levels of analysis, which is why I said that I expected my solution was a baseline and others would improve on it, even though I had the right answer.

You did allude to this by asking "What will you tell the Goddess when she returns?" but the overall scenario as presented was "Find a way for you personally to survive" and that's the problem I answered. Considering how much richness was present in the rest of the dataset, I think the puzzle should have explicitly said "Goal 1 is to survive personally. Goal 2, which is harder, is to maximise the survival of everyone who comes after you." This would have made it an excellent puzzle - aphyer did AMAZING work, and I think them being able to solve the puzzle does not speak poorly of it at all.

comment by aphyer · 2022-06-13T11:03:27.945Z · LW(p) · GW(p)

This is mostly a matter of taste, but I was mildly disappointed that there wasn't anywhere to go beyond the basic analysis. There were pretty much only two answers, 'Got Tricked By Fated' and 'Didn't', with no room to improve your solution using further exploration of the dataset.

I realize this is an artefact of making the scenario accessible and easy to play - from personal experience, it is really really hard to make a scenario have a complex ruleset that enables deep analysis and several levels of success without also having a complicated dataset that drives some players off - but want to register that my personal taste tends towards erring in the 'more complicated' direction.

(...anyone who's ever played any of the scenarios I wrote probably could have told you that already)

I did still enjoy playing this a lot, so thank you for writing it!

comment by aphyer · 2022-06-13T12:49:52.996Z · LW(p) · GW(p)

I think the interactive is bugged in a way that makes Enlightenment + Radiant Splendor look worse than it actually is: Radiant Splendor is coded as 'RS' at the top, but as 'RB' when checking for the synergy bonus, and so the bonus is never given?

Replies from: abstractapplic
comment by abstractapplic · 2022-06-13T13:00:01.171Z · LW(p) · GW(p)

Confirmed and corrected; thank you again.

comment by SarahSrinivasan (GuySrinivasan) · 2022-06-13T15:12:42.030Z · LW(p) · GW(p)

I spent all of my time trying to figure out how to figure out how much [the hidden variable causing the correlation between nerd and otaku] affects trait choices and winrates.

Apparently they are correlated without a relevant hidden variable. :D

Replies from: GuySrinivasan
comment by SarahSrinivasan (GuySrinivasan) · 2022-06-13T17:57:38.176Z · LW(p) · GW(p)

But in general I liked the setup a lot!

comment by aphyer · 2022-06-13T10:41:22.443Z · LW(p) · GW(p)

Did you leave Temporal Distortion out of the explanation?

Replies from: abstractapplic
comment by abstractapplic · 2022-06-13T10:46:09.961Z · LW(p) · GW(p)

Yes, good catch, fixed now.

comment by Multicore (KaynanK) · 2022-06-13T11:58:23.344Z · LW(p) · GW(p)

Looking at the generation code, aptitude had interesting effects on our predecessors' choice of cheats.

Good:

-Higher aptitude Hikkikomori and Otaku are less likely to take Hypercompetent Dark Side (which has lower benefits for higher aptitude characters).

Bad:

-Higher aptitude characters across the board are less likely to take Monstrous Regeneration or Anomalous Agility, which were some of the better choices available.

Ugly:

-Higher aptitude Hikkikomori are more likely to take Mind Palace.

comment by Christian Z R · 2024-10-10T10:02:31.854Z · LW(p) · GW(p)

Short and very interesting scenario! The fact that the most useful subset of the data was so small (929 people like us getting truly random skills) made me rater afraid that I was fooling myself with random fluctuations. With some very dirty probability I reasoned that the p value for our results for Anomalous Agility + Temporal Distortion was a bit less than 1%, so I went for it.