D&D.Sci Long War: Defender of Data-mocracy Evaluation & Ruleset

post by aphyer · 2024-05-14T03:35:10.586Z · LW · GW · 3 comments

Contents

  RULESET
  STRATEGY
  LEADERBOARD
  REFLECTION & FEEDBACK REQUEST
None
3 comments

This is a follow-up to last week's D&D.Sci scenario [LW · GW]: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.

There is a web interactive here you can use to test your answer, and generation code available here if you're interested, or you can read on for the ruleset and scores.

RULESET

Each alien has a different amount of HP:

AlienHPThreat*
Swarming Scarab11
Chitinous Crawler32
Voracious Venompede53
Arachnoid Abomination94
Towering Tyrant155

*Threat has no effect on combat directly - it's a measure of how threatening Earth considers each alien to be, which scales how many soldiers they send.  (The war has been getting worse - early on, Earth sent on average ~1 soldier/4 Threat of aliens, but today it's more like 1 soldier/6 Threat.  The wave you're facing has 41 Threat, Earth would send on average ~7 soldiers to it.  Earth doesn't exercise much selection with weapons, but sends soldiers in pairs such that each pair has two different weapons - this is a slight bias towards diversity.)

Each weapon has a damage it deals per shot, and a rate of fire that determines how many shots it can get off before the wielder is perforated by venomous spines/dissolved into a puddle of goo/voraciously devoured by a ravenous toothed maw:

WeaponDamageMin ShotsMax Shots
Macross Minigun158
Fusion Flamethrower1312
Pulse Phaser246
Rail Rifle335
Laser Lance525
Gluon Grenades723
Thermo-Torpedos1313
Antimatter Artillery2012

Each soldier will be able to fire a number of shots chosen randomly between Min Shots and Max Shots - for example, a soldier with a Laser Lance will have time to fire 1d4+1 shots, each doing 5 damage.

During a battle, humans roll for how many shots each weapon gets, and then attempt to allocate damage from their shots to bring down all aliens.  If they succeed, the humans win - if not, the humans lose.  While doing this optimally is theoretically very difficult, your soldiers are well-trained and the battles are not all that large, so your soldiers will reliably find a solution if one exists.

For example, if you are fighting two Towering Tyrants and two Swarming Scarabs using two soldiers:

STRATEGY

The most important element of strategy was sending the right kind of weapons for each alien: high-health aliens like Tyrants are extremely inefficient to kill with light weapons like Miniguns, while small, numerous aliens like Scarabs are extremely inefficient to kill with heavy weapons like artillery.

There were a few subtler elements of strategy:

Optimal play used different weapons based on how many soldiers you brought:

Random play would result in a lower winrate:

SoldiersRandom WinrateOptimal Winrate
40.02%2.08%
52.68%27.92%
621.38%69.65%
754.45%95.43%
880.26%100.00%
9*92.74%100.00%
10*97.57%100.00%

*Earth is unlikely to send this many soldiers with the war as it currently stands.

At current troop levels, Earth would usually send 6-8 soldiers to this mission, and almost always send 5-9:

SoldiersChance to Send
56.5%
633.0%
735.8%
817.4%
96.3%
10<0.1%

The interactive will be evaluating your submissions accordingly.

LEADERBOARD

Submissions were:

abstractapplic submitted a 7-soldier squad with 3 Artillery, 1 Torpedo, 2 Lances, and 1 Minigun.

This was not quite the optimal submission, but was fairly well-optimized and ended up very close, with a winrate of 94.40% (compared to 95.43% with the optimal 7-soldier squad).

Yonge and Unnamed both used the small battles to figure out what could reliably beat each alien in isolation, and submitted 10-soldier squads with separate soldiers for each alien type:

These teams both manage a 100% winrate (albeit while sending a larger squad than the PGFDA would usually send).

Measure said this:

According to my model, for larger numbers of soldiers, you don't need a specific anti-Scarab weapon. It's slightly more important to make sure you have a good matchup against the Tyrants.

and boldly submitted a 6-soldier squad with 4 Artillery and 2 Lances. 

Unfortunately for Measure, Scarabs were not in fact a non-threat, and while this squad has excellent handling of Tyrants and Abominations, plus some Lances for the Crawlers and Venompedes, it doesn't have anything that can deal efficiently with the Scarabs, and ends up with only a 9.38% winrate.  My condolences to Measure.

qwertyasdef submitted a 7-soldier squad armed with 7 Thermo-Torpedoes.  (Apparently this was the result of a model that assumed any given weapon added some amount to your winrate based on which aliens were present, but didn't include parameters for other weapons you already had).  

Sadly, this again ends up with a lot of large weapons for the bigger aliens but nothing to handle Scarabs, and has only a 1.65% winrate.

 

REFLECTION & FEEDBACK REQUEST

One thing I was trying to push with this scenario was 'look at the simple cases first to investigate the underlying world/ruleset, and after that try to apply what you've learned to complicated cases like the one at hand'.  I deliberately didn't impose any minimum size on battles, to ensure that there were a lot of extremely simple 'one soldier and one Tyrant'-type battles from which players could try to derive rules, rather than just immediately looking for 'battles that look like the one we're headed into' to try to find what did well there.

I'm reasonably happy with how this part went - I saw several players explicitly analyzing the simple cases, and didn't see anyone doing the 'jump straight to this battle' approach.  On the other hand, a couple players seemed to end up tricked by one thing or another into submitting worse-than-random teams.  How did this feel from the player side?

As usual, I'm also interested to hear more general feedback on what people thought of this scenario.  If you played it, what did you like and what did you not like?  If you might have played it but decided not to, what drove you away?  What would you like to see more of/less of in future?  Do you think the scenario was too complicated to decipher?  Too simple to have anything interesting/realistic to uncover?  Or both at once?  Do you have any other feedback?


 

3 comments

Comments sorted by top scores.

comment by abstractapplic · 2024-05-14T08:45:24.072Z · LW(p) · GW(p)

Reflections on my performance:

I'm pleasantly surprised by the effectiveness of my reasoning, and of my meta-reasoning. Not only did my loadout do well, but my calibration was impressively close: the final decision I pegged at a "~95%" success rate got 94.4%, and most of the alternative strategies I mentioned in my post [LW(p) · GW(p)] were similarly on-the-nose.

(Unfortunately, my meta-meta-reasoning could still use some work. I figured out that this was a "linear-ish logistic success model with some interactions on top" kind of problem, took this as an opportunity to test that library I made [LW · GW], created a good predictor with a bunch of pretty/informative graphs . . . and then found myself thinking "only need one minigun? doesn't sound right to me", "why would Tyrants/Artillery and Scarabs/Minigun-or-Flamethrowers be so much stronger than every other potential feature interaction?", and "I'm totally gonna turn out to have screwed up and wish I'd handled this with XGBoost, better not even mention how I built my model". If I'd been more calibrated about how calibrated I ended up being, this could have been a really good chance to show off by calling in advance that my unconventional ML approach would succeed here.)

Reflections on the challenge:

This was the 2D performance thing I tried to pull off in Boojumologist [LW · GW], but with better conceptual underpinning and flawless execution. I'm proud, gladdened and envious: can't think of a single way to improve this scenario.

(I, uh, may be biased by how well I happened to do: please take this feedback with a grain of salt.)

comment by qwertyasdef · 2024-05-14T23:20:45.904Z · LW(p) · GW(p)

I'm not surprised my submission did badly since it was the easiest thing I could quickly come up with after seeing that I was already late. I wasn't quite expecting to be unable to come up with anything better though. After looking at other people's comments I'm particularly disappointed that it never once crossed my mind to try analyzing single-soldier combats. I was explicitly trying to figure out the effect of one soldier of each weapon, and I had a histogram of the number of soldiers per combat from which I could have easily gleaned that there were lots of single-soldier combats to investigate had I thought to do so, but instead I tried to analyze the win rates of (some combination of weapons) vs (some combination of weapons) + (1 more of the weapon I'm trying to investigate) and running into trouble with the fact that that extra soldier is also correlated with an increased alien threat and didn't know how to tease the two effects apart.

comment by MathiasKB (MathiasKirkBonde) · 2024-05-14T20:59:20.891Z · LW(p) · GW(p)

Just played through it tonight. This was my first D&D.Sci, found it quite difficult and learned a a few things while working on it.

Initially I tried to figure out the best counters and found a few patterns (flamethrowers were especially good against certain units). I then tried to look and adjust for any chronology, but after tinkering around for a while without getting anywhere I gave up on that. Eventually I just went with a pretty brainless ML approach.

I ended up sending squads for 5 and 6 which managed a 13.89% and 53.15% chance of surviving, I think it's good I'm not in charge of any soldiers in real life!

Overall I had good fun, and I'm looking forward to looking at the next one.