D&D.SCP: Anomalous Acquisitions Evaluation & Ruleset

post by aphyer · 2022-02-22T18:19:22.408Z · LW · GW · 9 comments

Contents

  RULESET
  ANOMALIES
  DATASET
  SCP SOURCES
  TAGS
  LOCATIONS
  STRATEGY
  LEADERBOARD
  FEEDBACK REQUEST
None
9 comments


This is a follow-up to [LW · GW]last week's D&D.Sci scenario [LW · GW]: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.

A web interactive to test your solution is available here.  This also presents the thrilling* conclusion to the story, with a wide variety** of skillfully-written* endings depending on your actions***!

*Maybe.    **Five.      ***And on your luck.

Full generation code is also available here if you are interested, or you can read on and find out.

RULESET

An SCP object has a Value and a Danger.

When an object is successfully stolen, Marshall Carter & Dark security teams attempt to reduce its costs, while MC&D sales teams attempt to maximize revenue gained. 

Revenue gained starts at a baseline of $2MM * (Value^2).  This is multiplied by (3+1d6)/6 depending on performance of sales team.

Costs start at a baseline of $2MM * (Danger^2).  This is multiplied by (9-1d6)/6 depending on performance of security team.

Profit is equal to Revenue minus Costs.

Overall, an object whose Value equals its Danger will average a slight profit.  Objects with Value > Danger will tend to generate profits, while those with Danger > Value will tend to generate losses.

The squaring of Value and Danger means that e.g. Value 6 Danger 5 is much more profitable than Value 1 Danger 0 (with a baseline of $72 MM revenue and $50 MM costs being much better than a baseline of $2 MM revenue and $0 costs).

You have very good insight into the Danger of an object via the SCP Foundation's classifications:

There was low-hanging-fruit available in preferentially targeting Safe objects (though this was not always optimal).  

The more difficult problem was figuring out how to use tags to identify objects with high Value.

However, there were also two Anomalies hidden in the dataset - SCPs that broke the usual rules.  If you fed the data into a machine learning program without sanity-checking it first (cough), these may have shot you in the foot.

ANOMALIES

Two SCPs in the dataset were anomalous - they did not operate under the same rules as the rest of the data, and instead existed as horrible tricks to sabotage people who did not sanity-check their data.

SCP-1182 is an infohazardous object.  All data pertaining to it is corrupted, taking on false values.  Rows for this object showed up with:

SCP-537 is a Very Loyal Robot Dog.  It imprints on its owner (currently Foundation Senior Researcher Valdez on Site 2).  When its owner calls to it, it returns to them - regardless of its current location, whether it is constrained, or even whether it has been disassembled or destroyed.  The SCP Foundation is aware of this ability, and uses it to their advantage - they conceal its precise ability, but ensure that potential thieves are aware of it.  MC&D has stolen it half a dozen times under different heads of Acquisitions, and every time it has returned itself (leaving MC&D with zero profit).

DATASET

The remainder of the dataset was generated according to a consistent ruleset.  The central theme of this dataset was Bayesian inference.

There are four possible Sources for anomalous objects:

SCP SOURCES

CREATORS (e.g. Dr. Wondertainment) produce anomalous objects to accomplish something.  These objects are designed to be useful, and while they may be dangerous the danger is a side-effect of a desired function.

An SCP object produced by a Creator rolls 1d5 and 1d3.  The 1d5 result is its Danger.  The sum of both results is its Value.

This means that SCP objects produced by Creators will always be valuable, and will be more valuable the more dangerous they are (since Value 7 Danger 5 is better than Value 3 Danger 1).

SPACETIME SHENANIGANS (e.g. objects that have fallen through time portals from future or parallel universes) are valuable as mechanisms that can be worked with and sometimes even reverse-engineered.

An SCP object resulting from Spacetime Shenanigans rolls 1d6 for its Danger, and has a constant Value of 5.

This means that SCP objects resulting from Spacetime Shenanigans will usually be profitable, and more so the less dangerous they are.

ANART OBJECTS are produced by 'anomalous artists' who create curiosities.  In most cases these are neither particularly dangerous nor particularly valuable - they're created by people who are trying to make something artistic, not something useful.

An SCP object resulting from anartists rolls 12d6.  Its Value is equal to the number of 6s rolled and its Danger is equal to the number of 1s rolled.

This means that SCP anart objects are usually not very valuable, though low-danger ones are usually slightly profitable.

VILLAINS (e.g. the Disciples of the Scarlet King, the Church of the Broken God) are trying to use anomalous means to destroy the world/conquer the world/immanentize the eschaton.  These objects are designed to be dangerous.

An SCP object produced by a Villain rolls 1d4 and 1d5.  The 1d4 result is its Value.  The sum of both results is its Danger.

This means that SCP objects produced by Villains are never a good idea to pursue.

Expected profit from an object based on source and classification:

 CreatorSpacetimeAnartVillain
Safe$24.0M$49.6M$9.8M-$5.2M
Euclid$44.4M$31.3M-$10.9M-$15.9M
Keter$61.8M-$1.7M-$44.7M-$58.5M

Your overall goal, therefore, is to identify and pursue Creator-made objects (especially high-danger) and Spacetime-made objects (especially low-danger) while trying to avoid Anart and especially Villain-sourced objects.

TAGS

Tags are not directly relevant to Value or Danger.  Instead, Tags are informative about Value and Danger by being informative about Source.  Different sources have different probabilities of exhibiting a given tag:

TagCreatorsSpacetimeAnartistsVillains
Humanoid40%10%1%10%
Infohazardous20%20%40%10%
Location15%1%50%10%
Organic50%1%10%15%
Predatory15%5%5%30%
Mechanical1%60%10%20%
Mobile50%40%5%40%
Replicating1%20%5%30%
Virtual5%40%30%1%

For a given source, tags are independent of one another (with the exception of Humanoid and Location, which are mutually exclusive).

A full analysis of what tags imply what is quite deep.  A few sample things to point out:

Tags also had a secondary effect on retrieval teams:

Your predecessors knew the first two of these things, and did not send Infiltration teams after Locations or Legal teams after Humanoids.  They did not know the third, and frequently sent Paramilitary teams to fail in retrieving Virtual SCPs.

EDITED TO ADD: Aside from their dependence on tags, your predecessors' actions were almost entirely random, sending 2 teams of each type plus 1-5 additional random teams (1d3 at first, up to 1d4 in 1950 and 1d5 in 2000) to target random SCP objects.

LOCATIONS

There were six sites in-game.  Site 1 is Foundation overall administration and headquarters, SCP objects are not stored there.  

There is no Site 5.  You are not cleared to know what happened to Site 5.  Do not enquire further.  Be vg jvyy unccra gb lbh.

These had a relationship with Sources:

They also had a modest effect on retrieval teams:

STRATEGY

With a theoretical perfect understanding of how the system works, optimal strategy is to:

The SCP objects you had access to, their source probabilities with evidence taken into account, and the resulting expected profits, were:

SCP (Classification)CreatorsVillainsAnartistsSpacetimeExpected Profit if Stolen ($MM)
SCP-2797 (Keter)93.44%6.39%0.02%0.14%54.0
SCP-3273 (Safe)0.04%0.05%0.70%99.21%49.3
SCP-4449 (Safe)0.03%0.04%8.21%91.72%46.3
SCP-537 (Safe)1.01%5.50%3.94%89.56%44.8
SCP-3936 (Euclid)98.54%0.92%0.49%0.06%43.5
SCP-3440 (Safe)0.32%2.61%11.87%85.20%43.4
SCP-4834 (Euclid)97.18%2.71%0.04%0.08%42.7
SCP-4004 (Keter)82.10%14.98%2.89%0.03%40.7
SCP-3668 (Keter)78.71%15.26%0.05%5.99%39.6
SCP-4026 (Euclid)89.43%7.07%0.08%3.42%39.6
SCP-2720 (Keter)80.97%18.69%0.15%0.19%39.1
SCP-5117 (Euclid)87.74%7.74%3.98%0.54%37.4
SCP-2719 (Euclid)81.02%12.06%6.55%0.37%33.4
SCP-5087 (Euclid)76.73%4.28%18.97%0.01%31.3
SCP-2325 (Euclid)74.73%15.76%8.75%0.76%29.9
SCP-4957 (Safe)66.79%7.34%5.92%19.94%26.1
SCP-1282 (Euclid)0.10%0.17%16.84%82.89%24.1
SCP-2628 (Keter)66.12%28.84%0.01%5.03%23.9
SCP-3212 (Safe)96.87%1.88%0.95%0.29%23.4
SCP-2253 (Safe)89.18%6.70%0.11%4.02%23.1
SCP-1970 (Euclid)61.35%18.26%19.82%0.56%22.3
SCP-1720 (Safe)15.41%5.08%51.90%27.61%22.2
SCP-4931 (Euclid)57.88%35.13%1.76%5.23%21.5
SCP-4027 (Euclid)1.66%23.07%0.70%74.57%20.3
SCP-3339 (Safe)68.94%4.01%25.80%1.25%19.5
SCP-3597 (Safe)30.33%5.00%51.08%13.58%18.8
SCP-4271 (Safe)30.33%5.00%51.08%13.58%18.8
SCP-3699 (Safe)63.14%0.69%36.16%9.627278703124244e-0518.7
SCP-3850 (Safe)63.14%0.69%36.16%9.627278703124244e-0518.7
SCP-2942 (Euclid)0.24%3.25%29.17%67.35%17.5
SCP-4390 (Safe)1.49%0.09%81.55%16.87%16.7
SCP-1466 (Euclid)41.30%42.79%1.69%14.23%15.8
SCP-4709 (Safe)11.16%3.68%75.17%9.99%14.8
SCP-5136 (Safe)24.60%0.24%75.15%0.02%13.3
SCP-4625 (Safe)22.76%6.84%69.99%0.41%12.2
SCP-2122 (Safe)2.32%0.03%97.21%0.44%10.3
SCP-4370 (Safe)0.22%2.348960832571748e-0599.69%0.09%9.9
SCP-3656 (Keter)45.94%40.07%0.02%13.97%4.7
SCP-2883 (Euclid)9.60%7.20%74.47%8.73%-2.3
SCP-4565 (Keter)0.46%1.42%0.17%97.95%-2.3
SCP-4579 (Keter)5.63%6.71%1.78%85.87%-2.7
SCP-4550 (Euclid)14.10%1.98%83.70%0.22%-3.1
SCP-1785 (Euclid)3.99%1.12%94.76%0.12%-8.7
SCP-4222 (Keter)3.16%5.64%18.99%72.21%-11.1
SCP-2699 (Euclid)1.08%94.06%1.22%3.63%-13.5
SCP-4412 (Keter)0.20%29.98%0.04%69.78%-18.6
SCP-2898 (Keter)0.05%45.01%1.2563922948750266e-0554.94%-27.2
SCP-4424 (Keter)2.21%47.78%0.02%49.98%-27.4
SCP-2964 (Keter)1.48%48.01%0.28%50.23%-28.1
SCP-2603 (Keter)14.84%69.03%2.74%13.39%-32.6
SCP-5058 (Keter)14.84%69.03%2.74%13.39%-32.6
SCP-3781 (Keter)11.84%73.45%0.46%14.25%-36.1
SCP-2626 (Keter)11.48%86.43%0.03%2.06%-43.5
SCP-1838 (Keter)10.16%86.05%3.42%0.37%-45.6
SCP-2116 (Keter)0.16%87.03%0.38%12.44%-51.2
SCP-4036 (Keter)4.11%92.78%0.90%2.21%-52.2
SCP-3577 (Keter)0.04%88.98%0.12%10.86%-52.2
SCP-2178 (Keter)0.23%88.86%0.87%10.05%-52.4
SCP-3279 (Keter)3.887982652533503e-0592.26%0.05%7.69%-54.1
SCP-4654 (Keter)1.27%94.03%3.74%0.96%-55.9

 

While not all objects are classifiable, in many cases we can be confident about what source an object came from.  The most profitable object in expectation if stolen, SCP-2797, is a Keter-class object: but its tags are innocent enough that we can map it to a >93% chance of coming from a Creator, and only a <7% chance of coming from a Villain.

Once good objects are identified, we want to send the optimal teams.  The best targets are those where we can get a 90% success rate: SCP-2797 is not actually our best target, as it is located in Shanghai (where we would ordinarily want to send a Legal team for a 90% success rate), but is Humanoid (so we need to send a different team and accept a 60% success rate).  It's still in our top 9 targets, though.

 

One example of an optimal strategy to maximize profit is to send:

Infiltration teams to retrieve SCP-3668, SCP-2719 and SCP-4449

Legal teams to retrieve SCP-4004, SCP-5117 and SCP-3273

Paramilitary teams to retrieve SCP-3440, SCP-2797 and SCP-3936.

 

One example of an optimal strategy to minimize profit is to send:

Infiltration teams to retrieve SCP-3781, SCP-2603 and SCP-4036

Legal teams to retrieve SCP-4654, SCP-3279 and SCP-2178

Paramilitary teams to retrieve SCP-3577, SCP-1838 and SCP-2116.

 

LEADERBOARD

PlayerExpected Profit
Optimal Play (max)$291.0 MM
GuySrinivasan (max)$169.5 MM
Measure (max)$155.9 MM
Yonge$151.4 MM
abstractapplic$146.9 MM
Pablo Repetto$143.2 MM
Random Play (Safe SCPs only)$104.1 MM
Entirely Random Play$15.0 MM
Random Play (Keter SCPs only)-$113.6 MM
GuySrinivasan (min)-$290.3 MM
Measure (min)-$323.9 MM
Optimal Play (min)-$358.0 MM

If you're interested in looking in more detail, you can add lines like the following into the code and run it:

print('\nEvaluating max payoff plan:')
evaluate_strategy( myWorld, infil=[ 3668, 2719, 4449 ], legal=[ 4004, 5117, 3273 ], paramil=[ 3440, 2797, 3936 ] )

Most players pursuing high profits avoided Keter objects.  GuySrinivasan's max-payoff plan (the most successful one) pursued 5 Safe, 4 Euclid and 0 Keter objects.  Most extremely, abstractapplic and Pablo Repetto pursued 8 Safe, 1 Euclid and 0 Keter.

While this approach was less risky if you couldn't distinguish good Keter objects from bad ones, it was not the highest-payoff approach: optimal play in fact pursued 3 Safe, 3 Euclid and 3 Keter objects (because the payoff from Keter Creator objects is the highest available, and several Keter SCPs can be fairly reliably identified as coming from Creators). 

Nevertheless, I support players who made this decision.  It's valuable to know what parts of a problem you can optimize at your current level of understanding and what parts to leave alone.   If you don't think you can distinguish good from bad objects at higher danger levels, trying to do that just risks shooting yourself in the foot.

FEEDBACK REQUEST

As usual, I'm interested in feedback.  If you played the scenario, what did you like and what did you not like?  If you might have played but in the end did not, what drove you away?  Is the timeline too long/too short/just right?  Is the underlying data structure too complicated to approach?  Or too simple to feel realistic?  Or both at once?

Thanks again to simon for the scenario idea (although he seems to have missed the scenario itself), and to abstractapplic for feedback on a draft, and thank you all for playing!

9 comments

Comments sorted by top scores.

comment by Randomini · 2022-02-22T21:09:19.287Z · LW(p) · GW(p)

Marshall appears irked that you didn't send any teams, but he is watching your presentation with interest. Carter, sitting next to him, is somewhat more laid back. Both of them seem annoyed at having lost a bet of some kind, forking over wads of... cash? It's hard to focus your eyes on whatever it is, before Darke squirrels them away into the depths of his shabby cloak.

"As you can see, once split out according to SEK classifications, the historical profits are... unnervingly linear. Especially when you consider that these are post-processing, to account for inflation and the like - there should be a much higher signal-to-noise ratio than we're seeing here. To the extent that I understand these things, I suspect that your entire database has been contaminated. We don't get the kind of volatility spikes you'd expect during the depression or the second world war - it seems to completely ignore most world events."

You aren't the kind of data scientist that gives answers to problems. You're the kind that just looks at the data, and figures out the story below the surface. And the story here is just... weird. There's the obvious cognitohazards, the standard messages from beyond hidden in the data (you get that all the time in Kaggle). But the thing that disturbed you the most wasn't those smaller patterns.

It was the almost perfectly straight lines.

Carter adjusted his tie nervously.

"These are definitely factually accurate to our records - excluding the obvious infohazardous corruptions, of course. We have paper trails here, physical paper trails, receipts signed in blood. I can assure you this information hasn't been manipulated."

"Then there's something else doing this, and it's probably beyond my paygrade to figure out exactly what. Since you seem unwilling to give me the supporting documentation for these objects, that really is my best guess. All I can do is bring the issue to your attention."

"Alright. We can... give you some more information. I'll have the documents sent to your desk. Let us know if you have any additional insights."

You nod curtly, and as you close the door behind you, you hear voices raising behind it. You shake your head. You don't want to know what they're discussing.

Marshall and Carter stare expectantly at Darke. Marshall demands answers:

"What did you do, and when did you do it?"

"Oh, I didn't do anything myself. But I could tell something was... off. I could taste it, in the air. Look, our physics isn't quite right, watch -"

Darke grabs two delicate wine glasses from the table by the stems, raises his arms, and lets go. The glasses drop to the floor and shatter.

"Look at the pieces."

Both glasses shattered into three identical pieces along two identical fault lines.

"These are the same glasses, identically. Nothing should break this perfectly. Someone has cheaped out on our universe."

Carter and Marshall glance at each other, confused. Darke turns his head towards an otherwise unremarkable spot on the wall, frowning intently. Iris stares into her monitor, looking into the eyes of the simulation of her distant progenitor.

"Fuck."

She hammers Alt-F4 as quickly as she can and turns to the couch you're lying in. She pulls off the headset - miracle of technology, this thing - and snaps her fingers a few times in your face. Wiping the crud out of your eyes, you reorient and remember how you got here and why.

"Well, that's certainly the weirdest result of that test I've seen so far. And we don't do much other than weird, so I guess you've got the job."

Replies from: Randomini
comment by Randomini · 2022-02-22T21:30:54.954Z · LW(p) · GW(p)

Also

"endings depending on your actions" my actions were 'view source'

also wondering if anyone else found the secret scarlet poem? I will leave its discovery as an exercise to the reader

Replies from: aphyer
comment by aphyer · 2022-02-22T23:06:03.908Z · LW(p) · GW(p)

Hey now, you can't criticize me for cheaping out on the universe and then turn around and cheap out on the analysis part with View Source!

Replies from: Randomini
comment by Randomini · 2022-02-22T23:10:12.247Z · LW(p) · GW(p)

...this is fair

comment by SarahNibs (GuySrinivasan) · 2022-02-22T21:35:21.126Z · LW(p) · GW(p)

My plan of "when out of time, do enough to capture pairwise stuff and combine with an adhoc heuristic" works surprisingly well in these. :D

Which SCPs are available for capture each quarter, how do we know how many teams of which types were available to capture them, and how were teams allocated in the past?

Feedback: my gut reaction is that it would have been extremely difficult to divine the underlying-source nature of the data. Do you have a suggested method or technique that would have been likely to work? I think that noticing the success rates of teams per locations should have been possible, then ignoring that going forward. But how to get from profits, past the sales/security noise, past the tags noise? Seems like maybe looking at the full matrix of P(tag A | tag B) could have revealed an underlying structure? I'll have to think on that one.

Also, my favorite part of this was tracking down SCP-1182, but I don't think a similar thing should be present in most D&D.Sci challenges. :D

Replies from: aphyer
comment by aphyer · 2022-02-22T23:04:41.503Z · LW(p) · GW(p)

Which SCPs are available for capture each quarter, how do we know how many teams of which types were available to capture them, and how were teams allocated in the past?

 

I've also edited the doc to add this: Aside from their dependence on tags, your predecessors' actions were almost entirely random, sending 2 teams of each type plus 1-5 additional random teams (1d3 at first, up to 1d4 in 1950 and 1d5 in 2000) to target random SCP objects.

Feedback: my gut reaction is that it would have been extremely difficult to divine the underlying-source nature of the data. Do you have a suggested method or technique that would have been likely to work?

I did not intend for it to be realistically possible for players to fully capture the underlying nature of the data.  The goal was to have a tag-to-profit mapping that followed reasonably simple rules, but exhibited complicated behaviors that provided a lot of depth for players to analyze.  My hoped-for pattern of 'how to solve this problem' took the form of:

  1. Sanity-checking the data.  What is up with these weird rows?  Get rid of them!
  2. Safe/Euclid objects are much more profitable on average.  Let's just try a bunch of random Safe objects with no Keter ones.
  3. Chasing down tags to do better than random at identifying profitable Safe objects.  ('Mechanical', 'Mobile' and 'Virtual' are probably your best friends here.)
  4. Keter objects have the highest profit numbers.  Is there a way we can get those?
  5. Chasing down tags to do better than random at identifying profitable Keter objects.  (This would require a lot more effort, including a lot of multi-tag effects like the 'Organic AND Mechanical' one, before it could make your payoff from this beat just pursuing Safe objects.  You do get a bonus ending though.)

Overall most people just did steps #1-3, which is not unexpected.  In your 'escape' plan you actually scored nearly as high as your 'invaluable' plan in expectation, going after SCP-2797, SCP-3688 and SCP-4004 (all  three Keter objects that showed up in the max-payoff scenario), but losing some payoffs on Euclid-class objects.

comment by abstractapplic · 2022-02-22T23:28:45.130Z · LW(p) · GW(p)

Reflections on my attempt:

My intuition at the start was that this was a "get a vague sense of how the dataset behaves, account for anomalies and sample biases, then throw ML at what's left of the problem" kind of challenge, and it looks like I was right on the money. It's a pity I didn't have time to do the last step, but still, I'm happy with my "unspectacularly guarantee I don't get fed to a chair" outcome.

Reflections on the challenge:

This is very good. In particular, it's a masterclass in how to do a D&D.Sci game as fanfiction. I'm also pleased to see it has a HTML interactive attached, so future players can easily evaluate their results.

I do think the structure of the challenge made it more complicated than it had to be. Getting a reliable solution requires players to figure out at least four layers: how SCPs behave, how our predecessors acted, what predicts team success, and what predicts profit. Finding some not-entirely-unreasonable contriavance to remove one of those layers ("we removed the last guy when we found out he was picking targets entirely at random"?) could have made the game more approachable, though I admit it would have cost some depth and realism.

comment by Pablo Repetto (pablo-repetto-1) · 2022-02-23T08:59:41.451Z · LW(p) · GW(p)

Hm. I expected to do terribly on this problem since I hardly exhausted my avenues of research (I didn't even clean up the infohazardous object, despite knowing about it from other's comments). I ended up doing the worst out of everybody, which tracks, but the results are still clustered rather close.

I enjoyed the problem a lot, and I'm very grateful to aphyer for pinging me when he made the problem available. Tragically, I was sick at the time. :P

comment by Measure · 2022-02-22T20:21:23.019Z · LW(p) · GW(p)

One example of an optimal strategy to minimize profit is to send:

Infiltration teams to retrieve SCP-3781, SCP-2603 and SCP-4036

Legal teams to retrieve SCP-4654, SCP-3279 and SCP-2178

Paramilitary teams to retrieve SCP-3577, SCP-1838 and SCP-2116.

Looks like I targeted 8/9 of these items but sent the wrong teams for some of them.