Designing serious games - a request for help

post by taryneast · 2011-03-22T11:29:35.803Z · LW · GW · Legacy · 41 comments

We need some ideas for serious games. Games that will help us be better. Games that reward us for improving ourselves (even if just by the satisfaction of seeing our scores improve). Games that will help us in our quest of Tsoyoku Naritai

We've got an upcoming hackday in London - where we'll have a (small) bunch of people able to code up any good ideas into something usable... but we need **you** to help us come up with a whole bunch of good ideas. 

To start with, they should be simple ideas - not as complex as Rationalist Clue (which is an awesome idea... but we all have dayjobs too). I've got in mind something like the kinds of games you see at luminosity

The ideas should address individual biases - a way of training us to: a) recognise when we've accidentally engaged a bias b) reward us when we find a way to get the "right answer" in an unbiased manner.

 

We can do the programming (more help would of course be welcome), we can even come up with some ideas of our own... 

but we are few, and you are many... and the more ideas we get, the better we can choose between them... so let's roll.

41 comments

Comments sorted by top scores.

comment by taryneast · 2011-03-22T12:01:50.025Z · LW(p) · GW(p)

To get the ball rolling, here's an idea based on the "predicting red or blue" card game focusing on calibration, mentioned here: Lawful uncertainty.

The user will be presented with cards one at a time. The cards will be either red or blue. There are X cards in the pack (maybe 50? enough for a good run). At the start of the game, the proportion of red:blue is displayed as percentages.

There will be different types of card-sequences. a) purely random b) an obvious pattern (regularity or increasing numbers of blue etc) c) a combination of the two.

When game begins, the cards will be "turned over" one at a time. before the next card is turned over, the user is prompted to "guess the colour" - they have two buttons - red and blue ad clicking on one of those buttons is their guess... and also causes the card to be turned over.

The user starts with a score of 100%. If the user got the card right, the score does not change. Otherwise it is reduced (can't rem the exact function for this - probably need to re-read the article).

A user plays to beat their previous score. High scores are stored for the top ten.

comment by Emile · 2011-03-22T13:48:47.881Z · LW(p) · GW(p)

The user starts with a score of 100%. If the user got the card right, the score does not change. Otherwise it is reduced (can't rem the exact function for this - probably need to re-read the article).

If the player gives probabilities instead of choices, you can have him lose points equal to the log of the probability he assigned to what actually happened (minus the log actually, since it'll be negative). In that case giving honest probability estimates is the optimal strategy.

comment by taryneast · 2011-03-22T16:54:42.092Z · LW(p) · GW(p)

Sounds like level two. :)

comment by Manfred · 2011-03-22T17:32:51.887Z · LW(p) · GW(p)

You could try to teach pessimism in planning as a subgoal, e.g. by someone asking the player to predict their performance on the next task, as measured by a metric that they weren't just measured on.

comment by taryneast · 2011-03-23T07:39:54.284Z · LW(p) · GW(p)

Can you give me an example of this in game-format? What sorts of tasks would be fun as well as being predictable?

comment by Manfred · 2011-03-23T09:35:00.026Z · LW(p) · GW(p)

For an example, consider a puzzle game where you put together pieces to reach a goal. If well-done, plenty fun. Show the player some sort of level information - how many pieces will this level take you? How much time? How many times will you have to use the "delete a block" ability? How many times will you have to restart the level? What will your score be?

comment by taryneast · 2011-03-23T15:48:55.054Z · LW(p) · GW(p)

Hmmm that could be interesting... and can be attached to existing game-formats.

Predictive sudoku?

comment by Alexei · 2011-03-25T18:14:56.059Z · LW(p) · GW(p)

You can make the game score based. The final score will be inversely proportionate to the time it took to finish the puzzle (which will make the player want to solve it faster) AND proportionate to how closely they've predicted the time it will take them to solve the puzzle. (Solving under the predicted time will give the maximum score for the predicted time, since the player can just wait anyway.)

Example: I guess 10 minutes, solve the puzzle in 8 minutes -> I get score for 10 minutes. Example: I guess 7 minutes, solve the puzzle in 8 minutes -> I get score for 8 minutes - 1 minute penalty.

comment by taryneast · 2011-03-25T21:09:20.432Z · LW(p) · GW(p)

That works - though there should be a penalty for guessing overtime too - otherwise I'd always just guess 9999 minutes :)

comment by Alexei · 2011-03-25T21:41:49.566Z · LW(p) · GW(p)

If you notice, I accounted for that. The higher your guess is, the lower your score can be. If I guess 1 minute and finish in 1 minute, I get (let's say) one million points. If I guess 9999 minutes, then I'll get the score for 9999 minutes even if I finish sooner, for a score of (let's say) 2 points.

comment by taryneast · 2011-03-26T17:51:57.935Z · LW(p) · GW(p)

Ah ok - now I get it.

comment by Oscar_Cunningham · 2011-03-23T10:07:50.506Z · LW(p) · GW(p)

Beware the Ludic Fallacy!

comment by Emile · 2011-03-22T13:17:41.563Z · LW(p) · GW(p)

A few ideas have been kicked around in various threads, that are referenced on the Wiki; that could be a nice background for brainstorming.

comment by taryneast · 2011-03-22T13:46:15.980Z · LW(p) · GW(p)

Ah, thanks for that! Just what we need.

comment by taryneast · 2011-03-22T17:22:07.733Z · LW(p) · GW(p)

Ok, another idea... based on the "sunk cost" bias.

It's a simple investment game. User starts with a set amount of capital.

For the simplest version, you can invest in Stock A or Stock B - whose price fluctuates randomly (we assume we haven't enough money to affect the stock prices ourselves).

At any time, you can see how much your stock is worth now... and how much you have gained/lost so far... but also how much you could have gained if you switched strategy. Thus making it abundantly clear when it's better for you to switch to the other stock -> you eventually get better at noticing when your sunk-cost bias is actually hindering you.

probably needs some more work... eg more detail on how to calculate the various factors...

comment by Manfred · 2011-03-22T19:49:49.663Z · LW(p) · GW(p)

An interesting exercise you might use to attack privileging the hypothesis:

Start with a privileged hypothesis, ask the player to estimate its probability. Then ask the player to generate a bunch of alternate hypotheses. Then ask them to re-assess the probability of the initial hypothesis. Random line idea: "it could either happen or it could not happen - so there's a 50% chance, right?"

comment by taryneast · 2011-03-23T07:41:04.312Z · LW(p) · GW(p)

Can you give me the examples of the hypotheses that can be used to test the player? what is the scoring mechanism?

Remember - we don't have another person doing this - it all has to be coded into the program - so you'll have to spell it out to us so we can spell it out in code.

comment by Manfred · 2011-03-23T09:52:16.690Z · LW(p) · GW(p)

Scoring or checking that would definitely be difficult unless the game had a syntax for hypotheses, something like drawing causal arrows between bubbles. But that would at the same time make it boring and unhelpful to make hypotheses. I guess it's not really game material.

I suppose you could have the hypotheses written by the designers and then have various characters espouse them over the course of the game. But that would make the exercise very minor and not worth it.

Soooo nevermind.

comment by Armok_GoB · 2011-03-22T17:38:26.909Z · LW(p) · GW(p)

One thing that's important to AVOID is the things that come from the world being artificial and made specifically for challenges, things like there always being a way to succeed in the given task even if it seems not to, or the player always turning out to be important without a clear cause, or a multitude of other things, just look through TV tropes for more. In short, most games are stories. If you want the game to train you in thinking like reality you have to make the game more similar to reality (or make it very abstract).

So, some properties a reality like game might have:

  • Procedurally generated environments, with no special checks for solvability or indeed any causal arrows coming in from what the player might do in them.

  • The player is not placed in a special character. Either the player is the inly intelligence around, ir the player is given control of a random sample from a large population where everyone have interesting enough lives, or the player has to find and chose someone interesting from a randomly generated population. The middle solution is probably the most realistic.

  • Lots of things are Hard Work. No gaining advantages quickly as a side effect of accomplishing your other goals, or by killing random things. You have to earn your skills.

  • Brutal and unforgiving. A single mistake can make everything you have come crashing down, and there is no reloading from a saved game. Even usually harmless things can permanently cripple you and the only way to avoid this is being observant and trying to understand what makes things tick so you can be out of the blast radius when they finally tock. If there is combat it is short and even if you will can leave you mortally wounded. The game is NOT fair, and things having consequences disproportionate to the crime is the rule rather than the exception.

  • You don't really know what the goal is until you win. The game constantly give you "intuition" hints when you LOCALLY approach it. The goal is different each time you play.
  • Some NPC characters will try to detect any behaviour that might be smarter than them, and get hostile to anything that seems to be getting an advantage over them. The player can either act dumb and ignore obvious opportunities, avoid interacting with NPCs, or try to grab power to fast for the NPCs to stop them.

Lots of these sound similar to the rougelike genre I think, those also tend to require less development effort than most other games.

comment by atucker · 2011-03-23T00:51:52.305Z · LW(p) · GW(p)

This reminds me of a game where you explore a sandbox world, in which most players choose to harvest and rearrange the blocks that make up the terrain in order to build whatever they decide they should build.

At night, monsters come out and attack you, so you should build a protective structure or have weapons and constant vigilance.

Specifically,

  • It has a procedurally generated world.
  • The player is the only intelligence.
  • Things are pretty hard, and there's no leveling up to make it easier. It takes a while to physically reshape the world into what you want.
  • Its super brutal. Like, an improperly controlled fire will burn down a forest, and probably kill you while you're trying to put it out. And there's lots of ways to kill yourself and generally ruin everything.

Its also a timesink that I wasted a week or two playing before quitting, so I'm ROT13ing its name. zvarpensg. In case you're interested. It's also behind a paywall and pretty buggy, so hopefully that will discourage long-term casual play.

EDIT: Its not particularly rational, but it does have a lot of the traits mentioned.

comment by Normal_Anomaly · 2011-03-23T01:15:36.056Z · LW(p) · GW(p)

I know someone who plays this game constantly. It's fun, but it doesn't seem like a rationality builder. Said obsessed person doesn't think it's taught ver anything really useful.

comment by atucker · 2011-03-23T01:19:39.836Z · LW(p) · GW(p)

Agreed. It's not particularly rational, but it just had the qualities described.

comment by kpreid · 2011-03-23T21:12:42.957Z · LW(p) · GW(p)

I think it may have some small benefit in practice-of-thinking, if you get into mechanism building:

  • Solving problems in digital logic and 3-dimensional spatial layout.
  • Planning for sufficient space available for a mechanism, especially if you want it hidden in walls between rooms and no bigger than necessary.
  • Debugging.

But the majority of the time spent is probably no better than any other grinding.

comment by Alexei · 2011-03-25T18:18:47.551Z · LW(p) · GW(p)

A simple version of this would be the GROW flash game series. Here is one

comment by [deleted] · 2011-03-23T02:00:56.714Z · LW(p) · GW(p)

So... Dwarf Fortress?

comment by Armok_GoB · 2011-03-23T11:21:16.572Z · LW(p) · GW(p)

Hey, have a look at my name... :p

comment by taryneast · 2011-03-23T15:55:41.533Z · LW(p) · GW(p)

Hiya, while I greatly appreciate your effort to help us improve the quality of games we build... I don't want to turn any idea away just yet (no matter how terribly tropey).

We're still in the early brainstorming phase of this - and it's much better just to let the ideas pour out - regardless of how bad they are. Engaging the internal editor too early quashes that natural flow. :)

Also - the games you describe above sound really interesting... but probably too big for what we've got in mind to begin with... my little red+blue card game might be built over a weekend, rationalist clue (which would fit your requirements and be totally awesome) - would take months of work.

Lets start with the simple stuff - even if it's tropey.

besides, somewhere once said:

"Tropes Are Tools, not clichés. They are plot devices and progressions (similar to but more defined than literary devices) that have been around for a long time because they work, and there's no inherent loss of complexity through the use of them (most of the time). "

comment by Armok_GoB · 2011-03-23T18:47:01.033Z · LW(p) · GW(p)

I never said all tropes are bad. And even saying SOME are bad was only in the context of training rationality. Most tropes are, however, divergences from reality that the human brain are prone to make, and thus looking for them might be useful in making a game more similar to reality. This is different from realism by the way, realism is about fact inside the work of fiction having the same answers as IRL, the reality-similarity we want for a rationalist game however are on the decision theoretical level, and in fact the first kind of realism might be directly bad as it allows players to "cheat" and not find the answers out themselves using the ingame tools.

Anyway, I am brainstorming, I am not constraining the configuration space of possible games, I'm just pointing toward a section of it where it think it might be a good idea to start looking.

comment by Cyan · 2011-03-24T21:29:44.960Z · LW(p) · GW(p)

This article analyzing Angry Birds seems like it would be useful to you. (via)

comment by taryneast · 2011-03-25T21:34:27.340Z · LW(p) · GW(p)

Really awesome link, thanks. :)

comment by virtualAdept · 2011-03-22T18:51:52.188Z · LW(p) · GW(p)

Here's an idea for a game to train awareness of/resistance to confirmation bias:

The game would consist of three phases, that could then be repeated for however many iterations (levels!) were desired.
1) Presenting and habituating the "theory." Basically, give a set of rules for making some kind of decision/prediction, and then have the player apply those rules to a series of scenarios that clearly illustrate the supposed predictive (or score-increasing, if you will) properties of the Theory.
2) "In the wild" - Now present a series of scenarios that each either offer evidence that the Theory from phase 1 is useful (+), evidence that the Theory is incorrect(-), or no clear evidence in either direction (null).
3) "Assessment" - Have the player estimate the relative frequencies of (+), (-), and (null) evidence given in phase 2. Player receives an iteration score based on the accuracy of these estimates, and a cumulative score over all iterations completed.

Later iterations (higher levels) could perhaps re-use multiple Theories for the same round, and then in phase 3 ask for evidence estimates for all the Theories at once, possibly even throwing in Theories for which no evidence was presented in the second phase. Higher levels of complexity bring higher stakes (larger increases for accuracy and larger decreases for inaccuracy), so a player who could continue to improve the cumulative score with increases in difficulty would be doing very well indeed.

I've spoken of the Theories and Evidence in the purely abstract here, but I'm picturing either color/shape patterns and movements, or plausibly realistic word problem-type scenarios. The former would be preferable since it would not involve importing the player's biases about situations that might be found in the real world... or actually, come to think of it, it might be interesting and/or useful to make use of realistic-seeming examples precisely for that reason. Huh.

Anyway. The scoring algorithm would reward players who most aggressively sought out (-) evidence for the active Theory or Theories.

comment by Emile · 2011-03-26T10:05:37.723Z · LW(p) · GW(p)

That sounds vaguely similar to the game of Eleusis, which I'm surprised wasn't mentioned yet.

And of course, Zendo.

comment by taryneast · 2011-03-23T15:59:05.299Z · LW(p) · GW(p)

I'm not sure I follow how to turn this into a computer game. Can you give me an in-game example of the "set of rules for making some kind of decision/prediction"? Also the set of scenarios and how to "apply those rules"?

Remember that we have to spell this stuff out for a computer to be able to understand.

comment by virtualAdept · 2011-03-23T19:36:40.171Z · LW(p) · GW(p)

This is the simplest sort of example that I was picturing as I wrote the suggestion - it might not be sophisticated enough as described below to be sufficiently challenging.

I also changed my mind a bit about how phase 1 should be structured, so I'll work that in.

A "scenario" is a box on the screen that is populated by colored shapes that move around like paramecia on a microscope slide, and interact with each other according to the rules for the current round of the game. The scenario ends after a short time period (20-40 seconds) and freezes in its End State. This is what the player will be trying to predict.

Phase 1: Several scenarios are presented in sequence. Each scenario consists of colored shapes interacting with one another - they might bounce off one another and change colors; they might split after a collision or spontaneously; they might eat one another, etc. The interactions show a pattern over the multiple scenarios, such that an observer will eventually start to form predictions about the end state of the system in each scenario. After the pattern has been demonstrated, the player could be asked to code a decision tree for prediction of the end state based on the pattern observed (or this step could actually be skipped, and the Phase 3 predictions just compared to the implicit ruleset used for the pattern without ever making sure the player knows it). Several more scenarios are presented where the player is asked to predict the final state (following the same ruleset as the earlier patterns).

A very simple example of such a ruleset could be as follows:

  • If there are a circle and a square of the same color, they will collide.
  • If a red circle collides with a red square, they will each split into two of themselves.
  • If a blue circle collides with a blue square, the circle will 'eat' the square.
  • If a circle and square of any other color collide, their states will not change after collision.

Phase 2: A given number of scenarios are presented (including the end state). This number is available to the player (ie, the player does not have to keep count of the total numbe (+ evidence). Some explicitly violate these rules (with varying degrees of blatancy - using the ruleset above, one scenario might contain only one pair of shapes that did follow the applicable rule, while another scenario might contain five pairs that misbehaved) (- evidence). Some contain shape/color combos that simply do not contain the right combinations to illustrate the rule (null evidence).

Phase 3: The player is asked to report the relative amounts of (+), (-), and (null) evidence presented in Phase 2.

There is one underlying ruleset per round of the game. Rounds can and should sometimes have rules that contradict rules from previous rounds. The rulesets increase in complexity each time a new round is begun.

Difficulty would increase with complexity of rulesets. Requiring the player to explicitly state the ruleset inferred in Phase 1 would probably make it easier. Introducing interacting symbols that have meaning beyond the bounds of the game (words or pictures) instead of the shapes would likely increase difficulty by requiring the player to ignore prior associations and biases attached to the symbols being used.

Does that make the idea a bit clearer?

comment by taryneast · 2011-03-24T19:01:15.080Z · LW(p) · GW(p)

actually yeah - this is a great idea.

We could probably start by coding up a simplified version of this - just to get something done... then add more fo the complex features after that.

For example a good starting point would be for phase 1 predictions to just ask a (randomised) set of multi-choice or simple write-in questions for predictions: eg "how many red squares will there be at the end? in which part of the screen will the blue circle end up?" etc.

I reckon that in the first "level" they could start by estimating a probability, rather than jumping straight into weightings of evidence? We could then introduce evidence weighting as a "level 2"? What do you think? Would that totally change the nature of what it's teaching too much?

after we've got that working, we could then figure out how to get the user to describe the ruleset to the computer in a flexible way. That's actually a Tough Problem, BTW. It's basically forming a mini-language... so definitely on the books, but probably not the first iteration. :)

comment by virtualAdept · 2011-03-25T00:15:37.298Z · LW(p) · GW(p)

after we've got that working, we could then figure out how to get the user to describe the ruleset to the computer in a flexible way. That's actually a Tough Problem, BTW. It's basically forming a mini-language... so definitely on the books, but probably not the first iteration. :)

Yeah, I realized that as I was writing the longer example, and also that it wasn't strictly necessary. Interesting, but not necessary. =)

Your description of phase 1 prediction coding is very close to what I was picturing, and having a randomized set of questions rather than just saying "predict the final state" (in entirety) would give more game repeatability for less code if I understand correctly.

I actually really like the idea of having them just give a probability estimate the first time, or the first few times. I'm betting that will make for an increased effect of confirmation bias in those stages, and that their scores will improve when they're forced to itemize evidence weights - which illustrates a point about confirmation bias as well as tying into the kind of thought process needed for Bayesian prediction.

(If you were to get as far as trying to code the user-described ruleset bit... I'd suggest finding someone who's played Dragon Age and ask about the custom tactics options. I think that sort of format would work, as long as the number of total types of game objects and operators stayed relatively small.)

comment by Gray · 2011-03-22T21:47:16.416Z · LW(p) · GW(p)

Do existential graphs count?

comment by taryneast · 2011-03-23T15:59:42.294Z · LW(p) · GW(p)

I don't know - do they? :)

How would you turn that paper into the format of a small game?

comment by Gray · 2011-03-23T22:55:14.130Z · LW(p) · GW(p)

Eh, good point. I'm still learning them myself, but they are sort of gamey in that they are a visual/diagrammic way of representing predicate and/or modal logic, but I'm not sure what winning or losing would correspond to. Peirce even suggests it as a sort of game between the proposer and the doubter of the proposition, as the two sides take turns trying to prove either the argument is valid or invalid.

comment by taryneast · 2011-03-24T17:54:56.231Z · LW(p) · GW(p)

Sounds interesting. Could indeed be game-potential in there. You'll probably be in a better position to come up with them if there are... if you spot any good ones, let us know :)

comment by Gray · 2011-03-24T18:03:03.703Z · LW(p) · GW(p)

Yeah, this is the paper I've been studying for some time now, and I've been starting to draw my own existential graphs. I'm just not good enough at it yet to say either way on this topic though.