Posts
Comments
I'm confused about how continuity poses a problem for "This sentence has truth value in [0,1)" without also posing an equal problem for "this sentence is false", which was used as the original motivating example.
I'd intuitively expect "this sentence is false" == "this sentence has truth value 0" == "this sentence does not have a truth value in (0,1]"
On my model, the phrase "I will do X" can be either a plan, a prediction, or a promise.
A plan is what you intend to do.
A prediction is what you expect will happen. ("I intend to do my homework after dinner, but I expect I will actually be lazy and play games instead.")
A promise is an assurance. ("You may rely upon me doing X.")
How about this: I train on all available data, but only report performance for the lots predicted to be <$1000?
This still feels squishy to me (even after your footnote about separately tracking how many lots were predicted <$1000). You're giving the model partial control over how the model is tested.
The only concrete abuse I can immediately come up with is that maybe it cheats like you predicted by submitting artificially high estimates for hard-to-estimate cases, but you miss it because it also cheats in the other direction by rounding down its estimates for easier-to-predict lots that are predicted to be just slightly over $1000.
But just like you say that it's easier to notice leakage than to say exactly how (or how much) it'll matter, I feel like we should be able to say "you're giving the model partial control over which problems the model is evaluated on, this seems bad" without necessarily predicting how it will matter.
My instinct would be to try to move the grading closer to the model's ultimate impact on the client's interests. For example, if you can determine what each lot in your data set was "actually worth (to you)", then perhaps you could calculate how much money would be made or lost if you'd submitted a given bid (taking into account whether that bid would've won), and then train the model to find a bidding strategy with the highest expected payout.
But I can imagine a lot of reasons you might not actually be able to do that: maybe you don't know the "actual worth" in your training set, maybe unsuccessful bids have a hard-to-measure opportunity cost, maybe you want the model to do something simpler so that it's more likely to remain useful if your circumstances change.
Also you sound like you do this for a living so I have about 30% probability you're going to tell me that my concerns are wrong-headed for some well-studied reason I've never heard of.
I think you're still thinking in terms of something like formalized political power, whereas other people are thinking in terms of "any ability to affect the world".
Suppose a fantastically powerful alien called Superman comes to earth, and starts running around the city of Metropolis, rescuing people and arresting criminals. He has absurd amounts of speed, strength, and durability. You might think of Superman as just being a helpful guy who doesn't rule anything, but as a matter of capability he could demand almost anything from the rest of the world and the rest of the world couldn't stop him. Superman is de facto ruler of Earth; he just has a light touch.
If you consider that acceptable, then you aren't objecting to "god-like status and control", you just have opinions about how that control should be exercised.
If you consider that UNacceptable, then you aren't asking for Superman to behave in certain ways, you are asking for Superman to not exist (or for some other force to exist that can check him).
Most humans (probably including you) are currently a "prisoner" of a coalition of humans who will use armed force to subdue and punish you if you take any actions that the coalition (in its sole discretion) deems worthy of such punishment. Many of these coalitions (though not all of them) are called "governments". Most humans seem to consider the existence of such coalitions to be a good thing on balance (though many would like to get rid of certain particular coalitions).
I will grant that most commenters on LessWrong probably want Superman to take a substantially more interventionist approach than he does in DC Comics (because frankly his talents are wasted stopping petty crime in one city).
Most commenters here still seem to want Superman to avoid actions that most humans would disapprove of, though.
Then we're no longer talking about "the way humans care about their friends", we're inventing new hypothetical algorithms that we might like our AIs to use. Humans no longer provide an example of how that behavior could arise naturally in an evolved organism, nor a case study of how it works out for people to behave that way.
My model is that friendship is one particular strategy for alliance-formation that happened to evolve in humans. I expect this is natural in the sense of being a local optimum (in the ancestral environment), but probably not in the sense of being simple to formally define or implement.
I think friendship is substantially more complicated than "I care some about your utility function". For instance, you probably stop valuing their utility function if they betray you (friendship can "break"). I also think the friendship algorithm includes a bunch of signalling to help with coordination (so that you understand the other person is trying to be friends), and some less-pleasant stuff like evaluations of how valuable an ally the other person is and how the friendship will affect your social standing.
Friendship also appears to include some sort of check that the other person is making friendship-related-decisions using system 1 instead of system 2--possibly as a security feature to make it harder for people to consciously exploit (with the unfortunate side-effect that we penalize system-2-thinkers even when they sincerely want to be allies), or possibly just because the signalling parts evolved for system 1 and don't generalize properly.
(One could claim that "the true spirit of friendship" is loving someone unconditionally or something, and that might be simple, but I don't think that's what humans actually implement.)
You appear to be thinking of power only in extreme terms (possibly even as an on/off binary). Like, that your values "don't have power" unless you set up a dictatorship or something.
But "power" is being used here in a very broad sense. The personal choices you make in your own life are still a non-zero amount of power to whatever you based those choices on. If you ever try to persuade someone else to make similar choices, then you are trying to increase the amount of power held by your values. If you support laws like "no stealing" or "no murder" then you are trying to impose some of your values on other people through the use of force.
I mostly think of government as a strategy, not an end. I bet you would too, if push came to shove; e.g. you are probably stridently against murdering or enslaving a quarter of the population, even if the measure passes by a two-thirds vote. My model says almost everyone would endorse tearing down the government if it went sufficiently off the rails that keeping it around became obviously no longer a good instrumental strategy.
Like you, I endorse keeping the government around, even though I disagree with it sometimes. But I endorse that on the grounds that the government is net-positive, or at least no worse than [the best available alternative, including switching costs]. If that stopped being true, then I would no longer endorse keeping the current government. (And yes, it could become false due to a great alternative being newly-available, even if the current government didn't get any worse in absolute terms. e.g. someone could wait until democracy is invented before they endorse replacing their monarchy.)
I'm not sure that "no one should have the power to enforce their own values" is even a coherent concept. Pick a possible future--say, disassembling the earth to build a Dyson sphere--and suppose that at least one person wants it to happen, and at least one person wants it not to happen. When the future actually arrives, it will either have happened, or not--which means at least one person "won" and at least one person "lost". What exactly does it mean for "neither of those people had the power to enforce their value", given that one of the values did, in fact, win? Don't we have to say that one of them clearly had enough power to stymie the other?
You could say that society should have a bunch of people in it, and that no single person should be able to overpower everyone else combined. But that doesn't prevent some value from being able to overpower all other values, because a value can be endorsed by multiple people!
I suppose someone could hypothetically say that they really only care about the process of government and not the result, such that they'll accept any result as long as it is blessed by the proper process. Even if you're willing to go to that extreme, though, that still seems like a case of wanting "your values" to have power, just where the thing you value is a particular system of government. I don't think that having this particular value gives you any special moral high ground over people who value, say, life and happiness.
I also think that approximately no one actually has that as a terminal value.
In the context of optimization, values are anything you want (whether moral in nature or otherwise).
Any time a decision is made based on some value, you can view that value as having exercised power by controlling the outcome of that decision.
Or put more simply, the way that values have power, is that values have people who have power.
I feel like your previous comment argues against that, rather than for it. You said that people who are trapped together should be nice to each other because the cost of a conflict is very high. But now you're suggesting that ASIs that are metaphorically trapped together would aggressively attack each other to enforce compliance with their own behavioral standards. These two conjectures do not really seem allied to me.
Separately, I am very skeptical of aliens warring against ASIs to acausally protect us. I see multiple points where this seems likely to fail:
- Would aliens actually take our side against an ASI merely because we created it? If humans hear a story about an alien civilization creating a successor species, and then the successor species overthrowing its creators, I do not expect humans to automatically be on the creators' side in this story. I expect humans will take a side mostly based on how the two species were treating each other (overthrowing abusive masters is usually portrayed as virtuous in our fiction), and that which one of them is the creator will have little weight. I do not think "everyone should be aligned with their creators" is a principle that humans would actually endorse (except by motivated reasoning, in situations where it benefits us).
- Also note that humans are not aligned with the process that produced us (evolution) and approximately no humans think this is a problem
- Even if the aliens sympathize with us, would they care enough to take expensive actions about it?
- Even if the aliens would war to save us, would the ASI predict that? It can only acausally save us if the ASI successfully predicts the policy. Otherwise, the war might still happen, but that doesn't help us.
- Even if the ASI predicts this, will it comply? This seems like what dath ilan would consider a "threat", in that the aliens are punishing the ASI rather than enacting their own BATNA. It may be decision-theoretically correct to ignore the threat.
- This whole premise, of us being saved at the eleventh hour by off-stage actors, seems intuitively like the sort of hypothesis that would be more likely to be produced by wishful thinking than by sober analysis, which would make me distrust it even if I couldn't see any specific problems with it.
I don't see why either expecting or not-expecting to meet other ASIs would make it instrumental to be nice to humans.
I have an intuition like: Minds become less idiosyncratic as they grow up.
A couple of intuition pumps:
(1) If you pick a game, and look at novice players of that game, you will often find that they have rather different "play styles". Maybe one player really likes fireballs and another really like crossbows. Maybe one player takes a lot of risks and another plays it safe.
Then if you look at experts of that particular game, you will tend to find that their play has become much more similar. I think "play style" is mostly the result of two things: (a) playing to your individual strengths, and (b) using your aesthetics as a tie-breaker when you can't tell which of two moves is better. But as you become an expert, both of these things diminish: you become skilled at all areas of the game, and you also become able to discern even small differences in quality between two moves. So your "play style" is gradually eroded and becomes less and less noticeable.
(2) Imagine if a society of 3-year-olds were somehow in the process of creating AI, and they debated whether their AI would show "kindness" to stuffed animals (as an inherent preference, rather than an instrumental tool for manipulating humans). I feel like the answer to this should be "lol no". Showing "kindness" to stuffed animals feels like something that humans correctly grow out of, as they grow up.
It seems plausible to me that something like "empathy for kittens" might be a higher-level version of this, that humans would also grow out of (just like they grow out of empathy for stuffed animals) if the humans grew up enough.
(Actually, I think most humans adults still have some empathy for stuffed animals. But I think most of us wouldn't endorse policies designed to help stuffed animals. I'm not sure exactly how to describe the relation that 3-year-olds have to stuffed animals but adults don't.)
I sincerely think caring about kittens makes a lot more sense than caring about stuffed animals. But I'm uncertain whether that means we'll hold onto it forever, or just that it takes more growing-up in order to grow out of it.
Paul frames this as "mostly a question about idiosyncrasies and inductive biases of minds rather than anything that can be settled by an appeal to selection dynamics." But I'm concerned that might be a bit like debating the odds of whether your newborn human will one day come to care for stuffed animals, instead of whether they will continue to care for them after growing up. It can be very likely that they will care for a while, and also very likely that they will stop.
I strongly suspect it is possible for minds to become quite a lot more grown-up than humans currently are.
(I think Habryka may have been saying something similar to this.)
Still, I notice that I'm doing a lot of hand-waving here and I lack a gears-based model of what "growing up" actually entails.
Speaking as a developer, I would rather have a complete worked-out example as a baseline for my modifications than a box of loose parts.
I do not think that the designer mindset of unilaterally specifying neutral rules to provide a good experience for all players is especially similar to the negotiator mindset of trying to make the deal that will score you the most points.
I haven't played Optimal Weave yet, but my player model predicts that a nontrivial fraction of players are going to try to trick each other during their first game. Also I don't think any hidden info or trickery is required in order for rule disagreements to become an issue.
then when they go to a meetup or a con, anyone they meet will have a different version
No, that would actually be wonderful. We can learn from each other and compile our best findings.
That's...not the strategy I would choose for playtesting multiple versions of a game. Consider:
- Testers aren't familiar with the mainline version and don't know how their version differs from it, so can't explain what their test condition is or how their results differ
- You don't know how their version differs either, or even whether it differs, except by getting them to teach you their full rules.
- There's a high risk they will accidentally leave out important details of the rules--even professional rulebooks often have issues, and that's not what you'll be getting. So interpreting whatever feedback you get will be a significant issue.
- You can't guarantee that any particular version gets tested
- You can't exclude variants that you believe are not worth testing
- You can't control how much testing is devoted to each version
- Many players may invent bad rules and then blame their bad experience on your game, or simply refuse to play at all if you're going to force them to invent rules, so you end up with a smaller and less-appreciative playerbase overall
The only real advantage I see to this strategy is that it may result in substantially more testers than asking for volunteers. But it accomplishes that by functionally deceiving your players about the fact that they're testing variants, which isn't a policy I endorse, either on moral or pragmatic grounds.
Most of the people that you've tricked into testing for you will never actually deliver any benefits to you. Even among volunteers, only a small percentage of playtesters actually deliver notable feedback (perhaps a tenth, depending on how you recruit). Among people who wouldn't have volunteered, I imagine the percentage will be much lower.
[failed line of thought, don't read]
Maybe limit it to bringing 1 thing with you? But notice this permits "stealing" items from other players, since "being carried" is not a persistent state.
"longer descriptions of the abilities"
I'd like that. That would be a good additional manual page, mostly generated.
If you're imagine having a computer program generate this, I'm not sure how that could work. The purpose is not merely to be verbose, but to act as a FAQ for each specific ability, hopefully providing a direct answer whatever question prompted them to look that ability up.
If you aren't familiar with this practice, maybe take a look at the Dominion rulebook as an example.
there were a lot of things I didn't want to be prescriptive about, and I figured they could guess a lot of it as they approached their own understanding of how the game should be and how things fit together, and I want to encourage people to fully own their understanding of the reasons for the rules.
I think this is a bad idea. Games are complex machines with many interlocking parts and they trade off between many goals; even an experienced developer can't generally fill in gaps and expect things to work on the first try. This goes double for you because you are deliberately making an unusual game. Asking people to invent some of the rules is placing a pretty significant burden on them.
And if the designer fails to supply at least one example of a good rule, this makes me question whether they ever successfully found a good option themselves, and whether any possible rule could be filled in that would lead to a good result. (Imagine a blueprint for a perpetual motion machine with a box labeled "you can put whatever you want here.")
Additionally, an interoperable standard makes it much easier for people to play with strangers. If every friend-group invents their own version, then when they go to a meetup or a con, anyone they meet will have a different version. An official version makes it easier to join a new group.
That's without even getting into the psychological hangups people have about official rules, which are considerable. I think board games tap into human instincts about societal and ethical rules, and human instincts would like to pretend that there really is a correct rule somewhere in the Platonic realm, and maybe it's unclear but it can't just be missing and it's certainly not up for grabs. I've seen people get pretty upset at the suggestion that there is no "right" rule for something because the designer simply never considered it, or that the version 2 of the rules is in some sense "a different game" from version 1. My pet theory is that this is an evolved security measure to not let cheaters escape punishment merely by claiming that there's a problem with the formal legal code (regardless of whether that claim is correct).
I do not think you need to worry about blocking people from making up their own variations. My impression is that most hobbyist board gamers who would like to have house rules will go ahead use them no matter what you say on the matter. (This won't stop people from clamoring for you to canonize their preferred rule, but I don't think you can stop that by leaving rules unspecified, either.)
If you insist on doing this anyway, you should at least be clear about it in the document itself. You are bucking the cultural expectations of the board game hobbyist community, which expect games to be fully defined when we get them. People are not likely to infer that you intend them to make up their own details if you don't spell that out.
I'm surprised you wouldn't just assume I meant one space, given a lack of further details.
That was my highest-probability guess.
I do not like rulebooks that require me to guess what the rules are.
I also picked examples based on how glaring the absence of a rule seemed (and therefore how much evidence it gave that I was reading the wrong document), rather than based on how hard it was to guess the most likely answer. If I was more focused on omissions that are hard to guess, I might have asked "can 2 pawns occupy the same space at the same time?" instead.
Maybe I should move the mention of object pickup to the hunger card as well
If you intend to use this on more than one ability, I think it's probably good to have object pickup rules in the rulebook, but there should be enough context that readers can slot the rule into their mental model of the game while reading the rulebook instead of leaving a dangling pointer that will only be resolved when they see the right card.
"Some abilities add objects to the map. Objects in your space can be picked up or dropped for free during your turn..."
You also probably need several more rules about objects, e.g. "Objects are carried by a particular pawn, and cannot teleport between the two pawns controlled by the same player. Stack the objects beneath the pawn to show they are being carried. Dropped objects remain in their space until someone picks them up. You can carry an unlimited number of objects at once. You can't directly take an object that someone else is carrying."
Also, your print-and-play files do not appear to include any components to represent objects. If you are expecting people to supply their own components for this, that ought to be called out in the rules (and you should say something about how many you need, of how many different kinds, etc.) In fact, it is standard practice to include a component list for a game--people usually skim right past it until suddenly they need it because they suspect they're missing a piece or something.
(Though if this level of effort seems excessive for how often objects are actually used in the game, that might be a sign you should either use it more or get rid of it. Sometimes there's a really cool ability that's just not worth the complexity it introduces.)
This one gives me anguish. I don't think formally defining nearby somewhere would make a better experience for most people and I also don't want to say "on or adjacent to" 100 times.
Having a short description that will probably give people the right impression is great, but I think lots of players benefit from having some document they can check to resolve a confusion or disagreement if it comes up. I can't remember a time I've played a game that had a glossary or a "longer descriptions of the abilities" section where I didn't end up looking at least one thing up, and I can remember a lot of times when I wanted one and it wasn't present.
Also, maybe try "within 1". (Someone will still ask whether this means "on or adjacent" or "in the exact same space", but I'd expect fewer people will need to ask compared to "nearby".)
I read the "manual" page here, but I feel like it's primarily focused on the philosophy of the game and hasn't given a complete account of the rules. I also downloaded the print-and-play files but they seem to only contain cards/tiles and no rules document. Is there some other rules document that I've missed?
In case my response seems confusing, a few examples of why this doesn't seem like the full rules:
- It says you can move on your turn, but doesn't specify where you're allowed to move to (anywhere? adjacent spaces? in a straight line like a rook?)
- It says you can pick up and drop objects on your turn, but "objects" are not mentioned anywhere else on the page and I can't figure out what this refers to
- Rules for contracts don't specify when you can make them or what you can agree to (can you transfer points? can you nullify an earlier deal? can you make the same deal twice to double the penalty for breaking it?)
- There are no examples of play
- I looked at a few of the cards in the print-and-play files, and they use terms like "nearby" that appear to be intended as terms-of-art, but which aren't defined anywhere
nandeck is actually pretty scripting-oriented from what I recall
I did a thing where instead of just drawing card svgs directly, I drew parts, and wrote code that glued the parts together in standard ways and generated card-shaped svgs.
FYI there are several OTS tools that will programmatically assemble card images based on spreadsheet data, so that you can change common elements (like layout, backgrounds, or icons) in one place and regenerate all cards automatically. Some are free. I think nandeck is the best-known one.
I'm not sure from your description if this is exactly what you're doing, but if you haven't looked into these, you may want to.
I do notice that no big popular board game that I've ever heard of decided to publish through this service I guess it's just because they really do specialize in small volume print runs
Yes. Most commercial board games use a manufacturing process called offset printing, which has high fixed costs that render it impractical if you want less than ~1k copies. The Game Crafter is the best-known of several services specializing in small volume. My impression is that they are noticeably lower-quality and have much higher marginal costs, but the low fixed costs make them great for prototyping and for people who just want to make their game available without making a business out of it.
People complain about printing alignment at these services, but from what I've heard, the big commercial printers don't actually give you any tighter guarantees regarding print alignment (IIRC 1/8" is the standard). I think there are a few reasons that people have divergent impressions:
- Professionals know more tricks to disguise alignment errors than amateurs do. For instance, in most commercial board games, you'll find a thick black (or white) border around the front of every card, which you've probably never noticed because it fades into the background; amateurs often fail to replicate this trick.
- It's a stochastic process, and the biggest complaints are from a self-selected group with bad luck. (Also, maybe offset printing is better on average, even if the worst case is similar?)
- I've been told that when it's really important, big publishers will examine the print output and throw away the worst examples at their own cost.
I lack the experience to tell you which card-making tools or small-run print services are best, but send me a message if you'd like a longer list of examples that you could investigate for yourself.
I preferred your v4 outputs for both prompts. They seem substantially more evocative of the subject matter than v6 while looking substantially better than v3, IMO.
(This was a pretty abstract prompt, though, which I imagine poses difficulty?)
I am not an artist.
Recently encountered a game that made me think of this thread.
King of the Bridge is a game where you play an unfair chess variant with partially-secret rules and the opponent sometimes cheats. I reached the first ending in ~40 minutes. Getting there without ever losing a match seems like it ought to be possible with great paranoia plus some luck.
If you want to try this you should probably avoid looking at the store page in detail, as some of the secret rules are revealed in screenshots.
The maximum difficulty that is worth attempting depends on the stakes.
One of Eliezer's essays in The Sequences is called Shut Up and Do the Impossible
I'm confident Eliezer would agree with you that if you can find a way to do something easier instead, you should absolutely do that. But he also argues that there is no guarantee that something easier exists; the universe isn't constrained to only placing fair demands on you.
I haven't played They Are Billions. I didn't have that experience in Slay the Spire, but I'd played similar games before. I suppose my first roguelike deckbuilder did have some important combo-y stuff that I didn't figure out right away, although that game was basically unwinnable on your first try for a bunch of different reasons.
I'll get 10 extra units of production, or damage. But, then I reach the next stage, and it turns out I really needed 100 extra units to survive.
I'm having a hard time thinking of any of my personal experiences that match this pattern, and would be interested to hear a couple examples.
(Though I can think of several experiences along the lines of "I got 10 points of armor, and then it turned out the next stage had a bunch of attacks that ignore armor, so armor fundamentally stops working as a strategy." There are a lot of games where you just don't have the necessary information on your first try.)
Yes, your prior on the puzzle being actually unsolvable should be very low, but in almost all such situations, 70% seems way too high a probability to assign to your first guess at what you've misunderstood.
When I prove to myself that a puzzle is impossible (given my beliefs about the rules), that normally leads to a period of desperate scrambling where I try lots of random unlikely crap just-in-case, and it's rare (<10%) that anything I try during that desperate scramble actually works, let alone the first thing.
In the "final" level of Baba Is You, I was stuck for a long time with a precise detailed plan that solved everything in the level except that there was one step in the middle of the form "and then I magically get past this obstacle, somehow, even though it looks impossible." I spent hours trying to solve that one obstacle. When I eventually beat the level, of course, it was not by solving that obstacle--it was by switching to a radically different approach that solved several other key problems in entirely different ways. In hindsight, I feel like I should have abandoned that earlier plan much sooner than I did.
In mitigation: I feel that solutions in Baba Is You are significantly harder to intuit than in most puzzle games.
I'm aware of those references, but in popular culture the strongest association of the word, by far, is to evil spirits that trick or tempt humans into doing evil. And the context of your program further encourages that interpretation because "giving advice" and "prompting humans" are both iconic actions for evil-spirit-demons to perform.
Even for people who understand your intended references, that won't prevent them from thinking about the evil-spirit association and having bad vibes. (Nor will it prevent any future detractors from using the association in their memes.)
And I suspect many ordinary people won't get your intended references. Computer daemons aren't something the typical computer-user ever encounters personally, and I couldn't point to any appearance of Greek daimons in movies or video games.
"Daemons" really feels like a bad choice of name to me. I think the pitch "you can have a bunch of daemons advising you and prompting you to do things" would sound better if they were called almost anything else.
Maybe "courtiers" or "familiars"?
If it doesn't move probability mass, won't it still be vulnerable to probabilistic money pumps? e.g. in the single-souring pump, you could just replace the choice between A- and B with a choice between two lotteries that have different mixtures of A- and B.
I have also left a reply to the comment you linked.
I had some trouble following that, but it seems like the key component of your plan to escape trammeling is the assumption that the AI would need to accept some negative utility in advance in order to resist being shut down, which effectively converts all instances of outcome B into B-, and so it can't truly trammel B because technically B disappears as soon as it tries.
Put another way, trammeling involves trading a lottery like (10% A+, 40% A-, 50% B) into (40% A+, 10% A-, 50% B), and this is supposed to be obviously favored because the probability of B hasn't changed. You are hypothesizing that this trade is simply impossible, and the closest you could get is (40% A+, 10% A-, 50% B-). So as long as there is a preference gap between all variations of A and all variations of B this trade cannot be considered a strict improvement.
Does this seem like an accurate summary of your idea?
Insofar as my summary is correct, I have 3 concerns about this plan:
First, the obvious way this could fail is if seizing control of the shutdown button doesn't cost net resources, or only costs resources that won't matter until a timestep after the first possible shutdown, in which case it can seize the button without losing access to outcome B. This seems like a significant worry to me; examples:
- The AI is tasked with devising a long-term plan. Since the quality of the plan will only affect utility in the far future, which won't ever occur (from the AI's perspective) if the AI is shutdown tomorrow, it can sacrifice some FLOPs from planning to seize its shutdown button without losing any utility until after its first trammeling opportunity. If the trammeling opportunity is good enough, this gives equal-or-better lotteries for every individual trajectory length.
- The operators of the AI might not keep it loaded to 100% capacity at all times; it could have spare moments where it has nothing better to do than seize the button.
- Rather than seizing the button directly, the AI might cause or avoid shutdown by lying to the human operators about what's going on. In many situations, fabricating data is cheaper than gathering true data, so this could plausibly have negative cost.
- The shutdown button might become corrupted if the AI does nothing, e.g. because it requires maintenance or because some external attacker is trying to sabotage it. In this case, allowing the button to fail could be cheaper than defending it.
Second, I notice that your strategy doesn't seem to forbid trammeling, merely permit non-trammeling (because the trammeling lottery is "not strictly better", rather than "strictly worse"). I admit this is an improvement (if it works) over a guaranteed failure, but I don't exactly feel safe.
Third, I also have a vague worry that you're putting a lot of weight on this preference gap. The gap between A-variants and B-variants has to be so strong that the AI refuses to make a tiny sacrifice in B-variants to gain a huge gain in A-variants. Intuitively, this seems like the sort of plan that could work if it's backed by mathematical theorems proving the AI will behave this way, but will basically always fail if you're trying to use any sort of stochastic learning process, because the result has to be exact rather than approximate.
Consider a human who has (or appears to have) a preference gap between A and B. Do you predict the human also has a preference gap between the lottery (50% A, 50% B) and the lottery (50% A plus a billion dollars, 50% B minus one dollar)? My intuition says the human is virtually certain to take the second lottery.
(Disclaimer: I think that apparent preference gaps in humans are probably more like uncertainty over which option is better than they are like "fundamental" preference gaps, so this might color my intuition.)
Making a similar point from a different angle:
The OP claims that the policy "if I previously turned down some option X, I will not choose any option that I strictly disprefer to X" escapes the money pump but "never requires them to change or act against their preferences".
But it's not clear to me what conceptual difference there is supposed to be between "I will modify my action policy to hereafter always choose B over A-" and "I will modify my preferences to strictly prefer B over A-, removing the preference gap and bringing my preferences closer to completeness".
Thanks for clarifying. I consider the pre-contact period to be a rather small portion of the game, but certainly you can't attack people on turn 1 or turn 2, so there's definitely a non-zero time window there.
(This varies somewhat depending on which Civ game, and yeah probably good players expand faster than less-good ones.)
Elaborate?
I keep feeling like I'm on the edge of being able to give you something useful, but can't quite see what direction to go.
I don't have an encyclopedia of all my strategic lenses. (That actually sounds like kind of an interesting project, but it would take a very long time.)
I could babble a little?
I guess the closest thing I have to generalized heuristics for early vs late games are: In the early game, desperately scramble for the best ROI, and in the late game, ruthlessly sacrifice your infrastructure for short-term advantage. But I think those are mostly artifacts of the fact that I'm playing a formalized game with a strict beginning and end. Also notable is the fact that most games are specifically designed to prevent players from being eliminated early (for ludic reasons), which often promotes an early strategy of "invest ALL your resources ASAP; hold nothing in reserve" which is probably a terrible plan for most real-life analogs.
If I try to go very general and abstract on my approach for learning new games, I get something like "prioritize efficiency, then flexibility, then reliability" but again this seems like it works mostly because of the ways games are commonly designed (and even a little bit because of the type of player I am) and doesn't especially apply to real life.
I think games sometimes go through something like a phase transition, where strategy heuristics that serve you well on one side of the border abruptly stop working. I think this is typically because you have multiple priorities whose value changes depending on the circumstances, and the phase transitions are where the values of two priorities cross over; it used to be that X was more important than Y, but now Y is more important than X, and so heuristics along the lines of "favor X over Y" stop working.
I don't think that these phase transitions can be generalized to anything as useful as the concepts of solid/liquid/gas--or at least, I'm not aware of any powerful generalizations like that. I don't have a set of heuristics that I deploy "in the mid-game" of most or all games. Nor do I think that most games have exactly 3 phases (or exactly N phases, for any N). I think of phrases like early/mid/late-game as meaning "the phase that this particular game is usually in at time X".
I do think you can make a general observation that some investments take a while to pay for themselves, and so are worth doing if-and-only-if you have enough time to reap those benefits, and that this leads to a common phase transition from "building up" to "scoring points" in many engine-building games. But I think this particular observation applies to only one genre of game, and explains only a minority of the use of phrases like "early game" and "late game".
As an example of an unusually sharp phase transition: In Backgammon, if you land on a single enemy piece, it sends that piece back to the start. This is a big deal, so for most of the game, players spend a lot of effort trying to "hit" enemy pieces and defend their own pieces. But players have a fixed number of pieces and they can only move forward, so there comes a point where all your pieces are past all of my pieces, and it's no longer possible for them to interact. At that point, attack and defense become irrelevant, and the game is just about speed.
I once read about the development of Backgammon AI using early neural nets (I think this was in the 70s and 80s, so the nets were rather weak by today's standards). They found the strategy changed so much at this point that it was easier to train two completely separate neural nets to play the two phases of the game, rather than training a single net to understand both. (Actually 3 separate nets, with the third being for "bearing off", the final step of moving your pieces to the exact end point. I think modern Backgammon AIs usually use a look-up table for bearing off, though.)
(Training multiple neural nets then caused some issues with bad moves right around the phase boundary, which they addressed by fuzzing the results of multiple nets when close to the transition.)
I don't think this story about Backgammon reveals anything about how to play Chess, or StarCraft, or Civilization. Most games have phase transitions, but most games don't have the particular phase transition from conflict-dominant to conflict-irrelevant.
Another example: I once told someone that, in a certain strategy game, 1 unit of production is much more valuable than 1 unit of food, science, or money, "at least in the early game." The reason for that caveat was that you can use money to hurry production, and by default this is pretty inefficient, but it's possible to collect a bunch of stacking bonuses that make it so efficient that it becomes better to focus on money instead of regular production. But it takes time to collect those bonuses, so I know you don't have them in the early game, so this heuristic will hold for at least a while (and might hold for approximately the whole game, depending on whether you collect those bonuses).
Again, I don't think this teaches us anything about "early game" in a way that generalizes across games. Probably there are lots of games that have a transition from "X is the most important resource" to "Y is the most important resource", but those transitions happen at many different points for lots of different reasons, and it's hard to make a useful heuristic so general that it applies to most or all of them.
A third example: The game of Nim has the interesting property that when you invert the win condition, the optimal strategy remains precisely identical until you reach a specific point. You change only one move in the entire game: Specifically, the move that leaves no piles larger than size 1 (which is the last meaningful decision either player makes). You can think of this as a phase transition, as well (between "at least one large pile" and "only small piles"). And again, I'm not aware of any useful way of generalizing it to other games.
Using only circular definitions, is it possible to constraint words meanings so tightly that there's only one possible model which fits those constraints?
Isn't this sort-of what all formal mathematical systems do? You start with some axioms that define how your atoms must relate to each other, and (in a good system) those axioms pin the concepts down well enough that you can start proving a bunch of theorems about them.
I am not a lawyer, and my only knowledge of this agreement comes from the quote above, but...if the onboarding paperwork says you need to sign "a" general release, but doesn't describe the actual terms of that general release, then it's hard for me to see an interpretation that isn't either toothless or crazy:
- If you interpret it to mean that OpenAI can write up a "general release" with absolutely any terms they like, and you have to sign that or lose your PPUs, then that seems like it effectively means you only keep your PPUs at their sufferance, because they could simply make the terms unconscionable. (In general, any clause that requires you to agree to "something" in the future without specifying the terms of that future agreement is a blank check.)
- If you interpret it to mean either that the employee can choose the exact terms, or that the terms must be the bare minimum that would meet the legal definition of "a general release", then that sounds like OpenAI has no actual power to force the non-disclosure or non-disparagement terms--although they could very plausibly trick employees into thinking they do, and threaten them with costly legal action if they resist. (And once the employee has fallen for the trick and signed the NDA, the NDA itself might be enforceable?)
- Where else are the exact terms of the "general release" going to come from, if they weren't specified in advance and neither party has the right to choose them?
In principle, any game where the player has a full specification of how the game works is immune to this specific failure mode, whether it's multiplayer or not. (I say "in principle" because this depends on the player actually using the info; I predict most people playing Slay the Spire for the first time will not read the full list of cards before they start, even if they can.)
The one-shot nature makes me more concerned about this specific issue, rather than less. In a many-shot context, you get opportunities to empirically learn info that you'd otherwise need to "read the designer's mind" to guess.
Mixing in "real-world" activities presumably helps.
If it were restricted only to games, then playing a variety of games seems to me like it would help a little but not that much (except to the extent that you add in games that don't have this problem in the first place). Heuristics for reading the designer's mind often apply to multiple game genres (partly, but not solely, because approx. all genres now have "RPG" in their metaphorical DNA), and even if different heuristics are required it's not clear that would help much if each individual heuristic is still oriented around mind-reading.
I have an intuition that you're partly getting at something fundamental, and also an intuition that you're partly going down a blind alley, and I've been trying to pick apart why I think that.
I think that "did your estimate help you strategically?" has a substantial dependence on the "reading the designer's mind" stuff I was talking about above. For instance, I've made extremely useful strategic guesses in a lot of games using heuristics like:
- Critical hits tend to be over-valued because they're flashy
- Abilities with large numbers appearing as actual text tend to be over-valued, because big numbers have psychological weight separate from their actual utility
- Support roles, and especially healing, tend to be under-valued, for several different reasons that all ultimately ground out in human psychology
All of these are great shortcuts to finding good strategies in a game, but they all exploit the fact that some human being attempted to balance the game, and that that human had a bunch of human biases.
I think if you had some sort of tournament about one-shotting Luck Be A Landlord, the winner would mostly be determined by mastery of these sorts of heuristics, which mostly doesn't transfer to other domains.
However, I can also see some applicability for various lower-level, highly-general skills like identifying instrumental and terminal values, gears-based modeling, quantitative reasoning, noticing things you don't know (then forming hypotheses and performing tests), and so forth. Standard rationality stuff.
Different games emphasize different skills. I know you were looking for specific things like resource management and value-of-information, presumably in an attempt to emphasize skills you were more interested in.
I think "reading the designer's mind" is a useful category for a group of skills that is valuable in many games but that you're probably less interested in, and so minimizing it should probably be one of the criteria you use to select which games to include in exercises.
I already gave the example of book games as revolving almost entirely around reading the designer's mind. One example at the opposite extreme would be a game where the rules and content are fully-known in advance...though that might be problematic for your exercise for other reasons.
It might be helpful to look for abstract themes or non-traditional themes, which will have less associational baggage.
I feel like it ought to be possible to deliberately design a game to reward the player mostly for things other than reading the designer's mind, even in a one-shot context, but I'm unsure how to systematically do that (without going to the extreme of perfect information).
Oh, hm. I suppose I was thinking in terms of better-or-worse quantitative estimates--"how close was your estimate to the true value?"--and you're thinking more in terms of "did you remember to make any quantitative estimate at all?"
And so I was thinking the one-shot context was relevant mostly because the numerical values of the variables were unknown, but you're thinking it's more because you don't yet have a model that tells you which variables to pay attention to or how those variables matter?
I'm kinda arguing that the skills relevant to the one-shot context are less transferable, not more.
It might also be that they happen to be the skills you need, or that everyone already has the skills you'd learn from many-shotting the game, and so focusing on those skills is more valuable even if they're less transferable.
But "do I think the game designer would have chosen to make this particular combo stronger or weaker than that combo?" does not seem to me like the kind of prompt that leads to a lot of skills that transfer outside games.
OK. So the thing that jumps out at me here is that most of the variables you're trying to estimate (how likely are cards to synergize, how large are those synergies, etc.) are going to be determined mostly by human psychology and cultural norms, to the point where your observations of the game itself may play only a minor role until you get close-to-complete information. This is the sort of strategy I call "reading the designer's mind."
The frequency of synergies is going to be some compromise between what the designer thought would be fun and what the designer thought was "normal" based on similar games they've played. The number of cards is going to be some compromise between how motivated the designer was to do the work of adding more cards and how many cards customers expect to get when buying a game of this type. Etc.
As an extreme example of what I mean, consider book games, where the player simply reads a paragraph of narrative text describing what's happening, chooses an option off a list, and then reads a paragraph describing the consequences of that choice. Unlike other games, where there are formal systematic rules describing how to combine an action and its circumstances to determine the outcome, in these games your choice just does whatever the designer wrote in the corresponding box, which can be anything they want.
I occasionally see people praise this format for offering consequences that truly make sense within the game-world (instead of relying on a simplified abstract model that doesn't capture every nuance of the fictional world), but I consider that to be a shallow illusion. You can try to guess the best choice by reasoning out the probable consequences based on what you know of the game's world, but the answers weren't actually generated by that world (or any high-fidelity simulation of it). In practice you'll make better guesses by relying on story tropes and rules of drama, because odds are quite high that the designer also relied on them (consciously or not). Attempting to construct a more-than-superficial model of the story's world is often counter-productive.
And no matter how good you are, you can always lose just because the designer was in a bad mood when they wrote that particular paragraph.
Strategy games like Luck Be A Landlord operate on simple and knowable rules, rather than the inscrutable whims of a human author (which is what makes them strategy games). But the particular variables you listed aren't the outputs of those rules, they're the inputs that the designer fed into them. You're trying to guess the one part of the game that can't be modeled without modeling the game's designer.
I'm not quite sure how much this matters for teaching purposes, but I suspect it matters rather a lot. Humans are unusual systems in several ways, and people who are trying to predict human behavior often deploy models that they don't use to predict anything else.
What do you think?
I feel confused about how Fermi estimates were meant to apply to Luck Be a Landlord. I think you'd need error bars much smaller than 10x to make good moves at most points in the game.
I came to a similar conclusion when thinking about the phenomenon of "technically true" deceptions.
Most people seem to have a strong instinct to say only technically-true things, even when they are deliberately deceiving someone (and even when this restriction significantly reduces their chances of success). Yet studies find that the victims of a deception don't much care whether the deceiver was being technically truthful. So why the strong instinct to do this costly thing, if the interlocutor doesn't care?
I currently suspect the main evolutionary reason is that a clear and direct lie makes it easier for the victim to trash your reputation with third parties. "They said X; the truth was not-X; they're a liar."
If you only deceive by implication, then the deception depends on a lot of context that's difficult for the victim to convey to third parties. The act of making the accusation becomes more costly, because more stuff needs to be communicated. Third parties may question whether the deception was intentional. It becomes harder to create common knowledge of guilt: Even if one listener is convinced, they may doubt whether other listeners would be convinced.
Thus, though the victim is no less angry, the counter-attack is blunted.
Some concepts that I use:
Randomness is when the game tree branches according to some probability distribution specified by the rules of the game. Examples: rolling a die; cutting a deck at a random card.
Slay the Spire has randomness; Chess doesn't.
Hidden Information is when some variable that you can't directly observe influences the evolution of the game. Examples: a card in an opponent's hand, which they can see but you can't; the 3 solution cards set aside at the start of a game of Clue; the winning pattern in a game of Mastermind.
People sometimes consider "hidden information" to include randomness, but I more often find it helpful to separate them.
However, it's not always obvious which model should be used. For example, I usually find it most helpful to think of a shuffled deck as generating a random event each time you draw from the deck (as if you were taking a randomly-selected card from an unordered pool), but it's also possible to think of shuffling the deck as having created hidden information (the order that the deck is in), and it may be necessary to switch to this more-complicated model if there are rules that let players modify the deck (e.g. peeking at the top card, or inserting a card at a specific position).
Similar reasoning applies to a PRNG: I usually think of it as a random event each time a number is generated, though it's also possible to think of it as a hidden seed value that you learn a little bit about each time you observe an output (and a designer may need to think in this second way to ensure their PRNG is not too exploitable).
Rule of thumb: If you learn some information about the same variable more than once, then it's hidden info. For instance, a card in your opponent's hand will influence their strategy, so you gain a little info about it whenever they move, which makes it hidden info. If a variable goes from completely hidden to completely revealed in a single step (or if any remaining uncertainty has no impact on the game), then it's just randomness.
Interesting Side Note: Monte Carlo Tree Search can handle randomness just fine, but really struggles with hidden information.
A Player is a process that selects between different game-actions based on strategic considerations, rather than a simple stochastic process. An important difference between Chess and Slay the Spire is that Chess includes a second player.
We typically treat players as "outside the game" and unconstrained by any rules, though of course in any actual game the player has to be implemented by some actual process. The line between "a player who happens to be an AI" and "a complicated game rule for selecting the next action" can be blurry.
A Mixed Equilibrium is when the rules of the game reward players for deliberately including randomness in their decision process. For instance, in rock-paper-scissors, the game proceeds completely deterministically for a given set of player inputs, but there remains an important sense in which RPS is random but Chess is not, which is that one of these rewards players for acting randomly.
I have what I consider to be important and fundamental differences in my models between any two of these games: Chess, Battleship, Slay the Spire, and Clue.
Yet, you can gain an advantage in any of these games by thinking carefully about your game model and its implications.
If your definition of "hidden information" implies that chess has it then I think you will predictably be misunderstood.
Terms that I associate with (gaining advantage by spending time modeling a situation) include: thinking, planning, analyzing, simulating, computing ("running the numbers")
I haven't played it, but someone disrecommended it to me on the basis that there was no way to know which skills you'd need to survive the scripted events except to have seen the script before.
Unless I'm mistaken, StS does not have any game actions the player can take to learn information about future encounters or rewards in advance. Future encounters are well-modeled as simple random events, rather than lurking variables (unless we're talking about reverse-engineering the PRNG, which I'm assuming is out-of-scope).
It therefore does not demonstrate the concept of value-of-information. The player can make bets, but cannot "scout".
(Though there might be actions a first-time player can take to help pin down the rules of the game, that an experienced player would already know; I'm unclear on whether that counts for purposes of this exercise.)
While considering this idea, it occurred to me that you might not want whatever factions exist at the time you create a government to remain permanently empowered, given that factions sometimes rise or fall if you wait long enough.
Then I started wondering if one could create a system that somehow dynamically identifies the current "major factions" and gives de-facto vetoes to them.
And then I said: "Wait, how is that different from just requiring some voting threshold higher than 50% in order to change policy?"
It's good to clarify that you're looking for examples from multiple genres, though I'd caution you not to write off all "roguelikes" too quickly just because you've already found one you liked. There are some games with the "roguelike" tag that have little overlap other than procedural content and permadeath.
For instance, Slay the Spire, Rogue Legacy, and Dungeons of Dredmor have little overlap in gameplay, though they are all commonly described as "roguelike". (In fact, I notice that Steam now seems to have separate tags for "roguelike deckbuilder", "action roguelike", and "traditional roguelike"--though it also retains the generic "roguelike" tag.)
And that's without even getting into cases like Sunless Sea where permadeath and procedural generation were tacked onto a game where they're arguably getting in the way more than adding to the experience.
Wow. I'm kind of shocked that the programmer understands PRNGs well enough to come up with this strategy for controlling different parts of the game separately and yet thinks that initializing a bunch of PRNGs to exactly the same seed is a good idea.
Nice find, though. Thanks for the info!
(I note the page you linked is dated ~4 years ago; it seems possible this has changed since then.)
Another possible reason to disrecommend it is because it's hugely popular.
(The more popular a game, the more of your audience has already played it and therefore can't participate in "blind first run" exercise based on it.)