Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

bongo

Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

post by Bongo · 2011-02-09T15:20:54.505Z · LW · GW · Legacy · 79 comments

79 comments

Omega will either award you $1000 or ask you to pay him $100. He will award you $1000 if he predicts you would pay him if he asked. He will ask you to pay him $100 if he predicts you wouldn't pay him if he asked.

Omega asks you to pay him $100. Do you pay?

This problem is roughly isomorphic to the branch of Transparent Newcomb (version 1, version 2) where box B is empty, but it's simpler.

Here's a diagram:

79 comments

Comments sorted by top scores.

comment by cousin_it · 2011-02-10T11:52:46.694Z · LW(p) · GW(p)

I have sympathy for the commenters who agreed to pay outright (Nesov and ata), but viewed purely logically, this problem is underdetermined, kinda like Transparent Newcomb's (thx Manfred). This is a subtle point, bear with me.

Let's assume you precommit to not pay if asked. Now take an Omega that strictly follows the rules of the problem, but also has one additional axiom: I will award the player $1000 no matter what. This Omega can easily prove that the world in which it asks you to pay is logically inconsistent, and then it concludes that in that world you do agree to pay (because a falsity implies every statement, and this one happened to come first lexicographically or something). So Omega decides to award you $1000, its axiom system stays perfectly consistent, and all the conditions of the problem are fulfilled. I stress that the statement "You would pay if Omega asked you to" is logically true in the axiom system outlined, because its antecedent is false.

In summary, the system of logical statements that specifies the problem does not completely determine what will happen, because we can consistently extend it with another axiom that makes Omega cooperate even if you defect. IOW, you can't go wrong by cooperating, but some correct Omegas will reward defectors as well. It's not clear to me if this problem can be "fixed".

ETA: it seems that several other decision problems have a similar flaw. In Counterfactual Mugging with a logical coin it makes some defectors win, as in our problem, and in Parfit's Hitchhiker it makes some cooperators lose.

Replies from: JGWeissman, Bongo, wedrifid, Vladimir_Nesov, wedrifid, wedrifid, None, Stuart_Armstrong

↑ comment by JGWeissman · 2011-02-10T18:02:40.078Z · LW(p) · GW(p)

This Omega can easily prove that the world in which it asks you to pay is logically inconsistent, and then it concludes that in that world you do agree to pay (because a falsity implies every statement, and this one happened to come first lexicographically or something).

This seems to be confusing "counterfactual::if" with "logical::if". Noting that a world is impossible because the agents will not make the decisions that lead to that world does not mean that you can just make stuff up about that world since "anything is true about a world that doesn't exist".

Replies from: cousin_it, Vladimir_Nesov

↑ comment by cousin_it · 2011-02-10T18:09:19.009Z · LW(p) · GW(p)

Your objection would be valid if we had a formalized concept of "counterfactual if" distinct from "logical if", but we don't. When looking at the behavior of deterministic programs, I have no idea how to make counterfactual statements that aren't logical statements.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-02-10T18:24:51.673Z · LW(p) · GW(p)

When a program takes explicit input, you can look at what the program does if you pass this or that input, even if some inputs will in fact never be passed.

↑ comment by Vladimir_Nesov · 2011-02-10T18:21:53.844Z · LW(p) · GW(p)

Noting that a world is impossible because the agents will not make the decisions that lead to that world does not mean that you can just make stuff up about that world since "anything is true about a world that doesn't exist".

If event S is empty, then for any Q you make up, it's true that [for all s in S, Q]. This statement also holds if S was defined to be empty if [Not Q], or if Q follows from S being non-empty.

Replies from: JGWeissman

↑ comment by JGWeissman · 2011-02-10T18:30:25.332Z · LW(p) · GW(p)

Yes you can make logical deductions of that form, but my point was that you can't feed those conlusions back into the decision making process without invalidating the assumptions that went into those conclusions.

↑ comment by Bongo · 2011-02-10T12:54:28.841Z · LW(p) · GW(p)

I will award the player $1000 iff the player would pay
I will award the player $1000 no matter what

How are these consistent??

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T12:58:54.732Z · LW(p) · GW(p)

Both these statements are true, so I'd say they are consistent :-)

In particular, the first one is true because "The player would pay if asked" is true.

"The player would pay if asked" is true because "The player will be asked" is false and implies anything.

"The player will be asked" is false by the extra axiom.

Note I'm using ordinary propositional logic here, not some sort of weird "counterfactual logic" that people have in mind and which isn't formalizable anyway. Hence the lack of distinction between "will" and "would".

Replies from: Bongo

↑ comment by Bongo · 2011-02-10T14:15:13.015Z · LW(p) · GW(p)

Are you sure you're not confusing the propositions

o=ASK => a=PAY

and

a=PAY

If not, could you present your argument formally?

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T14:22:32.817Z · LW(p) · GW(p)

I thought your post asked about the proposition "o=ASK => a=PAY", and didn't mention the other one at all. You asked this:

Omega asks you to pay him $100. Do you pay?

not this:

Do you precommit to pay?

So I just don't use the naked proposition "a=PAY" anywhere. In fact I don't even understand how to define its truth value for all agents, because it may so happen that the agent gets $1000 and walks away without being asked anything.

Replies from: Bongo

↑ comment by Bongo · 2011-02-10T14:43:42.711Z · LW(p) · GW(p)

I don't even understand how to define its truth value for all agents

Seems to me that for all agents there is a fact of the matter about whether they would pay if asked. Even for agents that never in fact are asked.

So I do interpret a=PAY as "would pay". But maybe there are other legitimate interpretations.

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T14:52:15.433Z · LW(p) · GW(p)

If both the agent and Omega are deterministic programs, and the agent is never in fact asked, that fact may be converted into a statement about natural numbers. So what you just said is equivalent to this:

Seems to me that for all agents there is a fact of the matter about whether they would pay if 1 were equal to 2.

I don't know, this looks shady.

Replies from: AlephNeil

↑ comment by AlephNeil · 2011-05-06T05:22:36.223Z · LW(p) · GW(p)

I don't know, this looks shady.

Why? Say the world program W includes function f, and it's provable that W could never call f with argument 1. That doesn't mean there's no fact of the matter about what happens when f(1) is computed (though of course it might not halt). (Function f doesn't have to be called from W.)

Even if f can be regarded as a rational agent who 'knows' the source code of W, the worst that could happen is that f 'deduces' a contradiction and goes insane. That's different from the agent itself being in an inconsistent state.

Analogy: We can define the partial derivatives of a Lagrangian with respect to q and q-dot, even though it doesn't make sense for q and q-dot to vary independently of each other.

↑ comment by wedrifid · 2011-02-10T14:55:34.339Z · LW(p) · GW(p)

I assume that you would not consider this to be a problem if Omega was replaced with a 99% reliable predictor. Confirm?

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T15:14:57.945Z · LW(p) · GW(p)

...Huh? My version of Omega doesn't bother predicting the agent, so you gain nothing by crippling its prediction abilities :-)

ETA: maybe it makes sense to let Omega have a "trembling hand", so it doesn't always do what it resolved to do. In this case I don't know if the problem stays or goes away. Properly interpreting "counterfactual evidence" seems to be tricky.

Replies from: wedrifid

↑ comment by wedrifid · 2011-02-11T04:03:53.463Z · LW(p) · GW(p)

...Huh? My version of Omega doesn't bother predicting the agent, so you gain nothing by crippling its prediction abilities :-)

I would consider an Omega that didn't bother predicting in even that case to be 'broken'. Omega is good when it comes to good faith natural language implementation. Perhaps I would consider it one of Omega's many siblings, one that requires more formal shackles.

↑ comment by Vladimir_Nesov · 2011-02-10T13:23:22.043Z · LW(p) · GW(p)

This takes the decision out of Omega's hands and collapses Omega's agent-provability by letting it know its decision. We already know that in ADT-style decision-making, all theories of consequences of actions other than the actual one are inconsistent, that they are merely agent-consistent, and adding an axiom specifying which action is actual won't disturb consistency of the theory of consequences of the actual action. But there's no guarantee that Omega's decision procedure would behave nicely when faced with knowledge of inconsistency. For example, instead of concluding that you do agree to pay, it could just as well conclude that you don't, which would be a moral argument to not award you the $1000, and then Omega just goes crazy. One isn't meant to know own decisions, bad for sanity.

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T13:32:17.347Z · LW(p) · GW(p)

Yes, you got it right. I love your use of the word "collapse" :-)

My argument seems to indicate that there's no easy way for UDT agents to solve such situations, because the problem statements really are incomplete. Do you see any way to fix that, e.g. in Parfit's Hitchhiker? Because this is quite disconcerting. Eliezer thought he'd solved that one.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-02-10T13:36:53.062Z · LW(p) · GW(p)

I don't understand your argument. You've just broken Omega for some reason (by letting it know something true which it's not meant to know at that point), and as a result it fails in its role in the thought experiment. Don't break Omega.

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T13:38:14.717Z · LW(p) · GW(p)

My implementation of Omega isn't broken and doesn't fail. Could you show precisely where it fails? As far as I can see, all the conditions in Bongo's post still hold for it, therefore all possible logical implications of Bongo's post should hold for it too, and so should all possible "solutions".

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-02-10T13:50:07.554Z · LW(p) · GW(p)

It doesn't implement the counterfactual where depending on what response the agent assumes to give on observing a request to pay, it can agent-consistently conclude that Omega will either award or not award $1000. Even if we don't require that Omega is a decision-theoretic agent with known architecture, the decision problem must make the intended sense.

In more detail. Agent's decision is a strategy that specifies, for each possible observation (we have two: Omega rewards it, or Omega asks for money), a response. If Omega gives a reward, there is no response, and if it asks for money, there are two responses. So overall, we have two strategies to consider. The agent should be able to contemplate the consequences of adopting each of these strategies, without running into inconsistencies (observation is an external parameter, so even if in a given environment, there is no agent-with-that-observation, decision algorithm can still specify a response to that observation, it would just completely fail to control the outcome). Now, take your Omega implementation, and consider the strategy of not paying from agent's perspective. What would the agent conclude about expected utility? By problem specification, it should (in the external sense, that is not necessarily according to its own decision theory, if that decision theory happens to fail this particular thought experiment) conclude that Omega doesn't give it an award. But your Omega does knowably (agent-provably) give it an award, hence it doesn't play the intended role, doesn't implement the thought experiment.

Replies from: wedrifid, cousin_it

↑ comment by wedrifid · 2011-02-10T15:20:54.470Z · LW(p) · GW(p)

But your Omega does knowably (agent-provably) give it an award, hence it doesn't play the intended role, doesn't implement the thought experiment.

I think it would be fair to say that cousin_it's (ha! Take that English grammar!) description of Omega's behaviour does fit the problem specification we have given but certainly doesn't match the problem we intended. That leaves us to fix the wording without making it look too obfuscated.

Taking another look at the actual problem specification it actually doesn't look all that bad. The translation into logical propositions didn't really do it justice. We have...

He will award you $1000 if he predicts you would pay him if he asked.

cousin_it allows "if" to resolve to "iif", but translates "The player would pay if asked" into A -> B; !B therefore 'whatever'. Which is not quite what we mean when we use the phrase in English. We are trying to refer to the predicted outcome in a "possibly counterfactual but possibly real" reality.

Can you think of a way to say what we mean without any ambiguity and without changing the problem itself too much?

↑ comment by cousin_it · 2011-02-10T14:41:20.224Z · LW(p) · GW(p)

I believe you haven't yet realized the extent of the damage :-)

It's very unclear to me what it means for Omega to "implement the counterfactual" in situations where it gives the agent information about which way the counterfactual came out. After all, the agent knows its own source code A and Omega's source code O. What sense does it make to inquire about the agent's actions in the "possible world" where it's passed a value of O(A) different from its true value? That "possible world" is logically inconsistent! And unlike the situation where the agent is reasoning about its own actions, in our case the inconsistency is actually exploitable. If a counterfactual version of A is told outright that O(A)==1, and yet sees a provable way to make O(A)==2, how do you justify not going crazy?

The alternative is to let the agent tacitly assume that it does not necessarily receive the true value of O(A), i.e. that the causality has been surgically tweaked at some point - so the agent ought to respond to any values of O(A) mechanically by using a "strategy", while taking care not to think too much about where they came from and what they mean. But: a) this doesn't seem to accord with the spirit of Bongo's original problem, which explicitly asked "you're told this statement about yourself, now what do you do?"; b) this idea is not present in UDT yet, and I guess you will have many unexpected problems making it work.

Replies from: Vladimir_Nesov, Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-02-10T20:56:03.094Z · LW(p) · GW(p)

If a counterfactual version of A is told outright that O(A)==1, and yet sees a provable way to make O(A)==2, how do you justify not going crazy?

By the way, this bears an interesting similarity to the question of how would you explain the event of your left arm being replaced by a blue tentacle. The answer that you wouldn't is perfectly reasonable, since you don't need to be able to adequately respond to that observation, you can self-improve in a way that has a side effect of making you crazy once you observe your left arm being transformed into a blue tentacle, and that wouldn't matter, since this event is of sufficiently low measure and has sufficiently insignificant contribution to overall expected utility to not be worth worrying about.

So in our case, the question should be, is it desirable to not go crazy when presented with this observation and respond in some other way instead, perhaps to win the Omega Award? If so, how should you think about the situation?

↑ comment by Vladimir_Nesov · 2011-02-10T15:49:46.285Z · LW(p) · GW(p)

If a counterfactual version of A is told outright that O(A)==1, and yet sees a provable way to make O(A)==2, how do you justify not going crazy?

It's not the correct way of interpreting observations, you shouldn't let observations drive you crazy. Here, we have A's action-definition that is given in factorized form: action=A(O("A")). Normally, you'd treat such decompositions as explicit dependence bias, and try substituting everything in before starting to reason about what would happen if. But if O("A") is an observation, then you're not deciding action, that is A(O("A")). Instead, you're deciding just A(-), an Observations -> Actions map. So being told that you've observed "no award" doesn't mean that you now know that O("A")="no award". It just means that you're the subagent responsible for deciding a response to parameter "no award" in the strategy for A(-). You might also want to acausally coordinate with the subagent that is deciding the other part of that same strategy, a response to "award".

And this all holds even if the agent knows what O("A") means, it would just be a bad idea to not include O("A") as part of the agent in that case, and so optimize the overall A(O("A")) instead of the smaller A(-).

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T16:10:20.919Z · LW(p) · GW(p)

At this point it seems we're arguing over how to better formalize the original problem. The post asked what you should reply to Omega. Your reformulation asks what counterfactual-you should reply to counterfactual-Omega that doesn't even have to say the same thing as the original Omega, and whose judgment of you came from the counterfactual void rather than from looking at you. I'm not sure this constitutes a fair translation. Some of the commenters here (e.g. prase) seem to intuitively lean toward my interpretation - I agree it's not UDT-like, but think it might turn out useful.

Replies from: Vladimir_Nesov, Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-02-10T17:28:01.632Z · LW(p) · GW(p)

At this point it seems we're arguing over how to better formalize the original problem.

It's more about making more explicit the question of what are observations, and what are boundaries of the agent (Which parts of the past lightcone are part of you? Just the cells in the brain? Why is that?), in deterministic decision problems. These were never explicitly considered before in the context of UDT. The problem statement states that something is "observation", but we lack a technical counterpart of that notion. Your questions resulted from treating something that's said to be an "observation" as epistemically relevant, writing knowledge about state of the territory which shouldn't be logically transparent right into agent's mind.

(Observations, possible worlds, etc. will very likely be the topic of my next post on ADT, once I resolve the mystery of observational knowledge to my satisfaction.)

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T18:51:26.978Z · LW(p) · GW(p)

Thanks, this looks like a fair summary (though a couple levels too abstract for my liking, as usual).

A note on epistemic relevance. Long ago, when we were just starting to discuss Newcomblike problems, the preamble usually went something like this: "Omega appears and somehow convinces you that it's trustworthy". So I'm supposed to listen to Omega's words and somehow split them into an "epistemically relevant" part and an "observation" part, which should never mix? This sounds very shady. I hope we can disentangle this someday.

↑ comment by Vladimir_Nesov · 2011-02-10T16:15:29.762Z · LW(p) · GW(p)

Your reformulation asks what counterfactual-you should reply to counterfactual-Omega that doesn't even have to say the same thing as the original Omega.

Yes. If the agent doesn't know what Omega actually says, this can be an important consideration (decisions are made by considering agent-provable properties of counterfactuals, all of which except the actual one are inconsistent, but not agent-inconsistent). If Omega's decision is known (and not just observed), it just means that counterfactual-you's response to counterfactual-Omega doesn't control utility and could well be anything. But at this point I'm not sure in what sense anything can actually be logically known, and not in some sense just observed.

↑ comment by wedrifid · 2011-02-10T14:57:16.644Z · LW(p) · GW(p)

in Parfit's Hitchhiker it makes some cooperators lose

Now that is a real concern!

↑ comment by wedrifid · 2011-02-10T14:54:05.472Z · LW(p) · GW(p)

In summary, the system of logical statements that specifies the problem does not completely determine what will happen, because we can consistently extend it with another axiom that makes Omega cooperate even if you defect. IOW, you can't go wrong by cooperating, but some correct Omegas will reward defectors as well.

I am another person who pays outright. While I acknowledge the "could even reward defectors" logical difficulty I am also comfortable asserting that not paying is an outright wrong choice. A payoff of "$1,000" is to be preferred to a payoff of "either $1,000 or $0".

It's not clear to me if this problem can be "fixed".

It would seem to merely require more precise wording in the problem statement. At the crudest level you simply add the clause "if it is logically coherent to so refrain Omega will not give you $1,000".

↑ comment by [deleted] · 2013-01-15T19:30:08.144Z · LW(p) · GW(p)

The solution has nothing to do with hacking the counterfactual; the reflectively consistent (and winning) move is to pay the $100, as precommitting to do so nets you a guaranteed $1000 (unless omega can be wrong). It is true that "The player will pay iff asked" implies "The player will not be asked" and therefore "The player will not pay", but this does not cause omega to predict the player to not pay when asked.

↑ comment by Stuart_Armstrong · 2011-02-10T14:54:01.769Z · LW(p) · GW(p)

You've added an extra axiom to Omega, noted that this resulted in a consistent result, and concluded that therefore the original axioms are incomplete (because the result is changed).

But that does not follow. This would only be true if the axiom was added secretly, and the result was still consistent. But because I know about this extra axiom, you've changed the problem; I behave differently, so the whole setup is different.

Or consider a variant: I have the numbers sqrt[2], e and pi. I am required to output the first number that I can prove is irrational, using the shortest proof I can find. This will be sqrt[2] (or maybe e), but not pi. Now add the axiom "pi is irrational". Now I will output pi first, as the proof is one line long. This does not mean that the original axiomatic system was incorrect or under-specified...

Replies from: cousin_it

↑ comment by cousin_it · 2011-02-10T15:05:26.511Z · LW(p) · GW(p)

I'm not completely sure what your comment means. The result hasn't "changed", it has appeared. Without the extra axiom there's not enough axioms to nail down a single result (and even with it I had to resort to lexicographic chance at one point). That's what incompleteness means here.

If you think that's wrong, try to prove the "correct" result, e.g. that any agent who precommits to not paying won't get the $1000, using only the original axioms and nothing else. Once you write out the proof, we will know for certain that one of us is wrong or the original axioms are inconsistent, which would be even better :-)

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-02-10T17:19:29.225Z · LW(p) · GW(p)

The result hasn't "changed", it has appeared.

I was also previously suspicious to the word "change", but lately made my peace with it. Saying that there's change is just a way of comparing objects of the same category. So if you look at an apple and a grape, what changes from apple to grape is, for example, color. A change is simultaneously what's different, and a method of producing one from the other. Application of change to time, or to the process of decision-making, are mere special cases. Particular ways of parsing change in descriptions of decision problems can be incorrect because of explicit dependence bias: those changes as methods of determining one from the other are not ambient dependencies. But other usages of "change" still apply. For example, your decision to take one box in Newcomb's instead of two changes the content of the box.

comment by nshepperd · 2011-02-09T15:52:48.499Z · LW(p) · GW(p)

This is also isomorphic to the absent-minded driver problem with different utilities (and mixed strategies*), it seems. Specifically, if you consider the abstract idealized decision theory you implement to be "you", you make the same decision in two places, once in omega's brain while he predicts you and again if he asks you to pay up. Therefore the graph can be transformed from this

into this

which looks awfully like the absent minded driver. Interesting.

Additionally, modifying the utilities involved ($1000 -> death; swap -$100 and $0) gives Parfit's Hitchhiker.

Looks like this isn't really a new decision theory problem at all.

*ETA: Of course mixed strategies are allowed, if Omega is allowed to be an imperfect predictor. Duh. Clearly I wasn't paying proper attention...

Replies from: SilasBarta, Bongo

↑ comment by SilasBarta · 2011-02-09T17:51:40.447Z · LW(p) · GW(p)

I contend it's also isomorphic to the very real-world problems of hazing, abuse cycles, and akrasia.

The common dynamic across all these problems is that "You could have been in a winning or losing branch, but you've learned that you're in a losing branch, and your decision to scrape out a little more utility within that branch takes away more utility from (symmetric) versions of yourself in (potentially) winning branches."

Replies from: WrongBot, TheOtherDave, Matt_Simpson, Bongo

↑ comment by WrongBot · 2011-02-10T08:13:38.809Z · LW(p) · GW(p)

Disagree. In e.g. the case of hazing, the person who has hazed me is not a counterfactual me, and his decision is not sufficiently correlated with my own for this approach to apply.

Replies from: SilasBarta

↑ comment by SilasBarta · 2011-02-10T13:37:00.300Z · LW(p) · GW(p)

Whether it's a counterfactual you is less important than whether it's a symmetric version of you with the same incentives and preferences. And the level of correlation is not independent of whether you believe there's a correlation (like on Newcomb's problem and PD).

Replies from: WrongBot

↑ comment by WrongBot · 2011-02-10T16:00:30.222Z · LW(p) · GW(p)

Incentives, preferences, and decision procedure. Mine are not likely to be highly correlated with a random hazer's.

Replies from: SilasBarta

↑ comment by SilasBarta · 2011-02-10T16:52:30.556Z · LW(p) · GW(p)

Yes, depending on the situation, there may be in intractable discorrelation as you move from the idealization to real-world hazing.

But keep in mind, even if the agents actually were fully correlated (as specified in my phrasing of the Hazing Problem), they could still condemn themselves to perpetual hazing as a result of using a decision theory that returns a different output depending what branch you have learned you are in, and it is this failure that you want to avoid.

There's a difference between believing that a particular correlation is poor, vs. believing that only outcomes within the current period matter for your decision.

(Side note: this relates to the discussion of the CDT blind spot on page 51 of EY's TDT paper.)

↑ comment by TheOtherDave · 2011-02-09T17:57:52.458Z · LW(p) · GW(p)

This is very nicely put.

↑ comment by Matt_Simpson · 2011-02-09T18:11:54.254Z · LW(p) · GW(p)

Does this depend on many worlds as talking about "branches" seems to suggest? Consider, e.g.

You could have won or lost this time, but you've learned that you've lost, and your decision to scrape out a little more utility in this case takes away more utility by increasing the chance of losing in future similar situations.

Replies from: Bongo, SilasBarta

↑ comment by Bongo · 2011-02-09T18:14:59.948Z · LW(p) · GW(p)

No. These branches correspond to the branches in diagrams.

Replies from: Matt_Simpson

↑ comment by Matt_Simpson · 2011-02-09T18:16:25.643Z · LW(p) · GW(p)

Ah, i see. That makes much more sense. Thanks.

↑ comment by SilasBarta · 2011-02-09T18:32:33.949Z · LW(p) · GW(p)

The problems are set up as one-shot so you can't appeal to a future chance of (yourself experiencing) losing that is caused by this decision. By design, the problems probe your theory of identity and what you should count as relevant for purposes of decision-making.

Also, what Bongo said.

↑ comment by Bongo · 2011-02-09T18:11:29.453Z · LW(p) · GW(p)

"You could have been in a winning or losing branch, but you've learned that you're in a losing branch, and your decision to scrape out a little more utility within that branch takes away more utility from (symmetric) versions of yourself in (potentially) winning branches."

In the Hitchhiker you're scraping in the winning branch though.

Replies from: SilasBarta

↑ comment by SilasBarta · 2011-02-09T18:28:27.461Z · LW(p) · GW(p)

True, I didn't mean the isomorphism to include that problem, but rather, just the ones I mentioned plus counterfactual mugging and (if I understand the referent correctly) the transparent box newcomb's. Sorry if I wasn't clear.

↑ comment by Bongo · 2011-02-09T16:44:59.686Z · LW(p) · GW(p)

Looks like this isn't really a new decision theory problem at all.

Sort of. The shape is old, the payoffs are new. If Parfit's Hitchhiker, you pay for not being counterfactually cast into the left branch. In Extremely Counterfactual Mugging, you pay for counterfactually gaining access to the left branch.

comment by Vladimir_Nesov · 2011-02-10T02:15:45.168Z · LW(p) · GW(p)

By paying, you reduce probability of the low-utility situation you're experiencing, and correspondingly increase the probability of the counterfactual with Omega Award, thus increasing overall expected utility. Reality is so much worse than its alternatives that you're willing to pay to make it less real.

comment by ata · 2011-02-10T00:25:22.101Z · LW(p) · GW(p)

Of course I'd pay.

Replies from: NihilCredo

↑ comment by NihilCredo · 2011-02-10T14:20:05.027Z · LW(p) · GW(p)

Downvoted for the obnoxiousness of saying "of course" and not giving even the vaguest of explanations. This comment read to me like this: "It's stupid that you would even ask such a question, and I can't be bothered to say why it's stupid, but I can be bothered to proclaim my superior intelligence".

Replies from: ata

↑ comment by ata · 2011-02-10T18:18:41.684Z · LW(p) · GW(p)

I'm sorry it came off that way, I just found it overly similar to the other various Newcomblike problems, and couldn't see how it was supposed to reveal anything new about optimal decision strategies; paying is the TDT answer and the UDT answer; it's the choice you'd wish you could have precommitted to if you could have precommitted, it's the decision-type that will cause you to actually get $1,000,000, etc. If I'm not mistaken, this problem doesn't address any decision situations not already covered by standard Counterfactual Mugging and Parfit's Hitchhiker.

(Admittedly, I should have just said all that in the first place.)

comment by Perplexed · 2011-02-10T16:44:20.598Z · LW(p) · GW(p)

'a' should use a randomizing device so that he pays 51% of the time and refuses 49% of the time. Omega, aware of this strategy, but presumably unable to hack the randomizing device, achieves the best score by predicting 'pay' 100% of the time.

I am making an assumption here about Omega's cost function - i.e. that Type 1 and Type 2 errors are equally undesirable. So, I agree with cousin_it that the problem is underspecified.

The constraint P(o=AWARD) = P(a=PAY) that appears in the diagram does not seem to match the problem statement. It is also ambiguous. Are those subjective probabilities? If so, which agent forms those probabilities? And, as cousin_it points out, we also need to know the joint probability P(o=REWARD&a=PAY) or a conditional probability P(o=REWARD | a=PAY)

Replies from: wedrifid, Bongo

↑ comment by wedrifid · 2011-02-10T17:24:40.856Z · LW(p) · GW(p)

'a' should use a randomizing device so that he pays 51% of the time and refuses 49% of the time. Omega, aware of this strategy, but presumably unable to hack the randomizing device, achieves the best score by predicting 'pay' 100% of the time.

Apply any of the standard fine print for Omega based conterfactuals with respect for people who try to game the system with randomization. Depending on the version that means a payoff of $0, a payoff of 0.51 * $1,000 or an outright punishment for being a nuisance.

↑ comment by Bongo · 2011-02-10T17:23:48.252Z · LW(p) · GW(p)

I prefer this interpretation: P(a=X) means how sure the agent is it will X. If it flips a coin do decide whether X or Y, P(a=X)=P(a=Y)~=0.5. If it's chosen to "just X", P(a=X) ~= 1. Omega for his part knows the agent's surety and uses a randomizing device to match his actions with it.

ETA: if interpreted naively, this leads to Omega rewarding agents with deluded beliefs about what they're going to do. Maybe Omega shouldn't look at the agent's surety but the surety of "a perfectly rational agent" in the same situation. I don't have a real solution to this right now.

comment by Nisan · 2011-02-09T17:41:36.750Z · LW(p) · GW(p)

Nice diagram. By the way, the assertion "Omega asks you to pay him $100" doesn't make sense unless your decision is required to be a mixed strategy. I.e., P(a = PAY) < 1. In fact, P(a = PAY) must be significantly less than the strength of your beliefs about Omega.

comment by prase · 2011-02-09T17:28:52.927Z · LW(p) · GW(p)

Of course I don't pay. Omega has predicted that I won't pay if he asked, and Omega's predictions are by definition correct. I don't see how this is a decision problem at all.

Replies from: NihilCredo, Lightwave, wedrifid

↑ comment by NihilCredo · 2011-02-10T14:17:12.728Z · LW(p) · GW(p)

Omega problems do not (normally) require Omega to be assumed a perfect predictor, just a sufficiently good one.

Replies from: prase

↑ comment by prase · 2011-02-10T14:47:25.006Z · LW(p) · GW(p)

Well, fine, but then the correct strategy depends on Omega's success rate (and the payoffs). If the reward given to those willing to pay is r, and Omega's demand is d, and Omega's prediction success rate is s, the expected payoff for those who agree to pay is s r + (s - 1) d, which may be both positive or negative. (Refusers trivially get 0.)

↑ comment by Lightwave · 2011-02-09T20:05:38.468Z · LW(p) · GW(p)

What if the person being asked for the $100 is a simulation of you which Omega is using to check whether you'll pay if he asked you? You won't know whether you're the simulation or not.

Replies from: prase, TheOtherDave

↑ comment by prase · 2011-02-10T09:06:46.022Z · LW(p) · GW(p)

To predict, Omega doesn't need to simulate. You can predict that water will boil when put on fire without simulating the movement of 10^23 molecules.

Omega even can't use simulation to arrive at his prediction in this scenario. If Omega demands money from simulated agents who then agree to pay, the simulation violates the formulation of the problem, according to which Omega should reward those agents.

If the problem is reformulated as "Omega demands payment only if the agent would counterfactually disagree to pay, OR in a simulation", then we have a completely different problem. For example, if the agent is sufficiently confident about his own decision algorithm, then after Omega's demand he could assign high probability to being in a simulation. The analysis would be more complicated there.

In short, I am only saying that

Omega is trustworthy.
Omega can predict the agents behaviour with certainty.
Omega tells that it demands money only from agents whom it predicted to reject the demand.
Omega demands the money.
The agent pays.

are together incompatible statements.

Replies from: Skatche

↑ comment by Skatche · 2011-02-10T19:14:19.985Z · LW(p) · GW(p)

You can predict that water will boil when put on fire without simulating the movement of 10^23 molecules.

True but irrelevant. In order to make an accurate prediction, Omega needs, at the very least, to simulate my decision-making faculty in all significant aspects. If my decision-making process decides to recall some particular memory, then Omega needs to simulate that memory in all significant aspects. If my decision-making process decides to wander around the room conducting physics experiments, just to be a jackass, and to peg my decision to the results of those experiments - well, then Omega will need to convincingly simulate the results of those experiments. The anticipated experience will be identical for my actual decision-making process as for my simulated decision-making process.

Mind you, based on what I know of the brain, I think you'd actually need to run a pretty convincing, if somewhat coarse-grained, simulation of a good chunk of my light cone in order to predict my decision with any kind of certainty, but I'm being charitable here.

And yes, this seems to render the original formulation of the problem paradoxical. I'm trying to think of ways to suitably reformulate it without altering the decision theoretics, but I'm not sure it's possible.

Replies from: Nornagest

↑ comment by Nornagest · 2011-02-10T20:35:57.147Z · LW(p) · GW(p)

True but irrelevant. In order to make an accurate prediction, Omega needs, at the very least, to simulate my decision-making faculty in all significant aspects. If my decision-making process decides to recall some particular memory, then Omega needs to simulate that memory in all significant aspects. If my decision-making process decides to wander around the room conducting physics experiments, just to be a jackass, and to peg my decision to the results of those experiments - well, then Omega will need to convincingly simulate the results of those experiments.

I'm not convinced that all that actually follows from the premises. One of the features of Newcomblike problems is that they tend to appear intuitively obvious to the people exposed to them, which suggests rather strongly to me that the intuitive answer is linked to hidden variables in personality or experience, and in most cases isn't sensitively dependent on initial conditions.

People don't always choose the intuitive answer, of course, but augmenting that with information about the decision-theoretic literature you've been exposed to, any contrarian tendencies you might have, etc. seems like it might be sufficient to achieve fine-grained predictive power without actually running a full simulation of you. The better the predictive power, of course, the more powerful the model of your decision-making process has to be, but Omega doesn't actually have to have perfect predictive power for Newcomblike conditions to hold. It doesn't even have to have particularly good predictive power, given the size of the payoff.

Replies from: Skatche

↑ comment by Skatche · 2011-02-11T04:27:56.527Z · LW(p) · GW(p)

Er, I think we're talking about two different formulations of the problem (both of which are floating around on this page, so this isn't too surprising). In the original post, the constraint is given by P(o=award)=P(a=pay), rather than P(o=award)=qP(a=pay)+(1-q)P(a=refuse), which implies that Omega's prediction is nearly infallible, as it usually is in problems starring Omega: any deviation from P(o=award)=0 or 1 will be due to "truly random" influences on my decision (e.g. quantum coin tosses). Also, I think the question is not "what are your intuitions?" but "what is the optimal decision for a rationalist in these circumstances?"

You seem to be suggesting that most of what determines my decision to pay or refuse could be boiled down to a few factors. I think the evidence weighs heavily against this: effect sizes in psychological studies tend to be very weak. Evidence also suggests that these kinds of cognitive processes are indeed sensitively dependent on initial conditions. Differences in the way questions are phrased, and what you've had on your mind lately, can have a significant impact, just to name a couple of examples.

↑ comment by TheOtherDave · 2011-02-09T20:34:07.260Z · LW(p) · GW(p)

Doesn't that contradict the original assertion?

That is, at that point it sounds like it's no longer "Omega will ask me to pay him $100 if he predicts I wouldn't pay him if he asked," it's "Omega will ask me to pay him $100 if he predicts I wouldn't pay him if he asked OR if I'm a simulation."

Not that any of this is necessary. Willingness to pay Omega depends on having arbitrarily high confidence in his predictions; it's not clear that I could ever arrive at such a high level of confidence, but it doesn't matter.

We're just asking, if it's true, what decision on my part maximizes expected results for entities for which I wish to maximize expected results? Perhaps I could never actually be expected to make that decision because I could never be expected to have sufficient confidence that it's true. That changes nothing.

Also worth noting that if I pay him $100 in this scenario I ought to update my confidence level sharply downward. That is, if I've previously seen N predictions and Omega has been successful in each of them, I have now seen (N+1) predictions and he's been successful in N of them.

Of course, by then I've already paid; there's no longer a choice to make.

(Presumably I should believe, prior to agreeing, that my choosing to pay him will not actually result in my paying him, or something like that... I ought not expect to pay him, given that he's offered me $100, regardless of what choice I make. In which case I might as well choose to pay him. This is absurd, of course, but the whole situation is absurd.)

ETA - I am apparently confused on more fundamental levels than I had previously understood, not least of which is what is being presumed about Omega in these cases. Apparently I am not presumed to be as confident of Omega's predictions as I'd thought, which makes the rest of this comment fairly irrelevant. Oops.

Replies from: Bongo

↑ comment by Bongo · 2011-02-10T06:12:55.350Z · LW(p) · GW(p)

Willingness to pay Omega depends on having arbitrarily high confidence in his predictions

No. Paying is the winning strategy in the version where the predictor is correct only with probability, say, 0.8, too. ie.

P(o=AWARD) = 0.8*P(a=PAY)+0.2*P(a=REFUSE)

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-02-10T15:37:57.012Z · LW(p) · GW(p)

(blink)

You're right; I'm wrong. Clearly I haven't actually been thinking carefully about the problem.

Thanks.

↑ comment by wedrifid · 2011-02-09T18:22:12.574Z · LW(p) · GW(p)

I don't see how this is a decision problem at all.

And that is where most people make their mistake when encountering this kind of decision problem. It varies somewhat between people whether that error kicks in at standard Newcomb's, Transparent Newcomb's or some other even more abstract variant such as this.

Replies from: prase, TheOtherDave

↑ comment by prase · 2011-02-09T19:34:54.953Z · LW(p) · GW(p)

Could you explain the error, rather than just say that it is a common error? How can I agree to pay in a situation which happens only if I was predicted to disagree?

(I don't object to having precommited to agree to pay if Omega makes his request; that would indeed be a correct decision. But then, of course, Omega doesn't appear. The given formulation

Omega asks you to pay him $100. Do you pay?

implies that Omega really appears, which logically excludes the variant of paying. Maybe it is only a matter of formulation.)

↑ comment by TheOtherDave · 2011-02-09T20:40:53.925Z · LW(p) · GW(p)

I'll echo prase's request. It seems to me that given that he's made the offer and I am confident of his predictions, I ought not expect to pay him. This is true regardless of what decision I make: if I decide to pay him, I ought to expect to fail.

Perhaps I'm only carrying counterfeit bills, or perhaps a windstorm will come up and blow the money out of my hands, or perhaps by wallet has already been stolen, or perhaps I'm about to have a heart attack, or whatever.

Implausible as these things are, they are far more plausible than Omega being wrong. The last thing I should consider likely is that, having decided to pay, I actually will pay.

ETA - I am apparently confused on more fundamental levels than I had previously understood, not least of which is what is being presumed about Omega in these cases. Apparently I am not presumed to be as confident of Omega's predictions as I'd thought, which makes the rest of this comment fairly irrelevant. Oops.

Replies from: wedrifid

↑ comment by wedrifid · 2011-02-10T01:10:32.880Z · LW(p) · GW(p)

You just described the reasoning you would go through when making a decision. That would seem to be answer enough to demonstrate that this is a decision problem.

I don't see how this is a decision problem at all.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-02-10T01:18:20.036Z · LW(p) · GW(p)

Interesting.

It seems to me that if my reasoning tells me that no matter what decision I make, the same thing happens, that isn't evidence that I have a decision problem.

But perhaps I just don't understand what a decision problem is.

comment by Manfred · 2011-02-10T00:24:31.215Z · LW(p) · GW(p)

Would you agree that, given that Omega asks you, you are guaranteed by the rules of the problem to not pay him?

If you are inclined to take the (I would say) useless way out and claim it could be a simulation, consider the case where Omega makes sure the Omega in its simulation is also always right - creating an infinite tower of recursion such that the density of Omega being wrong in all simulations is 0.

Replies from: AlephNeil

↑ comment by AlephNeil · 2011-02-10T17:55:39.800Z · LW(p) · GW(p)

If you are inclined to take the (I would say) useless way out and claim it could be a simulation,

Leaving open the question of whether Omega must work by simulating the Player, I don't understand why you say this is a 'useless way out'. So for now let's suppose Omega does simulate the Player.

consider the case where Omega makes sure the Omega in its simulation is also always right

Why would Omega choose to, or need to, ensure that in its simulation, the data received by the Player equals Omega's actual output?

There must be an answer to the question of what the Player would do if asked, by a being that it believes is Omega, to pay $100. Even if (as cousin_it may argue) the answer is "go insane after deducing a contradiction", and then perhaps fail to halt. To get around the issue of not halting, we can either stipulate that if the Player doesn't halt after a given length of time then it refuses to pay by default, or else that Omega is an oracle machine which can determine whether the Player halts (and interprets not halting as refusal to pay).

Having done the calculation, Omega acts accordingly. None of this requires Omega to simulate itself.

Replies from: Manfred

↑ comment by Manfred · 2011-02-11T03:04:22.320Z · LW(p) · GW(p)

It's "useless" in part because, as you note, it assumes Omega works by simulating the player. But mostly it's just that it subverts the whole point of the problem; Omega is supposed to have your complete trust in its infallibility. To say "maybe it's not real" goes directly against that. The situation in which Omega simulates itself is merely a way of restoring the original intent of infallibility.

This problem is tricky; since the decision-type "pay" is associated with higher rewards, you should pay, but if you are a person Omega asks to pay, you will not pay, as a simple matter of fact. So the wording of the question has to be careful - there is a distinction between counterfactual and reality - some of the people Omega counterfactually asks will pay, none of the people Omega really asks will successfully pay. Therefore what might be seen as mere grammatical structure has a huge impact on the answer - "If asked, would you pay?" vs. "Given that Omega has asked you, will you pay?"

Replies from: wedrifid

↑ comment by wedrifid · 2011-02-11T03:28:55.485Z · LW(p) · GW(p)

It's "useless" in part because, as you note, it assumes Omega works by simulating the player

Or, if you are thinking about it more precisely, it observes that however Omega works, it will be equivalent to Omega simulating the player. It just gives us something our intuitions can grasp at a little easier.

Replies from: Manfred

↑ comment by Manfred · 2011-02-11T03:53:05.810Z · LW(p) · GW(p)

That's a fairly good argument - simulation or something equivalent is the most realistic thing to expect. But since Omega is already several kinds of impossible, if Omega didn't work in a way equivalent to simulating the player it would add minimally to the suspended disbelief. Heck, it might make it easier to believe, depending on the picture - "The impossible often has a kind of integrity to it which the merely improbable lacks."

Replies from: wedrifid

↑ comment by wedrifid · 2011-02-11T04:09:38.233Z · LW(p) · GW(p)

Heck, it might make it easier to believe, depending on the picture - "The impossible often has a kind of integrity to it which the merely improbable lacks."

On the other hand sometimes the impossible is simply incomprehensible and the brain doesn't even understand what 'believing' it would mean. (Which is what my brain is doing here.) Perhaps this is because it is related behind the scenes to certain brands of 'anthropic' reasoning that I tend to reject.

Extremely Counterfactual Mugging or: the gist of Transparent Newcomb

Contents

79 comments