# Formalizing Newcomb's

post by cousin_it · 2009-04-05T15:39:03.228Z · score: 18 (37 votes) · LW · GW · Legacy · 117 commentsThis post was inspired by taw urging us to mathematize Newcomb's problem and Eliezer telling me to post stuff I like instead of complaining.

To make Newcomb's problem more concrete we need a workable model of Omega. Let me count the ways:

1) Omega reads your decision from the future using a time loop. In this case the contents of the boxes are directly causally determined by your actions via the loop, and it's logical to one-box.

2) Omega simulates your decision algorithm. In this case the decision algorithm has indexical uncertainty on whether it's being run inside Omega or in the real world, and it's logical to one-box thus making Omega give the "real you" the million.

3) Omega "scans your brain and predicts your decision" without simulating you: calculates the FFT of your brainwaves or whatever. In this case you can intend to build an identical scanner, use it on yourself to determine what Omega predicted, and then do what you please. Hilarity ensues.

(NB: if Omega prohibits agents from using mechanical aids for self-introspection, this is in effect a restriction on how rational you're allowed to be. If so, all bets are off - this wasn't the deal.)

(Another NB: this case is distinct from 2 because it requires Omega, and thus your own scanner too, to terminate without simulating everything. A simulator Omega would go into infinite recursion if treated like this.)

4) Same as 3, but the universe only has room for one Omega, e.g. the God Almighty. Then ipso facto it cannot ever be modelled mathematically, and let's talk no more.

I guess this one is settled, folks. Any questions?

## 117 comments

Comments sorted by top scores.

Well... for whatever it's worth, the case I assume is (3).

"Rice's Theorem" prohibits Omega from doing this with all possible computations, but not with *humans*. It's probably not even all that difficult: people seem strongly attached to their opinions about Newcomb's Problem, so their actual move might not be too difficult to predict. *Any* mind that has an understandable reason for the move it finally makes, is not all that difficult to simulate at a high-level; *you* are doing it every time you *imagine* what it would do!

Omega is assumed to be in a superior position, but doesn't really need to be. I mean, I have no trouble *imagining* Omega as described - Omega figures out the decision I come to, then acts accordingly. Until I actually come to a decision, I don't know what Omega has already done - but of course my decision is simple: I take only box B. End of scenario.

If you're trying to figure out what Omega will do first - well, you're just doing that so that you can take both boxes, right? You just want to figure out what Omega does "first", and then take both boxes anyway. So Omega knows that, regardless of how much you insist that you want to compute Omega "first", and Omega leaves box B empty. You realize this and take both boxes. End of scenario again.

You may have some odd ideas left about free will. Omega can not only predict you, but probably do it without much trouble. Some humans might be able to take a pretty good guess too. Re: free will, see relevant posts, e.g. this.

But this is an ancient dilemma in decision theory (much like free will in philosophy), of which one should Google "causal decision theory", "evidential decision theory", and "Newcomblike" for enlightenment.

My strategy. I build a machine learning program that takes in half the data available about Omega and how well it predicts people who are likely to perform complex strategies, and data mines on that. If the computer program manages a high accuracy on the predicting the test set, and shows a significant chance that it will predict me to one box, then I two box.

Otherwise I one box.

Reasoning, it should be fairly obvious from this strategy that I am likely to one box, predicting Omega being hard. So if I can tell Omega is likely to predict this and I can predict Omega accurately, I'll then two box.

The goal is to try to force Omega into predicting that I will one box, while being more powerful than Omega in predictive power.

Not sure this will work, I'd like to try to do the math at some point.

My strategy. I build a machine learning program that takes in half the data available about Omega and how well it predicts people who are likely to perform complex strategies, and data mines on that. If the computer program manages a high accuracy on the predicting the test set, and shows a significant chance that it will predict me to one box, then I two box. ... The goal is to try to force Omega into predicting that I will one box, while being more powerful than Omega in predictive power.

Dunno, you'd have to pay me a lot more than $1000 to go to all that trouble. Doesn't seem rational to do all that work just to get an extra $1000 and a temporary feeling of superiority.

I dunno. I think I could make a 'machine learning program' that can predict a test set of 'every guess out of 1,000,000 was right' pretty quickly.

Aren't these rather ducking the point? The situations all seem to be assuming that we ourselves have Omega-level information and resources, in which case why do we care about the money anyway? I'd say the relevant cases are:

3b) Omega uses a scanner, *but we don't know how the scanner works* (or we'd be Omega-level entities ourselves).

5) Omega is using one of the above methods, or one we haven't thought of, but we don't know which. For all we know he could be reading the answers we gave on this blog post, and is just really good at guessing who will stick by what they say, and who won't. Unless we actually know the method with sufficient confidence to risk losing the million, we should one-box. ([Edit]: Originally wrote two-box here - I meant to say one-box)

3b) Our ignorance doesn't change the fact that, if the scanner is *in principle* repeatable, reality contains a contradiction. Type 3 is just impossible.

5) If I were in this situation, I'd assume a prior over possible Omegas that gave large weight to types 1 and 2, which means I would one-box. My prior is justified because a workable Omega of type 3 or 4 is harder for me to imagine than 1 or 2. Disagree? What would you do as a good Bayesian?

Type 3 is just impossible.

No - it just means it can't be perfect. A scanner that works 99.9999999% of the time is effectively indistinguishable from a 100% for the purpose of the problem. One that is 100% except in the presence of recursion is completely identical if we can't construct such a scanner.

My prior is justified because a workable Omega of type 3 or 4 is harder for me to imagine than 1 or 2. Disagree? What would you do as a good Bayesian?

I would one-box, but I'd do so regardless of the method being used, unless I was confident I could bluff Omega (which would generally require Omega-level resources on my part). It's just that I don't think the exact implementation Omega uses (or even whether we know the method) actually matter.

This is a good post. It explains that "given any concrete implementation of Omega, the paradox utterly disappears."

I'm quite bothered by Eliezer's lack of input to this thread. To me this seems like the most valuable thread of Newcomb's we had on OB/LW, and he's the biggest fan of the problem here, so I would have guessed he thought about it a lot, and tried some models even if they failed. Yet he didn't write anything here. Why is it so?

Because the discussion here didn't seem interesting relative to the discussions I've already read in philosophy; see the edited volume *Paradoxes of Rationality and Cooperation* or start googling on "evidential decision theory" and "causal decision theory".

I've never launched into a full-fledged discussion of Newcomb's Problem because that would quickly degenerate into a full-blown sequence in which I presented the general solution (tentatively labeled "timeless decision theory").

From my perspective this is a big, difficult, complicated, long-standing, controversial, overdetermined, elegant, solved problem, like the interpretation of quantum mechanics. Though in both cases there's a couple of leftover problems, the Born statistics for QM and some matters of mathematical representation for Newcomb, which may or may not represent a gateway to other mysteries after the original main problem has been solved.

I'll repeat yet again my standing offer to do my PhD thesis on Newcomblike problems if anyone will let me come in and *just* do a PhD thesis rather than demanding 8 years of class attendance.

Eliezer,

If what you have is good enough for a PhD thesis, you should just publish the thing as a book and then apply for a PhD based on prior work. On the other hand, there are plenty of schools with pure research degrees that will let you write a PhD without coursework (mostly in UK) but they won't likely let you in without a degree or some really impressive alternative credentials. But then, you probably have the latter.

All universities that I know of only grant PhDs based on prior work to their own previous students who've already taken a Masters there. If there is any university that just grants PhDs for sufficiently good prior work, do let me know.

For a certain definition of sufficiently good prior work, universities will grant PhDs. When I was in high school, I took a summer program at CMU and the professor Steven Rudich said that if we were to prove P=NP or P!=NP or prove it undecidable or whatever, that would be good for an instant PhD from CMU. I'm pretty sure the problem he referred to was P/NP, but it's been a while and it may have been another Millennium Problem.

So if you happen to have a proof for P/NP sitting around, let me know and I'll introduce you to Dr. Rudich.

Indeed. I'd thought De Montfort offered a PhD based on prior work, but can't seem to find a reference for it. I've also heard that the University of Luton (which would now be the University of Bedfordshire) would do them. However in either case, you'd likely need at least a bachelor's degree, so that seems like a dead end.

But maybe you can do something really impressive and get one of those 'honorary' doctorates. I hear they're as good as real ones.

Presumably the last line is sarcasm, but it's hard to tell over the Internet.

No, I was being serious. I'm pretty sure if you, say, do something Nobel Prize-worthy, someone will hop to and give you an honorary doctorate, and nobody will deny you've earned it.

Honorary doctorates are routinely handed out to random foreign dignitaries or people who donate money to colleges, and do not entitle the bearer to be called "Dr."

Kurzweil has 16 honorary doctorates plus the National Medal of Technology and he still gets written up as "Mr. Kurzweil".

Honorary doctorates are routinely handed out to random foreign dignitaries or people who donate money to colleges, and do not entitle the bearer to be called "Dr."

I wish. I'm thinking of a friend's boss, a private school headmaster, who insists on waving around his honorary doctorate as "Dr. [name]". The friend, who was teaching there, has an actual proper sweat of the brain Ph.D, and he insisted she should be addressed as "Mrs. [name]". WHAT.

Good point. At any rate, I'll keep an eye out for any doctorates by prior work from accredited schools and drop you a line.

thom: you're just wasting time suggesting this. It's been brought up on SL4 multiple times, and the people arguing like you have been ineffective each time.

I'd appreciate a short extended abstract of what you've collected (on related technical topics), without explanations, just outlining what it's about and linking to the keywords. I'm currently going through the stage of formalizing the earlier intuitions, and it looks like a huge synthesis, lots of stuff yet to learn, so some focus may be useful.

Sorry, too huge. There's a nice dissertation on the subject here: http://kops.ub.uni-konstanz.de/volltexte/2000/524/pdf/ledwig.pdf

I think I grasp this problem well enough, I'm not sure it's useful to plough through the existing philosophy at this point (am I wrong, is there something technically useful in e.g. that thesis?).

The examples of problems I was trying to figure out these last weeks is e.g. representation of preference order (lattices vs. probabilities vs. graphical models vs. other mathematical structures), relation and conversions between different representations of the state space (variables/predicates/etc.), representation of one agent by another, "agents" as efficient abstractions of regularities in the preference order, compound preferences and more global optimization resulting from cooperation of multiple agents, including the counterfactual agents and agents acting at different local areas in time/space/representation of state space, etc.

representation of preference order (lattices vs. probabilities vs. graphical models vs. other mathematical structures), relation and conversions between different representations of the state space (variables/predicates/etc.)

There's actually quite a lot of this in James Joyce's *The foundations of causal decision theory*, at what appears to me to be a gratuitiously high math level.

(5) Omega uses ordinary conjuring, or heretofore-unknown powers to put the million in the box after you make your decision. Solution: one-box for sure, no decision theory trickery needed. This would be in practice the conclusion we would come to if we encountered a being that appeared to behave like Omega, and therefore is also the answer in any scenario where we don't know the true implementation of Omega (ie any real scenario).

If the boxes are transparent, resolve to one-box iff the big box is empty.

Good! Now we have some terminology for future generations:

1) Temporal Omega 2) Simulator Omega 3) Terminating Omega 4) Singleton Omega 5) Cheating Omega

Great point about the prior, thanks.

I outlined a few more possibilities on Overcoming Bias last year:

There are many ways Omega could be doing the prediction/placement and it may well matter exactly how the problem is set up. For example, you might be deterministic and he is precalculating your choice (much like we might be able to do with an insect or computer program), or he might be using a quantum suicide method, (quantum) randomizing whether the million goes in and then destroying the world iff you pick the wrong option (This will lead to us observing him being correct 100/100 times assuming a many worlds interpretation of QM). Or he could have just got lucky with the last 100 people he tried it on.

If it is the deterministic option, then what do the counterfactuals about choosing the other box even mean? My approach is to say that 'You could choose X' means that if you had desired to choose X, then you would have. This is a standard way of understanding 'could' in a deterministic universe. Then the answer depends on how we suppose the world to be different to give you counterfactual desires. If we do it with a miracle near the moment of choice (history is the same, but then your desires change non-physically), then you ought two-box as Omega can't have predicted this. If we do it with an earlier miracle, or with a change to the initial conditions of the universe (the Tannsjo interpretation of counterfactuals) then you ought one-box as Omega would have predicted your choice. Thus, if we are understanding Omega as extrapolating your deterministic thinking, then the answer will depend on how we understand the counterfactuals. One-boxers and Two-boxers would be people who interpret the natural counterfactual in the example in different (and equally valid) ways.

If we understand it as Omega using a quantum suicide method, then the objectively right choice depends on his initial probabilities of putting the million in the box. If he does it with a 50% chance, then take just one box. There is a 50% chance the world will end either choice, but this way, in the case where it doesn't, you will have a million rather than a thousand. If, however, he uses a 99% chance of putting nothing in the box, then one-boxing has a 99% chance of destroying the world which dominates the value of the extra money, so instead two-box, take the thousand and live.

If he just got lucky a hundred times, then you are best off two-boxing.

If he time travels, then it depends on the nature of time-travel...

Thus the answer depends on key details not told to us at the outset. Some people accuse all philosophical examples (like the trolley problems) of not giving enough information, but in those cases it is fairly obvious how we are expected to fill in the details. This is not true here. I don't think the Newcomb problem has a single correct answer. The value of it is to show us the different possibilities that could lead to the situation as specified and to see how they give different answers, hopefully illuminating the topic of free-will, counterfactuals and prediction.

There's a (6) which you might consider a variant of (5): having made his best guess on whether you're going to going to one-box or two-box, Omega enforces that guess with orbital mind control lasers.

That's a creative attempt to avoid really considering Newcomb's problem; but as I suggested earlier, the noisy real-world applications are real enough to make this a question worth confronting on its own terms.

Least Convenient Possible World: Omega is type (3), and does not offer the game at all if it calculates that its answers turn out to be contradictions (as in your example above). At any rate, you're not capable of building or obtaining an accurate Omega' for your private use.

Aside: If Omega sees probability *p* that you one-box, it puts the million dollars in with probability *p*, and in either case writes *p* on a slip of paper in that box. Omega has been shown to be extremely well-calibrated, and its *p* only differs substantially from 0 or 1 in the case of the jokers who've tried using a random process to outwit it. (I always thought this would be an elegant solution to that problem; and note that the expected value of 1-boxing with probability *p* should then be 1000000*p*+1000(1-*p*).)

Yes, these are extra rules of the game. But if these restrictions make rationality impossible, then it doesn't seem human beings can be rational by your standards (as we're already being modeled fairly often in social life)— in which case, we'll take whatever Art is our best hope instead, and call *that* rationality.

So what do you do in this situation?

Eliezer has repeatedly stated in discussions of NP that Omega only cares about the outcome, not any particular "ritual of cognition". This is an essential part of the puzzle because once you start punishing agents for their reasoning you might as well go all the way: reward only irrational agents and say nyah nyah puny rationalists. Your Omega bounds how rational I can be and outright forbids thinking certain thoughts. In other words, the original raison d'etre was refining the notion of perfect rationality, whereas your formulation is about *approximations* to rationality. Well, who defines what is a good approximation and what isn't? I'm gonna one-box without explanation and call this rationality. Is this bad? By what metric?

Believe or not, I have considered the most inconvenient worlds repeatedly while writing this, or I would have had just one or two cases instead of four.

A strategy Omega uses to avoid paradox which has the effect of punishing certain rituals of cognition because they lead to paradox is different than Omega deliberately handicapping your thought process. It is not a winning strategy to pursue a line of thought that produces a paradox instead of a winning decision. I would wait until Omega forbids strategies that would otherwise win before complaining that he "bounds how rational I can be".

Maybe see it as a competition of wits. Between two agents whose personal goal is or isn't compatible. If they are not of similar capability, the one with more computational resources, and how well those resources are being used, is the one which will get its way, against the other's will if necessary. If you were "bigger" than omega, then you'd be the one to win, no matter which weird rules omega would wish to use. But omega is bigger ... by definition.

In this case, the only way for the smaller agent to succeeds is to embed his own goals into the other agent's. In practice agents aren't omniscient or omnipotent, so even an agent orders of magnitude more powerful than another, may still fail against the latter. That would become increasingly unlikely, but not totally impossible (as in, playing lotteries).

If the difference in power is even small enough, then both agents ought to cooperate and compromise, both, since in most cases that's how they can maximize their gains.

But in the end, once again, rationality is about reliably winning in as many cases as possible. In some cases, however unlikely and unnatural they may seem, it just can't be achieved. That's what optimization processes, and how powerful they are, are about. They steer the universe into very unlikely states. Including states where "rationality" is counterproductive.

Maybe see it as a competition of wits.

Yes! Where is the money? A battle of wits has begun! It ends when a box is opened.

Of course, it's so *simple*. All I have to do is divine from what I know of Omega: is it the sort of agent who would put the money in one box, or both? Now, a clever agent would put little money into only one box, because it would know that only a great fool would not reach for both. I am not a great fool, so I can *clearly* not take only one box. *But* Omega must have known I was not a great fool, and would have counted on it, so I can *clearly* not choose both boxes.

Truly, Omega must admit that I have a dizzying intellect.

On the other hand, perhaps I have confused this with something else.

My version of Omega still only cares about its prediction of your decision; it just so happens that it doesn't offer the game if it predicts "you will 2-box if and only if I predict you will 1-box", and it plays probabilistically when it predicts you decide probabilistically. It doesn't reward you for your decision algorithm, only for its outcome— even in the above cases.

Yes, I agree this is about approximations to rationality, just like Bayescraft is about approximating the ideal of Bayesian updating (impossible for us to achieve since computation is costly, among other things). I tend to think such approximations should be robust even as our limitations diminish, but that's not something I'm confident in.

Well, who defines what is a good approximation and what isn't?

A cluster in conceptspace. Better approximations should have more, not less, accurate maps of the territory and should steer higher proportions of the future into more desirable regions (with respect to our preferences).

I'm gonna one-box without explanation and call this rationality. Is this bad? By what metric?

I think "without explanation" is bad in that it fails to generalize to similar situations, which I think is the whole point. In dealing with agents who model your own decisions in advance, it's good to have a general theory of action that systematically wins against other theories.

Your fix is a kludge. I could randomize: use the detector to determine Omega's p and then use 1-p, or something like that. Give me a general description of what your Omega does, and I'll give you a contradiction in the spirit of my original post. Patch the holes all you want. Predicting the future *always* involves a contradiction, it's just more or less hard to tease out. You can't predict the future and outlaw contradictions by fiat; it is *logically impossible*. This was one of the points of my post.

Your fix is a bit of a kludge. I could randomize: use my detector to determine p, and then use 1-p. So for total consistency you should amend Omega to "protect" the value of p, and ban the agent if p is tampered with. Now it sounds bulletproof, right?

But here's the rub: the agent doesn't need a perfect replica of Omega. A half-assed one will do fine. In fact, if a certain method of introspection into your initial state allowed Omega to determine the value of p, then *any* weak attempt at introspection will give you some small but non-zero information about what p Omega detected. So every living person will fail your Omega's test. My idea with the scanner was just a way to "externalize" the introspection, making the contradiction stark and evident.

Any other ideas on how Omega should behave?

I could randomize: use my detector to determine p, and then use 1-p.

In this case, Omega figures out you would use that detector and predicts you will use 1-p. If your detector is effective, it will take into account that Omega knows about it, and will figure that Omega predicted 1-(1-p) = p. But Omega would have realized that the detector could do that. This is the beginning of an infinite recursion attempting to resolve a paradox, no different because we are using probabilities instead of Booleans. Omega recognizes this and concludes the game is not worth playing. If you and your detector are rational, you should too, and find a different strategy. (Well, Omega could predict a probability of .5 which is stable, but a strategy to take advantage of this would lead to paradox.)

Omegas of type 3 don't use simulations. If Omega is a simulator, see case 2.

...why is everybody latching on to 3? A brainwave-reading Omega is a pathetic joke that took no effort to kill. Any realistic Omega would have to be type 2 anyway.

Paradoxes show that your model is bad. My post was about defining non-contradictory models of Newcomb's problem and seeing what we can do with them.

Could you taboo "simulation" and explain what you are prohibiting Omega from doing by specifying that Omega does not use simulations? Presumably this still allows Omega to make predictions.

That one's simple: prohibit indexical uncertainty. I must be able to assume that I am in the real world, not inside Omega. So should my scanner's internal computation - if I anticipate it will be run inside Omega, I will change it accordingly.

Edit: sorry, now I see why exactly you're asked. No, I have no proof that my list of Omega types is exhaustive. There could be a middle ground between types 2 and 3: an Omega that doesn't simulate you, but still somehow prohibits you from using another Omega to cheat. But, as orthonormal's examples show, such a machine doesn't readily spring to mind.

Indexical uncertainty is a property of you, not Omega.

Saying Omega cannot create a situation in which you have indexical uncertainty is too vague. What process of cognition is prohibited to Omega that prevents producing indexical uncertainty, but still allows for making calibrated, discriminating predictions?

You're digging deep. I already admitted that my list of Omegas isn't proven to be exhaustive and probably can never be, given how crazy the individual cases sound. The thing I call a type 3 Omega should better be called a Terminating Omega, a device that outputs one bit in bounded time given any input situation. If Omega is non-terminating - e.g. it throws me out of the game on predicting certain behavior, or hangs forever on some inputs - of course such an Omega doesn't necessarily have to be a simulation. But then you need a halfway credible account of what it *does*, because otherwise the problem is unformulated and incomplete.

The process you've described (Omega realizes this, then realizes that...) sounded like a simulation - that's why I referred you to case 2. Of *course* you might have meant something I hadn't anticipated.

Part of my motivation for digging deep on this issue is that, although I did not intend for my description of Omega and the detector reasoning about each other to be based on a simulation, I could see after you brought it up that it might be interpreted that way. I thought if I knew on a more detailed level what we mean by "simulation", I would be able to tell if I had implicitly assumed that Omega was using one. However, any strategy I come up with for making predictions seems like something I could consider a simulation, though it might lack detail, and through omitting important details, be inaccurate. Even just guessing could be considered a very undetailed, very inaccurate simulation.

I would like a definition of simulation that doesn't lead to this conclusion, but in case there isn't one, suppose the restriction against simulation really means that Omega does not use a perfect simulation, and you have a chance to resolve the indexical uncertainty.

I can imagine situations in which an incomplete, though still highly accurate, simulation provides information to the simulated subject to resolve the indexical uncertainty, but this information is difficult or even impossible to interpret.

For example, suppose Omega does use a perfect simulation, except that he flips a coin. In the real world, Omega shows you the true result of the coin toss, but he simulates your response as if he shows you the opposite result. Now you still don't know if you are in a simulation or reality, but you are no longer guaranteed by determinism to make the same decision in each case. You could one box if you see heads and two box if you see tails. If you did this, you have a 50% probability that the true flip was heads, so you gain nothing, and a 50% probability that the true flip was tails and you gain $1,001,000, for an expected gain of $500,500. This is not as good as if you just one box either way and gain $1,000,000. If Omega instead flips a biased coin that shows tails 60% of the time, and tells you this, then the same strategy has an expected gain of $600,600, still not as good as complete one-boxing. But if the coin was biased to show tails 1000 times out of 1001, then the strategy expects to equal one-boxing, and it will do better for a more extreme bias.

So, if you suppose that Omega uses an imperfect simulation (without the coin), you can gather evidence about if you are in reality or the simulation. You would need to achieve a probability of greater than 1000/1001 that you are in reality before it is a good strategy to two box. I would be impressed with a strategy that could accomplish that.

As for terminating, if Omega detects a paradox, Omega puts money in box 1 with 50% probability. It is not a winning strategy to force this outcome.

It seems your probabilistic simulator Omega is amenable to rational analysis just like my case 2. In good implementations we can't cheat, in bad ones we can; it all sounds quite normal and reassuring, no trace of a paradox. Just what I aimed for.

As for terminating, we need to demystify what it means by "detecting a paradox". Does it somehow compute the actual probabilities of me choosing one or two boxes? Then what part of the world is assumed to be "random" and what part is evaluated exactly? An answer to this question might clear things up.

One way Omega might prevent paradox is by adding an arbitrary time limit, say one hour, for you to choose whether to one box or two box. Omega could then run the simulation, however accurate, up to the limit of simulated time, or when you actually make a decision, whichever comes first. Exceeding the time limit could be treated as identical to two boxing. A more sophisticated Omega that can search for a time in the simulation when you have made a decision in constant time, perhaps by having the simulation state described by a closed form function with nice algebraic properties, could simply require that you eventually make a decision. This essentially puts the burden on the subject not to create a paradox, or anything that might be mistaken for a paradox, or just take too long to decide.

Then what part of the world is assumed to be "random" and what part is evaluated exactly?

Well Omega could give you a pseudo random number generator, and agree to treat it as a probabilistic black box when making predictions. It might make sense to treat quantum decoherence as giving probabilities to observe the different macroscopic outcomes, unless something like world mangling is true and Omega can predict deterministically which worlds get mangled. Less accurate Omegas could use probability to account for their own inaccuracy.

In good implementations we can't cheat, in bad ones we can

Even better, in principal, though it would be computationally difficult, describe different simulations with different complexities and associated Occam priors, and with different probabilities of Omega making correct predictions. From this we could determine how much of a track record Omega needs before we consider one boxing a good strategy. Though I suspect actually doing this would be harder than making Omega's predictions.

I find Newcomb's problem interesting. Omega predicts accurately. This is impossible in my experience. We are not discussing a problem any of us is likely to face. However I still find discussing counter-factuals interesting.

To make Newcomb's problem more concrete we need a workable model of Omega

I do not think that is the case. Whether Omega predicts by time travel, mind-reading, or even removes money from the box by teleportation when it observes the subject taking two boxes is a separate discussion, considering laws of physics, SF, whatever. This might be quite fun, but is wholly separate from discussing Newcomb's problem itself.

I think an ability to discuss a counter-factual without having some way of relating it to Reality is a useful skill. Playing around with the problem, I think, has increased my understanding of the real World. Then the "need" to explain how a real Omega might do what Omega is described as being able to do just gets in the way.

Playing around with the problem, I think, has increased my understanding of the real World.

In what ways?

Most insights that arise from Newcomb's problem seem to me to be either phony or derivable from simpler problems that don't feature omniscient entities. Admittedly you can meditate on the logical loop forever in the illusion that it increases your understanding. Maybe the unexpected hanging paradox will help snap you out? That paradox also allows perpetual meditation until we sit down and demystify the word "surprise" into mathematical logic, exposing the problem statement as self-referential and self-contradictory. In Newcomb's problem we might just need to similarly demystify the word "predict", as I've been trying to.

All right, I found another nice illustration. Some philosophers today think that Newcomb's problem is a model of certain real-world situations. Here's a typical specimen of this idiocy, retyped verbatim from here:

*Let me describe a typical medical Newcomb problem. It has long been recognized that in people susceptible to migraine, the onset of an attack tends to follow the consumption of certain foods, including chocolate and red wine. It has usually been assumed that these foods are causal factors, in some way triggering attacks. This belief has been the source of much mental and physical anguish for those susceptible both to migraines and to the attractions of these substances. Recently however an alternative theory has come to light. It has been discovered that eating chocolate is not a cause of migraine, but a joint effect of some pre-migrainous state (or 'PMS', as we doctors say). The physiological changes that comprise PMS thus typically increase a subject's desire for chocolate, as well as leading, later, to the usual physical symptoms of migraine.*

The article goes on to suggest that, in a sufficiently freaky decision theory, abstaining from chocolate can still help. Yes, folks, *this* is the best real-world scenario they could come up with. I rest my case .

Newcomb-like problems arise when there is a causal thread passing *through your cognitive algorithm* which produces the correlation. There is no causality going through your cognitive algorithm to the migraine here. The author doesn't know what a newcomb-like problem is.

Some authors define "Newcomblike problem" as one that brings evidential and decision theory into conflict, which this does.

So... in Newcomb's problem, evidential says one-box, causal says two-box, causal clearly fails.

In Chocolate problem, evidential says avoid chocolate, causal says eat the chocolate, evidential clearly fails.

Thus neither theory is adequate.

Is that right?

I assume it's a typo: evidential vs. *causal* decision theories.

Evidential decision theory wins for the wrong reasons, and causal decision theory just fails.

But evidential actually tells you not to eat the chocolate? That's a pretty spectacular failure mode -- it seems like it could be extended to not taking your loved ones to the hospital because people tend to die there.

Yeah, that was awkwardly worded, I was only referring to Newcomb.

I assume it's a typo: evidential vs. *causal* decision theories.

In the standard Newcomb's, is the deal Omega is making explained to you before Omega makes its decision; and does the answer to my question matter?

Wikipedia says the deal is explained beforehand. It doesn't seem to matter in any of the models proposed in the post and comments, but it could conceivably matter in some other model.

NB: if Omega prohibits agents from using mechanical aids for self-introspection, this is in effect a restriction on how rational you're allowed to be. If so, all bets are off - this wasn't the deal.

Already answered above. If agents' rationality is restricted, the problem loses its original point of refining "perfect rationality" and becomes a question of approximations. Okay, my approximation: when confronted with a huge powerful agent that has a track record of 100% truth, *believe it*. I one-box and win. Who are you to tell me my approximation is bad?

Okay, my approximation: when confronted with a huge powerful agent that has a track record of 100% truth,

believe it. I one-box and win. Who are you to tell me my approximation is bad?

I don't have problems with that. But Omega doesn't tell you "take one box to win". It only tells that if you'll take one box, it placed a million in it, and if you'll take two boxes, it didn't. It doesn't tell which decision you must take, the decision is yours.

The whole thing is a test ground for decision theories. If your decision theory outputs a decision that *you* think is not the right one, then you need to work some more on that decision theory, finding a way for it to compute the decisions you approve of.

Annoyance has it right but too cryptic: it's the other way around. If your decision theory fails on this test ground but works perfectly well in the real world, maybe you need to work some more on the test ground. For now it seems I've adequately demonstrated how your available options depend on the implementation of Omega, and look not at all like the decision theories that we find effective in reality. Good sign?

Annoyance has it right but too cryptic: it's the other way around. If your decision theory fails on this test ground but works perfectly well in the real world, maybe you need to work some more on the test ground.

Not quite. The failure of a strong decision theory on a test is a reason *for you* to start doubting the adequacy of both the test problem and the decision theory. The decision to amend one or the other must always come through you, unless you already trust something else more than you trust yourself. The paradox doesn't care what you do, it is merely a building block towards better explication of what kinds of decisions you consider correct.

Woah, let's have some common sense here instead of preaching. I have good reasons to trust accepted decision theories. What reason do I have to trust Newcomb's problem? Given how much in my analysis turned out to depend on the implementation of Omega, I don't trust the thing at all anymore. Do you? Why?

You are not asked to trust anything. You have a paradox; resolve it, understand it. What do you refer to, when using the word "trust" above?

Uh, didn't I convince you that, *given any concrete implementation of Omega*, the paradox utterly disappears? Let's go at it again. What kind of Omega do you offer me?

The usual setting, you being a sufficiently simple mere human, not building your own Omegas in the process, going through the procedure in a controlled environment if that helps to get the case stronger, and Omega being able to predict your actual final decision, by whatever means it pleases. What the Omega does to predict your decision doesn't affect you, shouldn't concern you, it looks like only that it's usually right is relevant.

"What the Omega does to predict your decision doesn't affect you, shouldn't concern you, it looks like only that it's usually right is relevant."

Is this the least convenient world? What Omega does to predict my decision does concern me, because it determines whether I should one-box or two-box. However, I'm willing to allow that in a LCW, I'm not given enough information. Is this the Newcomb "problem", then -- how to make rational decision when you're not given enough information?

No perfectly rational decision theory can be applied in this case, just like you can't play chess perfectly rationally with a desktop PC. Several comments above I outlined a good approximation that I would use and recommend a computer to use. This case is just... uninteresting. It doesn't raise any question marks in my mind. It should?

Can you please explain why a rational decision theory cannot be applied?

As I understand it, perfect rationality in this scenario requires we assume some Bayesian prior over all possible implementations of Omega and do a ton of computation for each case. For example, some Omegas could be type 3 and deceivable with non-zero probability; we have to determine how. If we know which implementation we're up against, the calculations are a little easier, e.g. in the "simulating Omega" case we just one-box without thinking.

By that definition of "perfect rationality" no two perfect rationalists can exist in the same universe, or any material universe in which the amount of elapsed time before a decision is always finite.

Some assumptions allow you to play some games rationally with finite resources, like in the last sentence of my previous comment. Unfortunately we aren't given any such assumptions in Newcomb's, so I fell back to the decision procedure recommended by you: Solomonoff induction. Don't like it? Give me a workable model of Omega.

Yes, it's true. Perfectly playing any non-mathematical "real world" game (the formulation Vladimir Nesov insists on) requires great powers. If you can translate the game into maths to make it solvable, please do.

The decision theory must allow approximations, a ranking allowing to find (recognize) as good a solution as possible, given the practical limitations.

You are reasoning from the faulty assumption that "surely it's possible to formalize the problem *somehow* and do *something*". The problem statement is *self-contradictory*. We need to resolve the contradiction. It's only possible by making some part of the problem statement *false*. That's what the prior over Omegas is *for*. We've been told some bullshit, and need to determine which parts are true. Note how my Omegas of type 1 and 2 banish the paradox: in case 1 "the money is already there anyway" has become a plain simple lie, and in case 2 "Omega has already predicted your choice" becomes a lie when you're inside Omega. I say the real world doesn't have contradictions. Don't ask me to reason approximately from contradictory assumptions.

You gotta decide something, faced with the situation. It doesn't look like you argue that Newcomb's test itself literally can't be set up. So what do you mean by contradictions? The physical system itself can't be false, only its description. Whatever contradictions you perceive in the test, they come from the problems of interpretation; the only relevant part of this whole endeavor is computing the decision.

The physical system can't be false, but Omega seems to be lying to us. How do you, as a rationalist, deal when people contradict themselves verbally? You build models, like I did in the original post.

Omega doesn't lie by the statement of the problem. It doesn't even assert anything, it just places the money in the box or doesn't.

What's *wrong* with you? If Omega tells us the conditions of the experiment (about "foretelling" and stuff), then Omega is lying. If someone else, then someone else. Let's wrap this up, I'm sick.

As was pointed out numerous times, it well may be possible to foretell your actions, even by some variation on just reading this forum and looking what people claim to choose in the given situation. That you came up with specific examples that ridicule the claim of being able to predict your decision, doesn't mean that there literally is no way to do that. Another, more detailed example, is what you listed as (2) simulation approach.

some variation on just reading this forum and looking what people claim to choose in the given situation

Case 3, "terminating Omega", demonstrable contradiction.

Another, more detailed example, is what you listed as (2) simulation approach.

I already explained where a "simulator Omega" has to lie to you.

Sorry, I don't want to spend any more time on this discussion. Goodbye.

FWIW, I understand your frustration, but just as a data point I don't think this reaction is warranted, and I say that as someone who likes most of your comments. I know you made this post in order to escape the rabbit hole, but you must have expected to spend a little time there digging when you made it!

The problem setting itself shouldn't raise many questions. If you agree that the right answer in this setting is to one-box, you probably understand the test. Next, look at the popular decision theories that calculate that the "correct" answer is to two-box. Find what's wrong with those theories, or with the ways of applying them, and find a way to generalize them to handle this case and other cases correctly.

There's nothing wrong with those theories. They are wrongly applied, selectively ignoring the part of the problem statement that explicitly says you *can't* two-box if Omega decided you would one-box. Any naive application will do that because all standard theories assume causality, which is broken in this problem. Before applying decision theories we must work out what causes what. My original post was an attempt to do just that.

What other cases?

There's nothing wrong with those theories. They are wrongly applied, selectively ignoring the part of the problem statement that explicitly says you can't two-box if Omega decided you would one-box.

The decision *is* yours, Omega only foresees it. See also: Thou Art Physics.

Any naive application will do that because the problem statement is contradictory on the surface. Before applying decision theories, the contradiction has to be resolved somehow as we work out what causes what. My original post was an attempt to do just that.

Do that for the standard setting that I outlined above, instead of constructing its broken variations. What it means for something to cause something else, and how one should go about describing the situations in that model should arguably be a part of any decision theory.

the problem statement ... explicitly says you can't two-box if Omega decided you would one-box.

The decision is yours, Omega only foresees it.

These stop contradicting each other if you rephrase a little more precisely. It's not that you *can't* two-box if Omega decided you would one-box--you just *don't*, because in order for Omega to have decided that, you must have also decided that. Or rather, been going to decide that--and if I understand the post you linked correctly, its point is that the difference between "my decision" and "the predetermination of my decision" is not meaningful.

As far as I can tell--and I'm new to this topic, so please forgive me if this is a juvenile observation--the flaw in the problem is that it cannot be true both that the contents of the boxes are determined by your choice (via Omega's prediction), and that the contents have already been determined when you are making your choice. The argument for one-boxing assumes that, of those contradictory premises, the first one is true. The argument for two-boxing assumes that the second one is true.

The potential flaw in my description, in turn, is whether my simplification just now ("determined by your choice via Omega") is actually equivalent to the way it's put in the problem ("determined by Omega based on a prediction of you"). *I* think it is, for the reasons given above, but what do I know?

(I feel comfortable enough with this explanation that I'm quite confident I must be missing something.)

An aspiring Bayesian rationalist would behave like me in the original post: assume some prior over the possible implementations of Omega and work out what to do. So taboo "foresee" and propose some mechanisms as I, ciphergoth and Toby Ord did.

Why shouldn't you adjust your criteria for approval until they fit the decision theory?

Why not adjust both until you get a million dollars?

I'm liking this preference for (Zen|Socratic) responses.

Thank you. Hopefully this will be the last post about Newcomb's problem for a long time.

Even disregarding uncertainty whether you're running inside Omega or in the real world, assuming Omega is perfect #2 effectively reverses the order of decisions just like #1 - and you decide first (via simulation), omega decides second. So it collapses to a trivial one-box.

taw, I was kinda hoping you'd have some alternative formulations, having thought of it longer than me. What do you think? Is it still possible to rescue the problem?

I was mostly trying to approach it from classical decision theory side, but the results are still the same. There are three levels in the decision tree here:

- You precommit to one-box / two-box
- Omega decides 1000000 / 0. Omega is allowed to look at your precommitment
- You do one-box / two-box

If we consider precommitment to be binding, we collapse it to "you decide first, omega second, so trivial one-box" . If we consider precommitment non-binding, we collapse it to "you make throwaway decision to one-box, omage does 1000000, you two-box and get 1001000", and this "omega" has zero knowledge.

In classical decision theory you are not allowed to look at other people's precommitments, so the game with decisions taking place at any point (between start and the action) and people changing their minds on every step is mathematically equivalent to one where precommitments are binding and decided before anybody acts.

This equivalency is broken by Newcomb's problem, so precommitments and being able to break them now do matter, and people who try to use classical decision theory ignoring this will fail. Axiom broken, everybody dies.

Omega simulates your decision algorithm. In this case the decision algorithm has indexical uncertainty on whether it's being run inside Omega or in the real world, and it's logical to one-box thus making Omega give the "real you" the million.

I never thought of that!

Can you formalize "hilarity ensues" a bit more precisely?

I'd love to claim credit, but the head-slapping idea was mentioned on OB more than once, and also in the Wikipedia entry on Newcomb's Paradox.

Hilarity means we know what Omega predicted but are free to do what we like. For example, you could learn that Omega considers you a two-boxer and then one-box, earning zero money - an impressive feat considering the circumstances.

It's like a Mastercard commercial. Losing the opportunity to get a stack of money: costly. Blowing Omega's mind: priceless.

I love how the discussion here is turning out. The post had karma 1, then 0, then 1 again and there it stays; but the chat is quite lively. Maybe I shouldn't obsess over karma.

Sadly, it's impossible to distinguish a comment no one votes on from one that has equal positive and negative votes. The 'most controversial' category option helps a little bit, but not much.

My advice: don't sweat the small stuff, and remember that votes are small stuff.

Sadly, it's impossible to distinguish a comment no one votes on from one that has equal positive and negative votes.

This may get fixed later.

Omega knows that I have no patience for logical paradoxes, and will delegate my decision to a quantum coin-flipper exploiting the Conway-Kochen theorem. Hilarity ensues.

I would one-box in Newcomb's problem, but I'm not sure why Omega is more plausible than a being that rewards people that it predicts would be two-boxers. And yet it is more plausible to me.

When I associate one-boxing with cooperation, that makes it more attractive. The anti-Omega would be someone who was afraid cooperators would conspire against it, and so it rewards the opposite.

In the case of the pre-migraine state below, refraining from chocolate seems much less compelling.

4) Same as 3, but the universe only has room for one Omega, e.g. the God Almighty. Then ipso facto it cannot ever be modelled mathematically, and let's talk no more.

Why can't God Almighty be modelled mathematically?

Omega/God is running the universe on his computer. He can pause any time he wants (for example to run some calculations), and modify the "universe state" to communicate (or just put his boxes in).

That seems to be close enough to 4). Unlike with 3), you can't use the same process as Omega (pause the universe and run arbitrary calculations that could consider the state of every quark).

No God Almighty needed for your example, just an intelligence that's defined to be more powerful than you. If your computational capacity is bounded and the other player has much more, you certainly can't apply *any* perfectly rational decision concept. The problem is now about *approximation*. One approximation I've mentioned several times already is *believing* powerful agents with a 100% track record of truth. Sound reasonable? That's the level of discussion you get when you introduce bounds.

Your Omega isn't a type 3 or 4 at all, it's a type 2 with *really big* computational capacity.

What does Newcomb's Problem has to do with reality as we know it anyway? I mean, imagine that I've solved it (whatever that means). Where in my everyday life can I apply it?

Parfit's Hitchhiker, colliding futuristic civilizations, AIs with knowledge of each other's source code, whether rationalists can *in principle* cooperate on the true Prisoner's Dilemma.

Oh, hello.

Parfit's Hitchhiker

Purely about precommitment, not prediction. Precommitment has been analyzed to death by Schelling, no paradoxes there.

colliding futuristic civilizations

Pass.

AIs with knowledge of each other's source code

Rice's theorem.

whether rationalists can in principle cooperate on the true Prisoner's Dilemma

PD doesn't have mystical omniscient entities. If we try to eliminate them from Newcomb's as well, the problem evaporates. So no relation.

Rice's theorem.

You keep using that word. I do not think it means what you think it does.

Rice's theorem is evidence that Omega is likely to be type 1 or 2 rather than 3, and thus in favor of one-boxing.

This was kinda the point of the post: demonstrate the craziness and irrelevance of the problem. I just got sick of people here citing it as an important example. The easiest way to dismiss a problem like that from our collective mind is to "solve" it.

Parfit's Hitchhiker, colliding futuristic civilizations, AIs with knowledge of each other's source code, whether rationalists can *in principle* cooperate on the true Prisoner's Dilemma.

I have a very strong feeling that way 3 is not possible. It seems that any scanning/analysis procedure detailed enough to predict your actions constitutes simulating you.

I have a very strong feeling that way 3 is not possible. It seems that any scanning/analysis procedure detailed enough to predict your actions constitutes simulating you.

I predict that you will not, in the next 24 hours, choose to commit suicide.

Am I simulating you?

To complete the picture you should give smoofra adequate incentive to falsify your prediction, and then see how it goes.

You can always change the problem so that it stops making sense, or that the answer gets reversed. But this is not the point, you should seek to understand what the *intent* was as clearly as possible.

If an argument attacks your long-held belief, make the argument stronger, help it to get through. If you were right, the argument will fail, but you ought to give it the best chance you can.

Not necessarily. It could be purely empirical in nature. No insight into how the detected signals causally relate to the output is required.

I feel the same, but would have been dishonest to omit it. Even 4 sounds more likely to me than 3.