Posts
Comments
"In your problem description you said you receive the letter"
True, but the problem description also specifies subjunctive dependence between the agent and the predictor. When the predictor made her prediction the letter isn't yet sent. So the agent's decision influences whether or not she gets the letter.
"This intuition is actually false for perfect predictors."
I agree (and have written extensively on the subject). But it's the prediction the agent influences, not the presence of the termite infestation.
Given that you receive the letter, paying is indeed evidence for not having termites and winning $999,000. EDT is elegant, but still can't be correct in my view. I wish it were, and have attempted to "fix" it.
My take is this. Either you have the termite infestation, or you don't.
Say you do. Then
- being a "payer" means you never receive the letter, as both conditions are false. As you don't receive the letter, you don't actually pay, and lose the $1,000,000 in damages.
- being a "non-payer" means you get the letter, and you don't pay. You lose $1,000,000.
Say you don't. Then
- payer: you get the letter, pay $1,000. You lose $1,000.
- non-payer: you don't get the letter, and don't pay $1,000. You lose nothing.
Being a payer has the same result when you do have the termites, but is worse when you don't. So overall, it's worse. Being a payer or a non-payer only influences whether or not you get the letter, and this view is more coherent with the intuition that you can't possibly influence whether or not you have a termite infestation.
XOR Blackmail is (in my view) perhaps the clearest counterexample to EDT:
An agent has been alerted to a rumor that her house has a terrible termite infestation that would cost her $1,000,000 in damages. She doesn’t know whether this rumor is true. A greedy predictor with a strong reputation for honesty learns whether or not it’s true, and drafts a letter:
I know whether or not you have termites, and I have sent you this letter iff exactly one of the following is true: (i) the rumor is false, and you are going to pay me $1,000 upon receiving this letter; or (ii) the rumor is true, and you will not pay me upon receiving this letter.
The predictor then predicts what the agent would do upon receiving the letter, and sends the agent the letter iff exactly one of (i) or (ii) is true. Thus, the claim made by the letter is true. Assume the agent receives the letter. Should she pay up?
(Styling mine, not original.) EDT pays the $1,000 for nothing: it has absolutely no influence on whether or not the agent's house is infested with termites.
Alright, thanks for your answer!
Is it necessary to be able to work during all MIRI office hours, or is it enough if my hours are partially compatible? My time difference with MIRI is 9 hours, but I could work in the evening (my time) every now and then.
Btw, thanks for your comment! I edited my post with respect to fair problems.
I think this is potentially an overly strong criteria for decision theories - we should probably restrict to something like the problems to a fair problem class, else we end up with no decision theory receiving any credence.
Good point, I should have mentioned that in my article. (Note that XOR Blackmail is definitely a fair problem (not that you are claiming otherwise)).
I also think "wrong answer" is doing a lot of work here.
I at least in part agree here. This is why I picked XOR Blackmail, because it has such an obvious right answer. That's an intuition, but that's also true for some of the points made in favor of The Evidentialist's Wager to begin with.
Thanks for the comment!
There is a false dichotomy in the argument basing the conclusion only on the options CDT or EDT, when in fact both are wrong.
I wouldn't say there's a false dichotomy: the argument works fine if you also have credence in e.g. FDT. It just says that altruistic, morally motivated agents should favor EDT over CDT. (However, as I have attempted to demonstrate, 2 premises of the argument don't hold up.)
Suppose you know instead that Omega was miserly and almost all of the people who one-box don't get offered the opportunity to play - let's say every two-boxer gets to play but only 0.01% of one-boxers. Should you still choose to 1-box if presented with the opportunity of playing?
Interesting. No, because a 0.01% probability of winning $1,000,000 gives me an expected $100, whereas two-boxing gives me $1,000.
Your original description doesn't specify subjunctive dependence, which is a critical component of the problem.
Heighn’s response to this argument is that this is a perfectly fine prescription.
Note that omnizoid hasn't checked with me whether this is my response, and if he had, I would have asked him to specify the problem more. In my response article, I attempt to specify the problem more, and with that particular specification, I do indeed endorse FDT's decision.
I'm surprised Wei Dai thinks this is a fair point. I disagree entirely with it: FDT is a decision theory and doesn't in and of itself value anything. The values need to be given by a utility function.
Consider the Psychological Twin Prisoner's Dilemma. Given the utility function used there, the agent doesn't value the twin at all: the agent just wants to go home free as soon as possible. FDT doesn't change this: it just recognizes that the twin makes the same decision the agent does, which has bearing on the prison time the agent gets.
...which makes the Procreation case an unfair problem. It punishes FDT'ers specifically for following FDT. If we're going to punish decision theories for their identity, no decision theory is safe. It's pretty wild to me that @WolfgangSchwarz either didn't notice this or doesn't think it's a problem.
A more fair version of Procreation would be what I have called Procreation*, where your father follows the same decision theory as you (be it FDT, CDT or whatever).
Oh wait, of course, in this problem, Omega doesn't simulate the agent. Interesting! I'll have to think about this more :-)
In what ways is FDT broken?
I also wonder whether a different problem was intended.
Thanks for the link!
And hmm, it seems to me FDT one-boxes on ASP, as that gives the most utility from the subjunctive dependence perspective.
Why would Omega put $0 in the second box? The problem statement specifies Omega puts $100 in both boxes if she predicts you will two-box!
If I have a two box policy the simulated me gets $200 before deletion, and the real me gets nothing.
Wait, why does the real you get nothing? It's specified you get $200. What am I missing?
Exploitable? Please explain!
Ah, I just read your substack post on this, and you've referenced two pieces I've already reacted to (and in my view debunked) before. Seems like we could have a good debate on this :)
I would love to debate you on this. My view: there is no single known problem in which FDT makes an incorrect decision. I have thought about FDT a lot and it seems quite obviously correct to me.
Ah, so your complaint is that the author is ignoring evidence pointing to shorter timelines. I understand your position better now :)
"Insofar as your distribution has a faraway median, that means you have close to certainty that it isn't happening soon. And that, I submit, is ridiculously overconfident and epistemically unhumble."
Why? You can say a similar thing about any median anyone ever has. Why is this median in particular overconfident?
"And not only do I not expect the trained agents to not maximize the original “outer” reward signal"
Nitpick: one "not" too many?
I apologize, Said; I misinterpreted your (clearly written) comment.
Reading your newest comment, it seems I actually largely agree with you - the disagreement lies in whether farm animals have sentience.
(No edit was made to the original question.)
Thanks for your answer!
I (strongly) disagree that sentience is uniquely human. It seems to me a priori very unlikely that this would be the case, and evidence does exist to the contrary. I do agree sentience is an important factor (though I'm unsure it's the only one).
"but certainly none of the things that we (legally) do with animals are bad for any of the important reasons why torture of people is bad."
That seems very overconfident to me. What are your reasons for believing this, if I may ask? What quality or qualities do humans have that animals lack that makes you certain of this?
One-boxing on Newcomb's Problem is good news IMO. Why do you believe it's bad?
Of course! Thanks for your time.
I can, although I indeed don't think it is nonsense.
What do you think our (or specifically my) viewpoint is?
Hmm, interesting. I don't know much about UDT. From and FDT perspective, I'd say that if you're in the situation with the bomb, your decision procedure already Right-boxed and therefore you're Right-boxing again, as logical necessity. (Making the problem very interesting.)
Sorry, I'm having trouble understanding your point here. I understand your analogy (I was a developer), but am not sure what you're drawing the analogy to.
I've been you ten years ago.
Just... no. Don't act like you know me, because you don't. I appreciate you trying to help, but this isn't the way.
Seems to me Yudkowsky was (way) too pessimistic about OpenAI there. They probably knew something like this would happen.
To explain my view more, the question I try to answer in these problems is more or less: if I were to choose a decision theory now to strictly adhere to, knowing I might run into the Bomb problem, which decision theory would I choose?
"But by the time the situation described in the OP happens, it no longer matters whether you optimize expected utility over the whole sample space; that goal is now moot."
This is what we agree on. If you're in the situation with a bomb, all that matters is the bomb.
My stance is that Left-boxers virtually never get into the situation to begin with, because of the prediction Omega makes. So with probability close to 1, they never see a bomb.
Your stance (if I understand correctly) is that the problem statement says there is a bomb, so, that's what's true with probability 1 (or almost 1).
And so I believe that's where our disagreement lies. I think Newcomblike problems are often "trick questions" that can be resolved in two ways, one leaning more towards your interpretation.
In spirit of Vladimir's points, if I annoyed you, I do apologize. I can get quite intense in such discussions.
I see your point, although I have entertained Said's view as well. But yes, I could have done better. I tend to get like this when my argumentation is being called crazy, and I should have done better.
You could have just told me this instead of complaining about me to Said though.
I don't see how it is misleading. Achmiz asked what actually happens; it is, in virtually all possible worlds, that you live for free.
Note that it's my argumentation that's being called crazy, which is a large factor in the "antagonism" you seem to observe - a word choice I don't agree with, btw.
About the "needlessly upping the heat", I've tried this discussion from multiple different angles, seeing if we can come to a resolution. So far, no, alas, but not for lack of trying. I will admit some of my reactions were short and a bit provocative, but I don't appreciate nor agree with your accusations. I have been honest in my reactions.
Interesting. I'm having the opposite experience (due to timing, apparently), where at least it's making some sense now. I've seen it using tricks only applicable to addition and pulling numbers out of its ass, so I was surprised what it did wasn't completely wrong.
Asking the same question again even gives a completely different (but again wrong) result:
If you ask ChatGPT to multiply two 4-digit numbers it writes out the reasoning process in natural knowledge and comes to the right answer.
People keep saying such things. Am I missing something? I asked it to calculate 1024 * 2047, and the answer isn't even close. (Though to my surprise, the first 2 steps are at least correct steps, and not nonsense. And it is actually adding the right numbers together in step 3, again, to my surprise. I've seen it perform much, much worse.)
That's what I've been saying to you: a contradiction.
And there are two ways to resolve it.
The scenario also stipulates the bomb isn't there if you Left-box.
What actually happens? Not much. You live. For free.
"So if you take the Left box, what actually, physically happens?"
You live. For free. Because the bomb was never there to begin with.
Yes, the situation does say the bomb is there. But it also says the bomb isn't there if you Left-box.
Agreed, but I think it's important to stress that it's not like you see a bomb, Left-box, and then see it disappear or something. It's just that Left-boxing means the predictor already predicted that, and the bomb was never there to begin with.
Put differently, you can only Left-box in a world where the predictor predicted you would.
I think we agree. My stance: if you Left-box, that just means the predictor predicted that with probability close to 1. From there on, there are a trillion trillion - 1 possible worlds where you live for free, and 1 where you die.
I'm not saying "You die, but that's fine, because there are possible worlds where you live". I'm saying that "you die" is a possible world, and there are way more possible worlds where you live.
I'm not going to make you cite anything. I know what you mean. I said Right-boxing is a consequence, given a certain resolution of the problem; I always maintained Left-boxing is the correct decision. Apparently I didn't explain myself well, that's on me. But I'm kinda done, I can't seem to get my point across (not saying it's your fault btw).
By construction it is not, because the scenario is precisely that we find ourselves in one such exceptional case; the posterior probability (having observed that we do so find ourselves) is thus ~1.
Except that we don't find ourselves there if we Left-box. But we seem to be going around in a circle.
… but you have said, in a previous post, that if you find yourself in this scenario, you Right-box. How to reconcile your apparently contradictory statements…?
Right-boxing is the necessary consequence if we assume the predictor's Right-box prediction is fixed now. So GIVEN the Right-box prediction, I apparently Right-box.
My entire point is that the prediction is NOT a given. I Left-box, and thus change the prediction to Left-box.
I have made no contradictory statements. I am and always have been saying that Left-boxing is the correct decision to resolve this dilemma.