Anthropic reasoning and correlated decision making
post by Stuart_Armstrong · 2009-09-23T15:10:41.114Z · LW · GW · Legacy · 14 commentsContents
14 comments
There seems to be some confusion on how to deal with correlated decision making - such as with absent-minded drivers and multiple copies of yourself; any situation in which many agents will allreach the same decision. Building on Nick Bostrom's division-of-responsibility principle mentioned in Outlawing Anthropics, I propose the following correlated decision principle:
CDP: If you are part of a group of N individuals whose decision is perfectly correlated, then you should reason as if you had a 1/N chance of being the dictator of the group (in which case your decision is applied to all) and a (N-1)/N chance of being a dictatee (in which case your decision is ignored).
What justification could there be to this principle? A simple thought experiment: imagine if you were one of N individuals who had to make a decision in secret. One of the decisions is opened at random, the others are discarded, and each person has his mind modified to believe that what was decided was in fact what they decided. This process is called a "dictator filter".
If you apply this dictator filter any situation S, then in "S + dictator filter", you should reason as in the CDP. If you apply it to perfectly correlated decision making, however, then the dictator filter changes nothing at all to anyone's decision - hence we should treat "perfectly correlated" as isomorphic to "perfectly correlated + dictator filter", which establishes the CDP.
Used alongside the SIA, this solves many puzzles on this blog, without needing advanced decision theory.
For instance, the situation in Outlawing Anthropics is simple: the SIA implies the 90% view, giving you a 90% chance of being in a group of 18, and a 10% of being in a group of two. Then you were offered a deal in which $3 is stolen from the red rooms, and $1 given to the green rooms. The initial expected gain from accepting the deal was -$20; the problem came that when you woke up in a green room, you were far more likely to be in the group of 18, giving an expected gain of +$5.60. The CDP cancels out this effect, returning you to an expected individual gain of -$2, and a global expected gain of -$20.
The Absent-Minded driver problem is even more interesting, and requires a more subtle reasoning. The SIA implies that if your probability of continuing is p, then the chance that you are at the first intersection is 1/(1+p), while the chance that you are at the second is p/(1+p). Using these number, it appears that your expected gain is [p2 + 4(1-p)p + p(p+4(1-p))]/(1+p) which is 2[p2 + 4(1-p)p]/(1+p).
If you were the dictator, deciding the behaviour at both intersections, your expected gain would be 1+p times this amount, since the driver at the first intersection exist with probability 1, while that at the second exists with probability p. Since there are N=2 individuals, the CDP thus cancels both the 2 and the (1+p) factors, returning the situation to the expected gain of p2 -4(1-p)p, maximised at p = 2/3.
The CDP also solves the issues in my old Sleeping Beauty problem.
14 comments
Comments sorted by top scores.
comment by Wei Dai (Wei_Dai) · 2009-09-23T20:17:02.977Z · LW(p) · GW(p)
Using these number, it appears that your expected gain is [p2 + 4(1-p)p + p(p+4(1-p))]/(2-p)
Do you mean "1+p" instead of "2-p" at the end there? If not, where does "2-p" come from?
Since there are N=2 individuals, the CDP thus cancels both the 2 and the (1+p) factors
Why do you say that (N=2), since the number of individuals is actually random? If you EXIT at X, then the individual at Y doesn't exist, right?
Do you think CDP can be formalized sufficiently so that it can be applied mechanically after transforming a decision problem into some formal representation (like a decision tree, or world program as in UDT1)? The way it is stated now, it seems too ambiguous to say what is the solution to a given problem under CDP.
Replies from: Stuart_Armstrong, Stuart_Armstrong↑ comment by Stuart_Armstrong · 2009-09-24T06:28:40.069Z · LW(p) · GW(p)
Do you mean "1+p" instead of "2-p" at the end there? If not, where does "2-p" come from?
Thanks, error corrected (I mixed up p with 1-p).
Why do you say that (N=2), since the number of individuals is actually random? If you EXIT at X, then the individual at Y doesn't exist, right?
Because the number of individuals is exactly two - in that you have a certain probability of being either individuals. The second may not exist, but the probability of being the second is non-zero.
But I admit this is not fully rigorous; more examples are needed.
Do you think CDP can be formalized sufficiently so that it can be applied mechanically after transforming a decision problem into some formal representation (like a decision tree, or world program as in UDT1)?
I believe it can be formalized sufficiently; so far, no seeming paradox I've met has failed to fall eventually to these types of reasonings. However, more work needs to be done; in particular, one puzzle: why does the CDP for the absent-minded driver give you total expectation, while for Eliezer's problem it gives you individual expectation?
↑ comment by Stuart_Armstrong · 2009-09-24T20:11:34.074Z · LW(p) · GW(p)
I now think I've got a fomalisation that works; I'll put it up in a subsequent post.
Replies from: casebash↑ comment by casebash · 2016-04-16T08:12:14.175Z · LW(p) · GW(p)
Was this ever written up?
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2016-04-18T14:02:27.789Z · LW(p) · GW(p)
This is the most current version: https://www.youtube.com/watch?v=aiGOGkBiWEo
comment by casebash · 2016-04-16T08:52:42.835Z · LW(p) · GW(p)
I was recently reading Outlawing Anthropics and I thought of a very similar technique (that one random person would be given a button that would change what everyone did). I think that it is a shame that this post didn't receive much attention given that it seems to resolve these problems rather effectively.
There probably could have been a bit more that justifies this argument, apart from the fact that it works. I think a reasonable argument to note that we can either hold the group's choices as fixed and ask about whether an individual would want to change their choice given these fixed choices, or give an individual the ability to change everyone's choice/be a dictator and ask whether they'd want to change their choices then. The problem in outlawing anthropics is that it mixes and matches - it gives the decision to multiple individuals, but tries to calculate the individual benefit from a decision as though they were solely responsible for the choice and so it double-counts.
comment by wedrifid · 2009-09-24T03:26:13.521Z · LW(p) · GW(p)
There seems to be some confusion on how to deal with correlated decision making - such as with absent-minded drivers and multiple copies of yourself; any situation in which many agents will allreach the same decision.
I found this description misleading. It immediately brought to mind examples of games where a correlation of another agent's decision to my own provides me with the opportunity to exploit them.
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-09-23T18:13:46.972Z · LW(p) · GW(p)
Sleeping Beauty link is misformed.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2009-09-24T06:30:46.892Z · LW(p) · GW(p)
Fixed
comment by SilasBarta · 2009-09-23T17:38:35.128Z · LW(p) · GW(p)
I didn't check your application of CDP to the problems, but I think you erred at the beginning:
If you are part of a group of N individuals whose decision is perfectly correlated, then you should reason as if you had a 1/N chance of being the dictator of the group (in which case your decision is applied to all) and a (N-1)/N chance of being a dictatee (in which case your decision is ignored).
If your decisions are perfectly correlated (I assume that means they're all the same), then you are deciding for the group, because you make the same decision as everyone else. So you should treat it as a 100% chance of being the dictator of the group.
If you apply this dictator filter any situation S, then in "S + dictator filter", you should reason as in the CDP. If you apply it to perfectly correlated decision making, however, then the dictator filter changes nothing at all to anyone's decision - hence we should treat "perfectly correlated" as isomorphic to "perfectly correlated + dictator filter", which establishes the CDP.
Wouldn't it also mean that we should treat "perfectly correlated" as isomorphic to "uncorrelated + dictator filter", since you always believe your vote determined the outcome?
(Again, I don't know how this affects your application of it.)
By the way, how would your application of CDP/SIA to the Absent-minded Driver problem take into account additional evidence fed to you about what intersection you're at? (Say, someone shows you something that amplifies the odds of being at X by a factor of [aka has a Bayes factor/likelihood ratio of] L.)
Replies from: Eliezer_Yudkowsky, Stuart_Armstrong, Stuart_Armstrong↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-09-23T18:14:44.063Z · LW(p) · GW(p)
If your decisions are perfectly correlated (I assume that means they're all the same), then you are deciding for the group
He's proposing a way of thinking about implementing Bostrom's division-of-responsibility principle.
↑ comment by Stuart_Armstrong · 2009-09-24T20:13:42.514Z · LW(p) · GW(p)
By the way, how would your application of CDP/SIA to the Absent-minded Driver problem take into account additional evidence fed to you about what intersection you're at? (Say, someone shows you something that amplifies the odds of being at X by a factor of [aka has a Bayes factor/likelihood ratio of] L.)
I think my extended set-up can deal with that, too - I'll write it up in a subsequent post.
Replies from: SilasBarta↑ comment by SilasBarta · 2009-09-24T20:16:34.167Z · LW(p) · GW(p)
Okay. Just so you know, I attacked that problem, though not with your method, and presented the solution here. You will probably need to read this to understand the variables.
↑ comment by Stuart_Armstrong · 2009-09-24T06:37:47.492Z · LW(p) · GW(p)
Wouldn't it also mean that we should treat "perfectly correlated" as isomorphic to "uncorrelated + dictator filter", since you always believe your vote determined the outcome?
Not ahead of time - you know, ahead of time, that your vote has only one chance in N of determining the result (if you want, drop the "rewriting people's memories" part of the dictator filter; it's not really needed).
By the way, how would your application of CDP/SIA to the Absent-minded Driver problem take into account additional evidence fed to you about what intersection you're at?
Let me think about that for a bit.