Comment by mallah on A case study in fooling oneself · 2012-04-18T21:25:57.615Z · LW · GW

Mitchell, you are on to an important point: Observers must be well-defined.

Worlds are not well-defined, and there is no definite number of worlds (given standard physics).

You may be interested in my proposed Many Computations Interpretation, in which observers are identified not with so-called 'worlds' but with implementations of computations:

See my blog for further discussion:

Comment by mallah on The Social Coprocessor Model · 2010-05-19T14:35:13.539Z · LW · GW

I wasn't sneaky about it.

Comment by mallah on The Social Coprocessor Model · 2010-05-18T15:48:57.916Z · LW · GW

I don't think I got visibly hurt or angry. In fact, when I did it, I was feeling more tempted than angry. I was in the middle of a conversation with another guy, and her rear appeared nearby, and I couldn't resist.

It made me seem like a jerk, which is bad, but not necessarily low status. Acting without apparent fear of the consequences, even stupidly, is often respected as long as you get away with it.

Another factor is that this was a 'high status' woman. I'm not sure but she might be related to a celebrity. (I didn't know that at the time.) Hence, any story linking me and her may be 'bad publicity' for me but there is the old saying 'there's no such thing as bad publicity'.

Comment by mallah on The Social Coprocessor Model · 2010-05-17T14:21:04.900Z · LW · GW

It was a single swat to the buttocks, done in full sight of everyone. There was other ass-spanking going on, between people who knew each other - done as a joke - so in context it was not so unusual. I would not have done it outside of that context, nor would I have done it if my inhibitions had not been lowered by alcohol; nor would I do it again even if they are.

Yes, she deserved it!

It was a mistake. Why? It exposed me to more risk than was worthwhile, and while I might have hoped that (aside from simple punishment) it would teach her the lesson that she ought to follow the Golden Rule, or at least should not pull the same tricks on guys, in retrospect it was unlikely to do so.

Other people (that I have talked to) seem to be divided on whether it was a good thing to do or not.

Comment by mallah on The Social Coprocessor Model · 2010-05-15T22:33:01.916Z · LW · GW

Women seem to have a strong urge to check out what shoes a man has on, and judge their quality. Even they can't explain it. Perhaps at some unconscious level, they are guarding against men who 'cheat' by wearing high heels.

Comment by mallah on The Social Coprocessor Model · 2010-05-15T21:38:54.975Z · LW · GW

I can confirm that this does happen at least sometimes (USA). I was at a bar, and I approached a woman who is probably considered attractive by many (skinny, bottle blonde) and started talking to her. She soon asked me to buy her a drink. Being not well versed in such matters, I agreed, and asked her what she wanted. She named an expensive wine, which I agreed to get her a glass of. She largely ignored me thereafter, and didn't even bother taking the drink!

(I did obtain some measure of revenge later that night by spanking her rear end hard, though I do not advise doing such things. She was not amused and her brother threatened me, though as I had apologized, that was the end of it. She did tell some other lies so I don't know if she is neurotypical; my impression was that she was well below average in morality, being a spoiled brat.)

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-18T16:35:54.756Z · LW · GW

But Stuart_Armstrong's description is asking us to condition on the camera showing 'you' surviving.

That condition imposes post-selection.

I guess it doesn't matter much if we agree on what the probabilities are for the pre-selection v. the post-selection case.

Wrong - it matters a lot because you are using the wrong probabilities for the survivor (in practice this affects things like belief in the Doomsday argument).

I believe the strong law of large numbers implies that the relative frequency converges almost surely to p as the number of Bernoulli trials becomes arbitrarily large. As p represents the 'one-shot probability,' this justifies interpreting the relative frequency in the infinite limit as the 'one-shot probability.'

You have things backwards. The "relative frequency in the infinite limit" can be defined that way (sort of, as the infinite limit is not actually doable) and is then equal to the pre-defined probability p for each shot if they are independent trials. You can't go the other way; we don't have any infinite sequences to examine, so we can't get p from them, we have to start out with it. It's true that if we have a large but finite sequence, we can guess that p is "probably" close to our ratio of finite outcomes, but that's just Bayesian updating given our prior distribution on likely values of p. Also, in the 1-shot case at hand, it is crucial that there is only the 1 shot.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-18T16:22:02.045Z · LW · GW

It is only possible to fairly "test" beliefs when a related objective probability is agreed upon

That's wrong; behavioral tests (properly set up) can reveal what people really believe, bypassing talk of probabilities.

Would you really guess "red", or do we agree?

Under the strict conditions above and the other conditions I have outlined (long-time-after, no other observers in the multiverse besides the prisoners), then sure, I'd be a fool not to guess red.

But I wouldn't recommend it to others, because if there are more people, that would only happen in the blue case. This is a case in which the number of observers depends on the unknown, so maximizing expected average utility (which is appropriate for decision theory for a given observer) is not the same as maximizing expected total utility (appropriate for a class of observers).

More tellingly, once I find out the result (and obviously the result becomes known when I get paid or punished), if it is red, I would not be surprised. (Could be either, 50% chance.)

Not that I've answered your question, it's time for you to answer mine: What would you vote, given that the majority of votes determines what SB gets? If you really believe you are probably in a blue room, it seems to me that you should vote blue; and it seems obvious that would be irrational.

Then if you find out it was red, would you be surprised?

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-16T16:06:44.887Z · LW · GW

The way you set up the decision is not a fair test of belief, because the stakes are more like $1.50 to $99.

To fix that, we need to make 2 changes:

1) Let us give any reward/punishment to a third party we care about, e.g. SB.

2) The total reward/punishment she gets won't depend on the number of people who make the decision. Instead, we will poll all of the survivors from all trials and pool the results (or we can pick 1 survivor at random, but let's do it the first way).

The majority decides what guess to use, on the principle of one man, one vote. That is surely what we want from our theory - for the majority of observers to guess optimally.

Under these rules, if I know it's the 1-shot case, I should guess red, since the chance is 50% and the payoff to SB is larger. Surely you see that SB would prefer us to guess red in this case.

OTOH if I know it's the multi-shot case, the majority will be probably be blue, so I should guess blue.

In practice, of course, it will be the multi-shot case. The universe (and even the population of Earth) is large; besides, I believe in the MWI of QM.

The practical significance of the distinction has nothing to do with casino-style gambling. It is more that 1) it shows that the MWI can give different predictions from a single-world theory, and 2) it disproves the SIA.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-16T15:46:25.697Z · LW · GW

If that were the case, the camera might show the person being killed; indeed, that is 50% likely.

Pre-selection is not the same as our case of post-selection. My calculation shows the difference it makes.

Now, if the fraction of observers of each type that are killed is the same, the difference between the two selections cancels out. That is what tends to happen in the many-shot case, and we can then replace probabilities with relative frequencies. One-shot probability is not relative frequency.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-15T20:39:59.262Z · LW · GW

No, it shouldn't - that's the point. Why would you think it should?

Note that I am already taking observer-counting into account - among observers that actually exist in each coin-outcome-scenario. Hence the fact that P(heads) approaches 1/3 in the many-shot case.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-15T20:38:42.166Z · LW · GW

Adding that condition is post-selection.

Note that "If you (being asked before the killing) will survive, what color is your door likely to be?" is very different from "Given that you did already survive, ...?". A member of the population to which the first of these applies might not survive. This changes the result. It's the difference between pre-selection and post-selection.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-15T18:39:48.754Z · LW · GW

This subtly differs from Bostrom's description, which says 'When she awakes on Monday', rather than 'Monday or Tuesday.'

He makes clear though that she doesn't know which day it is, so his description is equivalent. He should have written it more clearly, since it can be misleading on the first pass through his paper, but if you read it carefully you should be OK.

So on average ...

'On average' gives you the many-shot case, by definition.

In the 1-shot case, there is a 50% chance she wakes up once (heads), and a 50% chance she wakes up twice (tails). They don't both happen.

In the 2-shot case, the four possibilities are as I listed. Now there is both uncertainty in what really happens objectively (the four possible coin results), and then given the real situation, relevant uncertainty about which of the real person-wakeups is the one she's experiencing (upon which her coin result can depend).

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-15T18:07:54.621Z · LW · GW

The 'selection' I have in mind is the selection, at the beginning of the scenario, of the person designated by 'you' and 'your' in the scenario's description.

If 'you' were selected at the beginning, then you might not have survived.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-15T18:00:50.565Z · LW · GW

There are always 2 coin flips, and the results are not known to SB. I can't guess what you mean, but I think you need to reread Bostrom's paper.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-14T16:00:22.656Z · LW · GW

Under a frequentist interpretation

In the 1-shot case, the whole concept of a frequentist interpretation makes no sense. Frequentist thinking invokes the many-shot case.

Reading Bostrom's explanation of the SB problem, and interpreting 'what should her credence be that the coin will fall heads?' as a question asking the relative frequency of the coin coming up heads, it seems to me that the answer is 1/2 however many times Sleeping Beauty's later woken up: the fair coin will always be tossed after she awakes on Monday, and a fair coin's probability of coming up heads is 1/2.

I am surprised you think so because you seem stuck in many-shot thinking, which gives 1/3.

Maybe you are asking the wrong question. The question is, given that she wakes up on Monday or Tuesday and doesn't know which, what is her creedence that the coin actually fell heads? Obviously in the many-shot case, she will be woken up twice as often during experiments where it fell tails, so in 2/3 or her wakeups the coin will be tails.

In the 1-shot case that is not true, either she wakes up once (heads) or twice (tails) with 50% chance of either.

Consider the 2-shot case. Then we have 4 possibilities:

  • coins , days , fraction of actual wakeups where it's heads
  • HH , M M , 1
  • HT , M M T , 1/3
  • TH , M T M , 1/3
  • TT , M T M T , 0

Now P(heads) = (1 + 1/3 + 1/3 + 0) / 4 = 5/12 = 0.417

Obviously as the number of trials increases, P(heads) will approach 1/3.

This is assuming that she is the only observer and that the experiments are her whole life, BTW.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-14T15:28:20.436Z · LW · GW

A few minutes later, it is announced that whoever was to be killed has been killed. What are your odds of being blue-doored now?

Presumably you heard the announcement.

This is post-selection, because pre-selection would have been "Either you are dead, or you hear that whoever was to be killed has been killed. What are your odds of being blue-doored now?"

The 1-shot case (which I think you are using to refer to situation B in Stuart_Armstrong's top-level post...?) describes a situation defined to have multiple possible outcomes, but there's only one outcome to the question 'what is pi's millionth bit?'

There's only one outcome in the 1-shot case.

The fact that there are multiple "possible" outcomes is irrelevant - all that means is that, like in the math case, you don't have knowledge of which outcome it is.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-13T04:04:50.357Z · LW · GW

I think talking about 'observers' might be muddling the issue here.

That's probably why you don't understand the result; it is an anthropic selection effect. See my reply to Academician above.

We could talk instead about creatures that don't understand the experiment, and the result would be the same. Say we have two Petri dishes, one dish containing a single bacterium, and the other containing a trillion. We randomly select one of the bacteria (representing me in the original door experiment) to stain with a dye. We flip a coin: if it's heads, we kill the lone bacterium, otherwise we put the trillion-bacteria dish into an autoclave and kill all of those bacteria. Given that the stained bacterium survives the process, it is far more likely that it was in the trillion-bacteria dish, so it is far more likely that the coin came up heads.

That is not an analogous experiment. Typical survivors are not pre-selected individuals; they are post-selected, from the pool of survivors only. The analogous experiment would be to choose one of the surviving bacteria after the killing and then stain it. To stain it before the killing risks it not being a survivor, and that can't happen in the case of anthropic selection among survivors.

I don't think of the pi digit process as equivalent.

That's because you erroneously believe that your frequency interpretation works. The math problem has only one answer, which makes it a perfect analogy for the 1-shot case.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-13T03:31:42.576Z · LW · GW

Given that others seem to be using it to get the right answer, consider that you may rightfully believe SIA is wrong because you have a different interpretation of it, which happens to be wrong.

Huh? I haven't been using the SIA, I have been attacking it by deriving the right answer from general considerations (that is, P(tails) = 1/2 for the 1-shot case in the long-time-after limit) and noting that the SIA is inconsistent with it. The result of the SIA is well known - in this case, 0.01; I don't think anyone disputes that.

P(R|KS) = P(R|K)·P(S|RK)/P(S|K) = 0.01·(0.5)/(0.5) = 0.01

If you still think this is wrong, and you want to be prudent about the truth, try finding which term in the equation (1) is incorrect and which possible-observer count makes it so.

Dead men make no observations. The equation you gave is fine for before the killing (for guessing what color you will be if you survive), not for after (when the set of observers is no longer the same).

So, if you are after the killing, you can only be one of the living observers. This is an anthropic selection effect. If you want to simulate it using an outside 'observer' (who we will have to assume is not in the reference class; perhaps an unconscious computer), the equivalent would be interviewing the survivors.

The computer will interview all of the survivors. So in the 1-shot case, there is a 50% chance it asks the red door survivor, and a 50% chance it talks to the 99 blue door ones. They all get an interview because all survivors make observations and we want to make it an equivalent situation. So if you get interviewed, there is a 50% chance that you are the red door one, and a 50% chance you are one of the blue door ones.

Note that if the computer were to interview just one survivor at random in either case, then being interviewed would be strong evidence of being the red one, because if the 99 blue ones are the survivors you'd just have a 1 in 99 chance of being picked. P(red) > P(blue). This modified case shows the power of selection.

Of course, we can consider intermediate cases in which N of the blue survivors would be interviewed; then P(blue) approaches 50% as N approaches 99.

The analogous modified MWI case would be for it to interview both the red survivor and one of the blue ones; of course, each survivor has half the original measure. In this case, being interviewed would provide no evidence of being the red one, because now you'd have a 1% chance of being the red one and the same chance of being the blue interviewee. The MWI version (or equivalently, many runs of the experiment, which may be anywhere in the multiverse) negates the selection effect.

If you are having trouble following my explanations, maybe you'd prefer to see what Nick Bostrom has to say. This paper talks about the equivalent Sleeping Beauty problem. The main interesting part is near the end where he talks about his own take on it. He correctly deduces that the probability for the 1-shot case is 1/2, and for the many-shot case it approaches 1/3 (for the SB problem). I disagree with his 'hybrid model' but it is pretty easy to ignore that part for now.

Also of interest is this paper which correctly discusses the difference between single-world and MWI interpretations of QM in terms of anthropic selection effects.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-07T17:52:39.409Z · LW · GW

BTW, whoever is knocking down my karma, knock it off. I don't downvote anything I disagree with, just ones I judge to be of low quality. By chasing me off you are degrading the less wrong site as well as hiding below threshold the comments of those arguing with me who you presumably agree with. If you have something to say than say it, don't downvote.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-07T17:50:08.041Z · LW · GW

Actually, if we consider that you could have been an observer-moment either before or after the killing, finding yourself to be after it does increase your subjective probability that fewer observers were killed. However, this effect goes away if the amount of time before the killing was very short compared to the time afterwards, since you'd probably find yourself afterwards in either case; and the case we're really interested in, the SIA, is the limit when the time before goes to 0.

I just wanted to follow up on this remark I made. There is a suble anthropic selection effect that I didn't include in my original analysis. As we will see, the result I derived applies if the time after is long enough, as in the SIA limit.

Let the amount of time before the killing be T1, and after (until all observers die), T2. So if there were no killing, P(after) = T2/(T2+T1). It is the ratio of the total measure of observer-moments after the killing divided by the total (after + before).

If the 1 red observer is killed (heads), then P(after|heads) = 99 T2 / (99 T2 + 100 T1)

If the 99 blue observers are killed (tails), then P(after|tails) = 1 T2 / (1 T2 + 100 T1)

P(after) = P(after|heads) P(heads) + P(after|tails) P(tails)

For example, if T1 = T2, we get P(after|heads) = 0.497, P(after|tails) = 0.0099, and P(after) = 0.497 (0.5) + 0.0099 (0.5) = 0.254

So here P(tails|after) = P(after|tails) P(tails) / P(after) = 0.0099 (.5) / (0.254) = 0.0195, or about 2%. So here we can be 98% confident to be blue observers if we are after the killing. Note, it is not 99%.

Now, in the relevant-to-SIA limit T2 >> T1, we get P(after|heads) ~ 1, P(after|tails) ~1, and P(after) ~1.

In this limit P(tails|after) = P(after|tails) P(tails) / P(after) ~ P(tails) = 0.5

So the SIA is false.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-07T16:10:36.205Z · LW · GW

I omitted the "|before" for brevity, as is customary in Bayes' theorem.

That is not correct. The prior that is customary in using Bayes' theorem is the one which applies in the absence of additional information, not before an event that changes the numbers of observers.

For example, suppose we know that x=1,2,or 3. Our prior assigns 1/3 probability to each, so P(1) = 1/3. Then we find out "x is odd", so we update, getting P(1|odd) = 1/2. That is the standard use of Bayes' theorem, in which only our information changes.

OTOH, suppose that before time T there are 99 red door observers and 1 blue door one, and after time T, there is 1 red door are 99 blue door ones. Suppose also that there is the same amount of lifetime before and after T. If we don't know what time it is, clearly P(red) = 1/2. That's what P(red) means. If we know that it's before T, then update on that info, we get P(red|before)=0.99.

Note the distinction: "before an event" is not the same thing as "in the absence of information". In practice, often it is equivalent because we only learn info about the outcome after the event and because the number of observers stays constant. That makes it easy for people to get confused in cases where that no longer applies.

Now, suppose we ask a different question. Like in the case we were considering, the coin will be flipped and red or blue door observers will be killed; and it's a one-shot deal. But now, there will be a time delay after the coin has been flipped but before any observers are killed. Suppose we know that we are such observers after the flip but before the killing.

During this time, what is P(red|after flip & before killing)? In this case, all 100 observers are still alive, so there are 99 blue door ones and 1 red door one, so it is 0.01. That case presents no problems for your intuition, because it doesn't involve changes in the #'s of observers. It's what you get with just an info update.

Then the killing occurs. Either 1 red observer is killed, or 99 blue observers are killed. Either outcome is equally likely.

In the actual resulting world, there is only one kind of observer left, so we can't do an observer count to find the probabilities like we could in the many-worlds case (and as cupholder's diagram would suggest). Whichever kind of observer is left, you can only be that kind, so you learn nothing about what the coin result was.

Actually, if we consider that you could have been an observer-moment either before or after the killing, finding yourself to be after it does increase your subjective probability that fewer observers were killed. However, this effect goes away if the amount of time before the killing was very short compared to the time afterwards, since you'd probably find yourself afterwards in either case; and the case we're really interested in, the SIA, is the limit when the time before goes to 0.

See here

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-07T13:35:55.120Z · LW · GW


That is an excellent illustration ... of the many-worlds (or many-trials) case. Frequentist counting works fine for repeated situations.

The one-shot case requires Bayesian thinking, not frequentist. The answer I gave is the correct one, because observers do not gain any information about whether the coin was heads or tails. The number of observers that see each result is not the same, but the only observers that actually see any result afterwards are the ones in either heads-world or tails-world; you can't count them all as if they all exist.

It would probably be easier for you to understand an equivalent situation: instead of a coin flip, we will use the 1 millionth digit of pi in binary notation. There is only one actual answer, but assume we don't have the math skills and resources to calculate it, so we use Bayesian subjective probability.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-04-07T00:43:32.495Z · LW · GW


Why do I get the feeling you're shouting, Academician? Let's not get into that kind of contest. Now here's why you're wrong:

P(red|before) =0.01 is not equal to P(red).

P(red) would be the probability of being in a red room given no information about whether the killing has occured; i.e. no information about what time it is.

The killing is not just an information update; it's a change in the # and proportions of observers.

Since (as I proved) P(red|after) = 0.5, while P(red|before) =0.01, that means that P(red) will depend on how much time there is before as compared to after.

That also means that P(after) depends on the amount of time before as compared to after. That should be fairly clear. Without any killings or change in # of observers, if there is twice as much time after an event X than before, then P(after X) = 2/3. That's the fraction of observer-moments that are after X.

Comment by mallah on Anthropic answers to logical uncertainties? · 2010-04-06T19:33:25.207Z · LW · GW

the justification for reasoning anthropically is that the set Ω of observers in your reference class maximizes its combined winnings on bets if all members of Ω reason anthropically

That is a justification for it, yes.

When most of the members of Ω arise from merely non-actual possible worlds, this reasoning is defensible.

Roko, on what do you base that statement? Non-actual observers do not participate in bets.

The SIA is not an example of anthropic reasoning; anthropic implies observers, not "non-actual observers".

See this post for an example of the difference, showing why the SIA is false.

Comment by mallah on NYC Rationalist Community · 2010-03-31T18:01:33.747Z · LW · GW

Sounds cool. I'm from NYC, but no longer live there. I was a member of athiest clubs in college, but I'd bet that post-college (or any, really) rationalists have a hard time meeting others of similar views.

Comment by mallah on Disambiguating Doom · 2010-03-31T03:59:27.522Z · LW · GW

I am very skeptical about SIA

Righly so, since the SIA is false.

The Doomsday argument is correct as far as it goes, though my view of the most likely filter is environmental degradation + AI will have problems.

Comment by mallah on It's not like anything to be a bat · 2010-03-30T17:15:31.943Z · LW · GW

Another reason I wouldn't put any stock in the idea that animals aren't conscious is that the complexity cost of a model in we are and they (other animals with complex brains) are not is many bits of information. 20 bits gives a prior probability factor of 10^-6 (2^-20). I'd say that would outweigh the larger # of animals, even if you were to include the animals in the reference class.

Comment by mallah on It's not like anything to be a bat · 2010-03-30T16:28:01.437Z · LW · GW

That kind of anthropic reasoning is only useful in the context of comparing hypotheses, Bayesian style. Conditional probabilities matter only if they are different given different models.

For most possible models of physics, e.g. X and Y, P(Finn|X) = P(Finn|Y). Thus, that particular piece of info is not very useful for distinguishing models for physics.

OTOH, P(21st century|X) may be >> P(21st century|Y). So anthropic reasoning is useful in that case.

As for the reference class, "people asking these kinds of questions" is probably the best choice. Thus I wouldn't put any stock in the idea that animals aren't conscious.

Comment by mallah on Avoiding doomsday: a "proof" of the self-indication assumption · 2010-03-30T03:34:33.621Z · LW · GW

A - A hundred people are created in a hundred rooms. Room 1 has a red door (on the outside), the outsides of all other doors are blue. You wake up in a room, fully aware of these facts; what probability should you put on being inside a room with a blue door?

Here, the probability is certainly 99%.


B - same as before, but an hour after you wake up, it is announced that a coin will be flipped, and if it comes up heads, the guy behind the red door will be killed, and if it comes up tails, everyone behind a blue door will be killed. A few minutes later, it is announced that whoever was to be killed has been killed. What are your odds of being blue-doored now?

There should be no difference from A; since your odds of dying are exactly fifty-fifty whether you are blue-doored or red-doored, your probability estimate should not change upon being updated.

Wrong. Your epistemic situation is no longer the same after the announcement.

In a single-run (one-small-world) scenario, the coin has a 50% to come up tails or heads. (In a MWI or large universe with similar situations, it would come up both, which changes the results. The MWI predictions match yours but don't back the SIA). Here I assume the single-run case.

The prior for the coin result is 0.5 for heads, 0.5 for tails.

Before the killing, P(red|heads) = P(red|tails) = 0.01 and P(blue|heads) = P(blue|tails) = 0.99. So far we agree.

P(red|before) = 0.5 (0.01) + 0.5 (0.01) = 0.01

Afterwards, P'(red|heads) = 0, P'(red|tails) = 1, P'(blue|heads) = 1, P'(blue|tails) = 0.

P(red|after) = 0.5 (0) + 0.5 (1) = 0.5

So after the killing, you should expect either color door to be 50% likely.

This, of course, is exactly what the SIA denies. The SIA is obviously false.

So why does the result seem counterintuitive? Because in practice, and certainly when we evolved and were trained, single-shot situations didn't occur.

So let's look at the MWI case. Heads and tails both occur, but each with 50% of the original measure.

Before the killing, we again have P(heads) =P(tails) = 0.5

and P(red|heads) = P(red|tails) = 0.01 and P(blue|heads) = P(blue|tails) = 0.99.

Afterwards, P'(red|heads) = 0, P'(red|tails) = 1, P'(blue|heads) = 1, P'(blue|tails) = 0.

Huh? Didn't I say it was different? It sure is, because afterwards, we no longer have P(heads) = P(tails) = 0.5. On the contrary, most of the conscious measure (# of people) now resides behind the blue doors. We now have for the effective probabilities P(heads) = 0.99, P(tails) = 0.01.

P(red|after) = 0.99 (0) + 0.01 (1) = 0.01

Comment by mallah on The I-Less Eye · 2010-03-30T02:46:05.526Z · LW · GW

rwallace, nice reductio ad adsurdum of what I will call the Subjective Probability Anticipation Fallacy (SPAF). It is somewhat important because the SPAF seems much like, and may be the cause of, the Quantum Immortality Fallacy (QIF).

You are on the right track. What you are missing though is an account of how to deal properly with anthropic reasoning, probability, and decisions. For that see my paper on the 'Quantum Immortality' fallacy. I also explain it concisely on on my blog on Meaning of Probability in an MWI.

Basically, personal identity is not fundamental. For practical purposes, there are various kinds of effective probabilities. There is no actual randomness involved.

It is a mistake to work with 'probabilities' directly. Because the sum is always normalized to 1, 'probabilities' deal (in part) with global information, but people easily forget that and think of them as local. The proper quantity to use is measure, which is the amount of consciousness that each type of observer has, such that effective probability is proportional to measure (by summing over the branches and normalizing). It is important to remember that total measure need not be conserved as a function of time.

As for the bottom line: If there are 100 copies, they all have equal measure, and for all practical purposes have equal effective probability.

Comment by mallah on The mathematical universe: the map that is the territory · 2010-03-30T00:31:26.289Z · LW · GW

Interesting. Do you know of place on the net where I can see what other (independent, mathematically knowledgeable) people have to say about its implications? It's asking for a lot maybe, but I think that would be the most efficient way for me to gain info about it, if there is.

Comment by mallah on The mathematical universe: the map that is the territory · 2010-03-30T00:19:38.101Z · LW · GW

Your first argument seems to say that if someone simulated universe A a thousand times and then simulated universe B once, and you knew only that you were in one of those simulations, then you'd expect to be in universe A.

That's right, Nisan (all else being equal, such as A and B having the same # of observers).

I don't see why your prior should assign equal probabilities to all instances of simulation rather than assigning equal probabilities to all computationally distinct simulations.

In the latter case, at least in a large enough universe (or quantum MWI, or the Everything), the prior probability of being a Boltzmann brain (not product of Darwinian evolution) would be nearly 1, since most distinct brain types are. We are not BBs (perhaps not prior info, but certainly info we have) so we must reject that method.

What if you run a simulation of universe A on a computer whose memory is mirrored a thousand times on back-up hard disks? ... Does this count as a thousand copies of you?

No. That is not a case of independent implementations, so it just has the measure of a single A.

As for wavefunction amplitudes, I don't see why that should have anything to do with the number of instantiations of a simulation.

A similar argument applies - more amplitude means more measure, or we would probably be BB's. Also, in the Turing machine version of the Tegmarkian everything, that could only be explained by more copies.

For an argument that even in the regular MWI, more amplitude means more implementations (copies), as well as discussion of what exactly counts as an implementation of a computation, see my paper


Comment by mallah on Newcomb's problem happened to me · 2010-03-26T19:49:04.454Z · LW · GW

It's not a Newcomb problem. It's a problem of how much his promises mean.

Either he created a large enough cost to leaving if he is unhappy, in that he would have to break his promise, to justify his belief that he won't leave; or, he did not. If he did, he doesn't have the option to "take both" and get the utility from both because that would incur the cost. (Breaking his promise would have negative utility to him in and of itself.) It sounds like that's what ended up happening. If he did not, he doesn't have the option to propose sincerely, since he knows it's not true that he will surely not leave.

Comment by mallah on The mathematical universe: the map that is the territory · 2010-03-26T15:17:34.938Z · LW · GW

Ata, there are many things wrong with your ideas. (Hopefully saying that doesn't put you off - you want to become less wrong, I assume.)

it is more difficult to get to the point where it actually seems convincing and intuitively correct, until you independently invent it for yourself

I have indeed independently invented the "all math exists" idea myself, years ago. I used to believe it was almost certainly true. I have since downgraded its likelihood of being true to more like 50% as it has intractable problems.

If it saved a copy of the universe at the beginning of your life and repeatedly ran the simulation from there until your death (if any), would it mean anything to say that you are experiencing your life multiple times?

Of course. (Well, it might be better to say that multiple guys like you are experiencing their own lives.)

Otherwise, it would mean that all types of people have the same measure of consciousness. Thus, for example, the fact that people who seem to be products of Darwinian evolution are more numerous would mean nothing - they are more numerous in terms of copies, not in terms of types, so the typical observer would not be one. So more copies = more measure. A similar argument applies to high measure terms in the quantum wavefunction. None of these considerations change if we assume that all math structures exist.

how about if we’re being simulated by zero computers?

You assume that this would make no difference to our consciousness, but you don't actually present any argument for that. You just assert it in the post. So I would have to say that your argument - being nonexistent - has zero credibility. That doesn't mean that your conclusion must be false, just that your argument provides no evidence in favor of it. The measure argument shows that your conclusion is false - though with the caveat that Platonic computers might count as real enough to simulate us. So let's continue.

By Occam’s Razor, I conclude that if a universe can exist in this way — as one giant subjunctive — then we must accept that that is how and why our universe does exist

So you are abandoning the question of "Why does anything exist?" in favor of just accepting that it does, which is what you warned against doing in the first place.

If all math must exist in a strong Platonic sense, then obviously, it does. If it merely can so exist as far as we know, or OTOH might not, then we have no answer as to why anything exists. "Nothing exists" would seem to be the simplest thing that might have been true, if we had no evidence otherwise.

That said, "everything exists" is prima facie simpler that "something exists" so, given that at least something exists, Occam's Razor suggests that everything exists. Hence my interest in it.

There's a problem, though.

If every possible mathematical structure is real in the same way that this universe is, then isn’t there only an infinitesimal probability that this universe will turn out to be ruled entirely by simple regularities?

Good question. There is an argument based on Turing machines that the simplest programs (i.e. laws of physics) have more measure, because a random string is more likely to have a short segment at the beginning that works well and then a random section of 'don't care' bits, as opposed to needing a long string that all works as part of the program. So if we run all TM programs Platonically, simpler "laws of physics" have more measure, possibly resulting in universes like ours being typical. Great, right?

But there are problems with this. First, there are many possible TMs that could run such programs. We need to choose one - but such a choice contradicts the "inevitable" nature that Platonism is supposed to have. So why not just use all of them? There are infinitely many, so there is no unique measure to use for them. Any choice we can make of how to run them all is inevitably arbritrary, and thus, we are back to "something" rather than "everything". We can have a very "big" something, since all programs do run, but it's still something - some nonzero information that pure math doesn't know anything about.

That's just TMs, but there's no reason other types of math structures such as continuous functions shouldn't exist, and we don't even have the equivalent of a TM to put a measure distribution on them.

I don't know for sure that there isn't some natural measure, but if there is I don't think we can know about it. Maybe I'm overlooking some selection effect that makes things work without arbritrariness.

Ok, so suppose we ignore the arbritrariness problem. The resulting 'everything' might not be Platonism, but at least it would be a high level and fairly simple theory of physics. Does the TM measure in fact predict a universe like ours?

I don't know. Selecting a fairly simple TM, in practice the differences resulting from choice of TM are negligable. But we still have the Boltzmann brain question. I don't know if a BB is typical in such an ensemble or not. At least that is a question that can be studied mathematically.

Comment by mallah on The "show, don't tell" nature of argument · 2010-03-25T16:02:31.329Z · LW · GW

I agree that a claim of sound reasoning methodology is easy to fake, and the writer could easily be mistaken. So it's very weak evidence. However, it's not no evidence, because if the writer would have said "my belief in X is based on faith" that would probably decrease your trust in his conclusions compared to those of someone who didn't make any claims about their methods.

Comment by mallah on The two insights of materialism · 2010-03-24T16:11:07.460Z · LW · GW

Academician, what you are explicitly not saying is that the aspects of reality that give rise to consciousness can be described mathematically. Well, parts of your post seem to imply that the mathematically describable functions are what matter, but other parts deny it. So it's confusing, rather than enlightening. But I'll take you at your word that you are not just a reductionist.

So you are a "monist" but, as David Chalmers has described such positions, in the spirit of dualism. As far as I am concerned, you are a dualist, because the only interesting distinction I see is between mathematically describable reality vs. non-MD reality - and your "monism" has aspects of both.

Your argument seems to be that monism is simpler than dualism, so Occam's Razor prefers it, so we should believe it. Hence, you define the stuff the world is made of as "whatever I am" and call it one kind of stuff.

I don't see that as a useful approach, because what I want to know is whether MD stuff is enough, or whether we need something more, where 'something more' is explicitly mental-related. Remember, we want the simplest explanation that fits the evidence. So the question reduces to "Does an MD-only world fit the evidence from subjective experience?" That's a hard question.

I am planning to write a post on the hard problem at some point, which I'll post on my blog and here.

Comment by mallah on Scott Aaronson on Born Probabilities · 2010-03-15T04:06:53.213Z · LW · GW

Wei, the relationship between computing power and the probability rule is interesting, but doesn't do much to explain Born's rule.

In the context of a many worlds interpretation, which I have to assume you are using since you write of splitting, it is a mistake to work with probabilities directly. Because the sum is always normalized to 1, probabilities deal (in part) with global information about the multiverse, but people easily forget that and think of them as local. The proper quantity to use is measure, which is the amount of consciousness that each type of observer has, such that effective probability is proportional to measure (by summing over the branches and normalizing). It is important to remember that total measure need not be conserved as a function of time.

So for the Ebborian example, if measure is proportional to the thickness squared, the fact that the probability of a slice can go up or down, depending purely on what happens to other slices that it otherwise would have nothing to do with, is neither surprising nor counterintuitive. The measure, of course, would not be affected by what the other slices do. It is just like saying that if the population of China were to increase, and other countries had constant population, then the effective probability that a typical person is American would decrease.

The second point is that, even supposing that quantum computers could solve hard math problems in polynomial time, your claim that intelligence would have little evolutionary value is both utterly far-fetched (quantum computers are hard to make, and nonlinear ones could be even harder) and irrelevant if we believe - as typical Everettians do - that the Born rule is not a seperate rule but must follow from the wave equation. Even supposing intelligence required the Born rule, that would just tell us that the Born rule is true - but we already know that. The question is, why would it follow from the wave equation? If the Born rule is a seperate rule, that suggests dualism or hidden variables, which bring in other possibilities for probability rules.

Actually there are already many other possibilities for probability rules. A lot of people, when trying to derive the Born rule, start out assuming that probabilities depend only on branch amplitudes. We know that seems true, but not why, so we can't start out assuming it. For example, probabilities could have been proportional to brain size.

These issues are discussed in my eprints, e.g. Decision Theory is a Red Herring for the Many Worlds Interpretation

Comment by mallah on The Moral Status of Independent Identical Copies · 2009-12-01T19:38:55.453Z · LW · GW

"our intuition of identical copy immortality"

Speak for yourself - I have no such intuition.

Comment by mallah on MWI, weird quantum experiments and future-directed continuity of conscious experience · 2009-12-01T19:32:48.144Z · LW · GW

Supposedly "we get the intuition that in a copying scenario, killing all but one of the copies simply shifts the route that my worldline of conscious experience takes from one copy to another"? That, of course, is a completely wrong intuition which I feel no attraction to whatsoever. Killing one does nothing to increase consciousness in the others.

See "Many-Worlds Interpretations Can Not Imply 'Quantum Immortality'"