Bayesian Doomsday Argument

post by DanielLC · 2010-10-17T22:14:17.440Z · LW · GW · Legacy · 16 comments

Contents

16 comments

First, if you don't already know it, Frequentist Doomsday Argument:

There's some number of total humans. There's a 95% chance that you come after the last 5%. There's been about 60 to 120 billion people so far, so there's a 95% chance that the total will be less than 1.2 to 2.4 trillion.

I've modified it to be Bayesian.

First, find the priors:

Do you think it's possible that the total number of sentients that have ever lived or will ever live is less than a googolplex? I'm not asking if you're certain, or even if you think it's likely. Is it more likely than one in infinity? I think it is too. This means that the prior must be normalizable.

If we take P(T=n) ∝ 1/n, where T is the total number of people, it can't be normalized, as 1/1 + 1/2 + 1/3 + ... is an infinite sum. If it decreases faster, it can at least be normalized. As such, we can use 1/n as an upper limit.

Of course, that's just the limit of the upper tail, so maybe that's not a very good argument. Here's another one:

We're not so much dealing with lives as life-years. Year is a pretty arbitrary measurement, so we'd expect the distribution to be pretty close for the majority of it if we used, say, days instead. This would require the 1/n distribution.

After that,

T = total number of people

U = number you are

P(T=n) ∝ 1/n
U = m
P(U=m|T=n) ∝ 1/n
P(T=n|U=m) = P(U=m|T=n) * P(T=n) / P(U=m)
= (1/n^2) / P(U=m)
P(T>n|U=m) = ∫P(T=n|U=m)dn
= (1/n) / P(U=m)
And to normalize:
P(T>m|U=m) = 1
= (1/m) / P(U=m)
m = 1/P(U=m)
P(T>n|U=m) = (1/n)*m
P(T>n|U=m) = m/n

So, the probability of there being a total of 1 trillion people total if there's been 100 billion so far is 1/10.

There's still a few issues with this. It assumes P(U=m|T=n) ∝ 1/n. This seems like it makes sense. If there's a million people, there's a one-in-a-million chance of being the 268,547th. But if there's also a trillion sentient animals, the chance of being the nth person won't change that much between a million and a billion people. There's a few ways I can amend this.

First: a = number of sentient animals. P(U=m|T=n) ∝ 1/(a+n). This would make the end result P(T>n|U=m) = (m+a)/(n+a).

Second: Just replace every mention of people with sentients.

Third: Take this as a prediction of the number of sentients who aren't humans who have lived so far.

The first would work well if we can find the number of sentient animals without knowing how many humans there will be. Assuming we don't take the time to terreform every planet we come across, this should work okay.

The second would work well if we did tereform every planet we came across.

The third seems a bit wierd. It gives a smaller answer than the other two. It gives a smaller answer than what you'd expect for animals alone. It does this because it combines it for a Doomsday Argument against animals being sentient. You can work that out separately. Just say T is the total number of humans, and U is the total number of animals. Unfortunately, you have to know the total number of humans to work out how many animals are sentient, and vice versa. As such, the combined argument may be more useful. It won't tell you how many of the denizens of planets we colonise will be animals, but I don't think it's actually possible to tell that.

One more thing, you have more information. You have a lifetime of evidence, some of which can be used in these predictions. The lifetime of humanity isn't obvious. We might make it to the heat death of the universe, or we might just kill each other off in a nuclear or biological war in a few decades. We also might be annihilated by a paperclipper somewhere in between. As such, I don't think the evidence that way is very strong.

The evidence for animals is stronger. Emotions aren't exclusively intelligent. It doesn't seem animals would have to be that intelligent to be sentient. Even so, how sure can you really be. This is much more subjective than the doomsday part, and the evidence against their sentience is staggering. I think so anyway, how many animals are there at different levels of intelligence?

Also, there's the priors for total human population so far. I've read estimates vary between 60 and 120 billion. I don't think a factor of two really matters too much for this discussion.

So, what can we use for these priors?

Another issue is that this is for all of space and time, not just Earth.

Consider that you're the mth person (or sentient) from the lineage of a given planet. l(m) is the number of planets with a lineage of at least m people. N is the total number of people ever, n is the number on the average planet, and p is the number of planets.

l(m)/N
=l(m)/(n*p)
=(l(m)/p)/n

l(m)/p is the portion of planets that made it this far. This increases with n, so this weakens my argument, but only to a limited extent. I'm not sure what that is, though. Instinct is that l(m)/p is 50% when m=n, but the mean is not the median. I'd expect a left-skew, which would make l(m)/p much lower than that. Even so, if you placed it at 0.01%, this would mean that it's a thousand times less likely at that value. This argument still takes it down orders of magnitude than what you'd think, so that's not really that significant.

Also, a back-of-the-envolope calculation:

Assume, against all odds, there are a trillion times as many sentient animals as humans, and we happen to be the humans. Also, assume humans only increase their own numbers, and they're at the top percentile for the populations you'd expect. Also, assume 100 billion humans so far.

n = 1,000,000,000,000 * 100,000,000,000 * 100

n = 10^12 * 10^11 * 10^2

n = 10^25

Here's more what I'd expect:

Humanity eventually puts up a satilite to collect solar energy. Once they do one, they might as well do another, until they have a dyson swarm. Assume 1% efficiency. Also, assume humans still use their whole bodies instead of being a brain in a vat. Finally, assume they get fed with 0.1% efficiency. And assume an 80-year lifetime.

n = solar luminosity * 1% / power of a human * 0.1% * lifetime of Sun / lifetime of human

n = 4 * 10^26 Watts * 0.01 / 100 Watts * 0.001 * 5,000,000,000 years / 80 years

n = 2.5 * 10^27

By the way, the value I used for power of a human is after the inefficiencies of digesting.

Even with assumptions that extreme, we couldn't use this planet to it's full potential. Granted, that requires mining pretty much the whole planet, but with a dyson sphere you can do that in a week, or two years with the efficiency I gave.

It actually works out to about 150 tons of Earth per person. How much do you need to get the elements to make a person?

Incidentally, I rewrote the article, so don't be surprised if some of the comments don't make sense.

16 comments

Comments sorted by top scores.

comment by Emile · 2010-10-18T18:50:47.025Z · LW(p) · GW(p)

Edit: I think the only way this is useful is to blablabla ...

Putting this edit at the top like that makes the post less readable ... the way to read it now is to look for the "original start", read from there on, and then read the edit at the top (which doesn't make much sense if you have no idea what the post is talking about). Most awkward.

Replies from: DanielLC
comment by DanielLC · 2010-10-19T00:44:46.516Z · LW(p) · GW(p)

Fixed.

comment by Manfred · 2010-10-18T02:12:09.184Z · LW(p) · GW(p)

"In frequentist statistics, they'd tell you that, no other information being given, the probability of each possibility is equal."

Many bayesians would call it the principle of indifference. That's not why your reasoning goes off the rails.

It goes off the rails soon after because you try to reason from a non-integrable distribution. Bad plan. E.T. Jaynes would tell you to construct the infinite distribution as the limit of a finite distribution. A non-straw frequentist would jump to the end and tell you that the Poisson distribution should be a reasonable first approximation to when humanity will end, since the chances of extinction shouldn't vary too much with time.

A frequentist perspective on the doomsday argument, that maybe someone can translate into bayesian thinking: While technically valid, it's not useful. This is because the variance of the mean is equal to the variance of the distribution of humans. If humanity ends tomorrow your variance turns out to have been small, but if humanity lives forever your variance turns out to have been infinite. This makes it a prediction of everything, and therefore nothing.

Replies from: DanielLC
comment by DanielLC · 2010-10-18T03:04:00.208Z · LW(p) · GW(p)

Jaynes would tell you to construct the infinite distribution as the limit of a finite distribution.

What do you mean? Suppose I said that n people was just as likely for all n. The limit would show that the probability of being before person number googolplex decreases without limit, so if I think that there's a one-in-a-million chance of being that early, that distribution would underestimate it.

the Poisson distribution should be a reasonable first approximation to when humanity will end, since the chances of extinction shouldn't vary too much with time.

I'm not sure when humanity will end. Some say it they'll last until the heat-death of the universe, others say we were lucky to last the Cold War, and we're going to die out when hostilities rise again.

As for your last part, why is variance important?

Replies from: Manfred
comment by Manfred · 2010-10-18T05:32:00.496Z · LW(p) · GW(p)

Oops, it looks like I messed up. I misinterpreted you as talking about a distribution of when humanity would go extinct. But you weren't talking about a distribution - you just said " if humanity lives forever, that contradicts intuition about our chances of biting it in finite time."

Maybe it's because I think a distribution is the correct way to think about it. If the chances of humanity going extinct are a Poisson distribution normalized to 0.75, we have a 25% chance of living forever AND a finite chance of biting it in finite time - the apparent exclusion is resolved.

Of course, that still doesn't answer the question "why were we born now and not infinitely later?" To do that, you have to invoke the axiom of choice, which is, in layman's terms, "impossible things happen all the time, because it makes more sense than them not happening."

-

"As for your last part, why is variance important?"

It's because a prediction is only powerful if it excludes some possibilities. If any real measurement is within a standard deviation of the prediction, it's a pretty boring prediction - it's not enough to tell me that I will die in 50 years if the error bars are +/- 100.

Of course, the more I think about it the more I question how I got my result. It was pretty dependent on thinking of the mean as something drawn from a distribution rather than just plugging and chugging the obvious way.

Replies from: DanielLC
comment by DanielLC · 2010-10-18T05:57:42.013Z · LW(p) · GW(p)

It's not about disproving certain possibilities; it's about finding them increasingly unlikely. I never said that humanity will die out. The intuition at the beginning was that it might. Of course, if there's any chance it will, the fact that we're alive now instead of unimaginable far into the future shows that it almost definitely will.

"impossible things happen all the time, because it makes more sense than them not happening."

But they'd be more likely to happen if there were fewer other possibilities. For example, if there was one dart thrower who was only guaranteed to hit the dartboard, and one who only ever hit the bullseye, if it hit in a given spot on the bullseye, it would be fairly good evidence that the second guy threw it.

Also, I don't see what that has to with the Axiom of Choice. That's an axiom of set theory, not statistics.

Replies from: Manfred
comment by Manfred · 2010-10-18T22:57:38.254Z · LW(p) · GW(p)

The axiom of choice was just to justify how it's possible to be born now even if there are infinite people, which is an objection I have seen before, though you didn't make it. Also it's a fun reference to make.

It's possible I'm wrong, but I'll try to walk you through my logic as to why it isn't informative that we're born now. Imagine the "real" set of people who ever get born. When choosing from this bunch o' people, the variance is - ^2 = N^2 /3-N^2 /4 = N^2 /12 (for large N).

So the standard deviation is proportional not to the square root of N, as you'd expect if you're used to normal distributions, but is instead proportional to N. This means that no matter how big N is, the beginning is always the same number of standard deviations from the mean. Therefore, to a decent approximation, it's not more surprising to be born now if there are 10^20 total people born than if there are 10^100.

Of course, writing this out highlights a problem with my logic (drat!) which is that standard deviations aren't the most informative things when your distribution doesn't look anything like a normal distribution - I really should have thought in two-tailed integrals or something. But then, once you start integrating things, you really do need a prior distribution.

So dang, I guess my argument was at least partly an artifact of applying standard distributions incorrectly.

comment by Vladimir_Nesov · 2010-10-18T01:11:26.995Z · LW(p) · GW(p)

First off, you have to assume that, as a prior, you're just as likely to be anyone.

A meaningless question. If you can be anyone, then you are everyone at the same time, in the sense of controlling everyone's decisions (with some probability). It's not mutually exclusive that your decision controls one person and another person, so even though the prior probability of controlling any given person is low (that is, given what your decisions would be in some collection of situation, we can expect a low probability of them all determining that person's decisions), and the total number of people is high, the probabilities of controlling all available people don't have to add up to 1 (the sum could be far below 1, if you are in fact neither of these people, maybe you don't exist in this world; or far above 1, if all people are in fact your near-copies).

See

Replies from: DanielLC
comment by DanielLC · 2010-10-18T01:55:26.934Z · LW(p) · GW(p)

This isn't about your decision controlling them. It's about the information gained by knowing you're the nth person. The fact that you might not be a person doesn't really matter, since you have to be a person for any of the possibilities mentioned.

the sum could be far below 1, if you are in fact neither of these people

P(U=m|T=n)∝1/n

That should be

a = number of nonhuman sentients.

P(U=m|T=n)∝1/(a+n)

which approaches a constant as a increases without limit.

Oops. I've checked this several times, and hadn't seen that.

Let this be a lesson. When quadrillions of lives are counting on it, make sure someone double-checks your math.

I'm going to consider the ramifications of this for a while. This argument might still apply significantly. It might not.

comment by Vladimir_M · 2010-10-17T23:31:35.395Z · LW(p) · GW(p)

In frequentist statistics, they'd tell you that, no other information being given, the probability of each possibility is equal.

I don't think that's a correct summary, certainly not in the context of this discussion. You might be confusing frequentist probability with classical probability.

Andrew Gelman wrote a critique of the Doomsday Argument, in which he dismisses it as a bad frequentist argument misrepresented as a Bayesian one. (He elaborates on the same idea in this paper.) If I understand correctly what Gelman says (which may not be the case), I agree with him. DA is nothing more than a mathematical sleight of hand in which a trivial mathematical tautology is misleadingly presented as a non-trivial claim about the real world.

Replies from: DanielLC, steven0461
comment by DanielLC · 2010-10-18T01:20:50.030Z · LW(p) · GW(p)

I don't think that's a correct summary, certainly not in the context of this discussion. You might be confusing frequentist probability with classical probability.

Fixed.

This is a variation on the argument I've heard. I can assure you it's Bayesian. I looked at the Bayesian one linked to in your link. I didn't see anything about justifying the prior, but I might have missed it. How do they make an explanation for such a simple argument so long?

I doubt I can focus long enough to understand what he's saying in that, also very long, rebuttal. If you understand it, can you tell me where you think the error is? At the very least, please narrow it down to one of these four:

  1. My prior is unreasonable
  2. My evidence is faulty
  3. I'm underestimating the importance of other evidence
  4. Something else I haven't thought of
comment by steven0461 · 2010-10-17T23:50:11.686Z · LW(p) · GW(p)

There's a Bayesian Doomsday Argument associated with Carter and Leslie as well as a frequentist version associated with Gott.

comment by Mitchell_Porter · 2010-10-18T07:15:35.349Z · LW(p) · GW(p)

This is indeed a confusing post. Why are we talking about utility at all, when this is just a question of fact? Shouldn't we be concerned with the number of sentients that our Earth-derived lineage will contain, not the total number of sentients that will ever exist anywhere? How does "believing that Xs matter less" imply "it is less likely that you are an X"? How can the Doomsday Argument become an "argument that we're not going to die"?

Replies from: DanielLC
comment by DanielLC · 2010-10-19T04:44:14.899Z · LW(p) · GW(p)

Shouldn't we be concerned with the number of sentients that our Earth-derived lineage will contain, not the total number of sentients that will ever exist anywhere?

I thought it would end up working out the same. I just added that part, and I was wrong. It's still somewhat close.

How does "believing that Xs matter less" imply "it is less likely that you are an X"?

It's not proof, but it's evidence. I guess it's pretty weak, so I deleted that part.

As a completely unrelated justification for that, if you ran a human brain slower, you'd be less likely to be that person at a given time. For instance, if it was half as fast, you'd be half as likely to be it at 5:34, since, to it, 5:34 only lasts 30 seconds. Animals are less intelligent than humans, so they might act similar to a slowed human brain.

comment by jimrandomh · 2010-10-17T23:57:00.196Z · LW(p) · GW(p)

First off, you have to assume that, as a prior, you're just as likely to be anyone.

There is a decision-theoretic caveat here. While you may be "just as likely" to be anyone, the amount of influence you have over utility, for most utility functions involving the universe as a whole, varies depending when you were born. People born earlier in the history of humanity have a greater influence, and therefore, you should weight the possibility that you were born early more highly.

Replies from: DanielLC
comment by DanielLC · 2010-10-18T01:10:43.811Z · LW(p) · GW(p)

You should weigh the importance of your choices more highly.

This doesn't mean future stuff doesn't matter; but it makes it so it's not an obvious choice.

Suppose you do something that has a chance of saving the world. Suppose there have been 100 billion people so far. The expected amount you'd do is ∫k/n dn = ln(n_2/n_1) If there's less than 200 billion people, that's k ln 2. If it's less than 210^40, that's k ln 210^29. It works out to being about 100 times as important. That seems like a lot, but charity tends to work in orders of magnitude difference.

I'm not sure how good a value 10^40 is, but I think the order of magnitude is within a factor of two, so the predicted value would be within that.