Experimental testing: can I treat myself as a random sample?

post by avturchin · 2025-04-22T12:34:18.495Z · LW · GW · 17 comments

Contents

  1.    How many months are in the year?
  2.    Earth size
  3.    Predicting typical human life expectancy based on my age
  But can it be applied to the real Doomsday Argument?
None
17 comments

TL;DR: Several experiments show that I can extract useful information just by treating myself as a random sample, and thus a view that I can't use myself as a random sample is false. But it's still not clear whether this can be used to prove the Doomsday argument.

There are two views: one view is that I can use my random location to predict the total size of the set from which I am selected – and, moreover, it is applicable to predicting future Earth population and thus the Doomsday timing and other anthropic things like the Simulation argument. 

And the second view is that there will always be a person at the beginning of any large ordered set of observers who will be surprised by their early location (in the case of DA). Thus, the fact of the surprise is non-informative. Or, as a variant, it should be ignored based on Updateless Decision Theory considerations.

Here I will not argue about the theoretical validity of these two views. Instead, I will perform a series of practical experiments to test the central claim.

Let's start from a simple experiment – please check the time of the day now and use it as a random sample to try to predict the typical duration of a day in hours. When I did it the first time, I looked at my clock and it was 15:14. It gives a 50 percent probability that the total number of hours in a day is 30, which is reasonably close to 24.

The history of anthropic thought – at least in one of its lines – started from a practical experiment: R. Gott claimed jokingly in 1975 that he could predict that the Berlin Wall would exist for around 100 percent more of its current age at the time of the prediction (it was 14 years old at the time of the joke). When it fell in 1989, Gott was surprised and wrote a theoretical underpinning of such prediction method. In some sense, Laplace's Sunrise Problem also is based on experimental observation that the Sun rises every day.
Gott continued to insist that his prediction method could be experimentally tested and used it to predict the durations of Broadway shows, and it worked.

However, Gott's method is not explicitly based on some observer sampling assumption, but on the observation of the duration of the existence of external things where the observer continues to exist after they disappear. Therefore, the application of it to the future duration of humanity's existence – the Doomsday Argument – is questionable.
Here I will perform several practical experiments. I will test the central claim: that some useful information can be extracted just from my random location, that is, that I can treat myself as a random sample. This idea is often attacked from different angles (perspective-based reasoning, linear progression argument, full conditioning requirement, and the idea that future observers are not determined yet and can't be sampled).

1.    How many months are in the year?

Given my date of birth, can I estimate the total number of months in a year, as if I do not know it? My birth month is September, the 9th month. This means that with 50 percent probability, according to me-as-a-random-sample logic, the total number of months in a year is 18. This is close to the real number, 12.

2.    Earth size

I will try another, more difficult one. I will measure the size of the Earth, taking as input my birth location and I will take from it only the surface distance from my location to the nearest point on the equator (that is, similar to latitude, but in km). For me, it is 6190km. I then assume that it is a random sample, and will ignore anything from spherical geometry and population distribution. In that case, I assume that I was born in a random place between the equator and pole. Therefore, the surface distance to the pole, according to random sampling of me, should also be 6190 km, and the total surface distance from pole to equator will be 12380 km. The real distance is 10000 km. So here again I get a good approximation of real data by treating myself as a random sample from all observers.

However, if we apply this to time, there is a problem: future observers do not exist yet. Leslie thought that this was the real problem of DA and spent a lot of time proving determinism. If determinism is correct, there is no difference between real and future observers, and random sampling works perfectly; DA will work. In other words, DA will work in a block-time universe. But what about MWI?

One trick to escape this problem is predicting the time of the existence of a typical observer and then applying it to myself as a typical observer.

3.    Predicting typical human life expectancy based on my age

For example, if I take a random alien and learn that his age is 1500 sols, I can't directly predict that this alien will live 3000 sols, but I can say that the average life expectancy of such aliens is 3000 sols, and thus this alien also has this average life expectancy as he likely is an average alien. This works for me too. My current age is 50 (at the time of writing), and this predicts that average human life expectancy is around 100. Not a bad guess, given that real human life expectancy is 70-80 years.

Here I use a trick with median life expectancy which does not change the prediction. The same trick is used in the so-called universal doomsday argument

But can it be applied to the real Doomsday Argument?

Ape in the coat [LW · GW] suggested that we can't think about ourselves as random samples from human history, as observers appear consequently. Any civilization discovers the Doomsday argument in its 20th-century-analogue and is surprised. But it is wrong: I am not randomly located in the history of human civilization; I am located near the date it first discovered the Doomsday Argument. In other words, thinking about DA pinpoints the specific moment in history and kills random selection. This looks like a refutation of the DA at first glance.

However, I can also say that I am randomly selected from all observers in Earth history who will ever think about DA. These observers appeared in the 1970s and the number of them has been growing at least until 2010. This suggests that such observers will disappear in a few decades from now.
 

17 comments

Comments sorted by top scores.

comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T16:54:20.569Z · LW(p) · GW(p)

Note the actual doomsday argument properly applied predicts that humanity is mostly likely to end right now, with probability dropping proportional to total number of humans there have ever been.

To give a simple example why: if you go to a city and see a bus with the number 1546, the number of busses that maximises the chance you would have seen that bus is 1546 busses. At 3000 busses the probability you would have seen that exact bus is halved. And at 3,000,000 it's 2000 times less likely. This gives you a Bayesian update across your original probability distribution for how many busses there are.

Replies from: jack-edwards, avturchin
comment by purple fire (jack-edwards) · 2025-04-22T17:15:32.405Z · LW(p) · GW(p)

Just to clarify, guessing that there are 1546 buses maximizes the probability that you are exactly correct, but it does not minimize your expected error, since you are guessing close to many numbers (everything below 1546) that are impossible. This is known in statistics as the "German tank problem"[1] and the posterior distribution is actually not well-defined in many setups.

  1. ^

    From WW2 soldiers trying to estimate enemies' manufacturing capacity based on tank serial numbers

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T17:29:42.251Z · LW(p) · GW(p)

I'm sorry, I'm not sure what you mean. Under bayesianism this is straightforward.

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T17:30:35.464Z · LW(p) · GW(p)

Oh I see. I'm not trying to guess a specific number, I'm trying to update my distribution.

Replies from: jack-edwards
comment by purple fire (jack-edwards) · 2025-04-22T17:45:51.805Z · LW(p) · GW(p)

The intuition is that if we both saw bus 1546, and you guessed that there were 1546 buses and I guessed that there were 1547, you would be a little more likely to be correct but I would almost certainly be closer to the real number.

The Bayesian update isn't generally well-defined because you get a divergent mean. Your implicit prior is  1/n which is an improper prior. This is fine for deriving a posterior median, which in this case happens to be about 3,100 buses, and a posterior distribution, which in this case is a truncated zeta distribution with s=2 and k=1546. But the posterior mean does not exist.

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T18:07:35.806Z · LW(p) · GW(p)

I'm not using this is a prior, I'm using it to update my existing prior (whatever that was). I believe the posterior will be well defined, so long as the prior was.

Replies from: yair-halberstadt, jack-edwards
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T18:25:20.489Z · LW(p) · GW(p)

As a worked example, if I start off assuming that chance of there being n busses is 1/2^n (nice and simple, adds up to 1), then the posterior is 1/n(ln(2))(2^n) - multiply the two distributions, then divide by the integral (ln(2)) so that it adds up to 1.

Replies from: jack-edwards
comment by purple fire (jack-edwards) · 2025-04-22T18:45:39.251Z · LW(p) · GW(p)

No, that's not the posterior distribution--clearly, the number of buses cannot be lower than 1546, but that distribution has material probability mass on low integers. I'm not quite sure how you got that equation.

But regardless, I think this shows where we disagree. That prior has mean 2... that's a pretty strong assumption about the distribution of n. If you want to avoid that kind of assumption, you can get posterior distributions but not a posterior expectation.

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T19:22:41.754Z · LW(p) · GW(p)

Sorry, I meant to add in an example where for simplicity you saw the bus numbered 1.

 

Agreed it's a terrible prior, it's just an easy one for a worked example.

comment by purple fire (jack-edwards) · 2025-04-22T18:39:31.512Z · LW(p) · GW(p)

I'm not disagreeing with that categorically--for many priors the posterior distribution is well defined. But all of those priors carry information (in the information theoretical sense) about the number of buses. If you have an uninformative reference prior, your posterior distribution does not have a mean.

You can see the sketch of this proof if you consider the likelihoods of seeing the bus for any given n. If there are 1546 buses, there was a 1/1546 chance you saw this one. If there were 1547, there was a 1/1547 chance you saw this one. This is the harmonic series, which diverges. That divergence is the fundamental issue that's going to cause the mean to be undefined.

You can't make claims about the posterior without setting at least some conditions on what your prior is--obviously, for some priors the posterior expectation is well-defined. (Trivially, if I already think n=2000 with probability 1, I will still think that after seeing the bus.) But I claim that all such priors make assumptions about the distribution of the possible number of buses. In the uninformative case, your posterior distribution is well-defined (as I said, it's a truncated zeta distribution) but it does not have a finite mean.

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T19:25:11.515Z · LW(p) · GW(p)

But I claim that all such priors make assumptions about the distribution of the possible number of buses

I mean, yes, that's the definition of a prior. How to calculate a prior is an old question in bayesianism, with different approaches - kolmogorov complexity being one.

Replies from: jack-edwards
comment by purple fire (jack-edwards) · 2025-04-22T19:29:55.689Z · LW(p) · GW(p)

No, that is not the definition of a prior. There are priors which imply an expected number of buses, and priors that don't. If you select a prior that doesn't, you can still get a meaningful posterior distribution even if that posterior distribution doesn't have a real-valued mean.

comment by avturchin · 2025-04-22T17:26:10.316Z · LW(p) · GW(p)

In my view, a proper use here is to compare two hypothesis: there are 2000 buses and 20 000 buses. Finding that the actual number is 1546 is an update in the direction of smaller number of buses. 

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T18:04:58.515Z · LW(p) · GW(p)

It would also update you towards 1600 over 2000.

Replies from: avturchin
comment by avturchin · 2025-04-22T18:26:31.194Z · LW(p) · GW(p)

I think you right that 1546 has the biggest probability compared to other probabilities for any other exact number, that is something like 1:1546. But it doesn't means that it is likely, as it is still very small number. 

In Doomsday argument we are interested in comparing not exact dates but periods, as in that case we get significant probabilities for each period and comparing them has meaning. 

Replies from: yair-halberstadt
comment by Yair Halberstadt (yair-halberstadt) · 2025-04-22T18:29:34.303Z · LW(p) · GW(p)

Agreed, I just wanted to clarify that the assumption it's double as long seems baseless to me. The point is it's usually shortly after.

Replies from: avturchin
comment by avturchin · 2025-04-22T18:43:29.605Z · LW(p) · GW(p)

'double' follows either from Gott's equation or from Laplace's rule.