# Yes, Virginia, You Can Be 99.99% (Or More!) Certain That 53 Is Prime

post by ChrisHallquist · 2013-11-07T07:45:07.565Z · score: 40 (46 votes) · LW · GW · Legacy · 68 comments**TLDR;** though you can't be 100% certain of anything, a lot of the people who go around talking about how you can't be 100% certain of anything would be surprised at how often you can be 99.99% certain. Indeed, we're often justified in assigning odds ratios well in excess of a million to one to certain claims. Realizing this is important for avoiding certain rookie Bayesian's mistakes, as well as for thinking about existential risk.

53 is prime. I'm *very *confident of this. 99.99% confident, at the very least. How can I be so confident? Because of the following argument:

If a number is composite, it must have a prime factor no greater than its square root. Because 53 is less than 64, sqrt(53) is less than 8. So, to find out if 53 is prime or not, we only need to check if it can be divided by primes less than 8 (i.e. 2, 3, 5, and 7). 53's last digit is odd, so it's not divisible by 2. 53's last digit is neither 0 nor 5, so it's not divisible by 5. The nearest multiples of 3 are 51 (=17x3) and 54, so 53 is not divisible by 3. The nearest multiples of 7 are 49 (=7^2) and 56, so 53 is not divisible by 7. Therefore, 53 is prime.

(My confidence in this argument is helped by the fact that I was good at math in high school. Your confidence in your math abilities may vary.)

I mention this because in his post Infinite Certainty, Eliezer writes:

Suppose you say that you're 99.99% confident that 2 + 2 = 4. Then you have just asserted that you could make 10,000

independentstatements, in which you repose equal confidence, and be wrong, on average, around once. Maybe for 2 + 2 = 4 this extraordinary degree of confidence would be possible: "2 + 2 = 4" extremely simple, and mathematical as well as empirical, and widely believed socially (not with passionate affirmation but just quietly taken for granted). So maybe you really could get up to 99.99% confidence on this one.I don't think you could get up to 99.99% confidence for assertions like "53 is a prime number". Yes, it seems likely, but by the time you tried to set up protocols that would let you assert 10,000

independentstatements of this sort—that is, not just a set of statements about prime numbers, but a new protocol each time—you would fail more than once. Peter de Blanc has an amusing anecdote on this point, which he is welcome to retell in the comments.

I think this argument that you can't be 99.99% certain that 53 is prime is fallacious. Stuart Armstrong explains why in the comments:

If you say 99.9999% confidence, you're implying that you could make one million equally fraught statements, one after the other, and be wrong, on average, about once.Excellent post overall, but that part seems weakest - we suffer from an unavailability problem, in that we can't just think up random statements with those properties. When I said I agreed 99.9999% with "P(P is never equal to 1)" it doesn't mean that I feel I could produce such a list - just that I have a very high belief that such a list could exist.

In other words, it's true that:

- If a well-calibrated person claims to be 99.99% certain of 10,000 independent statements, on average one of those statements should be false.

But it doesn't follow that:

- If a well-calibrated person claims to be 99.99% certain of one statement, they should be able to produce 9,999 other independent statements of equal certainty and be wrong on average once.

If it's not clear why this doesn't follow consider the anecdote Eliezer references in the quote above, which runs as follows: A gets B to agree that if 7 is not prime, B will give A $100. B then makes the same agreement for 11, 13, 17, 19, and 23. Then A asks about 27. B refuses. What about 29? Sure. 31? Yes. 33? No. 37? Yes. 39? No. 41? Yes. 43? Yes. 47? Yes. 49? No. 51? Yes. And suddenly B is $100 poorer.

Now, B claimed to be 100% sure about 7 being prime, which I don't agree with. But that's not what lost him his $100. What lost him his $100 is that, as the game went on, he got careless. If he'd taken the time to ask himself, "am I really as sure about 51 as I am about 7?" he'd probably have realized the answer was "no." He probably didn't check he primality of 51 as carefully as I checked the primality of 53 at the beginning of this post. (From the provided chat transcript, sleep deprivation may have also had something to do with it.)

If you tried to make 10,000 statements with 99.99% certainty, sooner or later you would get careless. Heck, before I started writing this post, I tried typing up a list of statements I was sure of, and it wasn't long before I'd typed 1 + 0 = 10 (I'd *meant *to type 1 + 9 = 10. Oops.) But the fact that, as the exercise went on, you'd start including statements that weren't really as certain as the first statement doesn't mean you couldn't be justified in being 99.99% certain of that first statement.

I almost feel like I should apologize for nitpicking this, because I agree with the main point of the "Infinite Certainty" post, that you should never assign a proposition probability 1. Assigning a proposition a probability of 1 implies that no evidence could ever convince you otherwise, and I agree that that's bad. But I think it's important to say that you're often justified in putting *a lot *of 9s after the decimal point in your probability assignments, for a few reasons.

One reason is arguments in the style of Eliezer's "10,000 independent statements" argument lead to inconsistencies. From another post of Eliezer's:

I would be substantially more alarmed about a lottery device with a well-defined chance of 1 in 1,000,000 of destroying the world, than I am about the Large Hadron Collider being switched on.

On the other hand, if you asked me whether I could make one million statements of authority equal to "The Large Hadron Collider will not destroy the world", and be wrong, on average, around once, then I would have to say no.

What should I do about this inconsistency? I'm not sure, but I'm certainly not going to wave a magic wand to make it go away. That's like finding an inconsistency in a pair of maps you own, and quickly scribbling some alterations to make sure they're consistent.

I would also, by the way, be substantially more worried about a lottery device with a 1 in 1,000,000,000 chance of destroying the world, than a device which destroyed the world if the Judeo-Christian God existed. But I would not suppose that I could make one billion statements, one after the other, fully independent and equally fraught as "There is no God", and be wrong on average around once.

Okay, so that's just Eliezer. But in a way, it's just a sophisticated version of a mistake a lot of novice students of probability make. Many people, when you tell them they can never be 100% certain of anything, respond switching to saying 99% or 99.9% whenever they previously would have said 100%.

In a sense they have the right idea—there are lots of situations where, while the appropriate probability is not 0, it's still negligible. But 1% or even 0.1% isn't negligible enough in many contexts. Generally, you should not be in the habit of doing things that have a 0.1% chance of killing you. Do so on a daily basis, and on average you will be dead in less than three years. Conversely, if you mistakenly assign a 0.1% chance that you will die each time you leave the house, you may never leave the house.

Furthermore, the ways this can trip people up aren't just hypothetical. Christian apologist William Lane Craig claims the skeptical slogan "extraordinary claims require extraordinary evidence" is contradicted by probability theory, because it actually wouldn't take all that much evidence to convince us that, for example, "the numbers chosen in last night's lottery were 4, 2, 9, 7, 8 and 3." The correct response to this argument is to say that the prior probability of a miracle occurring is orders of magnitude smaller than mere one in a million odds.

I suspect many novice students of probability will be uncomfortable with that response. They shouldn't be, though. After all, if you tried to convince the average Christian of Joseph Smith's story with the golden plates, they'd require *much *more evidence than they'd need to be convinced that last night's lottery numbers were 4, 2, 9, 7, 8 and 3. That suggests their prior for Mormonism is much less than one in a million.

This also matters a lot for thinking about futurism and existential risk. If someone is in the habit of using "99%" as shorthand for "basically 100%," they will have trouble grasping the thought "I am 99% certain this futuristic scenario will not happen, but the stakes are high enough that I need to take the 1% chance into account in my decision making." Actually, I suspect that problems in this vicinity explain much of the problems ordinary people (read: including average scientists) have thinking about existential risk.

I agree with what Eliezer has said about being ware of picking numbers out of thin air and trying to do math with them. (Or if you are going to pick numbers out of thin air, at least be ready to abandon your numbers at the drop of a hat.) Such advice goes double for dealing with very small probabilities, which humans seem to be *especially *bad at thinking about.

But it's worth trying to internalize a sense that there are several very different categories of improbable claims, along the lines of:

- Things that have a probability of something like 1%. These are things you really don't want to bet your life on if you can help it.
- Things that have a probability of something like one in a million. Includes many common ways to die that don't involve doing anything most people would regard as especially risky. For example, these stats suggest the odds of a 100 mile car trip killing you are somewhere on the order of one in a million.
- Things whose probability is truly negligible outside alternate universes where your evidence is radically different than what it actually is. For example, the risk of the Earth being overrun by demons.

Furthermore, it's worth trying to learn to think coherently about which claims belong in which category. That includes not being afraid to assign claims to the third category when necessary.

**Added: **I also recommend the links in this comment by komponisto.

## 68 comments

Comments sorted by top scores.

Generally, you should not be in the habit of doing things that have a 0.1% chance of killing you. Do so on a daily basis, and on average you will be dead in less than three years

Indeed!

It's even worse than that might suggest: 0.999^(3*365.25) = 0.334, so after three years you are almost exactly twice as likely to be dead than alive.

To get 50%, you only need 693 days, or about 1.9 years. Conversely, you need a surprising length of time (6500 days, about 17.8 years) to reduce your survival chances to 0.001.

The field of high-availability computing seems conceptually related. This is often considered in terms of the number of nines - so 'five nines' is 99.999% availability, or <5.3 min downtime a year. It often surprises people that a system can be unavailable for the duration of an entire working day and still hit 99.9% availability over the year. The 'nines' sort-of works conceptually in some situations (e.g. a site that makes money from selling things can't make money for as long as it's unavailable). But it's not so helpful in situations where the cost of an interruption per se is huge, and the length of downtime - if it's over a certain threshold - matters much less than whether it occurs at all. There are all sorts of other problems, on top of the fundamental one that it's very hard to get robust estimates for the chances of failure when you expect it to occur very infrequently. See Feynman's appendix to the report on the Challenger Space Shuttle disaster for amusing/horrifying stuff in this vein.

Very big and very small probabilities are very very hard.

Christian apologist William Lane Craig claims the skeptical slogan "extraordinary claims require extraordinary evidence" is contradicted by probability theory, because it actually wouldn't take all that much evidence to convince us that, for example, "the numbers chosen in last night's lottery were 4, 2, 9, 7, 8 and 3." The correct response to this argument is to say that the prior probability of a miracle occurring is orders of magnitude smaller than mere one in a million odds.

This only talks about the probability of the evidence given the truth of the hypothesis, but ignores the probability of the evidence given its falsity. For a variety of reasons, fake claims of miracles are far more common than fake TV announcements of the lottery numbers, which drastically reduces the likelihood ratio you get from the miracle claim relative to the lotto announcement.

The specific miracle also has lower prior probability (miracles are possible+this specific miracle's details), but that's not the only issue.

Even if true announcments are just 9 times more likely than false announcements, then a true announcment should raise your confidence that the lottery numbers were 4 2 9 7 9 3 to 90%. This is because the probability P (429783 announced | 429783 is the number) is just the probability of a true announcement, but the probability P( 429783 announced | 429783 is not the number) is the probability of a false announcement, divided by a million.

A false announcer would have little reason to fake the number 429793. This already completely annihilates the prior probability.

The specific miracle also has lower prior probability (miracles are possible+this specific miracle's details), but that's not the core issue.

Actually, I'd consider it fairly important. It's one reason the probabilities *ought* to get very small very fast, but if you're reluctant to assign less than one in a million odds...

I think it's important to grasp the general principle under which a person telling you that this week's winning lotto numbers are some particular sequence is *stronger evidence* than their telling you a miracle took place. It offers a greater odds ratio, because they're much less likely to convey a particular lottery number in the event of it not being the winning one than they are to convey a miracle story in the event that no miracle occurred (even people who believe in miracles should be able to accept that miracles have a very high false positive rate if they believe that miracles only occur within their own religion.)

To illustrate: Suppose you're checking the lottery results online, and you see that you won, and you're on your laptop at the house of a friend who knows what lottery numbers you buy and who has used his wi-fi to play pranks on guests in past. Suddenly the evidence doesn't fare so well against that million-to-one prior.

This reminds me of reading about the Miracle of the Sun (http://en.wikipedia.org/wiki/Miracle_of_the_Sun) in The God Delusion and in a theist's response. I found Dawkins fairly unpersuasive; the many agreeing testimonials weren't enough to overcome the enormous prior improbability, but they were still disconcertingly strong evidence. The theists' response cleared this up by giving historical background that Dawkins omitted. Apparently, the miracle was predicted in advance by three children and had become a focal point in the tensions between the devout and the secular. Suddenly, it was not at all surprising that the gathered crowd witnessed a miracle.

So I'd agree that miracles often have probability of under one in a million, but it's also vitally important to understand the effect of motivation on the likelihood of the evidence. If I thought every testimony to every reported miracle was based on unbiased reporting of fact, I'd have to conclude that many of them happened (caused by aliens messing with us or something).

Craig is just purposely conflating the likelihood of a particular result and the likelihood of given the declaration of a result by the lottery officials, that result being true.

If you and I are flipping coins for a million dollars, it's going to take a lot of convincing evidence that I lost the coin flip before I pay up. You just cannot flip the coin in another room where I can't even see, and then expect me to pay up because, well, the probability of heads is 50% and I shouldn't be so surprised to learn that I lost.

Therefore, the actual likelihood of a particular set of lottery numbers is totally irrelevant in this discussion.

In any case, the only kind of "evidence" that we have been presented for miracles has always been of the form "person X says Y happened', which has been known as hearsay and dealt with without even bothering with probability theory.

Christian apologist William Lane Craig claims the skeptical slogan "extraordinary claims require extraordinary evidence" is contradicted by probability theory, because it actually wouldn't take all that much evidence to convince us that, for example, "the numbers chosen in last night's lottery were 4, 2, 9, 7, 8 and 3." The correct response to this argument is to say that the prior probability of a miracle occurring is orders of magnitude smaller than mere one in a million odds.

I'm not sure that response works. Flip a fair coin two hundred times, tell me the results, then show me the video and I'll almost certainly believe you. But if the results were H^200, I won't; I'll assume you were wrong or lying about the coin being fair, or something.

H^200 isn't any less likely than any other sequence of two hundred coin flips, but it's still one of the most extraordinary. Extraordinariness just doesn't feel like it's a mere question of prior probability.

H^200 isn't any less likely under the assumption that the coin is fair, and the person reporting the coin is honest. But! H^200—being a particularly simple sequence—is massively more likely than most other sequences under the alternative assumption that the reporter is a liar, or that the coin is biased.

So being told that the outcome was H^200 is at least a lot of evidence that there's something funny going on, for that reason.

This has nothing to do with simplicity. Any other apriori selected sequence, such as first 200 binary digits of pi, would be just as unlikely. It seems like it is related to simplicity because "non-simple" sequences are usually described in an aggregate way, such as "100 heads and 100 tails" and in fact include a lot of individual sequences, resulting in an aggregate probability much higher than 1/2^200.

This has nothing to do with simplicity. Any other apriori selected sequence, such as first 200 binary digits of pi, would be just as unlikely.

Yes, *under the hypothesis that the coin is fair and has been flipped fairly* all sequences are equally unlikely. But under the hypothesis that someone is lying to us or has been messing with the coin simple sequences are more likely. So (via Bayes) if we hear of a simple sequence we will think it's more likely to have be artificially created than if we hear of a complicated one.

Well, what's most interestingly improbable here is the *prediction* of a 200-coin sequence, not the sequence itself.

I suspect what's going on with such "extraordinary" sequences is a kind of hindsight bias... the sequence seems so simple and easy to understand that, upon revealing it, we feel like "we knew it all along." That is, we feel like we could have predicted it... and since such a prediction is extraordinarily unlikely, we feel like something extraordinarily unlikely just happened.

Extraordinariness just doesn't feel like it's a mere question of prior probability.

And, that, my friend, is how an algorithm feels from inside. What else could extraordinariness possibly be? It might also help to read "Probability Is Subjectively Objective".

What else could extraordinariness possibly be?

You seem to be suggesting that if I actually flipped a coin 200 times, the actual result would be just as extraordinary as the hypothetical result H^200, having an equal prior probability. I'm not sure why you'd be suggesting that, so maybe we have crossed wires?

For one thing, it might be that an extraordinary event is one which causes us to make large updates to the probabilities assigned to various hypotheses. H^200 makes hypotheses like "double headed coin" go from "barely worth considering" to "really quite plausible", so is extraordinary. (I think this idea has a bunch of fiddly little bits that I'm not particularly interested in hammering out. But the idea that the extraordinariness of an event is purely a function of its prior probability, just seems plain wrong.)

You seem to be suggesting that if I actually flipped a coin 200 times, the actual result would be just as extraordinary as the hypothetical result H^200, having an equal prior probability.

I agree that I seemed to suggest that; I indeed disagree that some arbitrary (expected to come with attached justifications, but those justifications are not present) sequence would be just as "extraordinary". This is where simplicity comes in -- the only reason a sequence of 200 bits would be interesting to humans would be if it were simple -- if it had some special property that allowed us to describe it without listing out the results of all 200 flips. Most sequences of 200 flips won't have this property, which makes the sequences that do extraordinary. So I'd consider T^200, (HT)^100, and (TH)^100 extraordinary sequences, but not 001001011101111001101011001111100101001110001011101000100011100 000111000101110001011100011010101001000011011001011011010110101 011100011000000101011001000100011000010100001100110000110010101

- However, if I were to take out a coin and flip it now, and get those results, I could say "that sequence I posted on Less Wrong", and thus it would be extraordinary.

So I agree that extraordinariness has nothing to do with the prior probability of a *particular* sequence of flips, but rather the fact that such a sequence of flips belongs to a privileged reference class (sequences of flips you can easily describe without listing all 200 flips), and getting a sequence from that reference class is an event with a low prior probability. The combination of being in that particular reference class *and* the fact that such an event (being in that reference class, not the individual sequence itself) is unlikely together might provide a sense of extraordinariness.

I've suggested elsewhere that this sense of extraordinariness when faced with a result like H^200 is the result of a kind of hindsight bias.

Roughly speaking, the idea is that certain notions seem simple to our brains... easier to access and understand and express and so forth. When such an notion is suggested to us, we frequently come to believe that "we knew it all along."

A H^200 string of coin-flips is just such a notion; it seems simpler than a (HHTTTHHTHHT)^20 string, for example. So when faced with H^200 we have a stronger sense of having predicted it, or at least that we *would* have been able to predict it if we'd thought to.

But, of course, predicting the result of 200 coin flips is extremely unlikely, and we know that. So when faced with H^200 we have a much stronger sense of having experienced something extremely unlikely (aka extraordinary) than when faced with a more "random-seeming" string.

Getting 200 heads only in coinflipping is just as likely as any other result. However, it is of incredibly low entropy -- you should not expect to see a pattern of that sort (less bits to describe than listing the results). It's also impossibly unlikely as a result from a fair coin, compared to as a result of fraud or an unfair coin.

Isn't it also the case that you are, in that case, receiving extraordinary evidence? If people were as unreliable about lottery numbers as they are about religion you would in fact remain pretty skeptical about the actual number.

First, I really like you pointing out the frequent 99% cop out and your partitioning of low-probability events into meaningful categories.

Second, I am not sure that your example with 53 being prime is convincing. It would be more interesting to ask "what unlikely event would break your confidence in 53 being prime?" and estimate the probability of such an event.

"what unlikely event would break your confidence in 53 being prime?"

The discovery of modular arithmetic and finite fields?

Presumably ChrisHallquist is already familiar with finite fields.

Your list actually doesn't go far enough. There is a fourth, and scarier category. Things which would, if possibly render probability useless as a model. "The chance that probabilities don't apply to anything." is in the fourth category. I would also place anything that violates such basic things as the consistency of physics, or the existence of the external world.

For really small probabilities, we have to take into account some sources of error that just aren't meaningful in more normal odds.

For instance, if I shuffle and draw one card from a new deck what is the chance of drawing the ace of spades? I disregard any chance of the deck being defective, any chance of my model of the universe being wrong, and any chance of laws of identity being violated. Any probabilities are eclipsed by the normal probabilities of drawing cards. (category 1)

If I shuffle and draw two cards without replacement from a new deck, what is the chance of them both being aces of spades? Now I have to consider other sources of error. There could have been a factory error or the deck may have been tampered with. (category 2)

If I shuffle and draw one card from a new deck, what is the chance of it being a live tiger? Now I have to consider my model of the universe being drastically wrong. (category 3)

If I shuffle and draw one card from a new deck, what is the chance of it being both the ace of spades and the two of clubs? Not a misprint, and not two cards, but somehow both at the same time. Now I have to consider the law of identity being violated. (category 4)

Would 53 not being prime break mathematics?

It would more likely be user error. I believe 53 is prime. If it isn't then either mathematics is broken or I have messed up in my reasoning. It is much more likely that I made an error or accepted a bad argument.

53 not being prime while having no integer factors other than 1 and itself would break mathematics.

LNC, not the law of identity, I think.

Oops, right. Non-contradiction.

This is one of the great reasons to do your math with *odds* rather than *probabilities*. (Well, this plus the fact that Bayes' Theorem is especially elegant when formulated in the form of odds ratios.)

There is no reason, save the historical one, that the default mode of thinking is in probabilities (as opposed to odds.) The math works just the same, but for probabilities that are even slightly extreme (even a fair amount less extreme than what is being talked about here), our intuitions about them break down. On the other hand, our intuitions when doing calculations with odds seem to break down a lot less.

When money is on the line, people have more of incentive to avoid errors. That's why, traditionally, gamblers use odds and not probabilities. (e.g. the chance of a horse winning a horse race might be written as "two to nine odds" (or 2:9), and not "eighteen percent chance" (or 0.18). And this example isn't even nearly as extreme as the cases ChrisHallquist talked about, yet still putting it in odds form makes it quite a bit easier to deal with.)

I'm not sure of the value of odds as opposed to probabilities for extreme values. Million-to-one odds is virtually the same thing as a 1/1,000,000 probability. Log odds, on the other hand, seem like they might have some potential for helping people think clearly about the issues.

I'd also note that probabilities are more useful for doing expected value calculations.

Anyone know whether gambling being expressed as odds is cross-cultural?

I've never been completely happy with the "I could make 1M similar statements and be wrong once" test. It seems, I dunno, kind of a *frequentist* way of thinking about the probability that I'm wrong. I can't imagine making a million statements and have no way of knowing what it's like to feel confidence about a statement to an accuracy of one part per million.

Other ways to think of tiny probabilities:

(1) If probability theory tells me there's a 1 in a billion chance of X happening, then P(X) is somewhere between 1 in a billion and P(I calculated wrong), the latter being much higher.

If I were running on hardware that was better at arithmetic, P(I calculated wrong) could be got down way below 1 in a billion. After all, even today's computers do billions of arithmetic operations per second. If they had anything like a one-in-a-billion failure rate per operation, we'd find them much less useful.

(2) Think of statements like P(7 is prime) = 1 as useful simplifications. If I am examining whether 7 is prime, I wouldn't start with a prior of 1. But if I'm testing a hypothesis about *something else* and it depends on (among other things) whether 7 is prime, I wouldn't assign P(7 is prime) some ridiculously specific just-under-1 probability; I'd call it 1 and simplify the causal network accordingly.

It seems, I dunno, kind of a frequentist way of thinking about the probability that I'm wrong.

There are numerous studies that show that our brain's natural way of thinking out probabilities is in terms of frequencies, and that people show less bias when presented with frequencies than when they are presented with percentages.

There are numerous studies that show that our brain's natural way of thinking out probabilities is in terms of frequencies

Thinking about **which** probabilities?

Probability is a complex concept. The probability in the sentence "the probability of getting more than 60 heads in 100 fair coin tosses" is a very different beast from the probability in the sentence "the probability of rain tomorrow".

There is a reason that both the frequentist and the Bayesian approaches exist.

You can calculate wrong in a way that overestimates the probability, even if the probability you estimate is small. For some highly improbable events, if you calculate a probability of 10^-9 your best estimate of the probability might be smaller than that.

True. I suppose I was unconsciously thinking (now there's a phrase to fear!) about improbable *dangerous* events, where it is much more important not to underestimate P(X). If I get it wrong such that P(X) is truly only one in a trillion, then I am never going to know the difference and it's not a big deal, but if P(X) is truly on the order of P(I suck at maths) then I am in serious trouble ;)

Especially given the recent evidence you have just provided for that hypothesis.

T-t-t-the Ultimate Insult, aimed at... oh my... */me faints*

it actually wouldn't take all that much evidence to convince us that, for example, "the numbers chosen in last night's lottery were 4, 2, 9, 7, 8 and 3." The correct response to this argument is to say that the prior probability of a miracle occurring is orders of magnitude smaller than mere one in a million odds.

That doesn't seem right. If somebody tries to convince me that the result of a fair 5 number lottery is *1, 2, 3, 4, 5* I would have a much harder time believing it, but not because the probability is less then one in a million. I think the correct answer is that if the outcome of the lottery wasn't *4, 2, 9, 7, 8, 3* it is very unlikely anybody would try to convince me that the result was exactly that one.

*[assume* outcome *is 4, 2, 9, 7, 8, 3]*

Whereas *P(outcome)* is 1/1 000 000, *P(outcome|they tell you the outcome is outcome)* is much higher because *P(they tell you the outcome is outcome|not outcome)* is so much lower then *P(they tell you the outcome is outcome|outcome)*

Yeah, that's interesting.

I agree with Eliezer's post, but I think that's a good nitpick. Even if I can't be that certain about 10,000 statements *consecutively* because I get tired, I think it's plausible that there's 10,000 statements simple arithmetic statements which if I understand, check of my own knowledge, and remember seeing in a list on wikipedia, (which is what I did for 53), that, I've only ever been wrong once on. I find it hard to judge the exact amount, but I definitely remember thinking "I thought that was prime but I didn't really check and I was wrong" but I don't remember thinking "I checked that statement and then it turned out I was still wrong" for something that simple.

Of course, it's hard to be much *more* certain. I don't know what the chance is that (eg) mathematicians change the definition of prime -- that's pretty unlikely, but similar things have happened before that I *thought* I was certain of. But rarely.

Of course, it's hard to be much more certain. I don't know what the chance is that (eg) mathematicians change the definition of prime -- that's pretty unlikely, but similar things have happened before that I thought I was certain of. But rarely.

If mathematicians changed the definition of "prime," I wouldn't consider previous beliefs about prime numbers to be wrong, it's just a change in convention. Mathematicians have disagreed about whether 1 was prime in the past, but that wasn't settled through proving a theorem about 1's primality, the way normal questions of mathematical truth are. Rather, it was realized that the convention that 1 is not prime was more useful, so that's what was adopted. But that didn't render the mathematicians who considered 1 prime wrong (at least, not wrong about whether 1 was prime, maybe wrong about the relative usefulness of the two conventions.)

I emphatically agree with that, and I apologise for choosing a less-than-perfect example.

But when I'm thinking of "ways in which an obviously true statement can be wrong", I think one of the prominent ways is "having a different definition than the person you're talking to, but both assuming your definition is universal". That doesn't matter if you're always careful to delineate between "this statement is true according to my internal definition" and "this statement is true according to commonly accepted definitions", but if you're 99.99% sure your definition is certain, it's easy NOT to specify (eg. in the first sentence of the post)

In cases like this where we want to drive the probability that something is true as high as possible, you are always left with an incomputable bit.

The bit that can't be computed is - am I sane? The fundamental problem is that there are (we presume) two kinds of people, sane people, and mad people who only think that they are sane. Those mad ones of course come up with mad arguments which show that their sanity is just fine. They may even have supporters who tell them they are perfectly normal - or even hallucinatory ones. How can I show which category I am in? Perhaps instead I am mad, and too mad to know it !

Only mad people can prove that they are sane - the rest of us don't know for sure one way or the other, as every argument in the end returns to the problem that I have to decide whether it's a good argument or not, and whether I am in any position to decide that correctly is the point at issue.

It's quite easy, when trying to prove that 53 must be prime, to get to the position where this problem is the largest remaining issue, but I don't think it's possible to put a number on it. In practice of course I discount the problem entirely as there's nothing I can do about it. I assume I'm fallibly sane rather than barking crazy, and carry on regardless.

I agree that you can be 99.99% (or more) certain that 53 is prime but I don't think you can be that confident based *only* on the arguement you gave.

If a number is composite, it must have a prime factor no greater than its square root. Because 53 is less than 64, sqrt(53) is less than 8. So, to find out if 53 is prime or not, we only need to check if it can be divided by primes less than 8 (i.e. 2, 3, 5, and 7). 53's last digit is odd, so it's not divisible by 2. 53's last digit is neither 0 nor 5, so it's not divisible by 5. The nearest multiples of 3 are 51 (=17x3) and 54, so 53 is not divisible by 3. The nearest multiples of 7 are 49 (=7^2) and 56, so 53 is not divisible by 7. Therefore, 53 is prime.

There are just too many potential errors that could occur in this chain of reasoning. For example, how sure are you that you correctly listed the primes less than 8? Even a mere typo at this stage of the argument could result in an erroneous conclusion.

Anyway just to be clear I do think your high confidence that 53 is prime is justified, but that the argument you gave for it is insufficient in isolation.

I wouldn't call that argument my only reason, but it's my best shot at expressing my main reason in words.

Funny story: when I was typing this post, I almost typed, "If a number is not prime, it must have a prime factor greater than its square root." But that's wrong, counterexamples include pi, i, and integers less than 2. Not that I was confused about that, my *real* reasoning was partly nonverbal and included things like "I'm restricting myself to the domain of integers greater than 1" as unstated assumptions. And I didn't actually have to spell out for myself the reasoning why 2 and 5 aren't factors of 53; that's the sort of thing I'm used to just seeing at a glance.

This left me fearing that someone would point out some other minor error in the argument in spite of the arguments' being essentially correct, and I'd have to respond, "Well, I said I was 99.99% sure 53 was prime, I never claimed to be 99.99% sure of that particular argument."

I should perhaps include within the text a more direct link to Peter de Blanc's anecdote here:

http://www.spaceandgames.com/?p=27

I won't say "Thus I refute" but it is certainly a cautionary tale.

It seems to me to be mostly a cautionary tale about the dangers of taking a long series of bets when you're tired.

Definitely agreed. It's basically a variation on the old (very old) "Get a distracted or otherwise impaired person to agree to a bunch of obviously true statements, and then slip in a false one to trip them up" trick. I can't see that it has any relevance to the philosophical issue at hand.

**[deleted]**· 2013-11-15T02:10:01.411Z · score: 1 (1 votes) · LW · GW

Not quite, as SquallMage had correctly answered that 27, 33, 39 and 49 were not prime.

I believe that was part of the mistake, answering whether or not the numbers were prime, when the original question, last repeated several minutes earlier, was whether or not to accept a deal.

The point is, it's fundamentally the same trick, and is just that: a trick.

Except it's not the same trick. What you describe relies on the mark getting into the rhythm of replying "yes" to every question; the actual example described has the mark checking each number, but making a mistake eventually, because the odds they will make a mistake is *not zero*.

Yeah. When I try to do the "can I make a hundred statements yadda yadda" test I typically think in terms of one statement a day for a hundred days. Or more often, "if I make a statement in this class every day, how long do I expect it to take before I get one wrong?"

I won't say "Thus I refute" but it is certainly a cautionary tale.

I think we need to be very careful what system we're actually describing.

If someone asks me, "are you 99.99% sure 3 is prime? what about 5? what about 7? what about 9? what about 11?", my model does not actually consider these to be separate-and-independent facts, each with its own assigned prior. My mind "chunks" them together into a single meta-question: "am I 99.99% sure that, if asked X questions of the nature 'is {N} prime', my answer will conform to their own?"

This question itself depends on many sub-systems, each with its own probability:

- P(1). How likely is my prime number detection heuristic to return a false positive?
- P(2). How prone is my ability to utilize my prime number detection heuristic to error?
- P(3). How lossy is the channel by which I communicate the results of my utilization of my prime number detection heuristic?
- P(4). How likely is it that the apparently-communicated question 'is {N} prime?' actually refers to a different thing that I mean when I utilize my prime number detection heuristic?

So the meta-question "am I 99.99% sure that, if asked X questions of the nature 'is {N} prime', my answer will conform to their own?" is at LEAST bounded by ([1 - P(1)] * [1 - P(2)] * [1 - P(3)] * [1 - P(4)] ) < 0.999 ^ X.

Why I feel this is important:

When first asked a series of "Is {N} prime?" questions, my mind will immediately recognize the meta-question represented by P(1). It will NOT intuitively consider P(2), P(3) or P(4) relevant to the final bounds, so it will compute those bounds as ([1 - P(1)] * 1 * 1 * 1) < 0.999 ^ X.

Then, later, when P(2) turns out to be non-zero due to mental fatigue, I will explain away my failure as "I was tired" without REALLY recognizing that the original failure was in recognizing P(2) as a confounding input in the first place. (I.e.: in my personal case, especially if I was tired, I feel that I'd be likely to ACTUALLY use the heuristic "does my sense of memory recognition ping off when I see these numbers in the contexts of 'numbers with factorizations' rather than the heuristic 'perform Archimedes' sieve rigorously and check all potential factors", and not even realize that I was not performing the heuristic that I was originally claiming 99.99% confidence in.

I think a lot of people are making this mistake, since they seem to phrase their objections as "how could I have a prior that math is false?" - when the actual composite prior is "could math be false OR my ability to properly perform that math be compromised OR my ability to properly communicate my answer be compromised OR my understanding of what question is actually being asked be faulty"?, which is what your example actually illustrates.

EDIT: did I explain this poorly? Or is it just really stupid?

**[deleted]**· 2013-11-15T02:07:41.935Z · score: 2 (2 votes) · LW · GW

Sure, P(I'm mistaken about whether 53 is prime) is non-negligible (I've had far worse brain farts myself).

But P(I'm mistaken about whether 53 is prime|I'm not sleep-deprived, I haven't answered a dozen similar questions in the last five minutes, and I've spent more than ten seconds thinking about this) is several orders of magnitude smaller.

And P(I'm mistaken about whether 53 is prime|[the above], and I'm looking at a printed list of prime numbers and at the output of `factor 53`

) is almost at the blue tentacle level.

Related: Advancing Certainty, and these comments.

Thank you. I'd seen "Advancing Certainty" before I wrote my post, but the comments are really good too.

**[deleted]**· 2013-11-07T15:20:57.190Z · score: 2 (4 votes) · LW · GW

Things that have a probability of something like one in a million. Includes many common ways to die that don't involve doing anything most people would regard as especially risky. For example, these stats suggest the odds of a 100 mile car trip killing you are somewhere on the order of one in a million.

I am not entirely sure about this, since I have made a similar mistake in the past, but If I am applying my relatively recent learning of this correctly, I think technically it suggests that if 1 million random people drive 100 miles, one of them will probably die, and based on a quick check of top causes of car accidents, we would also expect that person to be at least one of: Distracted, Speeding, Drunk, Reckless, in the Rain, Drowsy etc...

And if you are currently an undistracted, alert, sober driver who is following all traffic rules in dry weather, your chances of an accident during this particular drive are notably lower (Although I don't know if they are low enough for driving to be safer then a plane because I don't have sufficient statistics on that.)

Edit: Actually I think I'm still insufficiently clear. Technically, we wouldn't be able to expect that they had at least one risk factor without knowing the overall prevalence of those risk factors in the overall population. Admittedly, I did include 'Speeding' as a risk factor and my understanding is that speeding is quite common, (and I did include an etc. to include other non listed risk factors, which would also increase the chance of at least one of them being true) but I haven't actually run the numbers on my expectation above either.

This fits with what I've read, though I'd point out that while we get our share of anti-drunk driving and now anti-texting-while-driving messages, most people don't seem to think driving in the rain, driving when they're a bit tired, or being a bit over the speed limit are particularly dangerous activities.

(Also, even if you're an exceptionally careful driver, you can still be killed by someone else's carelessness.)

I don't think most people believe that driving when they're very tired is especially dangerous.

And if you are currently an undistracted, alert, sober driver [...], your chances of an accident during this particular drive are notably lower.

Lower maybe. But they are still in the order of 1:10^6.

The border between the categories 1:100, 1:10^6 and 1:10^10 is - well - no border but continuous. The categorization into three rough areas expresses insufficient experience with all the shades in between. I don't mean that offensively. Dealing with risks appears to be normally done by the subconscious. Lifting it into the conscious is sensible but just assigning three categories will not do. Neither will do assigning words to more differentiated categories like in Lojban ( http://lesswrong.com/lw/9jv/thinking_bayesianically_with_lojban/ ).

Real insight comes from training. Training with a suitable didactic strategy. One strategy obviously being to read the sequences as that forces you to consider lots of different more or less unlikely situations.

What I am missing is a structured way to decompose these odds. 1:10^6 for a car accident in a 100 mile drive is arbitrary in so far as you can decompose it into either a 10 meter drive (say out of the parking lot) which then immediately moves the risk formally into the latter category. Or alternatively dying in a car accident in your life time which moves it into the first category.

So why is it that a 100 mile drive was chosen?

I think part of what's troubling you about the test is thus: The claim, X has a probability of 10^-30 despite a prior of 50% is roughly equivalent to saying "I have information whose net result is 100 bits of information that X is false" That is certainly a difficult feat, but not really that hard if you put some effort into it (especially when you chose X). The proposed test to verify such a claim, ie making 10^30 similar statements and being wrong only once, would not only be impossible in your lifetime, but would be equivalent to saying "I have 100,000,000,000,000,000,000,000,000,000,000 net bits of information concerning 10^30 questions" Not only will you get defeated by exhaustion and carelessness (including careless choice of X), but your brain just won't hold that much information, which means that such a test would be attempting to predict that in the future, I can acquire that many bits about that many things.

However flawed the test might be, however, the conclusion that you're probably being extremely overconfident is probably still true. I recommend playing one of the calibration games, before you go trusting yourself too much on estimates more than 90%.

I don't think your comment with the lottery is a good example. If there was a lottery last night, then it was going to be some combination of random numbers, with no combination more or less likely then any others. If you come up and tell me "the winning lottery combination last night was X", the odds of you being correct are pretty high; there's really nothing unlikely in that scenario at all. Taking a look at some random number in the real world and thinking about the probability of it is meaningless, since you could be sitting there having the exact same thought no matter what the lottery had ended up being.

If you want to get a better feeling for what a 1 in a million chance feels like, let's say that I come up to you and try to convince you "The numbers that will be chosen in *tomorrow night's* lottery will be 4, 2, 9, 7, 8 and 3." How easy would it be to convince you of that, before the drawing actually happens?

I don't like this analysis much; in particular "the odds of you being correct are pretty high; there's really nothing unlikely in that scenario at all" seems unclear. Here's what I consider a better one. (It's similar to what fela says in a comment from last month but with more detail.)

There are multiple different mechanisms that would lead to you saying "Last night's lottery numbers were [whatever]". You might have just made them up at random. You might have tried to determine them by magic, prayer, ESP, etc., and not checked against reality. You might have done one of those things but *actually* overheard the numbers without noticing, and been influenced by that. You might have read a report of what they were. Etc.

It seems likely (though it might be hard to check) that most of the time when someone reports a set of lottery numbers from the past it's because they looked up what they were. In that case they're probably right, and (if wrong) probably close to right. So when someone tells me the numbers were 4, 2, 9, 7, 8, 3, there's an excellent chance that those really were the numbers.

If they tell me the numbers were 1, 2, 3, 4, 5, 6, some quite different mechanisms become more probable -- they might be making them up for fun, have been misled by a hoax, etc. In this scenario I'd still reckon a posterior probability well over 10^-6 for the numbers actually being 1,2,3,4,5,6, but probably not over 10^-1 until I'd got some more evidence.

Similarly, if the person were clutching a lottery ticket with the numbers 4, 2, 9, 7, 8, 3 then I would be less inclined to believe that those really were the numbers -- again, because of the increased probability that the person in question is lying, is self-deceiving, etc. (Especially if they had something to gain from convincing me that they held a winning lottery ticket.)

Now, make it tomorrow's lottery instead of tonight's. What's different? The most important difference is that the formerly most likely mechanism (they read the numbers in the newspaper or whatever) is no longer possible. So now we're left with things like ESP, cheating, divine inspiration, etc., all of which are (in my judgement and probably yours) extremely unlikely to lead them to give numbers that correlate in any way with the real winning numbers. And also just-making-it-up, also unlikely to correlate with the real numbers.

OK. Now, finally, what about William Lane Craig's argument and Chris's assessment of it?

I think WLC is pretty much correct that the prior probability of the numbers being 4,2,9,7,8,3 is 10^-6, and that if someone tells you those were the numbers this is approximately enough evidence to bring that up to (say) 10^-1 or better. We can extend this further -- suppose someone tells me what purport to be digits 999000..999999 of pi; I'd expect there to be a very good chance that they're all correct. (And if you're uncomfortable with probabilities over the digits of pi, which are after all necessarily whatever they are, we could make it "the result of 3000 coin-flips" or something.) So "extraordinary claims require extraordinary evidence", if that's taken to mean "very low-prior-probability claims require earth-shattering evidence", is wrong. And it's possible for "ordinary" (i.e., not earth-shattering) evidence to give a reasonable person a pretty big posterior probability for something whose prior probability is less than that of (say) "something at least kinda-sorta resembling Christianity is correct".

Accordingly, I think Chris's assessment is too simple.

As with the lottery example, we need to look at the possible ways in which the testimony that reaches us might come to be what it is. What makes me pretty comfortable about disbelieving (say) the alleged resurrection of Jesus despite the testimony in its favour isn't the mere fact that the resurrection is awfully improbable prior to that testimony (though it is), it's the fact that, of the mechanisms that would lead to the existence of such testimony, the ones that don't involve an actual resurrection are much more probable than the ones that do. Nothing comparable to this is true in the case of the lottery example -- because the testimony we've got in that case really is very low-probability if those aren't the right lottery numbers. It's "extraordinary" in the sense of "very low probability if the claim is false", just not in the sense of "clearly the kind of thing that we never hear in the ordinary course of events".

The maxim that "extraordinary claims require extraordinary evidence" is still OK, if it's expanded as follows. To support a low-prior-probability claim C, you need evidence that's *sufficiently more probable on C than on not-C* and, in particular, evidence that's *very improbable conditional on not-C*. If C is a specification of the results of 3000 coin-flips, then someone's report of those results can easily be such evidence -- but, e.g., it wouldn't be if what they report is that the results were perfectly alternating H and T even though the flips were carried out "fairly" by someone not intending to cheat. The improbability of C is simply a matter of its specificity, and if the evidence has the same specificity then (in many cases) that's enough to make C quite probable a posteriori. But if C is, say, "Jesus was genuinely dead for at least a day and then alive again" then the fact that some people report this having happened *isn't particularly improbable conditional on not-C* and that's what makes it not "extraordinary" enough.

I feel that the sentence

Suppose you say that you're 99.99% confident that 2 + 2 = 4. Then you have just asserted that you could make 10,000 independent statements, in which you repose equal confidence, and be wrong, on average, around once.

is a little questionable to begin with. What exactly is an "independent" statement in this context? The only way to produce a statement about whether 2 + 2 = 4 holds, is to write a proof that it holds (or doesn't hold). But in a meaningful mathematical system you can't have two independent proofs for the same statement. Two proofs for the same thing are either both right or both wrong, or they aren't proofs in the first place.

I always thought this must be the case from plain observation of thinking; much thinking is "logical", and pure logic is not a suitable model with significant uncertainty. There must be many situations where you're 9.999+ certain in order to make logical thinking useful.

This is false modesty. This is assuming the virtue of doubt when none ought exist. Mathematics is one of the few (if not the only) worthwhile thing(s) we have in life that is entirely a priori. We can genuinely achieve 100% certainty. Anything less is to suggest the impossible, or to redefine the world in a way that has no meaning or usefulness.

I could say that I'm not really sure 2+2=4, but it would not make me more intelligent for the doubt, but more foolish. I could say that I'm not sure that 5 is really prime, but it would hinge on redefining '5' or 'prime'. I could posit that if 2+3 reproducibly equaled 4, I would have to change my view of the universe and mathematics, but were I to suggest that argument held any weight, I might as well start believing in God. Define any paradox you like and there will never be a correct answer. The solution is not to accept doubt, but rather to ignore truly unsolvable paradoxes as foolish and useless.

The problem in creating the parallel probability statements is not in the surety, for they would all almost certainly be mathematical as well, but in the daunting task of finding and stating them. This is not reason, this is a threat! "If you assign X probability, are you willing to spend X hours finding parallels?" We react in the negative not due to the reasonability of the rebuttal but rather the daunting task saying yes would hypothetically place upon us. Our chance to perform the task correctly is likely significantly less than that of the probability we have assigned.

If humans are bad at mental arithmetic, but good at, say, not dying - doesn't that suggest that, as a practical matter, humans should try to rephrase mathematical questions into questions about danger?

E.g. Imagine stepping into a field crisscrossed by dangerous laser beams in a prime-numbers manner to get something valuable. I think someone who had a realistic fear of the laser beams, and a realistic understanding of the benefit of that valuable thing would slow down and/or stop stepping out into suspicious spots.

Quantifying is ONE technique, and it's been used very effectively in recent centuries - but those successes were inside a laboratory / factory / automation structure, not in an individual-rationality context.

If humans are bad at mental arithmetic, but good at, say, not dying - doesn't that suggest that, as a practical matter, humans should try to rephrase mathematical questions into questions about danger?

I don't think this would help at all. Humans have some built-in systems to respond to danger that is shaped like a tiger or a snake or other learned stimuli, like when I see a patient go into a lethal arrhythmia on the heart monitor. This programmed response to danger pumps you full of adrenaline and makes you very motivated to run *very fast*, or work *very hard* at some skill that you've practiced over and over. Elite athletes perform better under the pressure of competition; beginners perform worse.

An elite mathematician might do math faster if they felt they were in danger, but an elite mathematician is probably motivated to do mental arithmetic in the first place. I place around 95% confidence that generic bad-at-mental-arithmetic human would perform worse if they felt they were in danger than if they were in a safe classroom environment. If a patient is in cardiac arrest, I'm incredibly motivated to do something about it, but I don't trust my brain with even the simplest mental arithmetic. (Which is irritating, actually).

This doesn't address the reward part of your situation, the "something valuable" at the end of the road. Without the danger, or with some mild thrill-adding danger, this might be a workable idea.

Elite athletes perform better under the pressure of competition; beginners perform worse.

Can you rule out the selection effect? (Which would be that people who *don't* happen to perform better under the pressure of competition, don't *become* elite athletes.)