Draft: Reasons to Use Informal Probabilities

post by jimrandomh · 2010-10-11T22:50:35.892Z · LW · GW · Legacy · 13 comments

Contents

13 comments

If I roll 15 fair 6-sided dice, take the ones that rolled 4 or higher, roll them again, and sum up all the die rolls... what is the probability that I drop at least one die on the floor?

There are two different ways of using probability. When we think of probability, we normally think of neat statistics problems where you start with numbers, do some math, and end with a number. After all, if we don't have any numbers to start with, we can't use a proven formula from a textbook; and if we don't use a proven formula from a textbook, our answer can't be right, can it? But there's another way of using probability that's more general: a probability is just an estimate, produced by the best means available, even if that's a guess produced by mere intuition. To distinguish these two types, let's call the former kind *formal probabilities*, and the latter kind *informal probabilities*.

An informal probability summarizes your state of knowledge, no matter how much or how little knowledge that is. You can make an informal probability in a second, based on your present level of confidence, or spend time making it more precise by looking for details, anchors, reference classes. It is perfectly valid to assign probabilities to things you don't have numbers for, to things you're completely ignorant about, to things that are too complex for you to model, and to things that are poorly defined or underspecified. Giving a probability estimate does not require *any* minimum amount of thought, evidence, or calculation. Giving an informal probability  is not a claim that any relevant mathematical calculation has been done, nor that any calculation is even possible.

I present here the case for assigning informal probabilities, as often as is practical. If any statement crosses your mind that seems especially important, you should put a number on it. Routinely putting probabilities on things has significant benefits, even if they aren't very accurate, even if you don't use them in calculations, and even if you don't share them. The process of assigning probabilities to things tends to prompt useful observations and clarify thinking; it eases the transition into formal calculation when you discover you need it, and provides a sanity check on formal probabilities; having used probabilities makes it easier to diagnose mistakes later; and using probabilities lets you quantify, not just confidence, but also the strength and usefulness of pieces of evidence, and the expected value of avenues of investigation. Finally, practice at generating probabilities makes you better at it.

The first thing to notice is that informal probabilities are much more broadly applicable than formal probabilities are. A formal probability requires more information and more work; in particular, you need to start with relevant numbers; but for most routine questions, you just don't have that data and it wouldn't be worth gathering anyways.  For example, it's worth estimating the informal probability that you'll like a dish before ordering it at a restaurant, but producing a formal probability would require a taste test, which is far outside the realm of practicality.

Assigning informal probabilities clarifies thinking, by forcing you to ask [the fundamental question](http://lesswrong.com/lw/24c/the_fundamental_question/): What do I believe, and why do I believe it? Sometimes, the reason turns out not to be very good, and you ought to assign a low probability. That's important to notice. Sometimes the reason is solid, but tracking it down leads you to something else that's important. That's good, too. Coming up with probabilities also pushes you to look for reference classes and examples. You can still ask these things without using probability, but trying to produce a probability gives guidance and motivation that greatly increases the chance that you'll actually remember to ask these questions when you need to. Informal probabilities also ease the transition into formal calculation when you need it; you can fill in an expected-utility calculation or other formula with estimates, then look for better numbers if the decision is close enough.

Probabilities are easier to remember than informal notions of confidence. This is important if you catch a mistake and need to go back and figure out where you went wrong; you want to be able to point to a specific thought you had and say, "this was wrong in light of the evidence I had at the time", or "I should've updated this when I found out X". Unfortunately, memories of degrees of confidence tend to come back badly distorted, unless they're crystallized somehow. Worse, they tend to come back consistently biased towards whatever would be judged correct now, which makes them useless or worse.  Numbers crystallize those memories, making them usable and enabling you to retrace steps

Quantifying confidence also enables us to quantify the strength of evidence - that is, how much a piece of information *changes* our confidence. For example, a piece of evidence that changes our probability estimate from 0.2 to 0.8 is a likelihood ratio of 4:1, or 2 bits of evidence. Assigning before-and-after-evidence probabilities to a statement forces you to consider just how good a piece of evidence it is; and this makes certain mistakes less likely. It's less tempting to round weak arguments off to zero, or to respond emotionally to an argument without judging its actual significance, if you're in the habit of putting numbers on that significance. But keep on mind that there is not one true value for the strength of a piece of evidence; it depends what you already know. For example, an argument that's a duplicate of one you've already updated on has no value at all

Finally, assigning probabilities to things is a skill like any other, which means it improves with practice. Estimating probabilities and writing them down  enables us to calibrate our intuitions. Even if you don't write anything down, just noticing every time you put a .99 on something that turns out to be false is a big improvement over no calibration at all.

I know of only one caveat: You shouldn't share every probability you produce, unless you're very clear about where it came from. People who're used to only seeing formal probabilities may assume that you have more information than you really do, or that you're trying to misrepresent the information you have.

To help overcome any internal resistance to giving informal probabilities, I have here a list of probability Fermi problems. A Fermi problem asks for only a rough estimate - an order of magnitude - and it does not include enough information for a precise answer. So too with these problems, which contain just enough information for an estimate. Answer quickly (ten seconds per question at most). Don't do any calculations except very simple ones in your head. Don't worry about all the missing details that could affect the answer. The goal is to be quick, since speed is the main obstacle to using probability routinely.

1. A car is white.
2. A car is a white, ten year old Ford with a dent on the rear right door
3. A ten-mile car trip will involve a collision.
4. A building is residential.
5. A person is below the age of 20.
6. A word in a book contaains a typo.
7. Your arm will spontaneously transform into a blue tentacle today.
8. A purse contains exactly 71 coins.
9. 76297 is a prime number.

I also suggest making some predictions on PredictionBook and taking a calibration quiz.

13 comments

Comments sorted by top scores.

comment by Vladimir_Nesov · 2010-10-11T22:59:31.119Z · LW(p) · GW(p)

This post should review the arguments in When (Not) To Use Probabilities.

Sometimes the conclusion you can derive from the made up numbers is worse than a directly intuited conclusion, since the latter is one step closer to native form of the request for answers from the brain.

Replies from: komponisto
comment by komponisto · 2010-10-13T03:16:21.299Z · LW(p) · GW(p)

This post should review the arguments in When (Not) To Use Probabilities.

The lesson of that post is basically "don't let yourself be deceived into thinking your calibration is better than it is". But if you're poorly calibrated, better to know this, and giving explicit probability estimates may help you find this out.

Hiding your judgements doesn't make them better.

comment by Vladimir_Nesov · 2010-10-11T23:09:14.260Z · LW(p) · GW(p)

Probabilities are easier to remember than informal notions of confidence.

You don't usually remember informal notions of confidence either, you regenerate them on request from the global model of the world in your mind.

comment by Vladimir_M · 2010-10-12T03:57:53.291Z · LW(p) · GW(p)

Here's one challenge for your position. Take, for example, your first question. I don't think it makes any sense to talk about any probabilities there, since the question is incomplete to the point of meaninglessness. What sample of cars are we talking about, and under what exact circumstances? To which, I assume, you would answer that for everything unspecified, you should somehow make assumptions that are true with some probabilities and then use that to calculate the final probability of your answer, or estimate it just by feeling in some such way.

But how far would you take this principle? Suppose you receive this question in a bad handwriting, with one word totally smudged, so that it reads like "a [...] is white," or "a car is [...]." Would you be willing to assign a probability nevertheless, based on probabilistic guesses about the missing word? If yes, what about the case where two words are smudged, so the claim is "a [...] is [...]"? What about the ultimate case where the text is completely unreadable, so you have to guess what the question is?

(Note that we can arrive at your original question by starting with a well-defined problem with a computable exact answer, and then smudging parts of it so that we're left with "a car is white.")

Replies from: jimrandomh
comment by jimrandomh · 2010-10-12T12:11:09.528Z · LW(p) · GW(p)

The way to deal with underspecified questions is to note the ambiguity, seek clarification if possible, and then if you still need an answer and can't get clarification, assume a probability distribution for each missing detail. Producing an answer is always possible, but the more ambiguities you had to do this for, the less useful the answer will be.

a [...] is white: 0.1 a car is [...]: 0.1 a [...] is [...]: 0.05

I wouldn't be willing to actually use those probabilities for much of anything, because as soon as I had a use for the answer, I'd surely also have found out what the actual question was, and be able to produce a much better answer.

comment by RobinZ · 2010-10-11T23:46:54.383Z · LW(p) · GW(p)
  1. A car is white. Sampling from the domains of cars I have seen on the road: 3%.
  2. A car is a white, ten year old Ford with a dent on the rear right door Ditto: 10^-9.
  3. A ten-mile car trip will involve a collision. 10^-7 or thereabouts.
  4. A building is residential. Off the cuff, close to even odds.
  5. A person is below the age of 20. 5%.
  6. A word in a book contaains a typo. Any given word in a published book: 10^-8. Any given book: 10%.
  7. Your arm will spontaneously transform into a blue tentacle today. Negligible, dominated by fundamental errors in understanding a la you-are-in-the-Matrix scenarios - 10^-20 is almost certainly too high, 1/10^^100 might be too low.
  8. A purse contains exactly 71 coins. 0.1%.
  9. 76297 is a prime number. 10%.

Unfortunately, memories of degrees of confidence tend to come back badly distorted, unless they're crystallized somehow. Worse, they tend to come back consistently biased towards whatever would be judged correct now, which makes them useless or worse. Numbers crystallize those memories, making them usable and enabling you to retrace steps

Really? Mythbusters fans might disagree. :P

comment by rwallace · 2010-10-12T19:28:25.829Z · LW(p) · GW(p)

Aside from the other problems that have been pointed out, I will also take exception to calling an order of magnitude a rough estimate. An order of magnitude would be a rough estimate where you have actual numeric data to work with. In cases where you have to just make up the numbers, an order of magnitude is high precision -- in some of these cases, extraordinarily high precision, far greater than you have any reason for claiming.

Replies from: khafra
comment by khafra · 2010-10-13T20:01:21.462Z · LW(p) · GW(p)

I'd say "an order of magnitude is a rough estimate" is a rough estimate. Remember, this is epistemic probability, so whether you

  • just think 76297 looks prime-ish and guess 9/10

  • mentally estimate the natural logarithm, quickly check whether 76297 is divisible by 2 or 3, and call it a 1/2 chance

  • can actually compute the Sieve of Eratosthenes with five nines of accuracy for it in ten seconds and call it a 1/10000 chance

You're correct, as long as you're not mis-reading your own degree of belief. To get into confidence about your degree of belief, I think we'd have to get into something like informal Dempster-Schafer theory--which, incidentally, I'd love to do.

comment by sixes_and_sevens · 2010-10-12T09:34:33.527Z · LW(p) · GW(p)

My answers and the logic and assumptions behind them. I assumed an implicit "in the UK" after every question, because this is where all my knowledge comes from.

1) 1/22

Assuming roadworthy cars. Anyone who's played Motorway Snooker will know how tricky it is to get started. I estimate I'd need to see 22 cars before a white one came up.

2) 1/1*10^7

Also assuming roadworthy cars. If we included cars sitting in scrapheaps, it becomes significantly more probable.

3) 1/1000

I would assume most collisions happen within the first or last few miles of a journey, so this estimate is effectively the same as "a car trip of at least ten miles involves a collision", which is easier to work with.

4) 9/14

I'd guess nine out of every fourteen buildings is residential. Commercial and industrial buildings tend to be larger and less numerous than residential dwellings.

5) 1/7

Based on my knowledge of population distribution by age, which I'll admit isn't that great.

6) 1/90000

Assuming an average of 250,000 words a book, and two to three typos a book.

7) 1/1*10^56

A silly number for a silly event.

8) 1/27000000

71 is a very strange number of coins to have in any object you might call a purse. This was a bit of a (number of purses) (probability of having a weird-ass number of coins) (spread of weird-ass coins) job.

9) 25%.

It fails divisibility tests for 2, 3, 5 and 11. Divisibility by 7 isn't something I can reliably test in under ten seconds, but it doesn't look divisible by 7. That still leaves a lot of other potential prime factors, but not nearly as many.

comment by orangecat · 2010-10-12T06:08:12.427Z · LW(p) · GW(p)
  1. 0.2 (I recall reading that white is the most common color, and I do see a bunch).
  2. 0.2 (p(10 year old Ford)=~0.001) (p(dent on rear right|10 year old Ford)=~0.01) =~ 2e-6, or 1 in 500,000.
  3. Average person averages one 10-mile trip per day and gets into an accident once every 10-20 years. ~1 in 5000.
  4. 2/3, heavily dependent on definition of building
  5. 0.2
  6. Average 1 typo per 10 books, 100k words/book, so 1 in a million.
  7. Probability that I'll perceive it, 10^-20. Probability of it actually happening, around 10^-(10^100)
  8. Seems like several standard deviations above average, maybe 1 in 1,000.
  9. Not divisible by 2 or 3, if I had written this post I'd flip a coin to decide whether to use a prime or plausible imposter, so 0.5.
Replies from: CarlShulman, sixes_and_sevens
comment by CarlShulman · 2010-10-12T14:23:02.359Z · LW(p) · GW(p)

Re #7, its past use as a discussion tool makes it more likely that people will create/simulate such situations as a joke in the future. The probability of "actually happening" thus seems far too low.

comment by sixes_and_sevens · 2010-10-12T10:03:08.119Z · LW(p) · GW(p)

6.Average 1 typo per 10 books, 100k words/book, so 1 in a million.

You have a very high opinion of proof readers :-)

comment by NancyLebovitz · 2010-10-12T01:58:27.114Z · LW(p) · GW(p)
  1. 10%
  2. .001 %
  3. .0001%
  4. 60%
  5. 30%
  6. .5%
  7. epsilon
  8. 1%
  9. .01%