Bead Jar Guesses

post by Alicorn · 2009-05-04T18:59:56.768Z · LW · GW · Legacy · 134 comments

Let's say Omega turns up and sets you a puzzle, since this seems to be what Omega does in his spare time.  He has with him an opaque jar, which he says contains some solid-colored beads, and he's going to draw one bead out of the jar.  He would like to know what your probability is that the bead will be red.

Well, now there is an interesting question.  We'll bypass the novice mistake of calling it .5, of course; just because the options are binary (red or non-red) doesn't make them equally likely.  It's not like you have any information.  Assuming you don't think Omega is out to deliberately screw with you, you could say that the probability is .083 based on the fact that "red" is one of twelve basic color words in English.  (If he had asked for the probability that the bead would be lilac, you'd be in a bit more trouble.)  If you were obliged to make a bet that the bead is red, you would probably take the most conservative bet available (even if you're still assuming Omega isn't deliberately screwing with you), but .083 sounds okay.

But because you start with no information, it's very hard to gather more.  Suppose Omega reaches into the jar and pulls out a red bead.  Does your probability that the second bead will be red go up (obviously the beads come in red)?  Does it go down (that might have been the only one, and however many red beads there were before, there are fewer now)?  Does it stay the same (the beads are all - as far as you know - independent of one another; removing this one bead has an effect on the actual probabilities of what the next one will be, but it can't affect your epistemic probability)?  What if he pulled out a gray bead first, instead of a red one?  How many beads would he have to pull, and in what colors, for you to start making confident predictions?

So that's one kind of probability: the bead jar guess.  It has a basis, but it's a terribly flimsy one, and guessing right (or wrong) doesn't help much to confirm or disconfirm the guess.  Even if Omega had asked about the bead being lilac, and you'd dutifully given a tiny probability, it would not have surprised you to see a lilac bead emerge from the jar.

A non-bead-jar-guess probability yields surprise when it turns out to be true even if it's just the same size.  Say your probability for lilac was .003.  That's tiny.  If you had a probability of .003 that it would rain on a particular day, you would be right to be astonished if you turned out to need the umbrella you left at home.

Bead jar guesses vacillate more easily.  Although in the case of the bead jar, you are in an extremely disadvantageous position when it comes to getting more information, we can fix that: somebody who says she's peeked into the jar says all the beads in the jar are red.  Just like that, you'll discard the .083 and swap it for a solid .99 (adjusted as you like for the possibility that she is lying or can't see well).  It would take considerable evidence to move a probability that far if it were not a wild guess, not just a single person's say-so, but that's all you've got.  Then Omega pulling out a bead can give you information: the minute he pulls out the gray bead you know you can't rely on your informant, at least not completely.  You can start making decent inferences.

I think more of our beliefs are bead jar guesses than we realize, but because of assorted insidious psychological tendencies, we don't recognize that and we hold onto them tighter than baseless suppositions deserve.

134 comments

Comments sorted by top scores.

comment by jimmy · 2009-05-04T19:31:46.860Z · LW(p) · GW(p)

"We'll bypass the novice mistake of calling it .5, of course; just because the options are binary (red or non-red) doesn't make them equally likely. It's not like you have any information."

Well, if you truly had no information, 0.5 would be the correct (entropy maximizing given constraints) bet. If you have no information you can call it "A or !A" or "!B or B" and it sounds the same- you can't say one is more likely.

By assigning a different probability, you're saying that you have information that makes the word "red" means something to you, and that it's less likely than half (say because there are 11 other "colors").

Likewise, if I say how likely is A and how likely is !A? You have to say 0.5. If A turns out to be "I'm gonna win the lottery tomorrow" then you can update and P goes to near zero. You didn't screw up though, since It could have just as easily been "I won't win the lottery tomorrow". If you don't think that it's just as likely, then that is information.

When you hear people saying "winning the lottery is 50/50 because either you win or you don't", their error isn't that they "naively" predict 0.5 in total absence of information. Their problem is that they don't update on the information that they do have.

Replies from: Alicorn
comment by Alicorn · 2009-05-04T19:44:45.132Z · LW(p) · GW(p)

Well, I suppose you do have information inasmuch as you know what colors are. But if your probability for red is .5, on the basis of knowing that it's a color alone, then you have to have the same probabilities for blue and yellow and green and brown and so forth if Omega asks for those too, and you can be Dutch booked like crazy.

Replies from: Cameron_Taylor, MichaelBishop, mitechka
comment by Cameron_Taylor · 2009-05-05T02:23:44.935Z · LW(p) · GW(p)

But if your probability for red is .5, on the basis of knowing that it's a color alone, then you have to have the same probabilities for blue and yellow and green and brown and so forth if Omega asks for those too, and you can be Dutch booked like crazy.

If Omega asks "What is p(red)?" then I may well consider that I have no information and reply 0.5.

If Omega then asks me "What are p(blue) and p(yellow)?" then I have new information. I would update p(red), p(blue) and p(yellow) to new numbers. By induction I would probably assign each a somewhat lower probability than 0.33 by this stage since p(jar contains all basic colors) has increased.

The most important thing is that I would never have exclusive probabilities that simultaniously sum to greater than 1. I may, however, declare (or even bet on) a probability that I update downwards when given new information.

I may end up in a situation where all the bets I have laid sequentially have an expected loss when considered together. This is unfortunate but does not indicate an error of judgement. It simply suggests that at the time of my bet on red I did not expect the new bets or the information contained therein. In later bets I reject consistency bias.

Replies from: Alicorn, saturn
comment by Alicorn · 2009-05-05T02:30:08.093Z · LW(p) · GW(p)

If you can be Dutch booked about probabilities asked in close sequence, where you gain no new information except that a question was asked, I'd think that reflects a considerable failure of rationality. There are grounds to reject "temporal Dutch book arguments", but this isn't one; the time and the "new information" should both be negligible.

To put it differently, if you have no information about what beads are in the jar, then you have even less information about why Omega wants to know your probabilities for the sorts of beads in the jar. Omega is a weird dude. Omega asking you a question does not mean what it means when a human asks you the same question.

Replies from: Cameron_Taylor
comment by Cameron_Taylor · 2009-05-05T03:30:04.186Z · LW(p) · GW(p)

I reject the proposition "Omega asking a question supplies negligible new information". What information you glean from Omega depends entirely on what your prior beliefs about likely Omega behavior are. It is not at all absurd for a rational entity to have beliefs representing the reasoning "if Omega asks about multiple basic colors then there is a higher probability that his bead jar contains beads selected from the basic colors than if he only asks about red beads".

For my part I would definitely not assign p(red) = 0.5 on the first question and then update to p(red) = 1/n(basic colors). I would, however, lower p(red) by greater that 0.

Omega asking you a question does not mean what it means when a human asks you the same question.

That is true, it doesn't. However, this is an argument for a zero information "0.5" probability, not against it. We (that is, yourself in your initial post and me in my replies here) are using inductive reasoning to assign probabilities based on our knowledge of human color labels. You have extracted information from "What is p(red)?" and used it to update from 0.5 in the direction of 1/12. The same process must also be allowed to apply when further questions are asked.

If "What is p(red)?" provokes me to even consider the number 0.083 in my reasoning then "What is p(red)?... What are p(blue) and p(yellow)?" will provoke me to consider 0.083 with greater weight. The question "What is p(darkturquoise)?" must also provoke me to consider a significantly lower figure.

Either questions give information or they don't.

comment by saturn · 2009-05-05T06:36:24.258Z · LW(p) · GW(p)

I may, however, declare (or even bet on) a probability that I update downwards when given new information.

And it's always .5, I hope.

Replies from: Unknowns, Vladimir_Nesov
comment by Unknowns · 2010-06-02T12:16:21.180Z · LW(p) · GW(p)

Your probability of updating downwards should be (more or less; not exactly) equal to one minus your original probability, i.e. if your original probability is .25, your probability of updating downwards should be around .75. This is obvious, since if there is a one in four chance that the thing is so, there is a three out of four chance that you will find out that it is not so, when you find out whether it is so or not.

Conservation of expected evidence doesn't mean that the chance of updating upwards is equal to the chance of updating downwards. It also takes into account the quantity of the change; i.e. my probability is .25, and I update upwards, I will have to update three times as much as if I had updated downwards.

Replies from: saturn, sparkles
comment by saturn · 2010-06-04T21:46:01.349Z · LW(p) · GW(p)

You're right. Thanks.

comment by sparkles · 2013-04-21T10:06:39.929Z · LW(p) · GW(p)

What if you know jar A is 80% red and jar B is 0% red, and you know you're looking at one of them, and your confidence that it's A is 0.625? Then you have probability 0.5 that a bead chosen from the jar in front of you is red, but will update upwards with probability 0.625 if you're given the information of which jar you're looking at.

Replies from: Unknowns
comment by Unknowns · 2013-05-30T18:47:52.124Z · LW(p) · GW(p)

My comment assigns to a probability to updating upwards or downwards in a generic way when new information is given; your comment calculates based on "if you're given the information of which jar you're looking at", which is more concrete. You could also be given other information which would make it more likely you're looking at B.

comment by Vladimir_Nesov · 2009-05-05T10:10:05.196Z · LW(p) · GW(p)

No, it's not. (You either win the lottery, or you don't.)

comment by Mike Bishop (MichaelBishop) · 2009-05-04T23:16:59.690Z · LW(p) · GW(p)

Excuse me for making such a minor point, but I don't think we have to give the same probability for each color. We have to guess at Omega's motivation before we can guess at the distribution of bead colors in the jar. Do we have previous knowledge of Omegas? How about Omegas bearing bead filled jars?

Replies from: Alicorn
comment by Alicorn · 2009-05-05T00:38:28.442Z · LW(p) · GW(p)

I was assuming that you have never met an Omega, much less one bearing a bead jar, and that you know all the standard facts about Omega (e.g. what he says is true, etc.)

comment by mitechka · 2009-05-04T20:05:49.875Z · LW(p) · GW(p)

I think I would agree partially with both of you. If I assume that there is no information at all .5 is a good choice. Once a bead of any color is pulled out, I can start making guesses on a potential number of beads in the jar from the relative volumes of the jar and the bead, so if I know that there is a finite number of potential colors, I might take a guess as to what the probability of any particular color distribution is. Once a red bead is pulled, I might adjust probability that Omega is not screwing with me etc.

comment by Mike Bishop (MichaelBishop) · 2009-05-05T17:57:07.323Z · LW(p) · GW(p)

I think more of our beliefs are bead jar guesses than we realize, but because of assorted insidious psychological tendencies, we don't recognize that and we hold onto them tighter than baseless suppositions deserve. [said Alicorn in the original]

If someone wants to do the work of linking the fairly abstract discussion here to how we think about making decisions in the real world, I think we would all benefit greatly.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-05-04T23:04:19.631Z · LW(p) · GW(p)

We're to assume here that Omega, being a strange sort of deity, has selected the question "Is the bead red?" via some process that has no expected correlation whatsoever to the actual color. Correct?

Replies from: Alicorn
comment by Alicorn · 2009-05-04T23:06:00.064Z · LW(p) · GW(p)

Except inasmuch as red is in fact a color (not a euphemism for communist beads or something), yes, that's right.

Replies from: Emile, Cameron_Taylor
comment by Emile · 2009-05-05T10:16:13.973Z · LW(p) · GW(p)

That's an important information, and one that wasn't included in the specification of the problem - hence some people in the comments arguing that 0.5 may not be that stupid.

Replies from: timtyler, Alicorn
comment by timtyler · 2009-05-06T01:15:55.751Z · LW(p) · GW(p)

Would you increase or decrease the probability on the basis of this information?

Replies from: Emile
comment by Emile · 2009-05-06T06:07:50.745Z · LW(p) · GW(p)

Decrease - wouldn't you?

Replies from: timtyler
comment by timtyler · 2009-05-08T20:09:03.323Z · LW(p) · GW(p)

I'm not sure what I would do - probably ask for more information about this Omega fellow.

comment by Alicorn · 2009-05-05T17:22:54.974Z · LW(p) · GW(p)

Okay, I'll bite - did anybody really think Omega might have been asking for the probability that the first bead would be a Communist?

Replies from: MrHen, Emile
comment by MrHen · 2009-05-05T17:27:10.413Z · LW(p) · GW(p)

No, but if the first bead was communist I sure am going to start thinking the rest of the beads were communist. Those silly communes always stick together.

With more seriousness, I am guessing Emile was talking about the revelation that the contents of the jar were restricted by "color" as in one of twelve colors.

comment by Emile · 2009-05-05T20:14:24.054Z · LW(p) · GW(p)

I wasn't talking about communism, I don't know where you picked that up. I was just saying that the assumption "Omega's choice of question is uncorrelated to the actual bead color" is missing, and should be explicitly stated.

Otherwise, it's reasonable to assign non-null probabilities to the propositions "There are beads of only N colors, and red is one of them", for various values of N (2? 12? Whatever)

Cameron Taylor makes the same point in another comment.

comment by Cameron_Taylor · 2009-05-05T02:28:34.848Z · LW(p) · GW(p)

That's a fairly significant exception.

comment by badger · 2009-05-04T20:30:56.241Z · LW(p) · GW(p)

I've just started reading Jaynes on prior formation, and I'd love to see more posts here on the topic. Maybe I'll write one if I ever have the chance to get some reading done.

As far as this problem goes, I agree we have some information about other colors. I want to know what Omega counts as "red" though, because that will go a long way in determining what sort of prior we'd assign.

Based on my limited understanding of physics, if we assume the bead only reflects a single wavelength, then it would be red if the wavelength were between 620 and 750nm. The visible spectrum goes from 380 to 750nm, so a uniform distribution over wavelength gives a probability of 0.34.

Most objects reflect multiple wavelengths though. In this case, the color of an object could be characterized as a distribution over the visible spectrum of the reflected light. To count as red, the distribution would need a mean between 620 and 750nm. We might need an additional constraint on the shape of the distribution so it isn't too spread out. I don't know how you'd begin calculating the measure of distributions that meet these constraints though.

I think drawing a red bead will marginally increase our probability of red.

In the end, I think our estimate is much more likely to depend on our expectation of Omega's motives than knowledge about colors, though.

comment by loqi · 2009-05-05T23:36:10.366Z · LW(p) · GW(p)

It's not like you have any information. Assuming you don't think Omega is out to deliberately screw with you

These are contradictory assumptions. I have no information. I have no idea what Omega is out to do. The whole point of invoking Omega is to obliterate meaningful priors. I'm with byrnema here, probabilities are a tool for making maximally effective use of information. Without any such information, the only correct answer to "what is your probability" is "I don't have one".

comment by Vladimir_Nesov · 2009-05-04T21:12:12.062Z · LW(p) · GW(p)

Interestingly, figuring out the answers to questions of this kind, basically about prior, we are dealing with issues similar to those in elicitation of human values. In both cases, the answer is hidden in our minds, never in explicit and consistent form, with no hope of constructing a precise model that will give the answer. The only way to approximate the solution is to consider arguments for and against, consider relared situations, think, and listen to your inner voice, to intuitive response that says that it's proper to save a child, and you agree, that it's proper to assign a probability of no less that .03 to seeing red, and you agree, and that can lie to you, beware.

P.S. Keywords for Google scholar: 'utility elicitation', 'prior elicitation', 'proper scoring rules'.

comment by Vladimir_Nesov · 2009-05-04T20:26:45.419Z · LW(p) · GW(p)

Alicorn, I think it'd be appropriate to add the following link at the beginning of the article:

Related to: Priors as Mathematical Objects.

It also kinda answers your questions.

Even if Omega had asked about the bead being lilac, and you'd dutifully given a tiny probability, it would not have surprised you to see a lilac bead emerge from the jar.

I see this conclusion as a mistake: being surprised is a way of translating between intuition and explicit probability estimates. If you are not surprised, you should assign high enough probability, and otherwise if you assign tiny probability, you should be surprised (modulo known mistakes in either representation).

Predicting the second bead given the color of the first one can also be expressed as probability estimates for joint observations, made before you observe the color of the first bead. What is the probability that you'll see two reds? That you'll see a red followed by a non-red? Non-red following by a red? Two non-reds? Then crunch the numbers through the definition of conditional probability/Bayes' theorem.

Replies from: Simetrical, Cyan
comment by Simetrical · 2009-05-04T23:20:43.179Z · LW(p) · GW(p)

I see this conclusion as a mistake: being surprised is a way of translating between intuition and explicit probability estimates. If you are not surprised, you should assign high enough probability, and otherwise if you assign tiny probability, you should be surprised (modulo known mistakes in either representation).

That's not true at all. Before I'm dealt a bridge hand, my probability assignment for getting the hand J♠, 8♣, 6♠, Q♡, 5♣, Q♢, Q♣, 5♡, 3♡, J♣, J♡, 2♡, 7♢ in that order would be one in 3,954,242,643,911,239,680,000. But I wouldn't be the least bit surprised to get it.

In the terminology of statistical mechanics, I guess surprise isn't caused by low-probability microstates ― it's caused by low-probability macrostates. (I'd have been very surprised if that were a full suit in order, despite the fact that a priori that has the same probability.) What you define as a macrostate is to some extent arbitrary. In the case of bridge, you'd probably divide up hands into classes based on their utility in bridge, and be surprised only if you get an unlikely type of hand.

In this case, I'd probably divide the outcomes up into macrostates like "red", "some other bright color like green or blue", "some other common color like brown", "a weird color like grayish-pink", and "something other than a solid-colored ball, or something I failed to even think of". Each macrostate would have a pretty high probability (including the last: who knows what Omega's up to?), so I wouldn't be surprised at any outcome.

This is an off-the-cuff analysis, and maybe I'm missing something, but the idea that any low-probability event should be surprising certainly can't be correct.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2009-05-04T23:26:22.210Z · LW(p) · GW(p)

Thank you, my mistake. I don't understand 'surprise'.

Let's see... It looks like 'surprise' is something about promoting a new theory about the structure of environment that was previously dormant, forcing you to drop many cached assumptions. For example, if (surprise, surprise...) you win a lottery, you may promote a previously dormant theory that you are on a holodeck. If you are surprised by observing 1000 equal quantum coinflips (replicated under some conditions, with apparatus not to blame), you may need to reconsider the theory of physics. If you experience surprising luck in a game of dice, you start considering the possibility that dice are weighted.

comment by Cyan · 2009-05-04T20:43:30.500Z · LW(p) · GW(p)

I see this conclusion as a mistake...

... but it isn't, because the degree of surprise doesn't just depend on the raw probability, but also only the number of other possible outcomes under consideration. That Omega uses the term "lilac" may reasonably be taken as evidence that the space of color outcomes should be treated as finely divided.

ETA: I guess the mistake is in comparing feelings of surprise across outcomes with the same probability embedded in event spaces with different cardinalities.

Replies from: JGWeissman, billswift, Vladimir_Nesov
comment by JGWeissman · 2009-05-04T21:09:05.595Z · LW(p) · GW(p)

If Omega asked me the probability of the next bead being lilac, I would be surprised to if the next bead actually was lilac, in a way I would not be surprised to find the bead is turquoise, an event to which I assign equal probability, but was not specifically considering prior to the draw, as any higher probability set of events which excludes drawing a turquoise bead would seem artificial. If the first two beads are the colors Omega asks me about, my leading theory would be that Omega will draw out a bead of which ever color he just brought up. (The first draw would cause me to consider this with roughly equal probability as maximum entropy.)

comment by billswift · 2009-05-04T22:05:58.782Z · LW(p) · GW(p)

"doesn't just depend on the raw probability" - Correct. It also depends strongly on how reliable you think your estimate of the probability is. That is, your confidence interval.

comment by Vladimir_Nesov · 2009-05-04T21:32:00.126Z · LW(p) · GW(p)

the degree of surprise doesn't just depend on the raw probability, but also only the number of other possible outcomes under consideration.

Well, maybe it isn't, but it should.

comment by conchis · 2009-05-04T19:34:58.471Z · LW(p) · GW(p)

But because you start with no information, it's very hard to gather more. Suppose Omega reaches into the jar and pulls out a red bead. Does your probability that the second bead will be red go up... down... [or] stay the same...?

My intuition here is to start with an uninformative prior over possible bead-generating mechanisms. (You still have the problem of how to divide up the state space, but that's nothing new.) If a red bead comes out first, I update the probabilities that I assign to each mechanism and proceed from there.

Where exactly that leads seems likely to depend on a bunch of assumptions that I'm too lazy to think through right now (e.g. whether the urn has a finite number of beads in it) but it seems a reasonable way of proceeding in principle.

comment by byrnema · 2009-05-05T11:21:34.671Z · LW(p) · GW(p)

For me, the Omega problem described in the post presents the following conundrum: what is a probability in the limit of no information?

Suppose we employ a pragmatic perspective: the "probability of an event", as a mathematical object, is a tool that is used to summarize information about the past and/or future occurrence of that event. In the limit of no information, using the pragmatic view, there is no justification for assigning a probability not because we don't know what it is, but because it has no use in summarizing information.

If you don't care about being pragmatic, if you want to define a probability because everything must have a probability between 0 and 1, then I don't see how you would be justified in eliminating information (constraining the space of possible probabilities) or making up information (arbitrarily picking a probability) by specifying a value for the probability.

So finally, if Omega asked me what is the probability of him choosing a red ball, I would be forced to say something like, "To the extent that I can guess what you might be meaning by probability -- respectfully, Omega, it's not entirely well-defined -- the probability is p where 0<p<1."

How could I truthfully say more?

I find this problem interesting because it's a gray area in the correct approach to working with definitions. One of the perspectives I've encountered on Less Wrong is that in "real world" situations you need to let definitions be defined as you go. Finding the solution to a problem involves finding the right/good/proper definition, so that the definition you end up with tells you which world you're in. See Disputing Definitions).

This problem represents a context where you cannot find a good workable definition for probability (other than as the original abstract mathematical object), because you're not allowed to constrain which world you're in, because that would require making up information.

comment by talisman · 2009-05-05T03:33:32.614Z · LW(p) · GW(p)

This post confused me enormously. I thought I must be missing something, but reading over the comments, this seems to be true for virtually all readers.

What exactly do you mean by "bead jar guess"? "Surprise"? "Actual probability"? Are you making a new point or explaining something existing? Are you purposely being obscure "to make us think"?

I propose replacing this entire post with the following text:

Hey everybody! Read E.T. Jaynes's Probability Theory: The Logic Of Science!

Replies from: Alicorn, talisman
comment by Alicorn · 2009-05-05T04:02:58.605Z · LW(p) · GW(p)

By "bead jar guess" I mean a wild, nearly-groundless assignment of a probability to a proposition. This is as opposed to a solidly backed up estimate based on something like well-controlled sample data, or a guess made with an appeal to an inelegant but often-effective hack like the availability heuristic.

Replies from: talisman
comment by talisman · 2009-05-05T04:11:46.824Z · LW(p) · GW(p)

Groundless or not, if you propoose to run two experiments X and Y, and select outcomes x of experiment X and y of experiment Y before running the experiments, and assign x and y the same probabilities, you have to be equally surprised by x occurring as you are by y occurring, or I'm missing something deep about what you're saying about probabilities. Are you using the word "probability" in a different sense than Jaynes?

Replies from: Alicorn
comment by Alicorn · 2009-05-05T04:15:17.340Z · LW(p) · GW(p)

I haven't read Jaynes's work on the subject, so I couldn't say. However, if he thinks that equal probabilities mean equal obligation to be surprised, I disagree with him. It's easy to do things that are spectacularly unlikely - flip through a shuffled deck of cards to see a given sequence, for instance - that do not, and should not, surprise you at all.

Replies from: steven0461, andrewc, Cameron_Taylor, talisman
comment by steven0461 · 2009-05-07T16:57:30.806Z · LW(p) · GW(p)

"Surprise", as I understand it, is something rational agents experience when an observation disconfirms the hypothesis they currently believe in relative to the hypothesis that "something is going on", or the set of unknown unknowns.

If you generate ten numbers 1-10 from a process you think is random, and it comes up 5285590861, that is no reason to be surprised, because the sequence is algorithmically complex, and the hypothesis that "something is going on" assigns it a conditional probability no higher than the hypothesis that the process is random. But if it comes up 1212121212, that is reason to be surprised, because the sequence is algorithmically simple, so the hypothesis that "something is going on" assigns it higher conditional probability than the hypothesis that the process is random. The surprised agent is then justified in sitting up and expending resources trying to gather more info.

comment by andrewc · 2009-05-05T07:04:08.552Z · LW(p) · GW(p)

I haven't read Jaynes's work on the subject, so I couldn't say.

  1. Point your browser at amazon
  2. Order ETJ's book.
  3. Wait approx one week for delivery
  4. Read it.

I don't mean to sound gushing but Jayne's writing on probability theory is the clearest, most grounded, and most entertaining material you will ever read on the subject. Even better than that weird AI dude. Seriously it's like trying to discuss the apocalypse without reading Revelations...

comment by Cameron_Taylor · 2009-05-05T04:31:31.531Z · LW(p) · GW(p)

I haven't read Jaynes's work on the subject, so I couldn't say. However, if he thinks that equal probabilities mean equal obligation to be surprised, I disagree with him.

I tend to agree. If I discovered that Jaynes had said such a thing I would be very surprised indeed. I'll be surprised when the probability of seeing something with a that probability or less occur is low.

comment by talisman · 2009-05-05T04:17:08.362Z · LW(p) · GW(p)

That's because you didn't specify the sequence ahead of time, right?

Replies from: Alicorn
comment by Alicorn · 2009-05-05T04:27:26.857Z · LW(p) · GW(p)

Writing down a sequence ahead of time makes it more interesting when it turns up, not more unlikely. Given the possibility of cheating, it might make it more likely.

comment by talisman · 2009-05-12T16:34:38.480Z · LW(p) · GW(p)

Belated apologies for cranky tone on this comment.

comment by JulianMorrison · 2009-05-05T02:41:09.187Z · LW(p) · GW(p)

Thinking about how an Occamian learner like AIXI would approach the problem, it would probably start from the simplest domain theory "beads have a color, red is a color I've heard mentioned, therefore all beads are red", p=1. If the first bead was grey, it would switch to "all beads are grey", p=0. The second bead is red, "half and half", p = 0.5, and so on, ratcheting up theories from the simplest first.

Replies from: Peter_de_Blanc, Vladimir_Nesov, MBlume
comment by Peter_de_Blanc · 2009-05-05T13:59:00.908Z · LW(p) · GW(p)

FYI, AIXI does not work like this; it uses a probability distribution over all Turing machines.

comment by Vladimir_Nesov · 2009-05-05T09:32:31.825Z · LW(p) · GW(p)

This is not how AIXI [*] works. It considers all possible programs at the start, with some probability. The simplest program that fits the data is not the only one it considers; it just gets most of the probability mass. So, from the start, it will give some tiny probability to a hypothesis that the beads will spell War and Peace is morse code. Only when this hypothesis is falsified by the data, it will drop out of race.

[*] M. Hutter (2003). `A Gentle Introduction to The Universal Algorithmic Agent AIXI'. Tech. rep. [abstract/download]

comment by MBlume · 2009-05-05T02:44:22.398Z · LW(p) · GW(p)

red is a color I've heard mentioned, therefore all beads are red

and thus his bayes-score drops to -Infinity

Replies from: JulianMorrison
comment by JulianMorrison · 2009-05-05T06:25:22.516Z · LW(p) · GW(p)

I don't think AIXI tries to maximize its Bayes score in one round - it tries to minimize the number of rounds until it converges on a good-enough model.

comment by Simetrical · 2009-05-04T23:32:17.958Z · LW(p) · GW(p)

I think this post could have been more formally worded. It draws a distinction between two types of probability assignment, but the only practical difference given is that you'd be surprised if you're wrong in one case but not the other. My initial thought was just that surprise is an irrational thing that should be disregarded ― there's no term for "how surprised I was" in Bayes' Theorem.

But let's rephrase the problem a bit. You've made your probability assignments based on Omega's question: say 1/12 for each color. Now consider another situation where you'd give an identical probability assignment. Say I'm going to roll a demonstrated-fair twelve-sided die, and ask you the probability that it lands on one. Again, you assign 1/12 probability to each possibility.

(Actually, these assignments are spectacularly wrong, since they give a zero probability to all other colors/numbers. Nothing deserves a zero probability. But let's assume you gave a negligible but nonzero probability to everything else, and 1/12 is just shorthand for "slightly less than 1/12, but not enough to bother specifying".)

So as far as everything goes, your probability assignments for the two cases look identical up to this point. Now let's say I offer you a bet: we'll go through both events (drawing a bead and putting it back, or rolling the die) a million times. If your estimate of the probability of red/one was within 1% of correct in that sample, I give you $1000. Otherwise, you give me $1000.

In the case of the die, we would all take the bet in a heartbeat. We're very sure that our figures are correct, since the die is demonstrated to be fair, and 1% is a lot of wiggle room for the law of large numbers. But you'd have to be crazy to take the same bet on the jar, despite having assigned a precisely identical chance of winning.

So what's the difference? Isn't all the information you care about supposed to be encapsulated in your probability distribution? What is the mathematical distinction between these two cases that causes such a clear difference in whether a given bet is rational? Are we supposed to not only assign probabilities to which events will occur, but also to our probabilities themselves, ad infinitum?

Replies from: saturn, GuySrinivasan, Cameron_Taylor
comment by saturn · 2009-05-05T04:58:04.706Z · LW(p) · GW(p)

there's no term for "how surprised I was" in Bayes' Theorem.

Not quite. The intuitive notion of "how surprised you were" maps closely to bayesian likelihood ratios.

Regarding your die/beads scenarios:

In your die scenario, you have one highly favored model that assigns equal probability to each possible number. In the beads scenario you have many possible models, all with low probability; averaging their predictions gives equal probability to each possible color.

To simplify things, let's say our only models are M, which predicts the outcomes are random and equally likely (i.e. a fair die or jar filled with an even ratio of 12 colors of beads), and not-M (i.e. a weighted die or jar filled with all the same color beads). In the beads scenario we might guess that P(M)=.1; in the die scenario P(M)=.99. In both cases, our probability of red/one is 1/12, because neither of our models tell us which color/number to expect. But our probability of winning the bet is different -- we only win if M is correct.

Replies from: Simetrical
comment by Simetrical · 2009-05-05T23:38:43.131Z · LW(p) · GW(p)

That clears things up a lot. I hadn't really thought about the multiple-models take on it (despite having read the "prior probabilities as mathematical objects" post). Thanks.

comment by GuySrinivasan · 2009-05-05T01:19:39.155Z · LW(p) · GW(p)

Isn't all the information you care about supposed to be encapsulated in your probability distribution?

No. As another (yours is one) simple counterexample, if I flip a fair coin 100 times you expect around 50 heads, but if I either choose a double-head or double-tail coin and flip that 100 times, you expect either 100 heads or 100 tails - and yet the probability of the first flip is still 50/50.

A distribution over models solves this problem. IIRC you don't have to regress further, but I don't remember where (or even if) I saw that result.

Replies from: orthonormal
comment by orthonormal · 2009-05-05T20:58:42.465Z · LW(p) · GW(p)

but if I either choose a double-head or double-tail coin and flip that 100 times,

To clarify: if you know Guy chose either a double-head or double-tail coin, but you have no idea which, then you should assign 50% to heads on the first flip, then either 0% or 100% to heads after, since you'll the know which one it was.

It's been linked too often already in this thread, but the example in Priors as Mathematical Objects neatly demonstrates how a prior is more than just a probability distribution, and how Simetrical's question doesn't lead to paradox.

comment by Cameron_Taylor · 2009-05-05T01:30:23.604Z · LW(p) · GW(p)

(Actually, these assignments are spectacularly wrong, since they give a zero probability to all other colors/numbers. Nothing deserves a zero probability. But let's assume you gave a negligible but nonzero probability to everything else, and 1/12 is just shorthand for "slightly less than 1/12, but not enough to bother specifying".)

The justification given in the original post was spectacularly wrong. The assignments themselves may not be. One could just as easily be using the shorthand for "slightly more than 1/12 because I now know that red is a color Omega considers 'color-worthy', he can see that I've got red receptive cones in my eyes and this influences my probability a little more than the possibility that he has obscure color beads. And screw it. Lilac is freaking purple anyway. And he asked for my probability, not that of some pedantic ponce!"

comment by nazgulnarsil · 2009-05-05T12:45:08.809Z · LW(p) · GW(p)

I'd say like the second example, most of our beliefs are bead jar guesses informed by untrustworthy informants. Namely our parents and other adults around when we were young.

Replies from: MrHen
comment by MrHen · 2009-05-05T14:08:05.300Z · LW(p) · GW(p)

FYI, italics can be entered as such: *this will be italic*. There are more formatting tips available by clicking the "help" link under the comments box.

comment by Scott Alexander (Yvain) · 2009-05-04T21:09:49.062Z · LW(p) · GW(p)

Do you see this as being sort of like Jimmy's metauncertainty?

Also, if Omega pulled out a bead and then asked you about the next one, the Rule of Succession would be a good place to start making guesses.

Replies from: Alicorn
comment by Alicorn · 2009-05-04T21:14:49.031Z · LW(p) · GW(p)

It's a little like metauncertainty, except that my post has much less frightening math.

comment by Emile · 2009-05-04T20:51:37.820Z · LW(p) · GW(p)

Omega asking you that question, "What's the probability that the bead will be red" is itself information about the beads - Omega is more likely to ask that question in cases where the color is relevant.

To do things properly, you could suppose that Omega is inspiring himself from existing bead-color-guessing logic puzzles (there are plenty of examples of those in human history), and you could assign probabilities to each type of guessing games, noting how many types of beads there are etc. You can also collect statistics about which combinations of colors are often used for puzzles (red-and-blue and black-and-white seem common), and from all that, have a probability to assign to each color coming out, knowing that "will the bead be red" is a reasonable question.

That kind of thinking seems like closer to what you can come up with with limited computational resources, and it might give you a probability of 0.5 for red.

Replies from: Alicorn
comment by Alicorn · 2009-05-04T20:59:02.265Z · LW(p) · GW(p)

Omega asking you that question, "What's the probability that the bead will be red" is itself information about the beads - Omega is more likely to ask that question in cases where the color is relevant.

I'm not sure what you mean by this. Obviously the color is relevant, in that it's what he wants you to guess at - but the fact that he asked about red and not about brown is not suggestive in any way. This is Omega we're talking about, not someone with normal psychology.

comment by Drahflow · 2009-05-04T20:39:05.329Z · LW(p) · GW(p)

Apparently, the term you are searching for is "Second Order Probability".

See here for a paper: www.dodccrp.org/events/2000_CCRTS/html/pdf_papers/Track_4/124.pdf

Replies from: Vladimir_Nesov, Alicorn
comment by Vladimir_Nesov · 2009-05-04T21:21:02.016Z · LW(p) · GW(p)

See here for a paper: www.dodccrp.org/events/2000CCRTS/html/pdfpapers/Track_4/124.pdf

D. Bamber and I.R. Goodman, (2000). New uses of second order probability techniques in estimating critical probabilities in command & control decision-making. [abstract] [pdf].

Please cite at least the name of the paper, preferably add a link to its abstract. It's not apparent that second order probability is directly related to this article.

comment by Alicorn · 2009-05-04T20:47:56.067Z · LW(p) · GW(p)

The link gives me a 404.

comment by badger · 2009-05-04T20:30:42.836Z · LW(p) · GW(p)

I've just started reading Jaynes on prior formation, and I'd love to see more posts here on the topic. Maybe I'll write one if I ever have the chance to get some reading done.

As far as this problem goes, I agree we have some information about other colors. I want to know what Omega counts as "red" though, because that will go a long way in determining what sort of prior we'd assign.

Based on my limited understanding of physics, if we assume the bead only reflects a single wavelength, then it would be red if the wavelength were between 620 and 750nm. The visible spectrum goes from 380 to 750nm, so a uniform distribution over wavelength gives a probability of 0.34.

Most objects reflect multiple wavelengths though. In this case, the color of an object could be characterized as a distribution over the visible spectrum of the reflected light. To count as red, the distribution would need a mean between 620 and 750nm. We might need an additional constraint on the shape of the distribution so it isn't too spread out. I don't know how you'd begin calculating the measure of distributions that meet these constraints though.

I think drawing a red bead will marginally increase our probability of red.

In the end, I think our estimate is much more likely to depend on our expectation of Omega's motives than knowledge about colors, though.

comment by MrHen · 2009-05-04T20:21:56.846Z · LW(p) · GW(p)

I'm sure not sure I understand the point of this post. Are you saying that guesses without any information are inherently unfounded?

How would guessing 50% on the first pull be any worse since, by definition of the problem, you have no information? As soon as you have seen one bead, however, you have perfect historical information which is better than none.

Assuming that Omega is picking at random, it makes sense to me to simply pick a random percentage on the first pull and then swing to 0% or 100% once you see a bead. Update again on the second bead to 0%, 50%, or 100%. Continuing the process should eventually converge on the correct answer.

(Note) If "0" and "100" make you uncomfortable, switch to the appropriate very close value of your choice.

Replies from: Alicorn, Cameron_Taylor
comment by Alicorn · 2009-05-04T20:55:47.662Z · LW(p) · GW(p)

You should not guess that the first bead has a 50% chance of being red, because if you do, you can have this conversation:

Omega: What is the probability of the first bead being red as opposed to non-red?

You: Fifty-fifty.

Omega: So you would consider it more than fair if I offered you three dollars if the bead is red, and you paid me a dollar if it was non-red?

You: Sure, I'll take that bet.

Omega: What is the probability of the first bead being blue as opposed to non-blue?

You: Fifty-fifty.

Omega: So you would consider it more than fair if I offered you three dollars if the bead is blue, and you paid me a dollar if it was non-blue?

You: Sure, I'll take that bet.

(...and so on for ten more colors.)

Omega pulls out a red bead. He owes you three dollars, but you owe him eleven dollars. He wins.

Replies from: Zvi, Vladimir_Nesov, JGWeissman, cousin_it, Cyan
comment by Zvi · 2009-05-04T22:40:24.944Z · LW(p) · GW(p)

You could have that conversation, but you don't have to. The argument for assigning 50% to red is that it's the only question Omega has asked you. There are several ways out of that. The first one is that the moment he offers you a 25% bet I would update to presume that 3:1 is not a positive e.v. bet, with a new number of perhaps 12.5% with a range of 0% to 25% with symmetric distribution. Similarly, if he offered me three to one that it wasn't red, I would presume that it probably will be. On a similar note, when he asks about blue (even without any bets involved) I can't see answering higher than 33.3%.

Contrast this with Alicorn watching this incident and offering me 3:1 after Omega asks my probability for red and I say 50%. I still have to update for Alicorn's opinion, but I might or might not accept that bet.

comment by Vladimir_Nesov · 2009-05-04T21:35:35.342Z · LW(p) · GW(p)

The estimate should take into account the expectation of being asked further questions. The ignorance prior is applied to a model of observation. The model of observation expresses which questions you may be asked, and what structure will the dependencies between these possible observations have.

Replies from: MrHen
comment by MrHen · 2009-05-04T22:41:43.192Z · LW(p) · GW(p)

The estimate should take into account the expectation of being asked further questions.

I do not know how related this is to your comment, but it made me think of another response to the Dutch book objection. (Am I using that term correctly?)

If Omega asks me about a red bead I can say 100%. If he then asks about a blue bead I can adjust my original estimate so that both red and blue are the same at 50/50. Every question asked is adding more information. If Omega asks about green beads all three answers get shifted to 1/3.

This translates into an example with numbered balls just fine. The more colors or numbers Omega asks about decreases the expected probability that any particular one of them will come out of the jar simply because the known space of colors and numbers is growing. Until Omega acknowledges that there could be a bead of that color or number there is no particular reason to assume that such a bead exists.

If the example was rewritten to simply say any type of object could be in the jar, this still makes sense. If Omega asks about a red bead, we say 100%. If Omega asks about a blue chair, both become 50%. The restriction of colors and numbers is our assumed knowledge and has nothing to do with the problem at hand. We can meta-game all we want, but it has nothing to do with what could be in the jar.

The state of the initial problem is this:

  • A red bead could be in the jar

After the second question:

  • A red bead could be in the jar
  • A green bead could be in the jar

I suppose it makes some sense to include an "other" category, but there is no knowledge of anything other than red and green beads. The question of probability implies that another may exist, but is that enough to assign it a probability?

Replies from: JamesAndrix, Vladimir_Nesov, Alicorn
comment by JamesAndrix · 2009-05-05T20:49:16.277Z · LW(p) · GW(p)

Every question asked is adding more information. If Omega asks about green beads all three answers get shifted to 1/3.

I don't think we should treat omega as adding (much) new information with each question. Omega is super intelligent, we should assume that he's already went all the way down the rabbit hole of possible colors, including ones that our brains could process but our eyes don't see. We're not inferring anything about his state of mind because he's only asking questions about red, green, and blue. A sequence of lilac turquoise turquoise lilac lilac says very much more about what's in the jar than the two hundred color questions omega asked you beforehand.

Replies from: jimrandomh, MrHen
comment by jimrandomh · 2009-05-05T21:03:37.643Z · LW(p) · GW(p)

Not every question Omega could ask would provide new information, but some certainly do. Suppose his follow-up questions were "What is the probability that the bead is transparent?", "What is the probability that the bead is made of wood?" and "What is the probability that the bead is striped?". It is very likely that your original probability distribution over colors implicitly set at least one of these answers to zero, but the fact that Omega has mentioned it as a possibility makes it considerably more likely.

Replies from: JamesAndrix
comment by JamesAndrix · 2009-05-06T00:46:58.352Z · LW(p) · GW(p)

If Omega asking if the bead could be striped changes you probability estimates, then you were either wrong before or wrong after (or likely both)

If omega tells you at the outset that the beads are all solid colors, then you should maintain your zero estimate that any are striped. If not, then you never should have had a zero estimate. He's not giving you new information, he's highlighting information you already had (or didn't have.)

I don't see any way to establish a reliable (non-anthropomorphic) chain of causality that connects there being red beads in the jars with Omega asking about red beads. He can ask about beads that aren't there, and that couldn't be there given the information he's given you. When Omega offered to save x+1 billion people if the earth was less than 1 million years old, I don't think anyone argued that his suggesting it should change our estimates.

Replies from: conchis
comment by conchis · 2009-05-06T20:24:57.056Z · LW(p) · GW(p)

I don't see any way to establish a reliable (non-anthropomorphic) chain of causality that connects there being red beads in the jars with Omega asking about red beads.

There's no need to, because probability is in the mind.

Replies from: JamesAndrix
comment by JamesAndrix · 2009-05-07T15:09:54.965Z · LW(p) · GW(p)

If you're going to update based on what omega asks you then you must believe there is a connection that you have some information about.

If we don't know anything about omega's thought process or goals, then his questions tell us nothing.

Replies from: conchis
comment by conchis · 2009-05-07T15:35:03.059Z · LW(p) · GW(p)

I think our only disagreement is semantic.

If I initially divide the state space into solid colours, and then Omega asks if the bead could be striped, then I would say that's a form of new information - specifically, information that my initial assumption about the nature of the state space was wrong. (It's not information I can update on; I have to retrospectively change my priors.)

Apologies for the pointless diversion.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2009-05-07T16:07:06.002Z · LW(p) · GW(p)

An ideal model of the real world must allow any miracle to happen, nothing should be logically prohibited.

comment by MrHen · 2009-05-05T21:18:39.346Z · LW(p) · GW(p)

Of note, I was operating under a bad assumption with regards to the original example. I assumed that the set was a finite but unknown set of colors or an infinite set of colors. In the former case, every question is giving a little information about the possible set. In the latter it really does not matter much.

A sequence of lilac turquoise turquoise lilac lilac says very much more about what's in the jar than the two hundred color questions omega asked you beforehand.

Yes, this is true. Personally, I am still curious about what to do with the two hundred color questions.

comment by Vladimir_Nesov · 2009-05-04T22:58:02.064Z · LW(p) · GW(p)

Don't think of probability as being mutable, as getting updated. Instead, consider a fixed comprehensive state space, that has a place on it for every possible future behavior, including the possible questions asked, possible pieces of evidence presented, possible actions you make. Assign a fixed probability measure to this state space.

Now, when you do observe something, this is information, an event, a subset on the global state space. This event selects an area on it, and encompasses some of the probability mass. The statements, or beliefs (such as "the ball #2 will be red"), that you update on this info, are probabilistic variables. A probabilistic variable is a function that maps the state space on a simpler domain, for example a binary discrete probabilistic variable is basically an event, a subset of the state space (that is, in some states, the ball #2 is indeed defined to be red, these states belong to the event of ball #2 being red).

Your info about the world retains only the part of the state space, and within that part of the state space, some portion of the probability mass goes to the event defining your statement, and some portion remains outside of it. The "updating" only happens when you focus on this info, as opposed to the whole state space.

If that picture is clear, you can try to step back to consider what kind of probability measure you'd assign to your state space, when its structure already encodes all possible future observations. If you are indifferent to a model, the assignment is going to be some kind of division into equal parts, according to the structure of state space.

Replies from: orthonormal
comment by orthonormal · 2009-05-05T20:43:03.540Z · LW(p) · GW(p)

IAWYC, but as pedagogy it's about on the level of "How should you imagine a 7-dimensional torus? Just imagine an n-dimensional torus and let n go to 7."

Eliezer's post on priors explains the same idea more accessibly.

EDIT: Sorry, I didn't notice you already linked it below.

comment by Alicorn · 2009-05-04T22:44:37.194Z · LW(p) · GW(p)

What if Omega wants you to commit to a bet based on your probabilities at every step?

Or what if he just straight up asks you what color you want to guess the bead will be, without asking about any individual colors? (Then you'd probably be best served by switching to a language with fewer basic color words, but that aside...)

Replies from: MrHen
comment by MrHen · 2009-05-04T22:58:21.625Z · LW(p) · GW(p)

What if Omega wants you to commit to a bet at every step?

Than you are forced to bid 0 because you have to account for any further questions, which sounds similar to what Vladimir_Nesov said.

By the way, I think adding another restriction to your example to force it back into your specific response is not particularly meaningful. In the case where you do not have to commit to a bet at every step, does what I say make sense? If so, than what Vladimir_Nesov suggested seems to be on the right path with regards to your restrictions.

Or what if he just straight up asks you what color you want to guess the bead will be, without asking about any individual colors? (Then you'd probably be best served by switching to a language with fewer basic color words, but that aside...)

Switching languages is a semantic trick. If we are allowed to use any words to describe the bead we can just say "not-clear" because the space of "not-clear" covers what we generally mean by "color". We may as well say "the bead will be a colored bead." All of this breaks the assumed principle of no information.

If Omega wanted a particular color and forced us into actually answering the annoying question, we are completely off the path of probabilities and it does not matter what you answer as long as you picked a color. If Omega then asked us what the probability of that particular color coming out of the jar would be, the answer should be the same as if you picked any other color. This drops to zero unless you self-restrict by the number of colors you can personally remember.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-05-04T23:02:31.481Z · LW(p) · GW(p)

MrHen, whatever strategy you're employing here, it doesn't sound like a strategy for arriving at the really truly correct answer, but some sort of clever set of verbal responses with a different purpose entirely. In real life, just because Omega asked if the bead is red simply does not mean there is probability 0 of it being green.

Replies from: MrHen
comment by MrHen · 2009-05-04T23:23:29.640Z · LW(p) · GW(p)

MrHen, whatever strategy you're employing here, it doesn't sound like a strategy for arriving at the really truly correct answer, but some sort of clever set of verbal responses with a different purpose entirely.

Mmm... I was not trying to employ a strategy with clever verbal responses. I thought I was arguing against that, actually, so I must be far from where I think I am.

I feel like I am trying to answer a completely different question than the one originally asked. Is the question:

  1. Knowing nothing about what is in the jar except that its contents are divided by color as per our definition of "color", what is the probability of a red bead being pulled?
  2. Knowing nothing about what is in the jar, what is the probability of a red bead being pulled?

I admittedly assumed the latter even though the article used words closer to the former. Perhaps this was my mistake?

In real life, just because Omega asked if the bead is red simply does not mean there is probability 0 of it being green.

I would agree. I do think that Omega asking about a red bead implies nothing about the probability of it being green. What I am currently wondering is if the question implies anything about the probability of the bead being red. If Omega acknowledges that the bead could be red, does that give red a higher probability than green?

I suppose I instinctively would answer affirmatively. The reasoning is that "red" is now included in the jar's potential outcomes while green has not been acknowledged yet. In other words, green doesn't even have a probability. Strictly speaking, this makes little sense, so I must be misstepping somewhere. My hunches are pointing toward my disallowing green into the potential outcomes.

This does not mean that I refuse to think of green as a color, but that green is not automatically included in the jar's potential outcomes just because Omega used the word "color". Is this the verbal cleverness you were referring to?

(Switching thoughts) In terms of arriving at the really truly correct answer, it seems that a strategy that gets closer as more beads is what is desired. If no beads are revealed, what sort of strategy is possible? I think the answer to this revolves around my potential confusion of the original question.

I apologize if I am mudding things up and am way off base.

Replies from: soreff
comment by soreff · 2010-08-08T21:08:16.456Z · LW(p) · GW(p)

Is Omega privileging the hypothesis that the bead is red?:-)

comment by JGWeissman · 2009-05-04T21:12:51.944Z · LW(p) · GW(p)

Omega: So you would consider it more than fair if I offered you three dollars if the bead is red, and you paid me a dollar if it was non-red?

Me: No, because you have more information than I do, and the fact that you would offer this bet is evidence that I should use to update my epistemic probabilities.

Replies from: Alicorn
comment by Alicorn · 2009-05-04T21:16:51.806Z · LW(p) · GW(p)

Well, Omega doesn't really need the money. There's no reason to believe he would balk at offering you a more-than-fair bet.

Replies from: JGWeissman
comment by JGWeissman · 2009-05-04T21:33:44.348Z · LW(p) · GW(p)

Well, in the case of Omega, I would at least suspect that he intends to demonstrate that I am vulnerable to a Dutch book, even though he doesn't need the money.

Replies from: Nominull
comment by Nominull · 2009-05-04T23:28:26.173Z · LW(p) · GW(p)

If you meet an Omega, that is pretty good evidence that you are living in a simulation: specifically, you are being simulated inside a philosopher's brain as a thought experiment.

comment by cousin_it · 2009-05-04T22:31:11.120Z · LW(p) · GW(p)

Sorry, you haven't convincingly demonstrated the wrongness of 50%. MrHen's position seems to me quite natural and defensible, provided he picks a consistent prior. For example, I'd talk with Omega exactly as you described up to this point:

...Omega: What is the probability of the first bead being blue as opposed to non-blue?

Me: 25%.

You ask why 25%? My left foot said so... or maybe because Omega mentioned red first and blue second. C'mon Dutch book me.

Replies from: Alicorn
comment by Alicorn · 2009-05-04T22:37:58.624Z · LW(p) · GW(p)

I think that doing it this way assumes that Omega is deliberately screwing with you and will ask about colors in a way that is somehow germane to the likelihood. Assume he picked "red" to ask about first at random out of whatever colors the beads come in.

Replies from: cousin_it, Peter_de_Blanc
comment by cousin_it · 2009-05-04T22:54:16.099Z · LW(p) · GW(p)

This new information gives me grounds to revise my estimates as Omega asks further questions, but I still don't see how it demonstrates the wrongness of initially answering 50%.

Replies from: MrHen
comment by MrHen · 2009-05-05T14:05:54.989Z · LW(p) · GW(p)

The reason 50/50 is bad is because the beads in the jar come in no more than 12 colors and we have no reason to favor red over the other 11 colors.

Knowing there is a cap of 12 possible options, it makes intuitive sense to start by giving each color equal weights until more information appears. (Namely, whenever Omega starts pulling beads.)

Replies from: cousin_it
comment by cousin_it · 2009-05-05T14:47:31.051Z · LW(p) · GW(p)

we have no reason to favor red over the other 11 colors

We have a reason: Omega mentioned red.

Replies from: MrHen
comment by MrHen · 2009-05-05T15:11:38.989Z · LW(p) · GW(p)

I suppose the relevant question is now, "Does Omega mentioning red tell us anything about what is in the jar?" When we know the set of possible objects in the jar, it really tells us nothing new. If the set of possible objects is unknown, now we know red is a possibility and we can adjust accordingly.

The assumption here is that Omega is just randomly asking about something from the possible set of objects. Essentially, since Omega is admitting that red could be in the jar, we know red could be in the jar. In the 12 color scenario, we already know this. I do not think that Omega mentioning red should effect our guess.

Replies from: cousin_it
comment by cousin_it · 2009-05-05T15:35:45.408Z · LW(p) · GW(p)

All this arguing about priors eerily resembles scholastics, balancing angels on the head of a pin. Okay I get it, we read Omega's Bible differently: unlike me, you see no symbolic significance in the mention of red. Riiiiight. Now how about an experiment?

Replies from: MrHen
comment by MrHen · 2009-05-05T15:42:50.444Z · LW(p) · GW(p)

All this arguing about priors eerily resembles scholastics, balancing angels on the head of a pin. Okay I get it, we read Omega's Bible differently: unlike me, you see no symbolic significance in the mention of red. Riiiiight. Now how about an experiment?

Agreed. For what it is worth, I do see some significance in the mention of red, but cannot figure out why and do not see the significance in the 12 color example. This keeps setting off a red flag in my head because it seems inconsistent. Any help in figuring out why would be nifty.

In terms of an experiment, I would not bet at all if given the option. If I had to choose, I would choose whichever option costs less and right it off as a forced expense.

In English: If Omega said he had a dollar claiming the next bead would be red and asked me what I bet I would bet nothing. If I had to pick a non-zero number I would pick the smallest available.

But that doesn't seem very interesting at all.

comment by Peter_de_Blanc · 2009-05-05T13:38:29.030Z · LW(p) · GW(p)

Then each time Omega mentions another color, it increases the expected number of colors the beads come in.

Replies from: MrHen
comment by MrHen · 2009-05-05T13:57:37.511Z · LW(p) · GW(p)

I think Alicorn is operating under a strict "12 colors of beads" idea based on what a color is or is not. As best as I can tell, the problem is essentially, "Given a finite set of bead colors in a jar, what is the probability of getting any particular color from a hidden mixture of beads?" The trickiness is that each color could have a different amount in the jar, not that there are any number of colors.

Alicorn answered elsewhere that when the jar has an infinite set of possible options the probability of any particular option would be infinitesimal.

Replies from: orthonormal
comment by orthonormal · 2009-05-05T20:50:05.868Z · LW(p) · GW(p)

If the number of possible outcomes is finite, fixed and known, but no other information is given, then there's a unique correct prior: the maxentropy prior that gives equal weight to each possibility.

(Again, though, this is your prior before Omega says anything; you then have to update it as soon as ve speaks, given your prior on ver motivations in bringing up a particular color first. That part is trickier.)

Replies from: MrHen
comment by MrHen · 2009-05-05T21:12:27.965Z · LW(p) · GW(p)

(Again, though, this is your prior before Omega says anything; you then have to update it as soon as ve speaks, given your prior on ver motivations in bringing up a particular color first. That part is trickier.)

How would you update given the following scenarios (this is assuming finite, fixed, known possible outcomes)?

  1. Omega asks you for the probability of a red bead being chosen from the jar
  2. Omega asks you for the probability of "any particular object" being chosen
  3. Omega asks you to name an object from the set and then asks you for the probability of that object being chosen
Replies from: orthonormal
comment by orthonormal · 2009-05-06T21:56:10.918Z · LW(p) · GW(p)

I don't think #2 or #3 give me any new relevant information, so I wouldn't update. (Omega could be "messing with me" by incorporating my sense of salience of certain colors into the game, but this suspicion would be information for my prior, and I don't think I learn anything new by being asked #3.)

I would incrementally increase my probability of red in case #1, and decrease the others evenly, but I can't satisfy myself with the justification for this at the moment. The space of all minds is vast; and while it would make sense for several instrumental reasons to question first about a more common color, we're assuming that Omega doesn't need or want anything from this encounter.

In the real-life cases which this is meant to model, though, like having a psychologist doing a study in place of Omega, I can model their mind by mine and realize that there are more studies in which I'd ask about a color I know is likely to come up, than studies in which I'd pick a specific less-likely color, and so I should update p(red) positively.

But probably not all the way to 1/2.

comment by Cyan · 2009-05-04T21:06:56.960Z · LW(p) · GW(p)

To make consistent bets, we need a prior on the number of possible outcomes.

Replies from: Alicorn
comment by Alicorn · 2009-05-04T21:12:46.540Z · LW(p) · GW(p)

The fact that Omega is speaking English and uses the word "red" as opposed to "scarlet" or something is decent evidence that there are twelve colors in beadspace.

Replies from: Cyan
comment by Cyan · 2009-05-04T21:20:30.158Z · LW(p) · GW(p)

What happens if you've taken bets on twelve colors [ETA: that eat up all your probability] and then Omega asks you to name odds on a transparent bead?

Replies from: Alicorn
comment by Alicorn · 2009-05-04T21:34:35.895Z · LW(p) · GW(p)

I was operating under the assumption that clear is not a "solid color".

Replies from: Cyan, MrHen
comment by Cyan · 2009-05-04T23:19:52.904Z · LW(p) · GW(p)

Right you are. I didn't read the original problem carefully enough...

Nevertheless, you can replace "transparent" with a surprising color like lilac, fuchsia, or, um, cyan to restore the effect. The point is that even decent evidence that there are twelve colors in beadspace doesn't justify a probability distribution on the number of colors that places all of its mass at twelve.

Replies from: Alicorn
comment by Alicorn · 2009-05-05T00:35:25.487Z · LW(p) · GW(p)

The twelve basic colors are so called because they are not kinds of other colors. Lilac and fuchsia are kinds of purple (I guess you could argue that fuchsia is a kind of red, instead, but pretend you couldn't), and cyan is a kind of blue. Even if you pull out a navy bead and then a cyan bead, they are both kinds of blue in English; in Russian, they would be different colors as unalike as pink and red.

Replies from: Cyan
comment by Cyan · 2009-05-05T00:55:09.375Z · LW(p) · GW(p)

So you're arguing that by definition, the basic color words define a mutually exclusive and exhaustive set. But there are colors near cyan which are not easy to categorize -- the fairest description would be blue-green. In the least convenient world, when Omega asks you for odds on blue-green, you ask it if that color counts as blue and/or green, and it replies, "Neither; I treat blue-green as distinct from blue and green." Then what do you do?

Replies from: Alicorn
comment by Alicorn · 2009-05-05T01:07:27.189Z · LW(p) · GW(p)

I was mentally categorizing that as "Omega deliberately screwing with you" by using English strangely, but perhaps that was unmotivated of me. But this gets into a grand metaphysical discussion about where colors begin and end, and whether there is real vagueness around their borders, and a whole messy philosophy of language hissy fit about universals and tropes and subjectivity and other things that make you sound awfully silly if you argue about them in public. I ignored it because the idea of the post wasn't about colors, it was about probabilities.

Replies from: Cyan, JamesAndrix, MrHen
comment by Cyan · 2009-05-05T02:18:34.225Z · LW(p) · GW(p)

That's a shame, because uncertainty about the number of possible outcomes is a real and challenging statistical problem. See for example Inference for the binomial N parameter: A hierarchical Bayes approach (abstract)(full paper pdf) by Adrian Raftery. Raftery's prior for the number of outcomes is 1/N, but you can't use that for coherent betting.

comment by JamesAndrix · 2009-05-05T21:05:33.277Z · LW(p) · GW(p)

I think there's also the question of inferring the included name space and possibility space from the questions asked.

If he asks you about html color #FF0000 (which is red) after asking you about red, do you change your probability? Assuming he's using 12 color words because he used 'red' is arbitrary.

Even with defined and distinct color terms, the question is, what of those colors are actual possibilities (colors in the jar) as opposed to logical possibilities (colors omega can name)

and I think THAT ties back to Elizer's article about Job vs. Frodo.

comment by MrHen · 2009-05-05T04:37:24.727Z · LW(p) · GW(p)

I was mentally categorizing that as "Omega deliberately screwing with you" by using English strangely, but perhaps that was unmotivated of me. But this gets into a grand metaphysical discussion about where colors begin and end, and whether there is real vagueness around their borders, and a whole messy philosophy of language hissy fit about universals and tropes and subjectivity and other things that make you sound awfully silly if you argue about them in public. I ignored it because the idea of the post wasn't about colors, it was about probabilities.

Personally, I think the intent has less to do with classifying colors strangely and more to do with finding a broader example where even less information is known. The misstep I think I took earlier had to do with assuming that the colors were just part of an example and the jar could theoretically hold items from an infinite set.

I get that when picking beads from the set of 12 colors it makes sense to guess that red will appear with a probability near 1/12. An infinite set, instead of 12, is interesting in terms of no information as well. As far as I can tell, there is no good argument for any particular member of the set. So, asking the question directly, what if the beads have integers printed on them? What am I supposed to do when Omega asks me about a particular number?

Replies from: Alicorn
comment by Alicorn · 2009-05-05T04:56:02.691Z · LW(p) · GW(p)

Unless you have a reason to believe that there is some constraint on what numbers could be used - if only a limited number of digits will fit on the bead, for example - your probability for each integer has to be infinitesimal.

Replies from: orthonormal, Peter_de_Blanc
comment by orthonormal · 2009-05-05T20:36:31.513Z · LW(p) · GW(p)

You're not allowed to do that. With a countably infinite set, your only option for priors that assign everything a number between 0 and 1 is to take a summable infinite series. (Exponential distributions, like that proposed by Peter above, are the most elegant for certain questions, but you can do p(n)=cn^{-2} or something else if you prefer to have slower decay of probabilities.)

In the case with colors rather than integers, a good prior on "first bead color, named in a form acceptable to Omega" would correspond to this: take this sort of distribution, starting with the most salient color names and working out from there, but being sure not to exceed 1 in total.

Of course, this is before Omega asks you anything. You then have to have some prior on Omega's motivations, with respect to which you can update your initial prior when ve asks "Is it red?" And yes, you'll be very metauncertain about both these priors... but you've got to pick something.

Replies from: MrHen
comment by MrHen · 2009-05-06T19:48:20.996Z · LW(p) · GW(p)

I am happy with that explanation. Thanks.

comment by Peter_de_Blanc · 2009-05-05T16:55:13.575Z · LW(p) · GW(p)

Why not, say, p(n) = (1/3) * 2^(-|n|)?

Replies from: MrHen
comment by MrHen · 2009-05-05T17:04:57.873Z · LW(p) · GW(p)

If p(n) = (1/3) * 2^(-|n|), then:

  • p(1) = (1/3) * 2^(-1) = 0.166666667
  • p(86) = (1 / 3) * (2^(-86)) = 4.30823236 × 10^(-27)
  • p(1 000 000) = (1/3) * 2^(-1 000 000) = Lower than Google's calculator lets me go

Are you willing to bet that 1 is going to happen that much more often than 1,000,000?

Replies from: GuySrinivasan, Peter_de_Blanc
comment by GuySrinivasan · 2009-05-05T17:37:32.508Z · LW(p) · GW(p)

The point is that your probability for the "first" integers will not be infinitesimal. If you think that drops off too quickly, instead of 2 use 1+e or something. p(n) = e/(e+2) * (1+e)^(-|n|). And replace n with s(n) if you don't like that ordering of integers. But regardless, there's some N for which there is an n with |n|N such that p(n)/p(m) >> 1.

comment by Peter_de_Blanc · 2009-05-05T21:09:21.954Z · LW(p) · GW(p)

I wasn't talking about limiting frequencies, so don't ask me "how often?"

Would you bet $1 billion against my $1 that no number with absolute value smaller than 3^^^3 will come up? If not then you shouldn't be assigning infinitesimal probability to those numbers.

Replies from: MrHen
comment by MrHen · 2009-05-05T22:14:23.826Z · LW(p) · GW(p)

I get the feeling that I am thinking about this incorrectly but am missing a key point. If someone out there can see it, please let me know.

I wasn't talking about limiting frequencies, so don't ask me "how often?"

Sorry.

If the set of possible options is all integers and Omega asks about a particular integer, why would the probability go up the smaller the number gets?

Would you bet $1 billion against my $1 that no number with absolute value smaller than 3^^^3 will come up? If not then you shouldn't be assigning infinitesimal probability to those numbers.

Betting on ranges seems like a no brainer to me. If Omega comes and asks you to pick an integer and then asks me to bet on whether an object pulled from the jar will have an absolute value over or under that integer, I should always bet that the number will be higher than yours.

If I had a random number generator that could theoretically pull a random number from all integers, it seems weird to assume it will be small. As far as I know, such a random number generator is impossible. Assuming it is impossible, there must be a cap somewhere in the set of all integers. The catch is that we have no idea where this cap is. If you can write 3^^^3 I can write 3^^^3 + 1 which leads me to believe that no matter what number you pick, the cap will be significantly higher. As long as I can cover the costs of the bet, I should bet against you.

The math works like this:

  • Given the option to place $X against your $Y
  • That when you pick an integer Z
  • Omega will pull a number out of the jar that is greater than the absolute value of Z,
  • There is a way to express X / Y * Z + 1 and,
  • Assuming I am placing equal probabilities on each possible integer between 0 and X / Y * Z + 1,
  • I should always take the bet

A trivial example: If I bet $5 against your $1 and you pick the integer 100, I can easily imagine the number 501. 501/100 is greater than 5/1. I should take the bet.

The problem seems to be that I am placing equal probabilities on each possible integer while you favor numbers closer to 0. Favoring numbers like 1 or 2 makes a lot of sense if someone came up to me on the street with a bucket of balls with numbers printed on them. I would also consider the chances of pulling 1 to be much higher than 3^^^3.

So, perhaps, my misstep is thinking of Omega's challenge as a purely theoretical puzzle and not associating it with the real world. In any case, I certainly do not want to give the impression that I think 3^^^3 is just as likely to appear as 42 in the real world. Of course, in the real world I wouldn't bet on anything at all because I do not consider the information available to be useful in determining the correct action and I am ridiculously risk averse.

Replies from: Vladimir_Nesov, GuySrinivasan
comment by Vladimir_Nesov · 2009-05-06T00:28:32.361Z · LW(p) · GW(p)

To dispel this confusion, you should read on algorithmic information theory.

Replies from: MrHen
comment by MrHen · 2009-05-06T03:15:46.485Z · LW(p) · GW(p)

Is there a good place to start online? Can I just Google "algorithmic information theory"?

Replies from: Cyan
comment by Cyan · 2009-05-06T03:37:57.497Z · LW(p) · GW(p)

Google first and ask questions later. ;-)

comment by GuySrinivasan · 2009-05-05T23:08:48.200Z · LW(p) · GW(p)

The problem seems to be that I am placing equal probabilities on each possible integer while you favor numbers closer to 0.

You are not doing so, since it is impossible. No such probability distribution exists. In fact you recognize this by saying there's a cap somewhere out there, you just don't know where. Well, this cap means that small numbers (smaller than the cap) have much, much higher probability (i.e. nonzero) than large numbers (those higher than the cap have zero probability).

Maybe this will serve as an intuition pump: suppose you've narrowed down your cap to just a few numbers. In fact, just N and 2N. You've given them each equal weight. Well, now p(1) = (1/N + 1/2N)/2 = 3/4 N, but p(N+k) = 1/4 N, and p(2N+k) = 0. The probability goes down as numbers get larger. Determine your priors over all the caps, compute the resulting distribution, and you'll find p(n) eventually start to decrease.

comment by MrHen · 2009-05-04T22:28:18.251Z · LW(p) · GW(p)

(Edit) After rereading my own comment, I do not think much of anything in here make sense. Feel free to ignore it completely. I know what I was trying to say but failed miserably. Sorry.

But now you are playing semantics and are making artificial definitions on the types of beads in the jar. This is definitely not no information and somewhat demeans the original example. If we switched the example to balls with integers printed on them you would have no linguistic basis to say there are only twelve options. I am just assuming that this is a better example than the colored beads. If you specifically meant the article to use "no information" to exclude "linguistic hints" than I would be forced to agree with your conclusion. Relevant quotes from the original post:

But because you start with no information, it's very hard to gather more.

Assuming you don't think Omega is out to deliberately screw with you, you could say that the probability is .083 based on the fact that "red" is one of twelve basic color words in English.

But this .083 guess is as wrong as .5 in the numbered balls example. The 50/50 guess has nothing to do with "red" and everything to do with guessing correctly. I could translate it into the following statement with no qualms:

"Omega will pull a bead in the color of his choosing."

If "color of his choosing" means red, okay. If it means blue, okay. I am not going to take one bet for each color because the color is unimportant until we see a bead come out of the jar.

Realistically, I would start at 0 because a bet with no information scares me, but the probability of "0" is no more wrong than ".5". It just carries less risk.

You should not guess that the first bead has a 50% chance of being red, because if you do, you can have this conversation: [snip]

With the numbered balls example, anything but 0 is a foolish response because instead of red, blue, green ... yellow it would be 1, 2, 3, 4 ... NAN. But even still, "0" is as wrong as ".5" because we have no information.

(Off-topic) This conversation strangely reminds me of talking about Pascel's Wager...

comment by Cameron_Taylor · 2009-05-05T14:16:35.321Z · LW(p) · GW(p)

Assuming that Omega is picking at random, it makes sense to me to simply pick a random percentage on the first pull and then swing to 0% or 100% once you see a bead. Update again on the second bead to 0%, 50%, or 100%. Continuing the process should eventually converge on the correct answer.

Picking a random percentage is not a particularly good idea. It will lead you to make absolutely insane bets for no good reason.

comment by John_Maxwell (John_Maxwell_IV) · 2009-05-04T20:19:28.258Z · LW(p) · GW(p)

In my opinion, a more interesting question is what game Omega can devise in which revealing your probability estimate is part of the winning strategy. If he asks you to name even odds and bet with him, for example, you could name ridiculous odds if you wanted more money. The only thing I can think of is for Omega to pour out part of the jar and reward you depending on your deviation from the correct percentage.

Replies from: Peter_de_Blanc
comment by Peter_de_Blanc · 2009-05-04T21:10:27.800Z · LW(p) · GW(p)

You could use a proper scoring rule.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2009-05-04T21:43:55.638Z · LW(p) · GW(p)

A more comprehensive description:

T. Gneiting & A. E. Raftery (2007). `Strictly Proper Scoring Rules, Prediction, and Estimation'. Journal of the American Statistical Association 102(477):359-378. [pdf]