↑ comment by alex_zag_al ·
2012-09-06T15:54:24.648Z · LW(p) · GW(p)
Okay. I'm assuming everyone has the same prior. I'm going to start by comparing the case where C talks to A and learns everything A knows, to the case where C talks to B and learns everything B knows; that is, when C ends up conditioning on all the same things. If you already see why those two cases are very different, you can skip down to the second section, where I talk about what this implies about how C updates when just hearing that A knows a lot and what Pa(X) is, compared to how he updates when learning what B thinks.
It's the same scenario as you described: knowlegable A, ignorant B, Pa(X) = Pb(X).
What happens when C learns everything B knows depends on what evidence C already has. If C knows nothing, then after talking to B, Pc(X) = Pb(X), because he'll be conditioning on exactly the same things.
In other words, if C knows nothing, then C is even more ignorant than B is. When he talks to B, he becomes exactly as ignorant as B is, and assigns the probability that you have in that state of ignorance.
It's only if C already has some evidence that talking to A and talking to B becomes different. As Kindly said, Pa(X) is very stable. So once C learns everything that A knows, C ends up with the probability Pa(X|whatever C knew), which is probably a lot like Pa(X). To take an extreme case, if A is well-informed enough, then she already knows everything C knows, and Pa(X|whatever C knew) is equal to Pa(X), and C comes out with exactly the same probability as A. But if C's info is new to A, then it's probably a lot like telling your biochemistry professor about a study that you read weighing in on one side of a debate: she's seen plenty of evidence for both sides, and unless this new study is particularly conclusive, it's not going to change her mind a whole lot.
However, B's probability is not stable. That biochemistry study might change B's mind a lot, because for all she knows, there isn't even a debate, and she has this pretty good evidence for one side of it.
So, once C talks to B and learns everything B knows, C will be using the probability that incorporates all of B's knowledge, plus his own: Pb(X|whatever C knew). This is probably farther from Pb(X) aka Pa(X) than Pa(X|whatever C knew).
This is just how it would typically go. I say A's probability is more "stable", but there's actually some evidence that A would recognize as extremely significant that would mean nothing to B. In this case, one C has learned everything A knows, he would also recognize the significance of the little bit of knowledge that he came in with, and end up with a probability far different from Pa(X).
So that's how it would probably go if C actually sits down and learns everything they know.
So, what if C just knows that A is knowledgable, and Pa(X)? Well, suppose that C is convinced by my reasoning, that if he sat down with A and learned everything she knew, then her probability of X would end up pretty close to Pa(X).
Here's the key thing: If C expects that, then his probability is already pretty close to Pa(X). All C knows is that A is knowledgable and has Pa(X), but if he expects to be convinced after learning everything A knows, then he already is convinced.
For any event Q, P(X) is equal to the expected value of P(X|the outcome of Q). That is, you don't know the outcome of Q, but if there's N mutually exclusive possible outcomes O_1... O_N, then P(X) = P(X|O_1)P(O_1) + ... + P(X|O_N)P(O_N). This is one way of stating Conservation of Probability. If the expected value of Pc(X|the outcome of learning everything A knows) is pretty close to Pa(X), then, well, Pc(X) must be pretty close too, because the expected value of Pc(X|the outcome of learning everything A knows) is equal to Pc(X).
Likewise, if C learns about B's knowledge and Pb(X), and he doesn't think that learning everything B knows would make much of a difference, then he also doesn't end up matching Pb(X) unless he started out matching before he even learned B's testimony.
I've been assuming that A's knowledge makes her probability more "stable"; Pa(X|one more piece of evidence) is close to Pa(X). What if A is knowledgable but unstable? I think it still works out the same way but I haven't worked it out and I have to go.
PS: This is a first attempt on my part. Hopefully it's overcomplicated and overspecific, so we can work out/receive a more general/simple answer. But I saw that nobody else had replied so here ya go.