Right for the Wrong Reasons
post by katydee · 2013-01-24T00:02:50.232Z · LW · GW · Legacy · 67 commentsContents
67 comments
One of the few things that I really appreciate having encountered during my study of philosophy is the Gettier problem. Paper after paper has been published on this subject, starting with Gettier's original "Is Justified True Belief Knowledge?" In brief, Gettier argues that knowledge cannot be defined as "justified true belief" because there are cases when people have a justified true belief, but their belief is justified for the wrong reasons.
For instance, Gettier cites the example of two men, Smith and Jones, who are applying for a job. Smith believes that Jones will get the job, because the president of the company told him that Jones would be hired. He also believes that Jones has ten coins in his pocket, because he counted the coins in Jones's pocket ten minutes ago (Gettier does not explain this behavior). Thus, he forms the belief "the person who will get the job has ten coins in his pocket."
Unbeknownst to Smith, though, he himself will get the job, and further he himself has ten coins in his pocket that he was not aware of-- perhaps he put someone else's jacket on by mistake. As a result, Smith's belief that "the person who will get the job has ten coins in his pocket" was correct, but only by luck.
While I don't find the primary purpose of Gettier's argument particularly interesting or meaningful (much less the debate it spawned), I do think Gettier's paper does a very good job of illustrating the situation that I refer to as "being right for the wrong reasons." This situation has important implications for prediction-making and hence for the art of rationality as a whole.
Simply put, a prediction that is right for the wrong reasons isn't actually right from an epistemic perspective.
If I predict, for instance, that I will win a 15-touch fencing bout, implicitly believing this will occur when I strike my opponent 15 times before he strikes me 15 times, and I in fact lose fourteen touches in a row, only to win by forfeit when my opponent intentionally strikes me many times in the final touch and is disqualified for brutality, my prediction cannot be said to have been accurate.
Where this gets more complicated is with predictions that are right for the wrong reasons, but the right reasons still apply. Imagine the previous example of a fencing bout, except this time I score 14 touches in a row and then win by forfeit when my opponent flings his mask across the hall in frustration and is disqualified for an offense against sportsmanship. Technically, my prediction is again right for the wrong reasons-- my victory was not thanks to scoring 15 touches, but thanks to my opponent's poor sportsmanship and subsequent disqualification. However, I likely would have scored 15 touches given the opportunity.
In cases like this, it may seem appealing to credit my prediction as successful, as it would be successful under normal conditions. However, I think we perhaps have to resist this impulse and instead simply work on making more precise predictions. If we start crediting predictions that are right for the wrong reasons, even if it seems like the "spirit" of the prediction is right, this seems to open the door for relying on intuition and falling into the traps that contaminate much of modern philosophy.
What we really need to do in such cases seems to be to break down our claims into more specific predictions, splitting them into multiple sub-predictions if necessary. My prediction about the outcome of the fencing bout could better be expressed as multiple predictions, for instance "I will score more points than my opponent" and "I will win the bout." Some may notice that this is similar to the implicit justification being made in the original prediction. This is fitting-- drawing out such implicit details is key to making accurate predictions. In fact, this example itself was improved by tabooing[1] "better" in the vague initial sentence "I will fence better than my opponent."
In order to make better predictions, we must cast out those predictions that are right for the wrong reasons. While it may be tempting to award such efforts partial credit, this flies against the spirit of the truth. The true skill of cartography requires forming both accurate and reproducible maps; lucking into accuracy may be nice, but it speaks ill of the reproducibility of your methods.
[1] I greatly suggest that you make tabooing a five-second skill, and better still recognizing when you need to apply it to your own processes. It pays great dividends in terms of precise thought.
67 comments
Comments sorted by top scores.
comment by royf · 2013-01-24T01:22:54.251Z · LW(p) · GW(p)
Predictions are justified not by becoming a reality, but by the likelihood of their becoming a reality [1]. When this likelihood is hard to estimate, we can take their becoming a reality as weak evidence that the likelihood is high. But in the end, after counting all the evidence, it's really only the likelihood itself that matters.
If I predict [...] that I will win [...] and I in fact lose fourteen touches in a row, only to win by forfeit
If I place a bet on you to win and this happens, I'll happily collect my prize, but still feel that I put my money on the wrong athlete. My prior and the signal are rich enough for me to deduce that your victory, although factual, was unlikely. If I believed that you're likely to win, then my belief wasn't "true for the wrong reasons", it was simply false. If I believed that "you will win" (no probability qualifier), then in the many universes where you didn't I'm in Bayes Hell.
Conversely in the other example, your winning itself is again not the best evidence for its own likelihood. Your scoring 14 touches is. My belief that you're likely to win is true and justified for the right reasons: you're clearly the better athlete.
[1] Where likelihood is measured either given what I know, or what I could know, or what anybody could know - depending on why we're asking the question in the first place.
Replies from: AlexSchell, CronoDAS↑ comment by AlexSchell · 2013-02-01T03:38:13.516Z · LW(p) · GW(p)
I notice that I am confused. What you say seems plausible but also in conflict with the (also plausible) Yudkowskian creed that probability is in the map.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-02-01T04:40:35.445Z · LW(p) · GW(p)
Can you clarify the conflict? It seems to me that when I treat observations as evidence with which to update my estimate of the likelihood of a prediction, as the OP describes, I'm doing a bunch of "map-level" operations.
Replies from: AlexSchell↑ comment by AlexSchell · 2013-02-01T04:56:52.266Z · LW(p) · GW(p)
I would like to be able to clarify it; as of now it's only my own confusion. In my confusion, "estimate of the likelihood of a prediction" sounds like assigning probabilities to probability statements, which feels like a map-territory reversal of some sort.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-02-01T05:58:43.073Z · LW(p) · GW(p)
Does it help to reread royf's footnote?
[1] Where likelihood is measured either given what I know, or what I could know, or what anybody could know - depending on why we're asking the question in the first place.
That is, he's not talking about some thing out there in the world which is independent of our minds, nor am I when I adopt his terminology. The likelihood of a prediction, like all probability judgments, exists in the mind and is a function of how evidence is being evaluated. Indeed, any relationship between a prediction and a state of events in the world exists solely in the mind to begin with.
Replies from: royf↑ comment by royf · 2013-02-01T18:04:13.392Z · LW(p) · GW(p)
To clarify further: likelihood is a relative quantity, like speed - it only has meaning relative to a specific frame of reference.
If you're judging my calibration, the proper frame of reference is what I knew at the time of prediction. I didn't know what the result of the fencing match would be, but I had some evidence for who is more likely to win. The (objective) probability distribution given that (subjective) information state is what I should've used for prediction.
If you're judging my diligence as an evidence seeker, the proper frame of reference is what I would've known after reasonable information gathering. I could've taken some actions to put myself in a difference information state, and then my prediction could be better.
But it's unreasonable to expect me to know the result beyond any doubt. Even if Omega is in an information state of perfectly predicting the future, this is never a proper frame of reference by which to judge bounded agents.
And this is the major point on which I'm non-Yudkowskian: since Omega is never a useful frame of reference, I'm not constraining reality to be consistent with it. In this sense, some probabilities are in the territory.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-02-01T20:01:23.694Z · LW(p) · GW(p)
since Omega is never a useful frame of reference, I'm not constraining reality to be consistent with it. In this sense, some probabilities are in the territory.
I thought I was following you, but you lost me there.
I certainly agree that if I want to evaluate various aspects of your cognitive abilities based on your predictions, I should look at different aspects of your predictions depending on what abilities I care about, as you describe, and that often the accuracy of your prediction is not the most useful aspect to look at. And of course I agree that expecting perfect knowledge is unreasonable.
But what that has to do with Omega, and what the uselessness of Omega as a frame of reference has to do with constraints on reality, I don't follow.
Replies from: royf↑ comment by royf · 2013-02-01T21:22:13.766Z · LW(p) · GW(p)
I probably need to write a top-level post to explain this adequately, but in a nutshell:
I've tossed a coin. Now we can say that the world is in one of two states: "heads" and "tails". This view is consistent with any information state. The information state (A) of maximal ignorance is a uniform distribution over the two states. The information state (B) where heads is twice as likely as tails is the distribution p("heads") = 2/3, p("tails") = 1/3. The information state (C) of knowing for sure that the result is heads is the distribution p("heads") = 1, p("tails") = 0.
Alternatively, we can say that the world is in one of these two states: "almost surely heads" and "almost surely tails". Now information state (A) is a uniform distribution over these states; (B) is perhaps the distribution p("ASH") = 0.668, p("AST") = 0.332; but (C) is impossible, and so is any information state that is more certain than reality in this strange model.
Now, in many cases we can theoretically have information states arbitrarily close to complete certainty. In such cases we must use the first kind of model. So we can agree to just always use the first kind of model, and avoid all this silly complication.
But then there are cases where there are real (physical) reasons why not every information state is possible. In these cases reality is not constrained to be of the first kind, and it could be of the second kind. As a matter of fact, to say that reality is of the first kind - and that probability is only in the mind - is to say more about reality than can possibly be known. This goes against Jaynesianism.
So I completely agree that not knowing something is a property of the map rather than the territory. But an impossibility of any map to know something is a property of the territory.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-02-01T21:45:16.401Z · LW(p) · GW(p)
the world is in one of two states: "heads" and "tails". [..] The information state (C) of knowing for sure that the result is heads is the distribution p("heads") = 1, p("tails") = 0.
Sure. And (C) is unachievable in practice if one is updating one's information state sensibly from sensible priors.
Alternatively, we can say that the world is in one of these two states: "almost surely heads" and "almost surely tails". Now information state (A) is a uniform distribution over these states
I am uncertain what you mean to convey in this example by the difference between a "world state" (e.g., ASH or AST) and an "information state" (e.g. p("ASH")=0.668).
The "world state" of ASH is in fact an "information state" of p("heads")>SOME_THRESHOLD, which is fine if you mean those terms to be denotatively synonymous but connotatively different, but problematic if you mean them to be denotatively different.
...but (C) is impossible . (C), if I'm following you, maps roughly to the English phrase "I know for absolutely certain that the coin is almost surely heads".
Yes, agreed that this is strictly speaking unachievable, just as "I know for absolutely certain that the coin is heads" was.
That said, I'm not sure what it means for a human brain to have "I know for absolutely certain that the coin is almost surely heads" as a distinct state from "I am almost sure the coin is heads," and the latter is achievable.
So we can agree to just always use the first kind of model, and avoid all this silly complication.
Works for me.
But then there are cases where there are real (physical) reasons why not every information state is possible.
And now you've lost me again. Of course there are real physical reasons why certain information states are not possible... e.g., my brain is incapable of representing certain thoughts. But I suspect that's not what you mean here.
Can you give me some examples of the kinds of cases you have in mind?
Replies from: royf↑ comment by royf · 2013-02-01T22:55:43.945Z · LW(p) · GW(p)
The "world state" of ASH is in fact an "information state" of p("heads")>SOME_THRESHOLD
Actually, I meant p("heads") = 0.999 or something.
(C), if I'm following you, maps roughly to the English phrase "I know for absolutely certain that the coin is almost surely heads".
No, I meant: "I know for absolutely certain that the coin is heads". We agree that this much you can never know. As for getting close to this, for example having the information state (D) where p("heads") = 0.999999: if the world is in the state "heads", (D) is (theoretically) possible; if the world is in the state "ASH", (D) is impossible.
Can you give me some examples of the kinds of cases you have in mind?
Mundane examples may not be as clear, so: suppose we send a coin-flipping machine deep into intergalactic space. After a few billion years it flies permanently beyond our light cone, and then flips the coin.
Now any information state about the coin, other than complete ignorance, is physically impossible. We can still say that the coin is in one of the two states "heads" and "tails", only unknown to us. Alternatively we can say that the coin is in a state of superposition. These two models are epistemologically equivalent.
I prefer the latter, and think many people in this community should agree, based on the spirit of other things they believe: the former model is ontologically more complicated. It's saying more about reality than can be known. It sets the state of the coin as a free-floating property of the world, with nothing to entangle with.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-02-02T02:17:35.665Z · LW(p) · GW(p)
OK. Thanks for clarifying.
↑ comment by CronoDAS · 2013-01-24T02:20:45.458Z · LW(p) · GW(p)
That suggests a question.
If I flip a fair coin, and it comes up heads, what is the probability of that coin flip, which I already made, having instead been tails? (Approximately) 0, because we've already seen that the coin didn't come up tails, or (approximately) 50%, because it's a fair coin and we have no way of knowing the outcome in advance?
Replies from: dspeyer, army1987, Elithrion, nshepperd, royf↑ comment by dspeyer · 2013-01-24T04:52:42.703Z · LW(p) · GW(p)
If you define "probability" as something that exists in a mind then it's perfectly reasonable that you_then.prob != you_now.prob.
If you're defining "probability" in some other way, please explain what you mean.
↑ comment by A1987dM (army1987) · 2013-01-25T03:27:04.267Z · LW(p) · GW(p)
As Jaynes suggested, it's best to view all probabilities as conditional. P(coin came up heads | what i know now) = 1, P(coin came up heads | what i knew before flipping it) = 0.5.
↑ comment by Elithrion · 2013-01-24T05:39:51.844Z · LW(p) · GW(p)
The way I look at it is that before the coin flip you obtain a probability from information you have at the time and you predict ~50%. After the coin flip, you have obtained new information, so the probability that the last coin flip came up tails becomes ~100% (because it did), and the new information also gives you a tiny bit of data that says "maybe the coin comes up heads more often", so you also update to ~50.005% heads for the next one (or whatever). So, yes, the probability that the coin came up tails last try becomes ~100%, you just couldn't estimate it from the information you had and with just your brain beforehand (an AGI would've probably immediately seen how much force is going into the flip and calculated it all out and seen ~100% probability).
Although if you have an event that's heavily influenced by quantum magic, which a coin flip is not, you might need to consider that maybe it did have true 50% probability (that is, no amount of information and processing power would improve the prediction), and you just lost half your world's measure.
↑ comment by nshepperd · 2013-01-24T03:01:54.050Z · LW(p) · GW(p)
The relevant number for the purpose of judging the "correctness" of predictions is the probability you should have had at the time of the prediction (ie. the epistemically correct prior). Whether the outcome of the coin flip is heads or tails, the correct prior odds are 1:1, because you had no evidence either way.
If royf bets that katydee will win a fencing bout, and in fact they lose fourteen touches in a row, only to win by forfeit, we should update towards suspecting that royf miscounted the evidence, since such a bad athlete should not have much strong evidence predicting their winning. We should believe that the correct prior probability of katydee winning (from royf's point of view) is lower than royf thought it was.
↑ comment by royf · 2013-01-24T02:52:33.745Z · LW(p) · GW(p)
we've already seen [...] or [...] in advance
Does this answer your question?
Replies from: CronoDAS↑ comment by CronoDAS · 2013-01-24T04:06:17.322Z · LW(p) · GW(p)
Not really.
Let me elaborate:
In a book of his, Daniel Dennett appropriates the word "actualism" to mean "the belief that only things that have actually happened, or will happen, are possible." In other words, all statements that are false are not only false, but also impossible: If the coin flip comes up heads, it was never possible for the coin flip to have come up tails. He considers this rather silly, says there are good reasons for dismissing it that aren't relevant to the current discussion, and proceeds as though the matter is solved. This strikes me as one of those philosophical positions that seem obviously absurd but very difficult to refute in practice. (It also strikes me as splitting hairs over words, so maybe it's just a wrong question in the first place?)
Replies from: simplicio, RobbBB, royf, MugaSofer↑ comment by simplicio · 2013-01-24T23:54:18.141Z · LW(p) · GW(p)
In a book of his, Daniel Dennett appropriates the word "actualism" to mean "the belief that only things that have actually happened, or will happen, are possible." In other words, all statements that are false are not only false, but also impossible: If the coin flip comes up heads, it was never possible for the coin flip to have come up tails.
Taboo "possible." My take: in the absence of real physical indeterminism (which I doubt exists), "possible" is basically an epistemic term meaning "my model does not rule this out." So actualism is wrong, on my view, because it projects the limitations of my mind onto the future causal evolution of the universe.
Replies from: whowhowho, CronoDAS↑ comment by whowhowho · 2013-01-25T03:10:42.450Z · LW(p) · GW(p)
My take: in the absence of real physical indeterminism (which I doubt exists), "possible" is basically an epistemic term meaning "my model does not rule this out."
You may doubt that real physical indeterminism exists; others do not. The problem is that communication hinges on shared meanings, so if you change your meanings to reflect beliefs you have and others don't, confusion may ensue.
Replies from: simplicio↑ comment by simplicio · 2013-01-25T04:01:31.649Z · LW(p) · GW(p)
True; however, even granting physical indeterminism, in most cases we can say that what possibility there is is epistemic. For example, whether katydee wins her fencing match probably does not depend closely on the result of some quantum event. (Although there is an interesting resonance between the probability she assigns to winning, and her actual likelihood of winning - but that's a whole other kettle of worms.)
Replies from: whowhowho↑ comment by Rob Bensinger (RobbBB) · 2013-01-25T01:50:59.844Z · LW(p) · GW(p)
On LessWrong, we generally use 'possible,' 'necessary,' 'probable,' etc. epistemically. Epistemic actualism, the doctrine that all events that occur have epistemic probability 1 (or approaching 1), is clearly absurd, since it requires that I be mistaken about nothing, and have perfect epistemic access to all facts in the universe. (But, of course, by 'actualism' no one ever means 'epistemic actualism'.)
On the other hand, metaphysical actualism seems quite reasonable; indeed, thee metaphysical non-actualist has a lot of ground to cover in establishing what s/he even means by 'metaphysically non-actual events'. Are non-actual 'worlds' abstract, for instance? Concrete? Neither? Both? Actually existent as non-actuals? Actually non-existent as non-actuals? Meinongian? And how do we gain any epistemic access to these mysterious possibilia? Even if you aren't a Lewisian modal realist, asserting anything but actualism (or, equivalently, necessitarianism) with respect to metaphysical modality seems... spooky.
Replies from: whowhowho↑ comment by whowhowho · 2013-01-25T02:33:22.125Z · LW(p) · GW(p)
We can epistemically access possible but non actual worlds by noting that they are not against known laws of nature...what is not impossible is possible.
Replies from: RobbBB↑ comment by Rob Bensinger (RobbBB) · 2013-01-25T03:32:00.691Z · LW(p) · GW(p)
We can epistemically access possible but non actual worlds by noting that they are not against known laws of nature
There are two options for what you're trying to do here:
(1) You're trying to analyze away metaphysical-possibilityspeak in terms of metaphysical-lawspeak. I.e., there's nothing we could discover or learn that would disassociate these two concepts; one is simply an definitional analysis of the other. In which case, we can simply discard the idea of metaphysical possibility, to avoid miscommunication (since most people do not understand it in this way), and speak only of the laws of nature.
(2) You're leaving the concepts distinct, but explaining that it just is the case that 'what is lawful is possible, and what is "against the (natural) law" is impossible', even though this is not synonymous with saying 'what is possible is possible, and what is impossible is impossible'. That is, this is a substantive metaphysical thesis.
If you mean to be asserting (2), the metaphysical rather than semantic thesis (i.e., the non-trivial and interesting one), then I ask: What is your basis for this claim? What is your prior grasp on metaphysical possibility, such that you can be confident of its relationship to natural law? Are the laws of nature themselves contingent, or necessary? What evidence could we use to decide the matter one way or the other?
Replies from: whowhowho↑ comment by whowhowho · 2013-01-25T17:50:13.289Z · LW(p) · GW(p)
we can simply discard the idea of metaphysical possibility, to avoid miscommunication (since most people do not understand it in this way),
Because most people do understand it epistemically/subjectively? I think there are many kinds of possibility and many kinds of laws, and different kinds of possibility, and we make judgements about possibility based on lows. The nomologically possible is that which is allowed by the laws of nature, the logically possible is that which is not contradictory, which follows the law of non contradiction, and the epistemically possible is that which does not contradict anything I already know. So I think the kinds of possibility have a family resemblance, and there is no issue of dsicarding the other kinds in favour of epistemic possibility. (I am however happy to deflate a "possible world" into a "hypothetical state of affairs that is allowed by such-and-such laws"). into
Replies from: RobbBB↑ comment by Rob Bensinger (RobbBB) · 2013-01-25T19:44:05.268Z · LW(p) · GW(p)
Because most people do understand it epistemically/subjectively?
No. Most English language speakers use modal terms both epistemically and metaphysically. My point was that most people, both lay- and academic, do not use 'p is (metaphysically) possible' to mean 'p is not ruled out by the laws of physics'. If they did, then they wouldn't understand anthropic arguments that presuppose the contingency of the physical laws themselves.
I think there are many kinds of possibility and many kinds of laws
Then I don't know what claim you're making anymore. Taboo 'law'; what is it you're actually including in this 'law' category, potentially?
I think the kinds of possibility have a family resemblance, and there is no issue of dsicarding the other kinds in favour of epistemic possibility.
But you still haven't explained what a 'merely possible' thing is. If logical and nomological possibility are metaphysical, then you owe us an account of what kinds of beings or thingies these possibilia are. On the other hand, if you reduce logical and nomological possibility to epistemic possibility -- logical necessity is what I can infer from a certain set of logical axioms alone, logical possibility is what I can't infer the negation of from some set of axioms, nomological necessity is what I know given only a certain set of 'natural laws'.... but if we epistemologize these forms of necessity, then we collapse everything into the epistemic, and no longer owe any account of mysterious 'possible worlds' floating out there in the aether.
Replies from: whowhowho↑ comment by whowhowho · 2013-01-25T20:16:27.963Z · LW(p) · GW(p)
do not use 'p is (metaphysically) possible' to mean 'p is not ruled out by the laws of physics'. If they did, then they wouldn't understand anthropic arguments that presuppose the contingency of the physical laws themselves.
If that is meant to indicate there is some specific sense of possible that is used instead, I doubt that. Consider the following:
A: "Are perpertual motions machines possible?"
B: "I don;t see why not"
A: "Ah, but theyre against the laws of thermodynamics "
B: "Ok, they.re impossible".
A: "But could the laws of phsyics have been different..?"
B: "I suppse so. I don't know what makes them thew way they are".
AFAICS, B has gone through as many of 3 different notions of possibility there.
But you still haven't explained what a 'merely possible' thing is.
I don't think there is "mere" possibility, if it means subtracting the X from "something is X-ly possible if it is allowed by X-ical laws".
If logical and nomological possibility are metaphysical, then you owe us an account of what kinds of beings or thingies these possibilia are. On the other hand, if you reduce logical and nomological possibility to epistemic possibility
What they are would depend on the value of X. Family resemblance.
↑ comment by royf · 2013-01-24T13:06:15.613Z · LW(p) · GW(p)
This is perhaps not the best description of actualism, but I see your point. Actualists would disagree with this part of my comment:
If I believed that "you will win" (no probability qualifier), then in the many universes where you didn't I'm in Bayes Hell.
on the grounds that those other universes don't exist.
But that was just a figure of speech. I don't actually need those other universes to argue against 0 and 1 as probabilities. And if Frequentists disbelieve in that, there's no place in Bayes Heaven for them.
↑ comment by MugaSofer · 2013-01-25T11:01:33.225Z · LW(p) · GW(p)
Well, assuming a strict definition of "possible", it's just determinism; if God's playing dice then "actualism" is false, and if he's not then it's true.
Assuming a useful definition of possible, then it's trivially false.
Looks like yet another argument over definitions.
comment by Qiaochu_Yuan · 2013-01-24T00:18:41.849Z · LW(p) · GW(p)
I'm confused about what your goal is in making the kind of predictions you're talking about. If, for example, you're betting money on the outcome of your upcoming fencing bout, your estimate of how likely you are to win should include all hypotheses about possible paths to victory, and any of those paths should qualify as a validation of your prediction. If the decision you want to make is about something other than the outcome of the bout, e.g. how well you fence, then you should make a prediction about that instead.
In other words, the fencing bout looks like a toy example and I don't understand what non-toy examples it's supposed to be illustrating.
Replies from: Will_Newsome↑ comment by Will_Newsome · 2013-01-24T02:03:35.848Z · LW(p) · GW(p)
The placebo effect is perhaps exemplary? Many interventions do work, but their mechanism isn't the expected one, which can lead to poor decisions.
comment by wedrifid · 2013-01-24T09:59:56.306Z · LW(p) · GW(p)
In order to make better predictions, we must cast out those predictions that are right for the wrong reasons. While it may be tempting to award such efforts partial credit, this flies against the spirit of the truth.
Are you going to reward me for being wrong for the right reasons? If not I want to know who is skimming 'credit' off the top.
Replies from: katydee, epigeios↑ comment by katydee · 2013-01-24T10:45:32.310Z · LW(p) · GW(p)
Are you going to reward me for being wrong for the right reasons?
It's very difficult for humans to actually isolate such cases, but in principle yes. After all, one in twenty predictions that you make with 95% confidence should turn out to be wrong. Just as lucking into accuracy doesn't mean you're right, lucking into inaccuracy doesn't mean you're wrong.
Replies from: Decius↑ comment by epigeios · 2013-02-07T03:22:39.110Z · LW(p) · GW(p)
You're the one skimming credit off the top.
My interpretation of this point is that the person doing the rewarding and punishing is the person doing the predicting.
This hints at the deeper problem, too: that the subconscious reinforcement of these predictions is causing them to continue. the most common reward is that a wrong prediction seems right. For most people, that is a reward in and of itself.
So the real question is: are you going to reward yourself for being wrong for the right reasons? how about being right for the wrong reasons?
comment by Omegaile · 2013-01-24T07:42:04.573Z · LW(p) · GW(p)
Lets abstract about this:
There are 2 unfair coins. One has P(heads)=1/3 and the other P(heads)=2/3. I take one of them, flip twice and it turns heads twice. Now I believe that the coin chosen was the one with P(heads)=2/3. In fact there are 4/5 likelihood of being so. I also believe that flipping again will turn heads again, mostly because I think that I choose the 2/3 heads coin (p=8/15). I also admit the possibility of getting heads but being wrong about the chosen coin, but this is much less likely (p=1/15). So I bet on heads. So I flip it again and it turns heads. I was right. But it turns out that the coin was the other one, the one with P(heads)=1/3 (which I found after a few hundreds flips). Would you say I was right for the wrong reasons? Well I was certainly surprised to find out I had the wrong coin. Does this apply for the Gettier problem?
Lets go back to the original problem to see that this abstraction is similar. Smith believes "the person who will get the job has ten coins in his pocket". And he does that mostly because he thinks Jones will get it and has ten coins. But if he is reasonable, he will also admit the possibility of he getting the job and also having ten coins, although with lower probability.
My point here is: at which probability the Gettier problem arises? Would it arises if in the coin problem P(heads) was different?
Replies from: mfb↑ comment by mfb · 2013-01-25T18:09:08.382Z · LW(p) · GW(p)
I think it arises at the point where you did not even consider the alternative. This is a very subjective thing, of course.
If the probability of the actual outcome was really negligible (with a perfect evaluation by the prediction-maker), this should not influence the evaluation of predictions in a significant way. If the probability was significant, it is likely that the prediction-maker considered it. If not, count it as false.
comment by Vladimir_Nesov · 2013-01-24T01:58:03.960Z · LW(p) · GW(p)
A simpler maxim is to pay attention to (fixing) cognitive errors at all times, without excuses. Correctness of a prediction is potentially useful data, but it's also an excuse for overlooking flaws in the prediction procedure.
Fixing a flaw in a procedure (behavior, skill, intuition) is a task that's separate from updating the prediction. Predictions need updating simply as a matter of recycling cached thoughts, which might be useful even if you are not fixing any particular reasoning error.
There is also a more subtle kind of cognitive errors where you can correctly solve a problem using the right systematic/conscious/formal method, but your intuition (maybe just one of the relevant intuitive mental models) is out of tune with this process. It's useful to have at your disposal a somewhat reliable intuitive guidance in thinking about a problem (it can be crucial in forming a plan for solving it), so when a discrepancy appears, it means that there is a bug either in the intuition or in the more formal procedure, which motivates looking into the discrepancy in more detail, and fixing the bug.
Replies from: ygert↑ comment by ygert · 2013-01-24T16:18:49.915Z · LW(p) · GW(p)
A simpler maxim is to pay attention to (fixing) cognitive errors at all times, without excuses. Correctness of a prediction is potentially useful data, but it's also an excuse for overlooking flaws in the prediction procedure.
But what are the (potential) flaws in the prediction procedure? The only way to figure that out is to see which cognitive behaviors lead to accuracy, and which lead to error. It is all very well to say that we should not perform cognitive errors, but that does not help us with the problem, because what a cognitive error is, is defined by it leading us away from the truth.
comment by A1987dM (army1987) · 2013-01-24T15:39:46.729Z · LW(p) · GW(p)
(The first thing I thought when I saw the title of this post in the sidebar, before even following the link, was “Gettier, schmettier, won't you people taboo ‘know’ already?” But anyway...)
(Gettier does not explain this behavior)
I assume he was going to write that Smith believed that he would get the job himself and had counted the coins in his own pocket but Jones got the job instead and happened to also have ten coins, and then some massive brain fart occurred.
comment by grouchymusicologist · 2013-01-24T05:21:14.494Z · LW(p) · GW(p)
Thanks for this post. Whatever problems the JTB definition of knowledge may have—the most obvious one of those to LWers probably being the treatment of "knowledge" as a binary condition—the Gettier problem has always struck me as being a truly ridiculous critique, for just the reasons you put forward here.
comment by Manfred · 2013-01-24T17:18:28.619Z · LW(p) · GW(p)
Seeing something like someone with 14 points lose on the last point should actually drive your probability closer to 1/2 of winning a 1v1. That is, you can't just discount it. You just have to take into account all the stuff you learned when, er, learning things.
comment by khafra · 2013-01-31T20:04:35.516Z · LW(p) · GW(p)
Events are subsets of outcome space, not just single outcomes. If your prediction is merely "I will be declared the winner of this fencing match," and space aliens mind-control the judges so that they declare you the winner before you even arrive, you were correct, because that outcome is within the subset you predicted. If you didn't predict a specific mechanism, go ahead and laud yourself for predicting the correct event.
However, my own predictions (and, I suspect, those of most humans) are usually of a different character. They include some causal narrative. I think the tricky part is either removing that narrative from your prediction, or explicitly including what does and does not fall inside that narrative as separate predictions.
comment by Vaniver · 2013-01-25T02:46:36.867Z · LW(p) · GW(p)
I strongly recommend taking a look at Epistemology and the Psychology of Human Judgment by Bishop and Trout (badger's review).
Replies from: katydeecomment by AlexSchell · 2013-02-10T06:06:50.817Z · LW(p) · GW(p)
Here's a simple analysis of what's going on when we feel a prediction is right for the wrong reasons:
One function of the habit of making firm predictions is to test our models of the world and be able to give others strong evidence about how good our models are. What's happening when we are right for the wrong reasons is that our model is confirmed by the explicitly predicted outcome but then disconfirmed by some detail that wasn't described in the prediction. The subsequent disconfirmation might be strong enough to cancel out the initial confirmation, or stronger. Even if it only partially cancels out the initial confirmation, focusing only on the evidence given by the explicitly predicted outcome exaggerates the extent to which our model is confirmed by the evidence.
Your fencing example: based solely on disagreement about the the fencers' relative ability, two predictors make different predictions about whether I win or my opponent wins. I lose 14 touches, but my opponent is disqualified, etc. "My" predictor collects his money, but his model is not clearly vindicated. If we focus only on the explicitly predicted event, his model assigned it a greater probability than the other predictors' model, and so is confirmed by the explicitly predicted outcome. But conditional on me winning, "my" predictor's model assigns a (much?) lower probability (compared to the other predictors' model) to the total evidence that I won in the stated circumstances. So the rest of the total evidence disconfirms "my" predictor's model, likely strongly enough that his model is disconfirmed on net.
The problem seems to be that coarse-grained predictions only provide coarse-grained information about the accuracy of our models of the world. Since we (qua rationalists) are interested in predictions because they are potentially strong evidence about our models, we should try to make our explicit predictions more fine-grained, as you recommend (while considering the obvious costs).
comment by bryjnar · 2013-01-24T03:05:09.764Z · LW(p) · GW(p)
I agree that "right for the wrong reasons" is an indictment of your epsitemic process: it says that you made a prediction that turned out correctly, but that actually you just got lucky. What is important for making future predictions is being able to pick the option that is most likely, since "being lucky" is not a repeatable strategy.
The moral for making better decisions is that we should not praise people who predict prima facie unlikely outcomes -- without presenting a strong rationale for doing so -- but who then happen to be correct. Amongst those who have made unusual but successful predictions we have to distinguish people who are reliably capable of insight from those who were just lucky. Pick your contrarians carefully.
There's a more complex case where your predictions are made for the "wrong" reasons, but they are still reliably correct. Say you have a disorder that makes you feel nauseous in proportion to the unlikeliness of an option, and you habitually avoid options that make you nauseous. In that case, it seems more that you've hit upon a useful heuristic than anything else. Gettier cases aren't really like this because they are usually more about luck than about reliable heuristics that aren't explicitly "rational"
comment by ksagan · 2014-05-08T17:59:00.199Z · LW(p) · GW(p)
I've been posting this around a lot lately (including on "Gettier in Zombie World"), still looking for a solid response.
I think Bayesian probability actually resolves Gettier problems, completely and (ironically, because Bayesian probability doesn't concern itself with this in the slightest) satisfyingly. Understanding that we only know likelihoods, not facts, is enough.
Situation: I know John had 10 coins in his pocket. I think he got the job. I don't know that Smith had 10 coins in his pocket. Do I "know" that the person who got the job had 10 coins in their pocket?
Classic Gettier Interpretation:
- Belief-Jones had 10 coins in his pocket
- Belief-Jones got the job
- Conclusion-The person who got the job had 10 coins in their pocket
Bayesian Gettier Interpretation(Example numbers used for ease of intuition; minimal significant digits used for ease of calculation):
- Belief-It's likely (90%) John got the job
- Belief-It's likely (90%) John had 10 coins in his pocket
- Conclusion-It's likely (81%) John got the job with 10 coins in his pocket
...and...
- Belief-It's unlikely (10%) someone other than John got the job
- Belief-It's possible (50%) a given person had 10 coins in their pocket
- Conclusion-It's unlikely (5%) that someone other than John got the job with 10 coins in their pocket
...thus...
- Belief-It's likely (81%) John got the job with 10 coins in his pocket
- Belief-It's unlikely (5%) that someone other than John got the job with 10 coins in their pocket
- Conclusion-It's likely (86%) that someone got the job with 10 coins in their pocket
comment by Ichoran · 2013-02-13T21:03:02.508Z · LW(p) · GW(p)
The appropriate thing to do is apply (an estimate of) Bayes rule. You don't need to try to specify every possible outcome in advance; that is hopeless and a waste of effort. Rather, you extract the information that you got about what happened to create an improved prediction of what would have happened, and assign credit appropriately.
First, let's look at what we're trying to do. If you're trying to make good predictions, you want p(X | "X") to be as close to 1 as possible, where X is what happens, and "X" is what you say will happen.
If an unbiased observer initially would have predicted, say, p(you win at fencing) = 0.5, then initially the estimate of your accuracy for that statement would be 0.5; and after winning 14 touches in a row it would probably be somewhere around 0.999, which is nearly as good as it having been true (unless your accuracy is already in the 99.9%+ range, at which point this doesn't help refine the estimate of your accuracy).
So, you don't need to ask more precise questions. You do need to honestly evaluate in aborted trials whether there were dramatic shifts in the apparent probability of the outcome. When doing these things in real life, actually going through Bayesian mathematics is probably not worthwhile, but keeping the gist of it certainly is.
comment by DavidTC · 2013-02-01T20:27:18.467Z · LW(p) · GW(p)
My prediction about the outcome of the fencing bout could better be expressed as multiple predictions, for instance "I will score more points than my opponent" and "I will win the bout."
What if lightning struck the building a day earlier and the match is called off? Of if you get arrested on the way to the match and thus you forfeit by failure to show up? Or if your opponent forfeits, which makes the first prediction wrong?
In those circumstances, there's no reason to modify your prediction for the next time. While they were wrong, none of the wrongness had anything to do with your fencing skill.
A better set of statement might be "I will not score less points than my opponent." and "I will not lose the bout on points.". Although those logically reduce to almost the same prediction.
In fact, that seems like a reasonable general principle. Predictions often should state what won't happen than what will. Stating what won't happen allow you to automatically exclude many random outliers that are outside the scope of the prediction you're trying to make.
If you state the prediction in such way that an 'act of God' means you are right, not wrong, then when they occur you don't have to try to justify why your wrongness doesn't really count. Granted, it doesn't mean you were right, either, but, mentally, it's probably safer to say 'I technically got that right, but it doesn't really count' than 'I technically got that wrong, but it doesn't really count.'.
comment by KnaveOfAllTrades · 2013-01-25T20:32:58.004Z · LW(p) · GW(p)
Upvoted. This article makes an extremely important point.
I use correct/incorrect to refer to 'prediction and outcome coincided'/'made what turned out to be the most favourable choice'/etc.
I use right/wrong to refer to 'made the best prediction'/'chose the outcome that, as far as could be told, would be most favourable'/etc.
Smith was correct and wrong.
Although one might expect ambiguity to be a problem with these terms (e.g. 'right' becoming overloaded to the point of equivocation), in my experience it hasn't been one once they have been explained.
The thesis of the right/correct distinction is defying the data.
The antithesis is regret of rationality, i.e. predictably losing due to a flaw in a model. This is a hazard that arises from devotion to a theory or undervaluing data, which lead to insistence one is still right even as the defeats pile up.
As for the No True Fencing Victory thingy: that's simply insufficient correspondence between one's internal understanding (e.g. one visualises winning in a fifteen-point whitewash, specifies 'winning', and is saved from the jaws of defeat by a technicality.) Such cases are of being too imprecise or inaccurate. I generally lean very heavily towards 'a win is a win', because anything else often seems to stem from an unrealistic expectation of perfect specification.
comment by Armok_GoB · 2013-01-24T00:25:38.051Z · LW(p) · GW(p)
To me it seem the problem here is simply trying to treat natural language sentences as real things when they are only an approximate abstraction, that breaks down in these kinds of edge cases.
There are no discrete "belief's" with "justifications", there are only a probability distribution over the configuration space of all possible histories of sensory input. And that's just another layer of abstraction really, but it's enough for now.
Replies from: Peterdjones↑ comment by Peterdjones · 2013-01-24T00:44:19.446Z · LW(p) · GW(p)
how does a belief in invisible pixies relate to possible experience?
Replies from: dspeyer↑ comment by dspeyer · 2013-01-24T01:06:04.758Z · LW(p) · GW(p)
If you are picnicking in the woods and a loaf of bread you're looking at just vanishes, you have possibly experienced theft by an invisible pixie.
Replies from: Peterdjones↑ comment by Peterdjones · 2013-01-24T01:09:37.458Z · LW(p) · GW(p)
People can believe in unempirical things. That's a fact -- an empirical one, if you like. The standard theory that treats beliefs as linguistic, sentence-like things can handle such beliefs.. The question is whether the Possible Experience theory can.
Replies from: dspeyer, CronoDAS↑ comment by dspeyer · 2013-01-24T04:48:53.694Z · LW(p) · GW(p)
Ah, I think I see what you're saying. We're getting tangled up about the meaning of the word "belief" (also about faelore, but we'll ignore that for now).
If someone speaks of a flour-permeable dragon, would you say they "believe" the dragon exists? I think most people here wouldn't. But it's just a disagreement about a word.
↑ comment by CronoDAS · 2013-01-24T02:13:07.516Z · LW(p) · GW(p)
People can believe in unempirical things. That's a fact -- an empirical one, if you like. The standard theory that treats beliefs as linguistic, sentence-like things can handle such beliefs.
People can get wrong answers when they do math problems, but that doesn't make two plus two equal five.
Replies from: Peterdjones↑ comment by Peterdjones · 2013-01-24T02:14:50.336Z · LW(p) · GW(p)
What is that relevant to?
Replies from: CronoDAS, jooyous↑ comment by CronoDAS · 2013-01-24T03:04:44.550Z · LW(p) · GW(p)
Sorry, I should have made my metaphor clearer. According to what you've termed Possible Experience theory, if a person claims to have a "belief" about something unempirical, they're wrong. A theory that explains why people sometimes say "two plus two equals five" is a theory of psychology, not a theory of arithmetic; similarly, a theory that explains why people sometimes say things like "there is an invisible dragon in my garage" is a theory of psychology and not a theory of epistemology.
If you can think of a better word than "belief" for me to use when I mean something like "portion of a human's model of the universe", fine, we can use that word instead, but I'd rather not argue about the meanings of words right now.
Replies from: Peterdjones↑ comment by Peterdjones · 2013-01-24T03:26:49.866Z · LW(p) · GW(p)
if a person claims to have a "belief" about something unempirical, they're wrong.
Why si that relevant? The question is how to explain belief:-
"To me it seem the problem here is simply trying to treat natural language sentences as real things when they are only an approximate abstraction, that breaks down in these kinds of edge cases.
There are no discrete "belief's" with "justifications", there are only a probability distribution over the configuration space of all possible histories of sensory input. And that's just another layer of abstraction really, but it's enough for now."
Wrong beliefs exist
A theory that explains why people sometimes say "two plus two equals five" is a theory of psychology
exactly.
similarly, a theory that explains why people sometimes say things like "there is an invisible dragon in my garage" is a theory of psychology and not a theory of epistemology.
Epistemology can't ignore falsehood.
comment by Sengachi · 2013-01-24T04:36:27.756Z · LW(p) · GW(p)
Here is what I think is a better example of the Gettier problem, and a subsequent reason the Gettier problem is flawed in its definitions of truths.
You are driving down the highway, passing what appear to be several dozen barns. Unknown to you, all but one of these barns is a stage prop cutout. You decide to stop at one of these barns and by luck it is the only real one. You now have a belief (which is that the barns you see are real), which is justified, and in this case, true. But it cannot be called knowledge. Why? Because the belief is imprecise and leaves room for vagaries. A belief should describe the fundamental mechanisms of the universe. i.e. the presence of light patterns in format X indicates structure Y, because light interacts in ways Z. In this case the belief about the barns is unjustified and untrue, because there is an additional way format X could be created, by structure Y2 and light interaction Z2 (the cutout). Discovery of the real barn is only weak evidence for the belief that format X indicates a real barn, as the discovery proves the possibility thereof, but does not eliminate the alternative (cutouts). Under this new definition of belief, a concept of the universe fundamental mechanisms, as opposed to informal correlations, only accurate and precise beliefs that allow prediction generation constitute knowledge.
But this is not how we think. And for very good reason. Typically a scenario in which all options appear identical to cursory examination, and in which detailed examination provides some conclusion about one option, it can be a huge waste of time and effort to generate all theoretically possible contradictory scenarios and test them, not to mention the possibility that you may not think of or be able to test all such options. So our brain takes a mental shortcut. Barns appear to be same? Check. Barn 10 is a 3d barn? Check. Therefore all barns are 3d. Though nothing was falsified, it is a useful informal deduction which only fails us in extreme circumstances such as the problem listed above. But there is a very good reason that we don't use such logic in scientific experimentation. When we have not repeatedly experienced a phenomenon and have no hard-set reason to believe a correlation indicates causation, falsification is all we can trust. We don't have the huge backdrop of everyday data to fall back upon. Oftentimes we have a hard time realizing this though, and make assumptions as if we have such a backdrop when we don't.
Scientific Method: Don't do that.