## Posts

Polarization is Not (Standard) Bayesian 2023-09-16T16:31:14.227Z
ChatGPT challenges the case for human irrationality 2023-08-22T12:46:22.085Z
Rationalization Maximizes Expected Value 2023-07-30T20:11:26.377Z
Is the Endowment Effect Due to Incomparability? 2023-07-10T16:26:06.508Z
A Subtle Selection Effect in Overconfidence Studies 2023-07-03T14:43:11.287Z
The Case for Overconfidence is Overstated 2023-06-28T17:21:06.160Z

Comment by Kevin Dorst on Polarization is Not (Standard) Bayesian · 2023-09-18T12:33:56.231Z · LW · GW

Nope, it's the same thing!  Had meant to link to that post but forgot to when cross-posting quickly.  Thanks for pointing that out—will add a link.

Comment by Kevin Dorst on Polarization is Not (Standard) Bayesian · 2023-09-18T12:33:00.413Z · LW · GW

I agree you could imagine someone who didn't know the factions positions.  But of course any real-world person who's about to become politically opinionated DOES know the factions positions.

More generally, the proof is valid in the sense that if P1 and P2 are true (and the person's degrees of belief are representable by a probability function), then Martingale fails.  So you'd have to somehow say how adding that factor would lead one of P1 or P2 to be false.  (I think if you were to press on this you should say P1 fails, since not knowing what the positions are still lets you know that people's opinions (whatever they are) are correlated.)

Comment by Kevin Dorst on ChatGPT challenges the case for human irrationality · 2023-08-24T23:21:13.146Z · LW · GW

Nice point! Thanks.  Hadn't thought about that properly, so let's see.  Three relevant thoughts:

1) For any probabilistic but non-omniscient agent, you can design tests on which it's poorly calibrated on.  (Let its probability function be P, and let W = {q: P(q) > 0.5 & ¬q} be the set of things it's more than 50% confident in but are false.  If your test is {{q,¬q}: q ∈ W}, then the agent will have probability above 50% in all its answers, but its hit rate will be 0%.)  So it doesn't really make sense to say that a system is calibrated or not FULL STOP, but rather that it is (or is not) on a given set of questions.

What they showed in that document is that for the target test, calibration gets worse after RLHF, but that doesn't imply that calibration is worse on other questions.  So I think we should have some caution in generalizing.

2) If I'm reading it right, it looks like on the exact same test, RLHF significantly improved GPT4's accuracy (Figure 7, just above).  So that complicates that "merely introducing human biases" interpretation.

3) Presumably GPT4 after RLHF is a more useful system than GPT4 without it, otherwise they would have released a different version.  That's consistent with the picture that lots of fallacies (like the conjunction fallacy) arise out of useful and efficient ways of communicating (I'm thinking of Gricean/pragmatic explanations of the CF).

Comment by Kevin Dorst on ChatGPT challenges the case for human irrationality · 2023-08-24T22:58:40.695Z · LW · GW

How does that argument go?  The same is true of a person doing (say) the cognitive reflection task.

"A bat and a ball together cost \$1.10; the bat costs \$1 more than the ball; how much does the ball cost?"

Standard answer: "\$0.10".  But also standardly, if you say "That's not correct", the person will quickly realize their mistake.

Comment by Kevin Dorst on ChatGPT challenges the case for human irrationality · 2023-08-24T22:56:51.068Z · LW · GW

Hm, I'm not sure I follow how this is an objection to the quoted text.  Agreed, it'll use bits of the context to modify its predictions. But when the context is minimal (as it was in all of my prompts, and in many other examples where it's smart), it clearly has a default, and the question is what we can learn from that default.

Clearly that default behaves as if it is much smarter and clearer than the median internet user. Ask it to draw a tikz diagram, and it'll perform better than 99% of humans. Ask it about the Linda problem, and it'll perform the conjunction fallacy. I was arguing that that is mildly surprising, if you think that the conjunction fallacy is something that 80% of humans get "wrong" (and, remember, 20% get "right").

Where does the fact that it can be primed to speak differently disrupt that reasoning?

Comment by Kevin Dorst on ChatGPT challenges the case for human irrationality · 2023-08-24T22:43:41.633Z · LW · GW

Thanks for the thoughtful reply! Two points.

1) First, I don't think anything you've said is a critique of the "cautious conclusion", which is that the appearance of the conjunction fallacy (etc) is not good evidence that the underlying process is a probabilistic one.  That's still interesting, I'd say, since most JDM psychologists circa 1990 would've confidently told you that the conjunction fallacy + gambler's fallacy + belief inertia show that the brain doesn't work probabilistically. Since a vocal plurality of cognitive scientists now think they're wrong, this is still an argument for the latter, "resource-rational" folks.

Am I missing something, or do you agree that your points don't speak against the "cautious conclusion"?

2) Second, I of course agree that "it's just a text-predictor" is one interpretation of ChatGPT.  But of course it's not the only interpretation, nor the most exciting one that lots of people are talking about.  Obviously it was optimized for next-word prediction; what's exciting about it is that it SEEMS like by doing so, it managed to display a bunch of emergent behavior.

For example, if you had asked people 10 years ago whether a neural net optimized for next-word prediction would ace the LSAT, I bet most people would've said "no" (since most people don't).  If you had asked people whether it would perform the conjunction fallacy, I'd guess most people would say "yes" (since most people do).

Now tell that past-person that it DOES ace the LSAT.  They'll find this surprising.  Ask them how confident they are that it performs the conjunction fallacy.  I'm guessing they'll be unsure. After all, one natural theory of why it aces the LSAT is that it gets smart and somehow picks up on the examples of correct answers in its training set, ignoring/swamping the incorrect ones.  But, of course, it ALSO has plenty of examples of the "correct" answer to the conjunction fallacy in its dataset.  So if indeed "bank teller" is the correct answer to the Linda problem in the same sense  that "Answer B" is the correct answer to LSAT question 34, then why is it picking up on the latter but not the former?

I obviously agree that none of this is definitive.  But I do think that insofar as your theory of GPT4 is that it exhibit emergent intelligence, you owe us some explanation for why it seems to treat correct-LSAT-answer differently from "correct"-Linda-problem-answers.

Comment by Kevin Dorst on Rationalization Maximizes Expected Value · 2023-08-13T18:47:05.344Z · LW · GW

Yeah, that looks right! Nice. Thanks!

Comment by Kevin Dorst on Rationalization Maximizes Expected Value · 2023-08-01T21:02:35.468Z · LW · GW

Fair! I didn't work out the details of the particular case, partly for space and partly from my own limited bandwidth in writing the post.  I'm actually having more trouble writing it out now that I sit down with it, in part because of the choice-dependent nature of how your values change.

Here's how we'd normally money-pump you when you have a predictable change in values.  Suppose at t1 you value X at \$1 and at t2 you predictably will come to value it at \$2.  Suppose at t1 you have X; since you value it at \$1, you'll trade it to me for \$1, so now you're at +\$1; then I wait for t2 to come around, and now you value X more, so I offer to trade  you \$X for \$2, so you happily trade and end up with X - \$1. Which is worse than you started.

The trouble is that this seems like precisely a case where you WOULDN'T rationalize, since you traded X away.  I think there will still be some way to do it, but haven't figured it out yet.  It'd be interesting if not (it's MAYBE possible that the money-pump argument for fixed preferences I had in mind presupposed that how they change wouldn't be sensitive to which trades you make. But I kinda doubt it).

Let me know if you have thoughts!  I'll write back if I have the chance to sit down and figure it out properly.

Comment by Kevin Dorst on Rationalization Maximizes Expected Value · 2023-08-01T20:47:19.046Z · LW · GW

Nice point. Yeah, that sounds right to me—I definitely think there are things in the vicinity and types of "rationalization" that are NOT rational.  The class of cases you're pointing to seems like a common type, and I think you're right that I should just restrict attention. "Preference rationalization" sounds like it might get the scope right.

Sometimes people use "rationalization" to by definition be irrational—like "that's not a real reason, that's just a rationalization".  And it sounds like the cases you have in mind fit that mold.

I hadn't thought as much about the cross of this with the ethical version of the case.  Of course, something can be (practically or epistemically) rational without being moral, so there are some versions of those cases that I'd still insist ARE rational even if we don't like how the agent acts.

Comment by Kevin Dorst on Is the Endowment Effect Due to Incomparability? · 2023-07-30T20:05:46.281Z · LW · GW

Ah, sorry!  Yes, they're exchanging with the experimenters, who have a excesses of both mugs and pens.  That's important, sorry to be unclear!

Comment by Kevin Dorst on Is the Endowment Effect Due to Incomparability? · 2023-07-20T12:50:12.815Z · LW · GW

Yeah, I think it's a good question how much of a role some sort of salient default is doing. In general ("status quo bias") people do have a preference for default choices, and this is of course generally reasonable since "X is the default option" is generally evidence that most people prefer X. (If they didn't, the people setting the defaults should change it!).  So that phenomenon clearly exists, and seems like it'd help explain the effect.

I don't know much empirical literature off-hand looking at variants like you're thinking of, but I imagine some similar things exist.  People who trade more regularly definitely exhibit the endowment effect less.  Likewise, if you manipulate whether you tell the person they're receiving a "gift" vs less-intentionally winding up with a mug, that affects how many people trade.  So that fits with your picture.

In general, I don't think the explanations here are really competing. There obviously are all sorts of factors that go into the endowment effect—it's clearly not some fundamental feature of decision making (especially when you notice all the modulators of it that have been found), but rather something that comes out of particular contexts.  Even with salience and default effects driving some of it, incomparability will exacerbate it—certainly in the valuation paradigm, for the reasons I mentioned in the post, and even if the exchange paradigm because it will widen the set of people for whom other features (like defaults, aversion to trade, etc.) kick in to prevent them from trading.

Comment by Kevin Dorst on Is the Endowment Effect Due to Incomparability? · 2023-07-20T12:43:54.301Z · LW · GW

Not sure I totally follow, but does this help?  Suppose it's true that 10 of 50 people who got mugs prefer the pen, so 20% of them prefer the pen. Since assignments were randomized, we should also expect 10 of 50  (20% of) people who got pens to prefer the pens. That means that the other 40 pen-receivers prefer mugs, so those 40 will trade too.  Then we have 10 mugs-to-pens trades + 40 pens-to-mugs trades, for a total of 50 of 100 trades.

Comment by Kevin Dorst on Is the Endowment Effect Due to Incomparability? · 2023-07-12T13:13:45.780Z · LW · GW

Thanks, yeah I agree that this is a good place to press.  A few thoughts:

1. I agree with what Herb said below, especially about default aversion to trading especially in contexts where you have uncertainty
2. I think you're totally right that those other explanations could play a role. I doubt the endowment effect has a single explanation, especially since manipulations of the experimental setup can induce big changes in effect sizes. So presumably the effect is the combination of a lot of factors—I didn't mean incomparability to be the only one, just one contributing factor.
3. I think one way to make the point that we should expect less trading under incomparability is just to go back to the unspoken assumption economists were making about indifference. As far as I can tell (like I said in footnote 1), the argument that standard-econ models entail 50% trading ignores the possibility that people are indifferent, or assumes that if they are then still 50% of those who are indifferent will trade. The latter seems like a really bad assumption—presumably if you're indifferent, you'll stick with a salient default (why bother; you risk feeling foolish and more regret, etc).  So I assume the reply is going to instead be that very few people will be PRECISELY indifferent—not nearly enough to lower the trading volume from 50% to 10%.  And that might be right, if the only options are strict preference or indifference. But once we recognize incomparability, it seems pretty obvious to me that most people will look at the two options and sorta shrug their shoulders; "I could go either way, I don't really care". That is, most will treat the two as incomparable, and if in cases like this incomparability skews toward the default of not trading (which seems quite plausible in to me in thie case, irrespective of wheeling in some general decision theory for imprecise values), then we should expect way less than 50% trading.

What do you think?

Comment by Kevin Dorst on The Case for Overconfidence is Overstated · 2023-07-03T14:21:17.152Z · LW · GW

Yeah that's a reasonable way to look at it. I'm not sure how much the two approaches really disagree: both are saying that the actual intervals people are giving are narrower than their genuine 90% intervals, and both presumably say that this is modulated by the fact that in everyday life, 50% intervals tend to be better. Right?

I take the point that the bit at the end might misrepresent what the irrationality interpretation is saying, though!

I haven't come across any interval-estimation studies that ask for intervals narrower than 20%, though Don Moore (probably THE expert on this stuff) told me that people have told him about unpublished findings where yes, when they ask for 20% intervals people are underprecise.

There definitely are situations with estimation (variants on the two-point method) where people look over-confident in estimates >50% and underconfident in estimates <50%, though you don't always get that.

Comment by Kevin Dorst on The Case for Overconfidence is Overstated · 2023-07-01T13:27:21.595Z · LW · GW

Oops, must've gotten my references crossed!  Thanks.

This wikipedia page says the height of a "Gerald R Ford-class" aircraft carrier is 250 feet; so, close.

https://en.wikipedia.org/wiki/USS_Gerald_R._Ford

Comment by Kevin Dorst on The Case for Overconfidence is Overstated · 2023-07-01T13:24:26.940Z · LW · GW

Crossposting from Substack:

Super interesting!

I like the strategy, though (from my experience) I do think it might be a big ask for at least online experimental subjects to track what's going on. But there are also ways in which that's a virtue—if you just tell them that there are no (good) ways to game the system, they'll probably mostly trust you and not bother to try to figure it out. So something like that might indeed work! I don't know exactly what calibration folks have tried in this domain, so will have to dig into it more. But it definitely seems like there should be SOME sensible way (along these lines, or otherwise) of incentivizing giving their true 90% intervals—and a theory like the one we sketched would predict that that should make a difference (or: if it doesn't, it's definitely a failure of at least local rationality).

On the second point, I think we're agreed! I'd definitely like to work out more of a theory for when we should expect rational people to switch from guessing to other forms of estimates. We definitely don't have that yet, so it's a good challenge. I'll take that as motivation for developing that more!

Comment by Kevin Dorst on The Case for Overconfidence is Overstated · 2023-06-29T12:37:41.749Z · LW · GW

Thanks!

Comment by Kevin Dorst on The Case for Overconfidence is Overstated · 2023-06-29T12:37:29.091Z · LW · GW

Thanks for the thoughtful reply!  Cross-posting the reply I wrote on Substack as well:

I like the objection, and am generally very sympathetic to the "rationality ≈ doing the best you can, given your values/beliefs/constraints" idea, so I see where you're coming from.  I think there are two places I'd push back on in this particular case.

1) To my knowledge, most of these studies don't use incentive-compatible mechanisms for eliciting intervals. This is something authors of the studies sometimes worry about—Don Moore et al talk about it as a concern in the summary piece I linked to.  I think this MAY link to general theoretical difficulties with getting incentive-compatible scoring rules for interval-valued estimates (this is a known problem for imprecise probabilities, eg https://www.cmu.edu/dietrich/philosophy/docs/seidenfeld/Forecasting%20with%20Imprecise%20Probabilities.pdf .  I'm not totally sure, but I think it might also apply in this case). The challenge they run into for eliciting particular intervals is that if they reward accuracy, that'll just incentivize people to widen their intervals.  If they reward narrower intervals, great—but how much to incentivize? (Too much, and they'll narrow their intervals more than they would otherwise.)  We could try to reward people for being calibrated OVERALL—so that they get rewarded the closer they are to having 90% of their intervals contain the true value. But the best strategy in response to that is (if you're giving 10 total intervals) to give 9 trivial intervals ("between 0 and ∞") that'll definitely contain the true value, and 1 ridiculous one ("the population of the UK is between 1–2 million") that definitely won't.

Maybe there's another way to incentivize interval-estimation correctly (in which case we should definitely run studies with that method!), but as far as I know this hasn't been done. So at least in most of the studies that are finding "overprecision", it's really not clear that it's in the participants' interest to give properly calibrated intervals.

2) Suppose we fix that, and still find that people are overprecise.  While I agree that that would be evidence that people are locally being irrational (they're not best-responding to their situation), there's still a sense in which the explanation could be *rationalizing*, in the sense of making-sense of why they make this mistake.  This is sort of a generic point that it's hard (and often suboptimal to try!) to fine-tune your behavior to every specific circumstance. If you have a pre-loaded (and largely unconscious) strategy for giving intervals that trades off accuracy and informativity, then it may not be worth the cognitive cost to try to change that to this circumstance because of the (small!) incentives the experimenters are giving you.

An analogy I sometimes use is the Stroop task: you're told to name the color of the word, not read it, as fast as possible. Of course, when "red" appears in black letters, there's clearly a mistake made when you say 'red', but at the same time we can't infer from this any broader story about irrationality, since it's overall good for you to be disposed to automatically and quickly read words when you see them.

Of course, then we get into hard questions about whether it's suboptimal that in everyday life people automatically do this accuracy/informativity thing, rather than consciously separate out the task of (1) forming a confidence interval/probability, and then (2) forming a guess on that basis.  And I agree it's a challenge for any account on these lines to explain how it could make sense for a cognitive system to smoosh these things together, rather than keeping them consciously separable.  We're actually working on a project along these lines for when cognitive limited agents might be more advantaged by guessing in this way, but I agree it's a good challenge!

What do you think?