Statistical models & the irrelevance of rare exceptions
post by patrissimo · 2023-05-07T15:59:10.677Z · LW · GW · 6 commentsContents
6 comments
“I don’t care about this instance - I don’t care about any instances! Life is too short to care about anything but the general case!”
Yes, the general case is drawn from instances, I’m saying that we shouldn't get caught up in the details unless they really matter. And if there is a clear statistical generalization, the details matter very little.
A common model might be: “Well, the pattern is that it’s some mix of A & B, with more A as factors X & Y are higher, and more B if they are lower”.
Replying "Z fits better than Y in the mix of variables to map onto the A & B mix" is a correction within the general case frame. Whereas, pointing out "Here are cases F & G that are high A with low X & Y - contradicting your model" - when such cases are very rare - is irrelevant. I find it's usually obvious from the description of F&G that they're extremely rare.
Rare exceptions are irrelevant because almost all models of the real world (not physics) are statistical claims about what’s usually true, not absolute claims about what’s always true. So a rare data point that doesn’t fit is actually *not* a contradiction! Pointing such cases out is just reiterating the tautological fact that statistical models are not absolute, which seems like a total waste of time to me. (Especially if the speaker agrees with the model!)
I’ve noticed this happens more often with careful, intellectually humble thinkers, who often include caveats of the form “But here’s an exception to this strong model I’ve just presented.” I think often they're trying to proactively defend against others pointing out this case. But to me, this is wrongfully falling into the frame that rare exceptions are relevant criticisms or corrections of a statistical model.
So, rather than defending by acknowledging the rare case, I think it’s far better to break the false frame that rare exceptions are counter-examples and pointing them out is a relevant thing that needs to be addressed. Move into the new, correct frame that this is a statistical model - say by asking them if they actually disagree that the suggested pattern fits in the vast majority of cases and thus is statistically true.
I’ve also lately noticed that while many people are abstract systematizers, I find far fewer who are relentlessly meta-seeking, constantly seeking to expand to more general theories over broader domains. Eg I’ve noticed myself saying regularly: “Hey, your model for domain S is actually a model for domain Q, where S is a subdomain of Q. Notice that everything we said about S applies equally to Q - we didn’t actually use any special features of S in our reasoning!”
I also recently noticed a weird but cool thing - my brain automatically generates statistical models without details ever coming into my mind. Like it has a "model this data set" function, where it retrieves and analyzes the data set without me having to consciously consider cases. Ofc I use cases (common ones!) to check the model afterwards. One thing I love about the LW community is that these cognitive modes I describe here are much more common than in most other circles.
Finally, I do find rare exceptions relevant when there is a pattern to the exceptions. So, rather than just pointing out rare exceptions F&G, the responder then generalizes them into subclass H. Now we can make the general model more accurate by adding that it doesn't apply to H. This "move" still falls within the true statistical frame.
As an example for this topic, note that in extreme distributions like power law, a “rare exception” that happens at 1% frequency could have 100x intensity compared to the other 99%, and so need to be weighted equally in a model. The generalization of this exception is that the more extreme the distribution, the more rare a rare case has to be to be irrelevant. This generalized exception now improves our model of statistical models.
6 comments
Comments sorted by top scores.
comment by tailcalled · 2023-05-07T17:55:00.587Z · LW(p) · GW(p)
Could you give an example of a dispute that follows the structure you are talking about?
Also, as a counterpoint, there's the arguments made for determinism in Science in a High-Dimensional World [LW · GW].
comment by Dagon · 2023-05-07T16:35:45.506Z · LW(p) · GW(p)
Upvoted, but I suspect this is a response to misunderstood or incorrect criticism, rather than a good meta-generalization.
All models are wrong, some models are useful. Outliers and exceptions are a reminder that the model is wrong - it's not truth, it's not reality, and it is not the ONLY thing available to make decisions on. AND an exception is not a knockout blow that invalidates all uses of the model. There may be MANY uses for that model, as long as it's known not to be complete.
comment by Unnamed · 2023-05-07T19:48:04.876Z · LW(p) · GW(p)
Contrast with Zvi's The One Mistake Rule [? · GW].
comment by Ben Pace (Benito) · 2023-05-07T21:55:16.260Z · LW(p) · GW(p)
This is a good point (and I think I occasionally make this mistake of giving far too unrealistic or rare counterexamples), but I do want to say that sometimes there is substantive disagreement about whether the counter-example is an extreme case or a relatively central one.
I think of people who are willing to accept very non-central counterexamples as relevant as being very conservative on the dimension of trusting their own taste, in that they are trying to avoid using their own judgment about what counts as central. (Mostly this seems good to me in cases of strong philosophical uncertainty — I would accept a bizarre counterexample from Nick Bostrom in one of his papers more so than I would about (say) whether a particular salary policy is good for my company.)
Overall I am reminded of a quote by Daniel Dennett that I am failing to find, about hypothetical examples being useful to think about in proportion to how far away they are from the real world.
Replies from: Jayson_Virissimo↑ comment by Jayson_Virissimo · 2023-05-07T22:03:40.995Z · LW(p) · GW(p)
IIRC, he says that in Intuition Pumps and Other Tools for Thinking.
Replies from: Benito↑ comment by Ben Pace (Benito) · 2023-05-07T22:33:35.983Z · LW(p) · GW(p)
You're right. Thanks!
“No,” says the philosopher. “It’s not a false dichotomy! For the sake of argument we’re suspending the laws of physics. Didn’t Galileo do the same when he banished friction from his thought experiment?” Yes, but a general rule of thumb emerges from the comparison: the utility of a thought experiment is inversely proportional to the size of its departures from reality.