# Are explanations that explain more phenomena always more unlikely than narrower versions?

post by FinalFormal2 · 2021-12-01T18:34:34.219Z · LW · GW · 4 comments

This is a question post.

The classic example of a hypothesis explaining more being less likely would of course be conspiracy theories, where adherents add more and more details under the false assumption that this makes the theory more likely rather than less likely.

However, when we have multiple phenomena that follow a similar pattern, isn't it simpler and more likely that there's only one cause for both situations?

Is it possible that in some circumstances it could be more unlikely that the pattern is completely coincidental?

It seems like the problem with conspiratorial thinking isn't that they explain more with less, but that they can selectively pull their facts from a wide range of fact space. Similar to how you can take advantage of people's tribe-brain and narrative thinking to make them think that surgeons are evil, if you want to tell a story about how sugar companies are taking over the world, you can probably find some number of world leaders with ties to Big Glucose.

answer by JBlack · 2021-12-02T04:10:59.344Z · LW(p) · GW(p)

The classical rule of thumb here is Occam's Razor, in which simpler models that fit the known facts are preferred.

A more modern (but less practical) take is Solomonoff induction, in which models have prior weightings that exponentially decay based upon the length of their description in some suitable language.

A model constructed specifically to fit a large number of facts must necessarily be as long as the fully specified description of those facts, and is exponentially penalized for that length. What's more, every new fact that has to be explained generates a new, larger model with worse penalty. Smaller alternative models that don't predict those exact facts may still end up preferred due to the exponential penalty of the more complex model.

This is rather mathematical, and we almost never explicitly reason exactly according to such a rule, but we often reason like this in a qualitative sort of way.

The main problem is that people mostly don't bother to keep in mind multiple models at all. The closest we get most of the time is in moments of confusion when our current default model fails, and we need to search for a new one. This is why I value being consciously aware that "I notice I am confused". Such moments are some of the very few times that we compare alternative models, and paying attention to that process is very important for rationality.

answer by tomcatfish · 2021-12-02T00:40:51.883Z · LW(p) · GW(p)

I've written up several responses to this, but don't think any of them succeeded in dissolving the question.

I think the confusion I notice could be summarized with "There isn't really a null hypothesis, just a prior". Longer: the alternative to "Soup costs $2.00 on Tuesdays" is not "Soup costs$2.00 this Tuesday", it is "Soup costs $2.00 this Tuesday and some amount other Tuesdays", which has strictly less explanatory power for the observation$soup_prices = {Tuesday --> .

I recommend reading/reviewing Solomonoff Induction [LW · GW] and the computations done in the post for clarity.

(If anyone has the description of theory formation involving soup prices, I would have loved to cite that instead)

comment by tomcatfish · 2021-12-02T00:44:24.191Z · LW(p) · GW(p)

I cannot post a comment containing the sequence "dollar backslash a r r o w dollar". I think any invalid Latex leads to a "Unrecognized LW server error"