K-types vs T-types — what priors do you have?post by Cleo Nardo (strawberry calm) · 2022-11-03T11:29:00.809Z · LW · GW · 25 comments
K-types vs T-types What makes a good theory? K-types and T-types have different priors. Algorithmic characterisation Proof-theoretic characterisation. Occam's Razor characterisation. Examples Case study: Quantum Mechanics Case study 2: Statistics Other examples So who's correct? Solomonoff Induction Correlation with other classifications Personality traits Bullet-dodgers vs bullet-swallowers Correct Contrarian Cluster Bouba-kiki Effect Practical advice K-targeted rhetoric vs T-targeted Is this classification any good? None 25 comments
Summary: There is a spectrum between two types of people, K-types and T-types. K-types want theories with low kolmogorov-complexity and T-types want theories with low time-complexity. This classification correlates with other classifications and with certain personality traits.
Epistemic status: I'm somewhat confident that this classification is real and that it will help you understand why people believe the things they do. If there are major flaws in my understanding then hopefully someone will point that out.
Edits: Various clarifying remarks.
K-types vs T-types
What makes a good theory?
There's broad consensus that good theories should fit our observations. Unfortunately there's less consensus about to compare between the different theories that fit our observations — if we have two theories which both predict our observations to the exact same extent then how do we decide which to endorse?
We can't shrug our shoulders and say "let's treat them all equally" because then we won't be able to predict anything at all about future observations. This is a consequence of the No Free Lunch Theorem: there are exactly as many theories which fit the seen observations and predict the future will look like X as there are which fit the seen observations and predict the future will look like not-X. So we can't predict anything unless we can say "these theories fitting the observations are better than these other theories which fit the observations".
There are two types of people, which I'm calling "K-types" and "T-types", who differ in which theories they pick among those that fit the observations.
K-types and T-types have different priors.
K-types prefer theories which are short over theories which are long. They want theories you can describe in very few words. But they don't care how many inferential steps it takes to derive our observations within the theory.
In contrast, T-types prefer theories which are quick over theories which are slow. They care how many inferential steps it takes to derive our observations within the theory, and are willing to accept longer theories if it rapidly speeds up derivation.
In computer science terminology, we can think of a theory as a computer program which outputs predictions. K-types penalise the kolmogorov complexity of the program (also called the description complexity), whereas T-types penalise the time-complexity (also called the computational complexity).
The T-types might still be doing perfect bayesian reasoning even if their prior credences depend on time-complexity. Bayesian reasoning is agnostic about the prior, so there's nothing defective about assigning a low prior to programs with high time-complexity. However, T-types will deviate from Solomonoff inductors, who use a prior which exponentially decays in kolmogorov-complexity.
When translating between proof theory and computer science, (computer program, computational steps, output) is mapped to (axioms, deductive steps, theorems) respectively. Kolmogorov-complexity maps to "total length of the axioms" and time-complexity maps to "number of deductive steps".
K-types don't care how many steps there are in the proof, they only care about the number of axioms used in the proof. T-types do care how many steps there are in the proof, whether those steps are axioms or inferences.
Occam's Razor characterisation.
Both K-types and T-types can claim to be inheritors of Occam's Razor, in that both types prefer simple theories. But they interpret "simplicity" in two different ways. K-types consider the simplicity of the assumptions alone, whereas T-types consider the simplicity of the assumptions plus the derivation. This is the key idea.
Both can accuse the other of "being needlessly convoluted", "playing mental gymnastics", or "making ad-hoc assumptions".
Case study: Quantum Mechanics
Is Hugh Everett's many-worlds interpretation simpler than Roger Penrose's dynamical collapse theory? Well, in one sense, Everett's is "simpler" because it only makes one assumption (Schrodinger's equation), whereas Penrose posits additional physical laws. But in another sense, Everett's is "more complicated" because he has to derive non-trivially a bunch of stuff that you get for free in Penrose's dynamical collapse theory, such as stochasticity. ("Stochasticity" is the observation that we can attach probabilities to events.)
Everett can say to Penrose: "Look, my theory is simpler. I didn't need to assume stochasticity. I derived it instead."
Penrose can say to Everett: "Look, my theory is simpler. I didn't need to derive stochasticity. I assumed it instead."
Everett's theory is shorter but slower, and Penrose's is longer but quicker.
A K-type will (all else being equal) be keener on Everett's theory (relative Penrose's theory) than a T-type. Both might accuse the rival theory of being needlessly complicated. The difference is the K-type only cares about how convoluted are the assumptions, but the T-type cares also about how convoluted is the derivation of the observation from those assumptions.
Case study 2: Statistics
Is bayesian statistics simpler than frequentist statistics? Well, in one sense, Bayesian statistics is "simpler" because it only has a single rule (Baye's rule), whereas Frequentist statistics posits numerous ad-hoc methods for building models. But in another sense, Bayesian statistics is "more complicated" because you need to do a lot more calculations to get the results than if you used the frequentist methods.
Bayesian statistics is simpler to describe.
Frequentist statistics is simpler to use.
Massively uncertain about all of this. I've tried to include views I endorse in both columns, however most of my own views are right-hand column because I am more K-type than T-type.
|Ontology||There are few types of things.||There are many types of thing.|
|Solving problems||One Big Rule.||Many ad-hoc tools.|
|Symmetries||Reality has lots of symmetries.||Reality has fewer symmetries.|
|Analogies||Different systems will follow the same rules.||Different systems will follow different rules.|
|Trusting experts||Domain-general knowledge is more important.||Domain-specific knowledge is more important.|
|Expections||Rules have no exceptions.||Rules have many exceptions.|
|Testing a theory||Hypothetical thought-experiments.||Real life examples.|
|Models||Toy models can be good.||Good models must include all the real-life details.|
|Interpretation of Quantum Mechanics||Many-worlds.||Copenhagen. Dynamical Collapse.|
|Mind-body problem of consciousness||Physical reductionism. Eliminativism.||Dualist.|
|Ethics||Utilitarianism. Kantian.||Virtue ethics. Natural law.|
|Politics||A few short laws (e.g. libertarianism).||Numerous long laws (e.g. neoliberalism).|
|Future technology||Near the limit of physical impossibility.||Near existing technology.|
|Moral circle||Wide moral circle.||Narrow moral circle.|
So who's correct?
Should we be K-types or T-types? What's better — a short slow theory or a long quick theory? Well, here's a Bayesianesque framework.
If a theory has assumptions, and each assumption has likelihood of error, then the likelihood that all the assumptions in the theory are sound is . If a derivation has steps, and each step has likelihood of error, then the likelihood that all the steps in the derivation are sound is . So the prior likelihood that the argument is sound is .
If we were performing a Bayesian update from observations to theories, then we should minimise over all theories, where...
- is a penalty for falsified theories ( is the likelihood the theory assigns to the observations we made)
- is the weight for the penalty on long theories
- is the weight for the penalty on slow theories.
The ratio characterises whether you're a K-type or a T-type. For K-types, this ratio is high, and for T-types this ratio is small.
If you're confident in your assumptions ( is small), or if you're unconfident in your inferences ( is big), then you should penalise slow theories moreso than long theories, i.e. you should be a T-type.
If you're confident in your inferences ( is small), or if you're unconfident in your assumptions ( is big), then you should penalise long theories moreso than slow theories, i.e. you should be a K-type.
We recover solomonoff induction when and . This occurs when and , i.e. each assumption has chance of error and each derivation step has chance of error.
Correlation with other classifications
I think the K-T distinction will correlate with certain personality traits.
We can see from the characterisation that K-types are more confident in their ability to soundly derive consequences within a theory than T-types. So K-types will tend to be less humble about their own intelligence than T-types.
We also can see from the characterisation that K-types are less confident in "common-sense" assumptions. They are distrustful of conventional wisdom, and more willing to accept counterintuitive results. K-types tend to be more disagreeable than T-types.
Bullet-dodgers vs bullet-swallowers
Scott Aaronson writes about two types of people on his blog.
My own hypothesis has to do with bullet-dodgers versus bullet-swallowers. A bullet-dodger is a person who says things like:
Sure, obviously if you pursued that particular line of reasoning to an extreme, then you’d get such-and-such an absurd-seeming conclusion. But that very fact suggests that other forces might come into play that we don’t understand yet or haven’t accounted for. So let’s just make a mental note of it and move on
Faced with exactly the same situation, a bullet-swallower will exclaim:
The entire world should follow the line of reasoning to precisely this extreme, and this is the conclusion, and if a ‘consensus of educated opinion’ finds it disagreeable or absurd, then so much the worse for educated opinion! Those who accept this are intellectual heroes; those who don’t are cowards.
How does Aaronson's distinction map onto K-types and T-types? Well, K-types are the bullet-swallowers and T-types are the bullet-dodgers. This is because bullet-dodging is normally done by sprinkling the following if-statements through the theory.
if Troublesome_Example(): return Intuitive_Answer()
These if-statements reduce the time-complexity of the theory (pleasing the T-types) but increase the kolmogorov-complexity (annoying the K-types).
Correct Contrarian Cluster
Eliezer Yudkowsky writes about The Correct Contrarian Cluster [LW · GW]. This is a collection of contrarian opinions, diverging from mainstream establishment science, which Yudkowsky thinks are correct, such that if someone has one correct-contrarian opinion they are likely to have others. He suggests you could use this collection to identify correct-contrarians, and then pay extra attention to their other contrarian opinions, which are more likely to be correct than the baseline rate of contrarian opinions. (Note that almost all contrarian beliefs are incorrect.)
Here are some opinions Yudkowsky lists in the "correct contrarian cluster":
- Atheism: Yes.
- Many-worlds: Yes.
- "P-zombies": No.
- Natural selection: Yes.
- World Trade Centre rigged with explosives: No.
- Rorschach ink blots: No.
How does Yudkowsky's distinction map onto K-types and T-types? Well, Yudkowsky himself is on the extreme K-side of the spectrum, so I'd expect that K-types are his "correct contrarians" and T-types are the mainstream establishment.
K-types are kiki and T-types are bouba. Don't ask me why.
K-targeted rhetoric vs T-targeted
You want to convince a K-type of some conclusion? Find an argument with really simple assumptions. Don't worry about whether those assumptions are common-sense or whether the derivation is long. Explicitly show each derivation step.
Do you want to convince a T-type of some conclusion? Find an argument with very few steps. You can assume far more "common-sense" knowledge. Skip over derivation steps if you can.
For example, here's how to convince a K-type to donate to AMF: "We ought to maximise expected utility, right? Well, here's a long derivation for why your donation would do that..."
But here's how to convince a T-type to donate to AMF: "We ought to donate our money to charities who can save loads of lives with that donation, right? Well, at the moment that's AMF."
Is this classification any good?
I claim that the classification is both explanatory and predictive.
- The classification is explanatory. Why is there a correlation between utilitarian ethics, Bayesian statistics, and Everettian quantum mechanics? Why do those beliefs also correlate with a bunch of personality traits? I explain this in terms of assumption-error rate and derivation-error rate .
- The classification is predictive. We can infer which theories will appeal to someone based on their commitments to completely unrelated theories.
Comments sorted by top scores.