What's next for instrumental rationality?
post by Andrew_Critch · 2022-07-23T22:55:06.185Z · LW · GW · 7 commentsContents
7 comments
Preceded by: Curating "The Epistemic Sequences" (list v.0.1) [LW · GW]
Epistemic status: speculations and ideations from me about the potential for further progress on broadly accessible instrumental rationality content.
In Curating "The Epistemic Sequences" [LW · GW], I explained how the epistemic content of the LessWrong sequences has a different epistemic status than the instrumental content, i.e.., content on how to behave. So what's next for instrumental rationality? It would be great if there were a how-to-behave version of the sequences that was built on foundations as strong as logic, probability, statistics, and causal inference.
Unfortunately, those foundations don't yet exist. There aren't formal foundations for decision theory, game theory, ethics, meta-ethics, and political theory that are "tried and true" the way logic, probability, statistics, and causal inference ("L+P+S+C") are.
Many people argue that universally-useful how-to-behave instructions can't exist, on the the grounds that "philosophers have been trying to solve these issues for millennia", but I'm not sure that's a strong case. After all, philosophers had been trying to develop truth-seeking techniques for millennia prior to the 20th century, and then along came a bunch of progress in L+P+S+C with widespread applications, enabling what might be called an "epistemic enlightenment (for individuals)", which arguably culminated in the epistemic content of the LessWrong sequences. And, perhaps in the next decade, there could be progress in the theory of embedded agency [? · GW] and multi-agent rationality ("E+M") leading to real-world applications as robustly useful and well-vetted as L+P+S+C are today. If breakthroughs in embedded & multi-agent rationality then remained in practice for something like 30 years of applications in broad-sweeping domains (or, an AI-augmented equivalent of 30 human-civilization-years) the way L+P+S+C have, perhaps then will be a good time for someone to write the "The Instrumental Sequences", and a new generation of instrumentally enlightened people will look back and wonder why it was considered impossible to derive a principled account of how individuals should behave.
7 comments
Comments sorted by top scores.
comment by David Gross (David_Gross) · 2022-07-25T13:48:08.202Z · LW(p) · GW(p)
FWIW, I'm trying to create something of a bridge between "the ancient wisdom of people who thought deeply about this sort of thing a long time ago" and "modern social science which with all its limitations at least attempts to test hypotheses with some rigor sometimes" in my sequence on virtues [? · GW]. That might serve as a useful platform from which to launch this new rigorous instrumental rationality guide.
comment by Scott Garrabrant · 2022-07-27T22:51:21.353Z · LW(p) · GW(p)
I love the "E+M" name. It reminds of electricity and magnetism, and IMO embedded agency and multi-agent rationality will eventually be seen as two sides of the same coin about as much as electricity and magnetism.
I think our current best theories of both don't look much like each other, and predict that as we progress on each, they will slowly look more and more like one field.
comment by cubefox · 2022-07-24T21:43:33.424Z · LW(p) · GW(p)
A minor point, but I think decision theory and game theory are actually more tried and true than causal inference. At least they are older than the theory of causal graphs. In particular Richard Jeffrey's decision theory can be seen as an extension of probability theory.
Replies from: Andrew_Critch↑ comment by Andrew_Critch · 2022-07-25T21:23:00.262Z · LW(p) · GW(p)
Re; Jeffrey's decision theory, it's not multi-agent, which is a huge limitation. Otherwise I'd agree with you.
Re: game theory, you're right that it's been around for a while, but it's more "tried and false" than "tried and true". Basically, people in geopolitics (both the study, and the activity) know by now that Nash equilibria and even correlated equilibria are not good models of how powerful entities interact, and psychologists know they're not good models of how individuals interact. (In fact, as early as 1967, Aumann, Harsanyi, Stearns, and others all warned that mathematical models of games were a poor fit for real-world applications, e.g., in their report to government, "Models of Gradual Reduction of Arms".) I believe a contributor to the disconnect between game theory and reality is that real-world states and even individual humans are somewhat translucent to each other, while game theoretic agents aren't. See Halpern & Pass (2018) Game Theory for Translucent Players for a solid attempt at fixing this.
Re: causal inference, it's true that Pearl's causal graph framing on the topic is relatively new (the late 80's / early 90s), as are some of his theorems (e.g., the v-structure theorem of the 90s), but much of this work just codifies and organizes practices that date back to the 1930s, with structural equation models. This isn't to disparage the value of organizing that stuff, since it paves the way for further theoretical advancement, and it's big part of what earned Pearl the Turing prize, which I think was very well deserved.
Replies from: cubefox↑ comment by cubefox · 2022-07-30T03:41:13.291Z · LW(p) · GW(p)
Regarding game theory: The examples you give are about game theory not describing actual behavior very well. But I assume we want here to use game theory as a theory of (multi-agent instrumental) rationality. So in our case it has to describe how people should interact, not necessarily how they do interact. Right?
Of course, if people do presumably interact rationality in certain cases, while game theory describes something else, then it is both normatively and descriptively inadequate. I'm not sure whether your examples are such cases. But there others. For example, both game theory and decision theory seem to recommend not to go voting in a democracy. In the former case because it seems to be a prisoner's dilemma, in the latter because the expected utility of voting is very low. Voting being irrational seems highly counterintuitive, especially if you haven't already been "brain washed" with those theories. They seem to miss some sort of Kantian "but if everyone did not vote" reasoning. That seems to me somewhat more excusable for decision theory, since it is not multi-agentic in the first place. But game t 'heory does indeed also seem more "tried and false" to me. Though some would bite the bullet and say voting is in fact irrational.
comment by yrimon (yehuda-rimon) · 2022-07-24T14:31:16.468Z · LW(p) · GW(p)
No Free Lunch means that optimization requires taking advantage of underlying structure in the set of possible environments. In the case epistemics, we all share close-to-the-same-environment (including having similar minds), so there are a lot of universally-useful optimizations for learning about the environment.
Optimizations over the space of "how-to-behave instructions" requires some similar underlying structure. Such structure can emerge for two reasons: (1) because of the shared environment, or (2) because of shared goals. (Yeah, I'm thinking about agents as cartesian, in the sense of separating the goals and the environment, but to be fair so do L+P+S+C.)
On the environment side, this leads to convergent behaviours (which can also be thought of as behaviours resulting from selection theorems), like good epistemics, or gaining power over resources.
When it comes to goals, on the other hand, it is both possible (by the orthogonality thesis) and the case that different peole have vastly different goals (e.g. some people want to live forever, some want to commit suicide, and these two groups probably require mostly different strategies). Less in common between different people's goals means less universally-useful how-to-behave instructions. Nonetheless, optimizing behaviours that are commonly prioritized is close enough to universally useful, e.g. doing relationships well.
Perhaps an "Instrumental Sequences" would include the above categories as major chapters. In such a case, as indicated in the post, current reseaerch being posted on Lesswrong gives an approximate idea of what such sequences could look like.
comment by JBlack · 2022-07-24T07:38:23.881Z · LW(p) · GW(p)
I would be enormously interested in any sort of "universal" multi-agent rationality model, but it seems to me that there is an enormous amount of work that would need to be done to get there. All the existing work that I've read has been either or both of extremely narrow, or not at all well founded.
Barring superhuman advances in math, I suspect that any such thing is more than a hundred years of widespread concerted research away. I would love to be proven wrong on this!