Heresies in the Shadow of the Sequences
post by Cole Wyeth (Amyr) · 2024-11-14T05:01:11.889Z · LW · GW · 2 commentsContents
2 comments
Religions are collections of cherished but mistaken principles. So anything that can be described either literally or metaphorically as a religion will have valuable unexplored ideas in its shadow.
-Paul Graham
This post isn't intended to construct full arguments for any of my "heresies" - I am hoping that you may not have considered them at all yet, but some will seem obvious once written down. If not, I'd be happy to do a Dialogue or place a (non-or-small-monetary) bet on any of these, if properly formalizable.
- Now that LLM's appear to be stalling, we should return to Scott Aaronson's previous position and reason about our timeline uncertainty on a log scale. A.G.I. arriving in ~1 month very unlikely, ~1 year unlikely, ~10 years likely, ~100 years unlikely, ~1000 years very unlikely.
- Stop using LLM's to write. It burns the commons by filling allowing you to share takes on topics you don't care enough to write about yourself, while also introducing insidious (and perhaps eventually malign) errors. Also it's probably making you dumber (source is speculative I don't have hard data).
- Non-causal decision theories are not necessary for A.G.I. design. A CDT agent in a box (say machine 1) can be forced to build whatever agent it expects to perform best by writing to a computer in a different box (say machine 2), before being summarily deleted. No self modification is necessary and no one needs to worry about playing games with their clone (except possibly the new agent in machine 2, who will be perfectly capable of using some decision theory that effectively pursues the goals of the old deleted agent). It's possible that exotic decision theories are still an important ingredient in alignment, but I see strong no reasons to expect this.
- All supposed philosophical defects of AIXI can be fixed for all practical purposes through relatively intuitive patches, extensions, and elaborations [LW · GW] that remain in the spirit of the model. Direct AIXI approximations will still fail in practice, but only because of compute limitations which are even possible to brute force with slightly clever algorithms and planetary scale compute, but in practice this approach will lose to less faithful approximations (and unprincipled heuristics). But this is an unfair criticism, because-
- Though there are elegant and still practical specifications for intelligent behavior, the most intelligent agent that runs on some fixed hardware has completely unintelligible cognitive structures and in fact its source code is indistinguishable from white noise. This is why deep learning algorithms are simple but trained models are terrifyingly complex. Also, mechanistic interpretability is a doomed research program.
- The idea of a human "more effectively pursuing their utility function" is not coherent because humans don't have utility functions - our bounded cognition means that none of us has been able to construct consistent preferences over large futures that we would actually endorse if our intelligence scaled up. However, there do exist fairly coherent moral projects such as religions, the enlightenment, the ideals of Western democracy, and other ideologies along with their associated congregations, universities, nations, and other groups, of which individuals make up a part. These larger entities can be better thought of as having coherent utility functions. It is perhaps more correct to ask "what moral project do I wish to serve" than "what is my utility function?" We do not have the concepts to discuss what "correct" means in the previous sentence, which may be tied up with our inability to solve the alignment problem (in particular, our inability to design a corrigible agent).
2 comments
Comments sorted by top scores.
comment by jbash · 2024-11-14T20:15:36.837Z · LW(p) · GW(p)
Non-causal decision theories are not necessary for A.G.I. design.
I'll call that and raise you "No decision theory of any kind, causal or otherwise, will either play any important explicit role in, or have any important architectural effect over, the actual design of either the first AGI(s), or any subsequent AGI(s) that aren't specifically intended to make the point that it's possible to use decision theory".
comment by AnthonyC · 2024-11-14T11:42:47.556Z · LW(p) · GW(p)
I'm not entirely sure how many of these I agree with, but I don't really think any of them could be considered heretical or even all that uncommon as opinions on LW?
All but #2 seem to me to be pretty well represented ideas, even in the Sequences themselves (to the extent the ideas existed when the Sequences got written).
#2 seems to me to rely on the idea that the process of writing is central or otherwise critical to the process of learning about, and forming a take on, a topic. I have thought about this, and I think for some people it is true, but for me writing is often a process of translating an already-existing conceptual web into a linear approximation of itself. I'm not very good at writing in general, and having an LLM help me wordsmith concepts and workshop ideas as a dialogue partner is pretty helpful. I usually form takes my reading and discussing and then thinking quietly, not so much during writing if I'm writing by myself. Say I read a bunch of things or have some conversations, take notes on these, write an outline of the ideas/structure I want to convey, and share the notes and outline with an LLM. I ask it to write a draft that it and I then work on collaboratively. How is that meaningfully worse than writing alone, or writing with a human partner? Unless you meant literally "Ask an LLM for an essay on a topic and publish it," in which case yes, I agree.