Notes on Superwisdom & Moral RSI

welfvh

Notes on Superwisdom & Moral RSI

post by welfvh · 2025-02-28T10:34:54.767Z · LW · GW · 4 comments

4 comments

These are very preliminary notes, to get the rough ideas out. There's lots of research lying around, a paper in the works, and I'm happy to answer any and all questions.

The Northstar of AI Alignment, as well as Alignment at Large, should be Superwisdom and Moral RSI (Recursive Self-Improvement). Our current notion of human values is too shallow, too static, too corrupted.

Coherently Extrapolated Volition was directionally right — a method for continually extrapolating what we’d want to want if we were wiser, had grown up further, etc. However, this requires a non-arbitrary concept of wisdom and moral progress. I believe a developmentally informed Moral Realism can serve as the foundation for this:

It’s not just intelligence that’s required for moral convergence, it’s human development across the (at least) cognitive, psychological, existential, cultural, and societal dimensions of human life. Developmental psychology and political metamodernism (Hanzi Freinacht et al) show that these patterns of development are not arbitrary, but emerge in recognizable patterns.

This makes a powerful argument for Moral Realism: There is “goodness” and moral significance baked into reality; moral competence a question of optometry — seeing clearly. This gives significant hope for AI alignment and should inform research agendas; if there is goodness to be seen, we better prioritize the requirements for seeing, and start the process of training moral agents.

Effectively, any real alignment that deserves the word must include a strong attunement to the good. Moral realism is not at all popular in the Yudkowskian deep atheist alignment discourse - for what I think are a bunch of explainable reasons (developmental imbalances, autism, etc).

Really, what’s needed is a metamodern axiology built around these insights. Given the explanatory power of developmental psychology, much of philosophy needs to be refactored. Metamodern thought is somewhat recent and fringe. Much more work is waiting to be done.

Superwisdom should be the North Star of alignment, and Moral RSI should be a near-term priority for frontier labs. The conception of wisdom gestured at here offers the seed for an axiological basis for this work.

Relatedly, Chris Leong wrote about a "wisdom explosion" [LW · GW] here, and Oliver Klingefjord coined the term "Artificial Super Wisdom".

4 comments

Comments sorted by top scores.

comment by Chris_Leong · 2025-02-28T11:41:03.380Z · LW(p) · GW(p)

I suspect that your post probably isn't going to be very legible to the majority folks on Less Wrong, since you're assuming familiarity with meta-modernism. To be honest, I suspect this post would have been more persuasive if you had avoided mentioning it, since the majority of folks here are likely skeptical of it and it hardly seems to be essential for making what seems to be the core point of your post^[1]. Sometimes less is more. Things cut out can always explored in the future, when you have the time to explain them in a way that will be legible to your audience (though it's often valuable to gesture towards the directions you wish to develop in the future).

I see as the core point that your post is arguing for as the following:

If moral realism is true^[2], then this suggests that incorporating it within our attempt at alignment may be easier than avoiding making any assumptions about morality, since understanding morality then becomes about trying to see reality more clearly.

I think this is quite an interesting and reasonable argument and I'd like to see you sketch out in more detail how you think we might be able to leverage it.

^{^}
As far as I can tell, but I could be wrong.
^{^}
Which it certainly may not be and indeed there are some quite strong arguments against.

comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-02-28T11:06:08.209Z · LW(p) · GW(p)

I don't think the idea of Superwisdom / Moral RSI requires Moral Realism. Personally, I am a big fan of research being put into a Superwisdom Agenda, but I don't believe in Moral Realism. In fact, I'd be against a project which had (in my view, harmful and incorrect) assumptions about Moral Realism as a core part of its aims.

So I think you should ask yourself whether this is necessarily part of the Superwisdom Agenda, or if you could envision the agenda being at least agnostic about Moral Realism.

Replies from: cubefox, welfvh

↑ comment by cubefox · 2025-02-28T23:37:59.365Z · LW(p) · GW(p)

Note that Yudkowsky wasn't agnostic about it either, see his theory of moral realism here [? · GW].

↑ comment by welfvh · 2025-02-28T20:59:55.224Z · LW(p) · GW(p)

Thanks! Yes, there's lots of convergence between methods, something Joe Carlsmith also wrote about. What do you see as the strongest arguments against Moral Realism?

Notes on Superwisdom & Moral RSI

Contents

4 comments