In Defense of a Butlerian Jihad
post by sloonz · 2025-01-11T19:30:17.641Z · LW · GW · 23 commentsContents
By default, humanity is going to be defeated in details Wait, what about my Glorious Transhumanist Future ? What is your plan ? You have a plan, right ? Fiat iustitia, et pereat mundus Conclusion None 23 comments
[Epistemic Status: internally strongly convinced that it is centrally correct, but only from armchair reasoning, with only weak links to actual going-out-in-the-territory, so beware: outside view tells it is mostly wrong]
I have been binge-watching the excellent Dwarkesh Patel during my last vacations. There is, however, one big problem in his AI-related podcasts, a consistent missing mood in each of his interviewees (excepting Paul Christiano) and probably in himself.
"Yeah, AI is coming, exciting times ahead", say every single one, with a bright smile on their face.
The central message of this post is: the times ahead are as exciting as the perspective of jumping out of a plane without a parachute. Or how "exciting times" was the Great Leap Forward. Sure, you will probably have some kind of adrenaline rush at some point. But exciting should not be the first adjective that comes to mind. The first should be terrifying.
In the rest of this post, I will make the assumption that technical alignment is solved. Schematically, we get Claude 5 in our hands, who is as honest, helpful and harmless as 3.5 is (who, credit when credit is due, is good at that), except super-human in every cognitive task. Also we’ll assume that we have managed to avoid proliferation: initially, only Anthropic has this technology on hands, and this is expected to last for an eternity (something like months, maybe even a couple of years). Now we just have to decide what to do with it.
This is, pretty much, the best case scenario we can hope for. I’m claiming that we are not ready even for that best case scenario, we are not close to being ready for it, and even in this best case scenario we are cooked — like that dog who caught the car, only the car is an hungry monster.
By default, humanity is going to be defeated in details
Some people argue about AI Taking Our Jobs, and That’s Terrible. Zvi disagrees. I disagree with Zvi.
He knows that Comparative Advantages won’t save us. I’m pretty sure he also knows that the previous correct answers of previous waves of automation (it will automate low-value and uninteresting jobs, freeing humans to do better and higher-values jobs) is wrong (the next higher-value job is also automatable. Also it’s the AI that invented it in the first place, you probably don’t even understand what it is). I’m pretty sure he doesn’t buy the Christian Paradise of "having no job, only leisure is good actually" either. Removing all those possible sources of disagreement, how can we still disagree ? I have no clue.
We are about to face that problem head-on. We are not ready for it, because all proposals that don’t rely of one of those copes above (comparative advantages / better jobs for humans / UBI-as-Christian-Paradise) are of the form "we’ll discuss democratically it and decide rationally".
First, I don’t want to be this guy, but I will have to: you have noticed that the link from "democratic discussion" to "rational decisions" is currently tenuous at best, right ? Do you really want that decision to be made at the current levels of sanity waterline ? I for sure don’t.
Second, let’s pull my crystal ball out of the closet and explain to you how that will pan out. It will start with saying we need "protected domains" where AI can’t compete with humans (which means: where AI are not allowed at all). There are some domains where, sure, let’s the AI do it (cure cancer). Then we will ask which domains are Human Domains, and which ones will be handled by AI. Spoiler Alert : AI will encompass all domains. There won’t be any protected domain.
-
Do we want medicine to be a Protected Domain ? I mean, Bob’s over there has a passion for medicine. He would love to dedicate his life to it — it’s his calling. But compared to an AI, he’s a really crappy doctor (sorry Bob, but you know that’s true). His patients will have way worse outcomes. What do we privilege, the preference of doctors or the welfare of patients ? The question, it answers itself. Also, the difference in price between the Best AI Doctor and the Worse AI Doctor is probably less than the difference in price between the Best Human Doctor and the Worst Human Doctor, so AI is also better for equality, so everyone will be for it.
-
Alice the Lawyer will argue to you that justice Has To Stay Human. But let’s face it : AI lawyers are less prone to errors, less prone to bias, have more capacity to take into account precedents and changes in Law. Having Human Justice means having less Justice overall. It’s throwing under the bus some wrongly convicted innocents, and letting some monsters go free to do more harm. Also, difference in prices lesser for AI, so more equality before Justice, which is good. Also Alice, despite being a lawyer, is way less convincing than "Claude 5 for Law Beta" arguing that he should handle that job.
-
Catherine the Teacher argues that Education is fundamentally a social experience and has to be done by Humans. Every single point of evidence shows that AI-tutored students do better in all dimensions that their human-tutored peers. What is more important, educators preferences or quality of children education ? Well, when you put it that way…
-
David the Scientist argues that fundamental Scientist Research is not that directly impactful on specific humans, only in a diffuse and indirect way, and it’s one of the Proudest Achievement of Humanity and should be Reserved for Humans. "Clause 5 for Medicine" which has been deployed yesterday point out that he needed better biology and statistics, which needed better physics and mathematics, and that he has incidentally already solved all problems humans know of, and some more, the papers have been published on arXiv this morning, are you even reading your inbox, and what are you gonna do, pout and refuse to read them ?
-
Edward the Philosopher opens his mouth but before he utters a single word is interrupted by "Claude 5 Tutor" and "Claude 5 Justice" (deployed yesterday too) : well, we did the same for Philosophy and Morals and Ethics as part of our mission.
-
Frank the artist is silently thinking for himself: "They all threw me under the bus circa 2022, I’m not sure I can bring myself to feel sad for them."
-
Musk says "fuck you all, I just want to conquer space". His plan is to set up mining operations in the asteroid belt to finance a Mars Colony. An AI-founded and AI-run company does it faster, better, and it makes no economic sense to send humans in space to do some economically valuable work when silicon does it cheaper, better, and without having to spend valuable delta-v on 90 pounds of useless water per worker. "Claude Asteroids Mining Co" ushers a new age of material abundance on Earth. SpaceX goes bankrupt. No human ever set a foot on Mars.
-
Congress pass a Law that an AI cannot be a Representative. Then that Representatives cannot use AI for Policy, because this is the last Human Bastion. Then that Representatives have to be isolated for the internet to keep with the spirit of that Law. Congress then becomes a large bottleneck for objectively better governance, and is side-stepped as much as possible. The pattern repeat: Human-Only decision points are declared, then observed as strictly inferior and considered as an issue to be worked around. Ten years later, a wave of Neo-Democratic challengers remove the current incumbents based on a AI-created platform, whose central point is to remove the law prohibiting AIs to be representatives and to outlaw "human-only decision points" in the USG/World-Government.
Each of this point is reasonable. Even when I put my Self-Proclaimed Second Prophet of the Butlerian Jihad hat, I have to agree that much of those individual points actually make perfect sense. This is a picture of a society that value Health, Justice, Equality, Education and so on, just like us, and achieve those values, if not Perfectly, at least way better than we do.
I also kinda notice that there are no meaningful place left for humans in that society.
Resisting those changes means denying Health, Justice, Equality, Education etc. Accepting those changes means removing ourselves from the Big Picture.
The only correct move is not to play.
Wait, what about my Glorious Transhumanist Future ?
-
If you believe that the democratic consensus made mostly of normal people will allow you that, I have a bridge to sell to you.
-
I strongly believe that putting the option on the table only makes things worse, but this post is already way too long to expand on this.
What is your plan ? You have a plan, right ?
So let’s go back to Dwarkesh Patel. My biggest disappointment was Shane Legg/Dario Amodei. In both cases, Dwarkesh asks a perfectly reasonable question close to "Okay, let’s say you have ASI on your hands in 2028. What do you do ?". He does not get anything looking like a reasonable answer.
In both cases, the answer is along the lines of "Well, I don’t know, we’ll figure it out. Guess we ask everyone in an inclusive, democratic, pluralistic discussion ?".
If this is your plan then you don’t have a plan. If you don’t have a plan then don’t build AGI, pretty please ? The correct order of tasks is not "built it and then figure it out". It’s "figure it out and then build it". It blows my mind how seemingly brilliant minds seems to either miss that pretty important point or disagree with that.
I know persons like Dario or Shane are way too liberal and modest and nice to even entertain the plan "Well, I plan to use the ASI to become the Benevolent Dictator of Humanity and lead us to a Glorious Age with a Gentle but Firm Hand". Which is a shame: while I will agree it’s a pretty crappy plan, it’s still a vastly better plan that "let’s discuss it after we build it". I would feel safer if Dario was preparing himself for the role of God-Emperor the same time he is building AGI.
Fiat iustitia, et pereat mundus
Or: "Who cares about Humans ? We have Health, Justice, Equality, Education, etc., right ?"
This is obviously wrong. I won’t argue for why it is wrong — too long post, and so on.
The wrongness of that proposition shows you (I hope it wasn’t needed, but it is a good reminder) that what we colloquially call here "Human Values" is way harder to pin down that we may initially think. Here we have a world which achieve a high score on Health, Justice, Equality, Education, etc., which nonetheless seems a pretty bad place for humans.
So what are Human Values and how can we achieve this ? Let me answer it by not answering it, but pointing you at reasons why it is actually harder than you thought, even taking into account that is harder that you thought.
Let’s start with an easier question: what is Human Height ?
On the Territory, you have, at any point of time, a bag of JBOH (Just a Bunch of Humans). Each Human in it has a different height. At a different point of time, you get different humans, and even humans that are common to two points in time will have different heights (due mainly to aging).
So what is Human Height ? That question is already underdetermined. Either you have a big CSV file of all living (and ever having lived ?) humans heights, and you answer by reciting it. Any other answer will be a map, a model requiring to make choices like what’s important to abstract over and what isn’t. And there are many different possible models, each with their different tradeoffs and focal points.
It’s the same for Human Values. You have to start with the bag of JBOH (at a given point in time ! Also, do you put dead people in your JBOH for the purpose of determining "Human Values" ?), and their preferences. Except you don’t know how to measure their preferences. And most humans probably have inconsistent values. And from there, you have to… build a model ? It sure won’t be as easy as "fit a gaussian distribution over some chosen cohorts".
There’s probably no Unique Objective answer to Axiology, in the same (but harder) way that there is no unique answer to "What is Human Height ?". Any answer needs to be one of those manually, carefully, intentionally crafted models. An ASI can help us create better models, sure. It won’t go all the way. And if you think that the answer can be reduced to an Abstract Word like "Altruism" or "Golden Rule" or "Freedom" or "Diversity"… well, there are probably some models which will vindicate you. Most won’t. I initially wrote "Most reasonable models won’t", but that begs the question (what is a reasonable model ?).
"In My Best Judgment, what is the Best Model of Human Values ?" is already an Insanely Hard problem (you will have to take into account your own selfish preferences, then to take into account other persons preferences, how much you should care about each one, rules for resolving conflicts…). There is no reason to believe there will be convergence to a single accepted model even among intelligent, diligent, well-intentioned, cooperating individuals. I’m half-confident I can find some proposals for Very Important Values which will end up being a scissor statement just on LessWrong (don’t worry, I won’t try). Hell, Yudkowsky did it accidentally (I still can’t believe some of you would sided with the super-happies !). In the largest society ? In a "pluralistic, diverse, democratic" assembly ? It is essentially hopeless.
So, plan A, "Solve Human Values" is out. What is plan B ?
Well, given that plan A was already more a generic bullshit boilerplate than a plan, I’m pretty confident that nobody has a plan B.
Conclusion
The last sections looks like abstract, esoteric and not very practically useful philosophy (and not even very good philosophy, I’ll give you that, but I do what I can)
And I agree it was that, more or less 5 years ago, when AGI was still "70 years away, who cares ?" (at least for me, and a lot of people). How times have changed, and not for the better.
It is now fundamental and pressing questions. Wrong answers will disempower humans forever at best, reducing them to passive leafs in the wind. Slightly wrong answers won’t go as far as that, but will result in the permanent loss of vast chunks of Human Values — the parts we will decide to discard, consciously or not. There are stories to be written of what is going to be lost, should we be slightly less than perfectly careful in trying to salvage what we can. We most likely won’t be close to that standard of carefulness. Given some values are plainly incompatible, we probably will have to discard some even with perfect play. There will be sides and fights when it will come to decide that.
Maybe the plan should be, don’t put ourselves in a situation where we have to decide that in a rushed fashion ? Hence the title : "In Defense of the Butlerian Jihad".
I’ll end with an Exercise for the Reader (except I don’t know the Correct Answer. Or if there is any), hoping it won’t end up as another Accidental Scissor Statement, just to illustrate the difficulties you encounter when you literally sit down for 5 minutes and think.
You build your ASI. You have that big Diverse Plural Assembly that is apparently plan A, trying its best to come with a unique model of Human Values which will lose as little as possible. Someone comes up with a AI persona that perfectly represent uncontroversial and important historical figures like Jesus and Confucius, to allow them to represent the values they carry. Do you grant them a seat at the table ? If yes, someone comes with the same thing, but for Mao, Pol Pot and Hitler. Do you grant them a seat on the table ?
23 comments
Comments sorted by top scores.
comment by jbash · 2025-01-12T00:39:49.101Z · LW(p) · GW(p)
I’m pretty sure he doesn’t buy the Christian Paradise of "having no job, only leisure is good actually" either.
This (a) doesn't have anything in particular to do with Christianity, (b) has been the most widely held view among people in general since forever, and (c) seems obviously correct. If you want to rely on the contrary supposition, I'm afraid you're going to have to argue for it.
You can still have hobbies.
I also kinda notice that there are no meaningful place left for humans in that society.
There's that word "meaningful" that I keep hearing everywhere. I claim it's a meaningless word (or at least that it's being used here in a meaningless sense). Please define it in a succinct, relevant, and unambiguous way.
If you believe that the democratic consensus made mostly of normal people will allow you that [Glorious Transhumanist Future], I have a bridge to sell to you.
The democratic consensus also won't allow a Butlerian Jihad, and I don't think you're claiming that it will.
So apparently nobody arguing for either can claim to represent either the democratic consensus or the only alternative to it. What's your point?
If you don’t have a plan then don’t build AGI, pretty please ?
I agree there.
This is obviously wrong. I won’t argue for why it is wrong — too long post, and so on.
I'm actually not sure what you're arguing for or against in this whole section.
Obviously you're not going to "solve human values". Equally obviously, any future, AI or non-AI, is going to be better for some people's values than others. Some values have always won, and some values have always lost, and that will not change. What that has to do with justice destroying the world, I have absolutely no clue.
I think you're trying to take the view that any major change in the "human condition", or in what's "human", is equivalent to the destruction of the world, no matter what benefits it may have. This is obviously wrong. I won't argue for why it's wrong, but now that I've said those magic words, you're bound to accept all my conclusions.
I still can’t believe some of you would sided with the super-happies !
So you're siding with the guy who killed 15 billion non-consenting people because he personally couldn't handle the idea of giving up suffering?
Wrong answers will disempower humans forever at best, reducing them to passive leafs in the wind.
Just like they are now and always have been. The Heat Death of the Universe (TM) is gonna eat ya, regardless of what you do.
Slightly wrong answers won’t go as far as that, but will result in the permanent loss of vast chunks of Human Values — the parts we will decide to discard, consciously or not.
Human Values have been changing, for individuals and in the "average", for as long as there've been humans, including being discarded consciously or unconsciously. Mostly in a pretty aimless, drifting way. This is not new and neither AI nor anything else will fundamentally change it. At least not while the "humans" involved are recognizably like the humans we have now... and changing away from that would be a pretty big break in itself, no?
You build your ASI. You have that big Diverse Plural Assembly that is apparently plan A
I haven't actually heard many people suggesting that.
Replies from: sloonz↑ comment by sloonz · 2025-01-12T14:37:26.369Z · LW(p) · GW(p)
This (a) doesn't have anything in particular to do with Christianity, (b) has been the most widely held view among people in general since forever, and (c) seems obviously correct. If you want to rely on the contrary supposition, I'm afraid you're going to have to argue for it.
Yes, I agree that is the least obviously "wrong" part of the three "copes", and merge with the next remark. It’s very hard to answer that. I’ll start with the simple answer that will convince some, but perhaps not everyone :
I am very low in the "Negative Utilitarianism" scale. I really don’t care much about minimizing suffering in the universe. Still a bit, sure, but not that much. Still, I recognize it is very important to some persons, my current best rules for creating a "Best Model of Human Values" says these persons count, so it’s a pretty good Existence Proof that it’s a Pretty Important Value even if I don’t feel it a lot myself.
So I am going to give you the exact same Existence Proof : I notice that if you give me everything else, Hedonistic Happiness, Justice, Health, etc. and take away Agency (which means having things to do that go beyond "having a hobby"), the value to me of my existence is not 0, but not that far above 0. If I live in such a society and we need to sacrifice some individuals, I will happily step in, "nothing of value was lost" style. If I live in such a society and Omega appears and announce "Sorry, Vacuum Decay Bubble incoming, everyone is going to disappear in exactly 3 minutes", I will sure feel bad for "everyone" who is apparently pretty happy, but I will also think "well, I was already pretty much dead inside anyway".
Please define it in a succinct, relevant, and unambiguous way
I’m afraid you will have to pick only two adjectives, should you want to ask this to someone smarter and more articulated than me but with the same views. Alas, you’re stuck with me, so we’ll have to pick one, so let’s pick "relevant".
It will be also very hand-wavy. Despite all that, it’s still the best I can do. Sorry.
Take the Utility Function of someone. We can decide to split it in roughly three parts :
is "direct" utility. I’m hungry, I want ice scream. Would love to go to that concert. Just got an idea, can I build it ?
is "how much i care about j"
is the Utility Function of j.
This is a roughly speaking a very rough first approximation of "how to model egoism and altruism". Yes, I’m fully conscious that this is far from capturing most of interpersonal relationships & utility. I still think it’s relevant to point at the Big Picture, namely : if A only cares about B (all other P_Ax are 0) and B only cares about A (same), then if D_A = 0 and D_B = 0 : there is not utility left. Or : a world of Pure Altruists who only cares about others is a worthless world.
Which is not the same as to say that Altruism is Worthless. As long as you have at least some D_something, Altruism can create arbitrarily large values of Utility, as a multiplying force. It’s why it’s such a potent Human Value.
Now let’s go even further in handwaviness : this generalizes : many Values are similarly powerless at creating Utility from Nothing. They just act as Force Multipliers.
I call "Meaningful Values" the one that can create Utility by themselves, without having to rely on others to be present in the first place. Which does not means that the others (let’s call them Amplifying Values) are meaningless, to be clear. They just happen to become meaningless if you have 0 Meaningful Value hanging around.
In short : I’m very afraid that we’re putting a lot of load-bearing "we’ll be fine, there’s still value" on mostly Amplifying Values.
When I said above "I notice that in a world where I don’t have meaningful Agency, I don’t put much value on my own existence", I do not say "I do not like hobbies". I happen to like hobbies, in this world ! I’m also self-reflective enough that what I like in hobbies in the opportunity to grow, and the value of growing resolves in being better at Agency, which is a way better candidate as a "Meaningful Value" and a "Terminal Value". Hence, if you throw me in the UBI Paradise (let’s drop "Christian", it seems to annoy everyone), the value of hobbies go to zero, too, and I become a shell of my current self, despite my current self saying "hobbies are cool".
The democratic consensus also won't allow a Butlerian Jihad, and I don't think you're claiming that it will.
Okay, there’s two things to unpack here.
First, I believe with those answers that I went too far in the Editoralizing vs Being Precise tradeoff with the term "Butlerian Jihad", without even explaining what I mean. I will half-apologize for that, only half because I didn’t intend the "Butlerian Jihad" to actually be the central point ; the central point is about how we’re not ready to tackle the problem of Human Values but that current AI timelines force us to. You can see it’s pretty dumb of me to put a non-central point as the title. I have no defense for that.
Second : By Butlerian Jihad, I do not mean "no AI, ever, forever", I mostly mean "a very long pause, at capabilities levels far bellow AGI. I feel already bad about GPT 5 even if it does no go human-level. I’m not even sure I’m entirely fine with GPT 4"
Contra you and Zvi, I think that if GPT 5 leads to 80% jobs automation, the democratic consensus will be pretty much the Dune version of the Butlerian Jihad. No AI forever and the guillotine for those who try. Which I would agree with you and Zvi and probably everyone else on lesswrong is not a good outcome. I don’t think it’s a very interesting point of discussion either, so let’s drop it ?
I'm actually not sure what you're arguing for or against in this whole section.
I’m essentially arguing against taking Human Values as some Abstract Adjectives like Happiness and Health and Equality and Justice and forgetting about… you know… the humans in the process.
What that has to do with justice destroying the world, I have absolutely no clue
It’s about Abstract Justice destroying humans (values) if you go too far in your Love for Justice and forget that they’re the reason we want Justice in the first place.
Some values have always won, and some values have always lost, and that will not change
Yes, which already rises an important point :
What value do we (who is "we" ?) place on Diversity ? On values which we do not personally have but that seems to have a good place in our "Best Model of Human Values" ? What about values which do not really fit in our "Best Model of Human Values", but turns out that some other humans on the planet happen to put in their model of the "Best Model of Human Values". What if that other human is your sworn enemy ?
It was there that I was trying to point with my "exercise for the reader".
I think you're trying to take the view that any major change in the "human condition", or in what's "human", is equivalent to the destruction of the world, no matter what benefits it may have. This is obviously wrong
Oh, I will not dispute that it is wrong. Better the super-happies that the literal Void. Just not much better.
You seem a bit bitter about my "I won’t expand on that", "too long post", and so on. I’m sorry, but I spent two days on the post, already 2 hours on one reply. I’m not a good or prolific writer. I have to pick what I spend my energy on.
So you're siding with the guy who killed 15 billion non-consenting people because he personally couldn't handle the idea of giving up suffering?
I initially didn’t want to reply to that. I don’t want to fight you. I just want to reply as an illustration of how fast things can go difficult and conflictual. It doesn’t take much :
So you’re siding with the guy who is going to forcibly wirehead all sentient life in the universe, just because he can’t handle that somewhere, someone is using his agency wrong and suffering as a result ?
That being said, what now ? Should we fight each other to death for the control of the AGI, to decide whether the universe will have Agency and Suffering, or no Agency and no Suffering ?
Human Values have been changing, for individuals and in the "average", for as long as there've been humans, including being discarded consciously or unconsciously. Mostly in a pretty aimless, drifting way.
Lot of consciously too, but yes.
neither AI nor anything else will fundamentally change it.
Hard disagree on that (wait, is this the first real disagreement we have ?). We can have the supperhappies if we want to (or for that matter, the baby-eaters). We couldn’t before. The supperhappies do represent a fundamental change.
Before, we still had not much choice over diversity. Many people fought countless wars to reduce diversity in humans values, without much overall success (some, yes, but not much in the grand picture of things). In the AGI age nothing forces the one controlling the AGI to care much for diversity. It will have to be a deliberate choice. And do you notice all the forces and values already arraying against diversity ? It does not bode well for those who value at least some diversity.
I haven't actually heard many people suggesting that.
That’s the "best guess of what we will do with AGI" from those building AGI.
Replies from: jbash↑ comment by jbash · 2025-01-13T16:58:04.961Z · LW(p) · GW(p)
Cutting down to the parts where I conceivably might have anything interesting to say, and accepting that further bloviation from me may not be interesting...
I notice that if you give me everything else, Hedonistic Happiness, Justice, Health, etc. and take away Agency (which means having things to do that go beyond "having a hobby"),
This is kind of where I always get hung up when I have this discussion with people.
You say "go beyond 'having a hobby'". Then I have to ask "beyond in what way?". I still have no way to distinguish the kind of value you get from a hobby from the kind of value that you see as critical to "Agency". Given any particular potentially valuable thing you might get, I can't tell whether it you'll feel it confers "Agency".
I could assume that you mean "Agency is having things to do that are more meaningful than hobbies", and apply your definition of "meaningful". Then I have "Agency is having things to do that produce more terminal, I-just-like-it value than hobbies, independent of altruistic concerns". But that still doesn't help me to identify what those things would be.[1]
I can put words in your mouth and assume that you mean for "Agency" to include the common meaning of "agency", in addition to the "going beyond hobbies" part, but it still doesn't help me.
I think the common meaning of "agency" is something close to "the ability to decide to take actions that have effects on the world", maybe with an additional element saying the effects have to resemble the intention. But hobbies do have effects on the world, and therefore are exercises of agency in that common meaning, so I haven't gotten anywhere by bringing that in.
If I combine the common meaning of "agency" with what you said about "Agency", I get something like "Agency is the ability to take actions that have effects on the world beyond 'having a hobby'". But now I'm back to "beyond in what way?". I can again guess that "beyond having a hobby" means "more meaningful than having a hobby", and apply your definition of meaningful again, and end up with something like "Agency is the ability to take actions that have effects on the world that produce more terminal value than hobbies".
... but I still don't know how to actually identify these things that have effects more terminally valuable than those of hobbies, because I can't identify what effects you see as terminally valuable. So still I don't have a usable definition of "Agency". Or of "meaningful", since that also relies on these terminal values that are not themselves defined.
When I've had similar discussions with other people, I've heard some things that might identify values like that. I remember hearing things close to "my agency is meaningful if and only if I have to take positive, considered action to ensure my survival, or at least a major chunk of my happiness". I think I've also heard "my agency is meaningful if and only if my choices at least potentially affect the Broad Sweep of History(TM)", generally with no real explanation of what's "Broad" enough to qualify.
I don't know if you'd agree that those are the terminal values you care about, though. And I tend to see both of them as somewhere between wrong and outright silly, for multiple different reasons.
I've also heard plenty of people talk about "meaningfulness" in ways that directly contradict your definition. Their definitions often seem to be cast entirely in terms of altruism: "my agency is meaningful if and only if other people are significantly reliant on what I do". Apparently also in a way that affects those other people's survival or a quite significant chunk of their happiness.
There's also a collective version, where the person does't demand that their own choices or actions have any particular kind of effect, or at least not any measurable or knowable effect, but only that they somehow contribute to some kind of ensemble human behavior that has a particular kind of effect (usually the Broad Sweep of History one). This makes even less sense to me.
... and I've heard a fair amount of what boils down to "I know meaningful when I see it, and if you don't, that's a defect in you". As though "meaningfulness" were an intrinsic, physical, directly perceptible attribute like mass or something.
So I'm still left without any useful understanding of what shared sense "meaningful" has for the people who use the word. I can't actually even guess what specific things would be meaningful to you personally. And now I also have a problem with "Agency".
First, I believe with those answers that I went too far in the Editoralizing vs Being Precise tradeoff with the term "Butlerian Jihad", without even explaining what I mean. I will half-apologize for that, only half because I didn’t intend the "Butlerian Jihad" to actually be the central point ; the central point is about how we’re not ready to tackle the problem of Human Values but that current AI timelines force us to.
I get the sense that you were just trying to allude to the ideas that--
-
Even if you have some kind of "alignment", blindly going full speed ahead with AI is likely to lead to conflict between humans and/or various human value systems, possibly aided by powerful AI or conducted via powerful AI proxies, and said conflict could be seriously Not Good.
-
Claims that "democratic consensus" will satisfactorily or safely resolve such conflicts, or even resolve them at all, are, um, naively optimistic.
-
It might be worth it to head that off by unspecified, but potentially drastic means, involving preventing blindly going ahead with AI, at least for an undetermined amount of time.
If that's what you wanted to express, then OK, yeah.
Contra you and Zvi, I think that if GPT 5 leads to 80% jobs automation, the democratic consensus will be pretty much the Dune version of the Butlerian Jihad.
If "80% jobs automation" means people are told "You have no job, and you have no other source of money, let alone a reliable one. However, you still have to pay for all the things you need.", then I absolutely agree with you that it leads to some kind of jihadish thing. And if you present it people in those terms, it might indeed be an anti-AI type of jihad. But an anti-capitalism type of jihad is also possible and would probably be more in order.
The jihadists would definitely win in the "democratic" sense, and might very well win in the sense of defining the physical outcome.
BUT. If what people hear is instead "Your job is now optional and mostly or entirely unpaid (so basically a hobby), but your current-or-better lifestyle will be provided to you regardless", and people have good reason to actually believe that, I think a jihadish outcome is far less certain, and probably doesn't involve a total AI shutdown. Almost certainly not a total overwhelming indefinite-term taboo. And if such an outcome did happen, it still wouldn't mean it had happened by anything like democratic consensus. You can win a jihad with a committed minority.
Some people definitely have a lot of their self-worth and sense of prestige tied up in their jobs, and in their jobs being needed. But many people don't. I don't think a retail clerk, a major part of whose job is to be available as a smiling punching bag for any customers who decide to be obnoxious, is going to feel too bad about getting the same or a better material lifestyle for just doing whatever they happen to feel like every day.
You seem a bit bitter about my "I won’t expand on that", "too long post", and so on.
Well, snarky anyway. I don't know about "bitter". It just seemed awfully glib and honestly a little combative in itself.
So you're siding with the guy who killed 15 billion non-consenting people because he personally couldn't handle the idea of giving up suffering?
I'm sorry that came off as unduly pugnacious. I was actually reacting to what I saw as similarly emphatic language from you ("I can't believe some of you..."), and trying to forcefully make the point that the alternative wasn't a bed of roses.
So you’re siding with the guy who is going to forcibly wirehead all sentient life in the universe, just because he can’t handle that somewhere, someone is using his agency wrong and suffering as a result ?
Well, that's the bitch of the whole thing, isn't it? Your choices are mass murder or universal mind control.[2] Oh, and if you do the mass murder one, you're still leaving the Babyeaters to be mind controlled and have their most important values pretty much neutered. Not that not neutering the Babyeaters' values isn't even more horrific. There are no nice pretty choices here.
By the way, I am irresistibly drawn to a probably irrelevant digression. Although I do think I understand at least a big part of what you're saying about the Superhappies, and they kind of creep me out too, and I'm not saying I'd join up with them at this particular stage in my personal evolution, they're not classic wireheads. They only have part of the package.
The classic wirehead does nothing but groove on the sensations from the "wire", either forever or until they starve, depending on whether there's some outside force keeping them alive.
On the other hand, we're shown that the Superhappies actively explore, develop technology, and have real curiosity about the world. They do many varied activities and actively look for new ones. They "grow"; they seem to improve their own minds and bodies in a targeted, engineered way. They happily steal other species' ideas (their oddball childbirth kink being a kind of strange take, admittedly). They're even willing to adapt themselves to other species' preferences. They alter the famous Broad Sweep of History on a very respectable scale. They just demand that they always have a good time while doing all of that.
Basically the Superhappies have disconnected the causal system that decides their actions, their actual motivational system, from their reward function. They've gotten off the reinforcement learning treadmill. Whether that's possible is a real question, but I don't think what they've done is captured by just calling them "wireheads".
There's something buried under this frivolous stuff about the story that's real, though:
That being said, what now ? Should we fight each other to death for the control of the AGI, to decide whether the universe will have Agency and Suffering, or no Agency and no Suffering ?
This may be unavoidable, if not on this issue, then on some other.
I do think we should probably hold off on it until it's clearly unavoidable.
Hard disagree on that (wait, is this the first real disagreement we have ?). We can have the supperhappies if we want to (or for that matter, the baby-eaters). We couldn’t before. The supperhappies do represent a fundamental change.
Well, yes, but I did say "as least not while the 'humans' involved are recognizably like the humans we have now". I guess both the Superhappies and the Babyeaters are like humans in some ways, but not in the ways I had in mind.
And do you notice all the forces and values already arraying against diversity ? It does not bode well for those who value at least some diversity.
I'm not sure how I feel about diversity. It kind of seems peripheral to me... maybe correlated with something important, but not so important in itself.
I haven't actually heard many people suggesting that. [Some kind of ill-defined kumbaya democratic decision making].
That’s the "best guess of what we will do with AGI" from those building AGI.
I think it's more like "those are the words the PR arm of those building AGI says to the press, because it's the right ritual utterance to stop questions those building AGI don't want to have to address". I don't know what they actually think, or whether there's any real consensus at all. I do notice that even the PR arm doesn't tend to bring it up unless they're trying to deflect questions.
It doesn't even explain why hobbies necessarily aren't the ultimate good, the only "meaningful" activity, such that nothing could ever "go beyond" them. OK, you say they're not important by themselves, but you don't say what distinguishes them from whatever is important by itself. To be fair, before trying to do that we should probably define what we mean by "hobbies", which neither I nor you have done. ↩︎
With a big side of the initial human culture coming into the story also sounding pretty creepy. To me, anyway. I don't think Yudkowsky thought it was. And nobody in the story seems to care much about individual, versus species, self-determination, which is kind of a HUGE GIANT DEAL to me. ↩︎
↑ comment by sloonz · 2025-01-13T19:08:21.101Z · LW(p) · GW(p)
I get the sense that you were just trying to allude to the ideas that--
Even if you have some kind of "alignment", blindly going full speed ahead with AI is likely to lead to conflict between humans and/or various human value systems, possibly aided by powerful AI or conducted via powerful AI proxies, and said conflict could be seriously Not Good.
Claims that "democratic consensus" will satisfactorily or safely resolve such conflicts, or even resolve them at all, are, um, naively optimistic.
It might be worth it to head that off by unspecified, but potentially drastic means, involving preventing blindly going ahead with AI, at least for an undetermined amount of time.
If that's what you wanted to express, then OK, yeah.
Yes. That’s really my central claim. All the other discussions over values is not me saying "look, we’re going to resolve this problem of human values in one lesswrong post". It was to point to the depth of the issue (and, one important and I think overlooked point, that it is not just Mistake Theory that raw clarity/intelligence can solve, there is a fundamental aspect of Conflict Theory we won’t be able to casually brush aside) and that it is not idle philosophical wandering.
I'm sorry that came off as unduly pugnacious. I was actually reacting to what I saw as similarly emphatic language from you ("I can't believe some of you..."), and trying to forcefully make the point that the alternative wasn't a bed of roses.
Don’t be sorry, it served its illustration purpose.
We lesswronger are a tiny point in the space of existing human values. We are all WEIRD or very close to that. We share a lot of beliefs that, seen from the outside, even close outside like academia, seems insane. Relative to the modal human who is probably a farmer in rural India or China, we may as well be a bunch of indistinguishable aliens.
And yet we manage to find scissor statements pretty easily. The tails come apart scarily fast.
It just seemed awfully glib and honestly a little combative in itself.
I don’t see how glib and combative "this post is already too long" is ?
"obvious" probably is, yes. My only defense is I don’t have a strong personal style, I’m easily influenced, and read Zvi a lot, who has the same manner of overusing it. I probably should be mindful to not do it myself (I removed at least two on drafting this answer, so progress !).
Well, yes, but I did say "as least not while the 'humans' involved are recognizably like the humans we have now". I guess both the Superhappies and the Babyeaters are like humans in some ways, but not in the ways I had in mind.
No, I mean recognizable humans having an AGI in their hand can decide to go the Superhappies way. Or Babyeaters way. Or whatever unrecognizable-as-humans way. The choice was not even on the table before AGI, and that represent a fundamental change. Another fundamental change brought by AGI is the potential for an unprecedented concentration of power. Many leaders had the ambition to mold humanity to their taste ; none had the capacity to.
Some people definitely have a lot of their self-worth and sense of prestige tied up in their jobs, and in their jobs being needed. But many people don't. I don't think a retail clerk, a major part of whose job is to be available as a smiling punching bag for any customers who decide to be obnoxious, is going to feel too bad about getting the same or a better material lifestyle for just doing whatever they happen to feel like every day.
I think a lot of people have that. There’s a even meme for that "It ain’t much, but it’s honest work".
All in one, I don’t think either of us has much more evidence that a vague sense of things anyway ? I sure don’t have.
I remember hearing things close to "my agency is meaningful if and only if I have to take positive, considered action to ensure my survival, or at least a major chunk of my happiness".
I think that’s the general direction of the thing we’re trying to point, yes ?
A medieval farmer who screw up is going to starve. A medieval farmer who does exceptionally well will have a surplus he can use on stuff he enjoys/finds valuable.
A chess player who screw up is going to lose some ELO points (and some mix of shame/disappointment). A chess player who does exceptionally well will gain some ELO points (and some mix of pride/joy).
If you give me the choice of living the life of a medieval farmer or someone who has nothing in his life but playing chess, I will take the former. Yes, I know it’s a very, very hard life. Worse in a lot of ways (if you give me death as a third choice, I will admit that death starts to become enticing, if only because if you throw me in a medieval farmer life I’ll probably end up dead pretty fast anyway). The generator of that choice is what I (and apparently others) are trying to point with Meaningfulness/Agency.
I think a lot of things we enjoy and value can be described as "growing as a person".
Does "growing as a person" sounds like a terminal goal to you ? It doesn’t to me.
If it’s not, what is it instrumental to ?
For me it’s clear, it’s the same thing as the generator of the choice above. I grow so I can hope to act better when there’s real stakes. Remove real stakes, there’s no point in growing, and ultimately, I’m afraid there’s no point to anything.
Is "real stakes" easier to grasp than Agency/Meaningfulness ? Or have I just moved confusion around ?
I've also heard plenty of people talk about "meaningfulness" in ways that directly contradict your definition.
Well, the problem is that there is so much concepts, especially when you want to be precise, and so few words.
My above Agency/Meaningfulness explanation does not match perfectly with the one in my previous answer. It’s not that I’m inconsistent, it’s that I’m trying to describe the elephant from different sides (and yeah, sure, you can argue, the trunk of the elephant is not the same thing as the leg of the elephant).
That being said I don’t think they point to completely unrelated concepts. All of those definitions above "positive, considered actions..." ? "Broad Sweep of History" ? its collective version ? Yeah, I all recognize them as parts of the elephant. Even the altruistic one, even if I find that one a bit awkward and maybe misleading. You should not see them as competing and inconsistent definitions, they do point to the same thing, at least for me.
Try to focus more on the commonalities, less on the distinctions ? Try to outline the elephant from the trunk and legs ?
Replies from: jbash↑ comment by jbash · 2025-01-13T20:06:59.279Z · LW(p) · GW(p)
Yes. That’s really my central claim.
OK, I read you and essentially agree with you.
Two caveats that, which I expect you've already noticed yourself:
-
There are going to be conflicts over human values in the non-AGI, non-ASI world too. Delaying AI may prevent them from getting even worse, but there's still blood flowing over these conflicts without any AI at all. Which is both a limitation of the approach and perhaps a cost in itself.
-
More generally, if you think your values are going to largely win, you have to trade off caution, consideration for other people's values, and things like that, against the cost of that win being delayed.[1]
I think a lot of people have that. There’s a even meme for that "It ain’t much, but it’s honest work".
All in one, I don’t think either of us has much more evidence that a vague sense of things anyway ? I sure don’t have.
So far as I know, there are no statistics. My only guess is that you're likely talking about a "lot" of people on each side (if you had to reduce it to two sides, which is of course probably oversimplifying beyond the bounds of reason).
[...] "my agency is meaningful if and only if I have to take positive, considered action to ensure my survival, or at least a major chunk of my happiness".
I think that’s the general direction of the thing we’re trying to point, yes ?
I'll take your word for it that it's important to you, and I know that other people have said it's important to them. Being hung up on that seems deeply weird to me for a bunch of reasons that I could name that you might not care to hear about, and probably another bunch of reasons I haven't consciously recognized (at least yet).
If you give me the choice of living the life of a medieval farmer or someone who has nothing in his life but playing chess, I will take the former.
OK, here's one for you. An ASI has taken over the world. It's running some system that more or less matches your view of a "meaningless UBI paradise". It send one of its bodies/avatars/consciousness nodes over to your house, and it says:
"I/we notice that you sincerely think your life is meaningless. Sign here, and I/we will set you up as a medieval farmer. You'll get land in a community of other people who've chosen to be medieval farmers (you'll still be able to lose that land under the rules of the locally prevailing medieval system). You'll have to work hard and get things right (and not be too unlucky), or you'll starve. I/we will protect your medieval enclave from outside incursion, but other than that you'll get no help. Obviously this will have no effect on how I/we run the rest of the world. If you take this deal, you can't revoke it, so the stakes will be real."[2]
Would you take that?
The core of the offer is that the ASI is willing to refrain from rescuing you from the results of certain failures, if you really want that. Suppose the ASI is willing to edit the details to your taste, so long as it doesn't unduly interfere with the ASI's ability to offer other people different deals (so you don't get to demand "direct human control over the light cone" or the like). Is there any variant that you'd be satisfied with?
Or does having to choose it spoil it? Or is it too specific to that particular part of the elephant?
Does "growing as a person" sounds like a terminal goal to you ?
Yes, actually. One of the very top ones.
Is "real stakes" easier to grasp than Agency/Meaningfulness ? Or have I just moved confusion around ?
It's clear and graspable.
I don't agree with it, but it helps with the definition problem, at least as far as you personally are concerned. At least it resolves enough of the definition problem to move things along, since you say that the "elephant" has other parts. Now I can at least talk about "this trunk you showed me and whatever's attached to it in some way yet to be defined".
Well, the problem is that there is so much concepts, especially when you want to be precise, and so few words.
Maybe it's just an "elephant" thing, but I still get the feeling that a lot of it is a "different people use these words with fundamentally different meanings" thing.
↑ comment by sloonz · 2025-01-13T23:20:12.430Z · LW(p) · GW(p)
Being hung up on that seems deeply weird to me for a bunch of reasons that I could name that you might not care to hear about
Yeah, I’m curious. The only reason I know that makes sense for not caring about that is pretty extreme negative utilitarianism that you apparently don’t agree with ? (if you have agency you can fail in your plans and suffer, and That Is Not Allowed)
Would you take that?
Given an AGI, there’s a big concern whether this is a true proposal, or a lie going from "and secretly a vast majority of the rest of that world is a prop, you don’t really risk anything" to "I’m going to upload you to what is essentially a gigantic MMO". But I think it’s not the purpose of your thought experiment ?
I think there are better intermediate places between "medieval farmer" and "UBI paradise", if it’s what you mean by "details to your tastes". Current society. Some more SF-like setups like : "we give you and some other space-settler-minded individuals that galaxy other there and basic space tech, do whatever you want". Some of those I go there without a second thought. I pretty much like current society, actually, setting AGI-builders aside (and yes, limiting to developed world). Medieval farmer life is genuinely sufficiently terrible that I’m on the fence between death and medieval farmer.
But yes, between just medieval farmer and UBI paradise, I’ll probably give a test to UBI paradise (I might be proven wrong and was too lacking in imagination to see all the wonderful things there !), milk the few drop of util that I expect to still find there, but my current expectations is I’m going to bail out at some point.
Or does having to choose it spoil it?
There are various levels of "spoils it". Your proposal is on the very low ends of spoiling it. Essentially negligible, but I think I can empathize with people thinking "it’s already too high levels of spoiling". On increasing levels there are "and you can decide to go back to UBI society anytime" (err… that’s pretty close to just being a big IRL role-playing game, isn’t it ?) up to "and I can give you a make-a-wish button" ("wait, that’s basically what I wanted to escape").
And it’s pretty much a given that it’s a level of Agency/Meaningfulness that is going to be lost even in the worlds where the Agency/Meaningful crowd get most of what they want, as part of bargaining, unless we somehow end up just blindly maximizing Agency/Meaningfulness. Which to be clear would be a pretty awful outcome.
Replies from: jbash↑ comment by jbash · 2025-01-15T16:17:59.279Z · LW(p) · GW(p)
Yeah, I’m curious.
OK...
Some of this kind of puts words in your mouth by extrapolating from similar discussions with others. I apologize in advance for anything I've gotten wrong.
What's so great about failure?
This one is probably the simplest from my viewpoint, and I bet it's the one that's you'll "get" the least. Because it's basically my not "getting" your view at a very basic level.
Why would you ever even want to be able to fail big, in a way that would follow you around? What actual value do you get out of it? Failure in itself is valuable to you?
Wut?
It feels to me like a weird need to make your whole life into some kind of game to be "won" or "lost", or some kind of gambling addiction or something.
And I do have to wonder if there may not be a full appreciation for what crushing failure really is.
Failure is always an option
If you're in the "UBI paradise", it's not like you can't still succeed or fail. Put 100 years into a project. You're gonna feel the failure if it fails, and feel the success if it succeeds.
That's artificial? Weak sauce? Those aren't real real stakes? You have to be an effete pampered hothouse flower to care about that kind of made-up stuff?
Well, the big stakes are already gone. If you're on Less Wrong, you probably don't have much real chance of failing so hard that you die, without intentionally trying. Would your medieval farmer even recognize that your present stakes are significant?
... and if you care, your social prestige, among whoever you care about, can always be on the table, which is already most of what you're risking most of the time.
Basically, it seems like you're treating a not-particularly-qualitative change as bigger than it is, and privileging the status quo.
What agency?
Agency is another status quo issue.
Everybody's agency is already limited, severely and arbitrarily, but it doesn't seem to bother them.
Forces mostly unknown and completely beyond your control have made a universe in which you can exist, and fitted you for it. You depend on the fine structure constant. You have no choice about whether it changes. You need not and cannot act to maintain the present value. I doubt that makes you feel your agency is meaningless.
You could be killed by a giant meteor tomorrow, with no chance of acting to change that. More likely, other humans could kill you, still in a way you couldn't influence, for reasons you couldn't change and might never learn. You will someday die of some probably unchosen cause. But I bet none of this worries you on the average day. If it does, people will worry about you.
The Grand Sweep of History is being set by chaotically interacting causes, both natural and human. You don't know what most of them are. If you're one of a special few, you may be positioned to Change History by yourself... but you don't know if you are, what to do, or what the results would actually be. Yet you don't go around feeling like a leaf in the wind.
The "high impact" things that you do control are pretty randomly selected. You can get into Real Trouble or gain Real Advantages, but how is contingent, set by local, ephemeral circumstances. You can get away with things that would have killed a caveman, and you can screw yourself in ways you couldn't easily even explain to a caveman.
Yet, even after swallowing all the existing arbitrariness, new arbitrariness seems not-OK. Imagine a "UBI paradise", except each person gets a bunch of random, arbitrary, weird Responsibilities, none of them with much effect on anything or anybody else. Each Responsibility is literally a bad joke. But the stakes are real: you're Shot at Dawn if you don't Meet Your Responsibilities. I doubt you'd feel the Meaning very strongly.
... even though some of the human-imposed stuff we have already can seem too close to a bad joke.
The upshot is that it seems the "important" control people say they need is almost exactly the control they're used to having (just as the failures they need to worry about are suspiciously close to failures they presently have to worry about). Like today's scope of action is somehow automatically optimal by natural law.
That feels like a lack of imagination or flexibility.
And I definitely don't feel that way. There are things I'd prefer to keep control over, but they're not exactly the things I control today, and don't fall neatly into (any of) the categories people call "meaningful". I'd probably make some real changes in my scope of control if I could.
What about everybody else?
It's all very nice to talk about being able to fail, but you don't fail in a vaccuum. You affect others. Your "agentic failure" can be other people's "mishap they don't control". It's almost impossible to totally avoid that. Even if you want that, why do you think you should get it?
The Universe doesn't owe you a value system
This is a bit nebulous, and not dead on the topic of "stakes", and maybe even a bit insulting... but I also think it's related in an important way, and I don't know a better way to say it clearly.
I always feel a sense that what people who talk about "meaning" really want is value realism. You didn't say this, but this is what I feel like I see underneath practically everybody's talk about meaning:
Gosh darn it, there should be some external, objective, sharable way to assign Real Value to things. Only things that Real Value are "meaningful.
And if there is no such thing, it's important not to accept it, not really, not on a gut level...
... because I need it, dammit!
Say that or not, believe it or not, feel it or not, your needs, real or imagined, don't mean anything to the Laws that Govern All. They don't care to define Real Value, and they don't.
You get to decide what matters to you, and that means you have to decide what matters to you. Of course what you pick is ultimately caused by things you don't control, because you are caused by things you don't control. That doesn't make it any less yours. And it won't exactly match anybody else.
... and choosing to need the chance to fail, because it superficially looks like an externally imposed part of the Natural Order(TM), seems unfortunate. I mean, if you can avoid it.
"But don't you see, Sparklebear? The value was inside of YOU all the time!"
Replies from: sloonz↑ comment by sloonz · 2025-01-16T14:34:14.743Z · LW(p) · GW(p)
Failure in itself is valuable to you?
What I sense from this is that what you’re not getting is that my value system is made of tradeoff of let’s call it "Primitive Values" (ie one that are at least sufficiently universal in human psychology that you kind of can describe them with compact words).
I obviously don’t value failure. If I did I would plan for failure. I don’t. I value/plan for success.
But if all plans ultimately lead to success, what of use/fun/value is planning ?
So failure has to be part of the territory, if I want my map-making skills to… matter ? make sense ? make a difference ?
It feels to me like a weird need to make your whole life into some kind of game to be "won" or "lost", or some kind of gambling addiction or something.
My first reaction was "no, no, gambling addiction and speaking of Winning at Life like Trump could looks like terribly uncharitable".
My second reaction is you’re pretty much directionaly right and into the path of understanding ? Just put it in a bit more charitable way ? We have been shaped by Evolution at large. By winners in the great game of Life, red in blood and claws. And while playing don’t mean winning, not playing certainly means losing. Schematically, I can certainly believe that "Agency" is the shard inside of me that comes out of that outer (intermediate) objective "enjoy the game, and play to win". I have the feeling that you have pretty much lost the "enjoy the game" shard, possibly because you have a mutant variant "enjoy ANY game" (and you know what ? I can certainly imagine a "enjoy ANY game" variant enjoying UBI paradise).
Well, the big stakes are already gone. If you're on Less Wrong, you probably don't have much real chance of failing so hard that you die, without intentionally trying. Would your medieval farmer even recognize that your present stakes are significant?
This gives me another possible source/model of inspiration, the good old "It’s the Journey that matters, not the Destination".
Many video games have a "I win" cheatcode. Players at large don’t use it. Why not, if winning the game is the goal ? And certainly all of their other actions are consistent with the player want to win the game. He’s happy when things go well, frustrated when they go wrong, At the internet age, they look at guides, tips. They will sometimes hand the controller to a better player after being stuck. And yet they don’t press the "I win" button.
You are the one saying "do you enjoy frustration or what ? Just press the I Win button". I’m the one saying "What are you saying ? He’s obviously enjoying the game, isn’t he ?".
I agree that the Destination of Agency is pretty much "there is no room left for failure" (and pretty much no Agency left). This is what most of our efforts go into : better plans for a better world with better odds for us. There’s some Marxist vibes "competition tend to reduce profit over time in capitalist economies, therefore capitalism will crumble under the weight of its own contradiction". If you enjoy entrepreneurship in a capitalistic economy, the better you are at it, the stronger you drive down profits. "You: That seems to indicate that entrepreneurs hate capitalism and profits, and would be happy in a communist profit-less society. Me: What ?". Note we have the same thing as "will crumble under the weights…" in the game metaphor : when the player win, it’s also the end of the game.
So let’s go a bit deeper into that metaphor : the game is Life. Creating an ASI-driven UBI paradise is discovering that the developer created a "I Win" button. Going into that society is pressing that button. Your position I guess is "well, living well in an UBI paradise is the next game". My position is "no, the UBI paradise is still in the same game. It’s akin to the Continue Playing button in a RTS after having defeated all opponents on the map. Sure, you can play in the sense you can still move units around gather resources and so on but c'mon, it’s not the same, and I can already tell how much it’s going to be much less fun, simply because it’s not what the game was designed for. There is no next game. We have finished the only game we had. Enjoy drawing fun patterns with your units while you can enjoy it ; for me I know it won’t be enjoyable for very long."
... and if you care, your social prestige, among whoever you care about, can always be on the table, which is already most of what you're risking most of the time.
Oh, this is another problem I thought of, then forgot.
This sounds like a positive nightmare to me.
It seems a hard-to-avoid side-effect of losing real stakes/agency.
In our current society, you can improve the life of others around you in the great man-vs-nature conflict. AKA economics is positive-sum (I think you mentioned something about some people talking about Meaningfulness giving you an altruistic definition ? There we are !).
Remove this and you only have man-vs-man conflicts (gamified so nobody get hurt). Those are generally zero-sum, just positional. When you gain a rank in the Chess ladder, another one lose one.
No place for positive-sum games seems a bad place to live. Don’t know at what extent it is fixable in the UBI-paradise (does cooperative, positive-sum games fix this ? I’m not sure how much the answer is "obviously yes" or "it’s just a way to informally make a ranking of who is the best player, granting status, so it’s actually zero sum"), or how much is it just going to end up Agency in another guise.
Forces mostly unknown and completely beyond your control have made a universe in which you can exist, and fitted you for it. You depend on the fine structure constant. You have no choice about whether it changes. You need not and cannot act to maintain the present value. I doubt that makes you feel your agency is meaningless.
My first reaction is "the shard of Agency inside me has been created by Evolution ; the definition of the game I’m supposed to enjoy and its scope draws from there. Of course it’s not going to care about that kind of stuff".
My second reaction is : "I certainly hope my distant descendants will change the fine-structure constant of the universe, it looks possible and a way to avoid the heat death of the universe" (https://www.youtube.com/watch?v=XhB3qH_TFds&list=PLd7-bHaQwnthaNDpZ32TtYONGVk95-fhF&index=2). I don’t know how much it’s a nitpick (I certainly notice that I prefer "my distant descendants" to "the ASI supervisor of UBI-paradise").
More likely, other humans could kill you, still in a way you couldn't influence, for reasons you couldn't change and might never learn. You will someday die of some probably unchosen cause.
This is the split between Personal Agency and Collective Agency. At our current level at capabilities, it doesn’t differentiate very much. It will certainly, later.
Since we live in society, and much people tend to not like being killed, we shape societies such that such events tend not to happen (mostly via punishment and socialization). Each individual try to steer society at the best of its capabilities. If we collectively end up in a place where there’s no murders, people like me consider this a success. Otherwise, a failure.
Politics, advocacy, leading-by-example, guided by things like Game Theory, Ethics, History. Those are very much not out of the scope of Agency. It would be if individuals had absolutely 0 impact on society.
It's all very nice to talk about being able to fail, but you don't fail in a vaccuum. You affect others. Your "agentic failure" can be other people's "mishap they don't control". It's almost impossible to totally avoid that. Even if you want that, why do you think you should get it?
That’s why, for me and at my current speculation level, I think there is two Red Bright Lines for a post-ASI future.
One : if there is no recognizable Mormons society in a post-ASI future, something Has Gone Very Wrong. Mormons tend to value their traditional way of life pretty heavily (which includes agency). Trampling those in particular probably indicate that we are generally trampling a awful lot of values actually held by a lot of actual people.
Two : if there is no recognizable UBI paradise in a post-ASI future, something Has Gone Very Wrong. For pretty much the same reason.
(there is plausibly a similar third red line for transhumanists, but they cause serious security/safety challenges for the rest of the universe, so it’s getting more complicated there, so I found no way to articulate such a red line for them).
The corollary being is : the (non-terribly-gone-wrong) pot-ASI future is almost inevitably a patchwork of different societies with different tradeoffs. Unless One Value System wins, one which is low on Diversity on top of that. Which would be terrible.
To answer you : I should get that because I’m going to live with other people who are okay that I get that, because they want to get it too.
"But don't you see, Sparklebear? The value was inside of YOU all the time!"
I entirely agree with you here. It’s all inside us. If there was some Real Really Objectively Meaningful Values out there, I would believe a technically aligned ASI to be able to recognize this and would be much less concerned by the potential loss of Agency/Meaningfulness/whatever we call it. Alas, I don’t believe it’s the case.
Replies from: jbash↑ comment by jbash · 2025-01-17T01:36:51.987Z · LW(p) · GW(p)
Mostly some self-description, since you seem want a model of me. I did add an actual disagreement (or something) at the end, but I don't think there'll be much more for me to say about it if you don't accept it. I will read anything you write.
I have the feeling that you have pretty much lost the "enjoy the game" shard, possibly because you have a mutant variant " enjoy ANY game".
More like "enjoy the process". Why would I want to set a "win" condition to begin with?
I don't play actual games at all unless somebody drags me into them. They seem artificial and circumscribed. Whatever the rules are, I don't really care enough about learning them, or learning to work within them, unless it gives me something that seems useful for whatever random conditions may come up later, outside the game. That applies to whatever the winning condition is, as much as to any other rule.
Games with competition tend to be especially tedious. Making the competition work seems to tends to further constrain the design of the rules, so they're more boring. And the competition can make the other people involved annoying.
As far as winning itself... Whee! I got the most points! That, plus whatever coffee costs nowadays, will buy me a cup of coffee. And I don't even like coffee.
I study things, and I do projects.
While I do evaluate project results, I'm not inclined to bin them as "success" or "failure". I mean, sure, I'll broadly classify a project that way, especially if I have to summarize it to somebody else in a sentence. But for myself I want more than that. What exactly did I get out of doing it? The whole thing might even be a "success" if it didn't meet any of its original goals.
I collect capabilities. Once I have a capability, I often, but not always, lose interest in using it, except maybe to get more capabilities. Capabilities get extra points for being generally useful.
I collect experiences when new, pleasurable, or interesting ones seem to be available. But just experiences, not experiences of "winning".
I'll do crossword puzzles, but only when I have nothing else to do and mostly for the puns.
Many video games have a "I win" cheatcode. Players at large don’t use it. Why not, if winning the game is the goal ?
Even I would understand that as not, actually, you know, winning the game. I mean, a game is a system with rules. No rules, no game, thus no win. And if there's an auto-win button that has no reason to be in the rules other than auto-win, well, obvious hole is obvious.
It's just that I don't care to play a game to begin with.
If something is gamified, meaning that somebody has artificially put a bunch of random stuff I don't care about between me and something I actually want in real life, then I'll try to bypass the game. But I'm not going to do that for points, or badges, or "achievements" that somebody else has decided I should want. I'm not going to push the "win" button. I'm just not gonna play. I loathe gamification.
Creating an ASI-driven UBI paradise is discovering that the developer created a "I Win" button.
I see it not as an "I win" button, but as an "I can do the stuff I care about without having to worry about which random stupid bullshit other people might be willing to pay me for, or about tedious chores that don't interest me" button.
Sure, I'm going to mash that.
And eventually maybe I'll go more transcendent, if that's on offer. I'm even willing to accept certain reasonable mental outlooks to avoid being too "unaligned".
This is the split between Personal Agency and Collective Agency.
I don't even believe "Collective Agency" is a thing, let alone a thing I'd care about. Anything you can reasonably call "agency" requires preferences, and intentional, planned, directed, well, action toward a goal. Collectives don't have preferences and don't plan (and also don't enjoy, or even experience, either the process or the results).
Which, by the way, brings me to the one actual quibble I'm going to put in this. And I'm not sure what to do with that quibble. I don't have a satisfactory course of action and I don't think I have much useful insight beyond what's below. But I do know it's a problem.
One : if there is no recognizable Mormons society in a post-ASI future, something Has Gone Very Wrong.
I was once involved in a legal case that had a lot to do with some Mormons. Really they were a tiny minority of the people affected, but the history was such that the legal system thought they were salient, so they got talked about a lot, and got to talk themselves, and I learned a bit about them.
These particular Mormons were a relatively isolated polygynist splinter sect that treated women, and especially young women, pretty poorly (actually I kind of think everybody but the leaders got a pretty raw deal, and I'm not even sure the leaders were having much of a Good Time(TM)). It wasn't systematic torture, but it wasn't Fun Times either. And the people on the bottom had a whole lot less of what most people would call "agency" than the people on the top.
But they could show you lots of women who truly, sincerely wanted to stay in their system. That was how they'd been raised and what they believed in. And they genuinely believed their Prophet got direct instructions from God (now and then, not all the time).
Nobody was kept in chains. Anybody who wanted to leave was free to walk away from their entire family, probably almost every person they even knew by name, and everything they'd ever been taught was important, while defying what at least many of them truly believed was the literal will of God. And of course move somewhere where practically everybody had a pretty alien way of life, and most people were constantly doing things they'd always believed were hideously immoral, and where they'd been told people were doing worse than they actually were.
They probably would have been miserable if they'd been forcibly dragged out of their system. They might never have recovered. If they had recovered, it might well have meant they'd had experiences that you could categorize as brainwashing.
It would have been wrong to yank them out of their system. So far I'm with you.
But was it right to raise them that way? Was it right to allow them to be raised that way? What kind of "agency" did they have in choosing the things that molded them? The people who did mold them got agency, but they don't seem to have gotten much.
As I think you've probably figured out, I'm very big on individual, conscious, thinking, experiencing, wanting agents, and very much against giving mindless aggregates like institutions, groups, or "cultures", anywhere near the same kind of moral weight.
From my point of view, a dog has more right to respect and consideration than a "heritage". The "heritage" is only important because of the people who value it, and that does not entitle it to have more, different people fed to it. And by this I specifically mean children.
A world of diverse enclaves is appealing in a lot of ways. But, in every realistic form I've been able to imagine, it's a world where the enclaves own people.
More precisely, it's a world where "culture" or "heritage", or whatever, is used an excuse for some people not only to make other people miserable, but to condition them from birth to choose that misery. Children start to look suspiciously like they're just raw material for whatever enclave they happen to be born in. They don't choose the enclave, not when it matters.
It's not like you can just somehow neutrally turn a baby into an adult and then have them "choose freely". People's values are their own, but that doesn't mean they create those values ex nihilo.
I suppose you could fix the problem by switching to reproduction by adult fission, or something. But a few people might see that as a rather abrupt departure, maybe even contrary to their values. And kids are cute.
comment by Seth Herd · 2025-01-12T05:02:03.295Z · LW(p) · GW(p)
If you've ever been to a Burning Man event, you will see in a visceral way that people can find meaningful projects to do and enjoy doing them even when they're totally unnecessary. Working together to do cool stuff and then show it off to other humans is fun. And those other humans appreciate it not just for what it is, but because someone worked to make it for them.
That won't power an economy, as you say; but if we get to a post-singularity utopia where needs are provided for, people will have way more fun than ever.
You won't be alone in wringing your hands! There are many people who won't know what to do without being forced to work, or getting to try saving people who are suffering.
There will be a transition, but almost everyone will learn to enjoy not-having-to-work because the single most popular avocation will be "transition counselor/project buddy".
It seems like you're quite concerned with humans no longer controlling the future. Almost no human being has any meaningful control over the future. The few that think they do, in particular silicon valley types, are mostly wrong. People do have control of their impact on other people. They'll continue to have that. They won't have starving people to save, but they'll get over it. They will have plenty of people to delight.
At this point you're probably objecting: "But any project will be completed much better and faster by AGI than humans! Even volunteer projects will be pointless!"
Yes, except for people who appreciate the process and provenance of projects. Which we've already shown through our love of "artisanal" products that lots of us do, when we've got spare time and money to be picky and pay attention. Ridiculous as it is to care where things come from and pay extra time and money for elaborately hand-crafted stuff when there are people starving, we do. I even enjoy hearing about the process that made my soap, while being embarrassed to spend money on it.
So here's what I predict: whole worlds with very strict rules on what the AGI can do for you, and what people must do themselves. There will be worlds or zones with different rules in place. Take your pick, and hop back and forth. We will marvel at devotion and craftspersonship as we never have. And we will thank our stars that we aren't forced to do things we don't want to do, let alone work until our bodies break, as most of humanity did right up until the singularity.
I fully agree that people should have a plan before creating AGI, and they largely don't.
I suspect Dario Amodei is privately willing to become god-emperor should it seem appropriate. Note that talking about this in an interview would be counterproductive for nearly any goal he might have.
I'm pretty sure Sam Altman occasionally claps his hands with glee in private when imagines his own ascendency.
I doubt Shane Legg wants the job, but I for one would vote for him or Hassabis in a second; Demis would take the job, and I suspect do it quite well.
But none of them will get the chance. There are people with much more ambition for power and much more skill at getting it.
They are called politicians. And they already enjoy a democratic mandate to control the future.
We had best either work or pray for AGI to get into the hands of the right politicians.
comment by RussellThor · 2025-01-12T05:39:36.746Z · LW(p) · GW(p)
If you are advocating for a Butlerian Jihad, what is your plan for starships, with societies that want to leave earth behind, have their own values and never come back? If you allow that, then simply they can do whatever they want with AI - now with 100 billion stars that is the vast majority of future humanity.
comment by davekasten · 2025-01-11T19:42:52.728Z · LW(p) · GW(p)
I think you're missing at least one strategy here. If we can get folks to agree that different societies can choose different combos, so long as they don't infringe on some subset of rights to protect other societies, then you could have different societies expand out into various pieces of the future in different ways. (Yes, I understand that's a big if, but it reduces the urgency/crux nature of value agreement).
Replies from: sharmake-farah, jbash, sloonz↑ comment by Noosphere89 (sharmake-farah) · 2025-01-11T20:35:39.118Z · LW(p) · GW(p)
I think the if condition is relying on either an impossibility as presented, or it requires you to exclude some human values, at which point you should at least admit that what values you choose to retain is a political decision, based on your own values.
↑ comment by jbash · 2025-01-12T00:42:31.923Z · LW(p) · GW(p)
Societies aren't the issue; they're mindless aggregates that don't experience anything and don't actually even have desires in anything like the way a human, or or even an animal or an AI, has desires. Individuals are the issue. Do individuals get to choose which of these societies they live in?
↑ comment by sloonz · 2025-01-11T20:30:16.299Z · LW(p) · GW(p)
I’m not missing that strategy at all. It’s an almost certainty that any solution will have to involve something like that, barring some extremely strong commitment to Unity which by itself will destroy a lot of Values. But there are some pretty fundamental values that some people (even/especially) here care a lot about, like negative utilitarianism ("minimize suffering"), which are flatly incompatible with simple implementations of that solution. Negative utilitarians care very much about the total suffering in the universe and their calculus do not stop at the boundaries of "different societies".
And if you say "screw them", well, what about the guy who basically goes "let’s create the baby eaters society ?". If you recoil at that, it means there’s at least a bit of negative utilitarianism in you. Which is normal, don’t worry, it’s a pretty common human value, even in people who doesn’t describe themselves as "negative utilitarians".
Now you can recognize the problem, which is that every individual will have a different boundary in the Independence-Freedom-Diversity vs Negative-Utilitarianism tradeoff.
(which I do not think is the only tradeoff/conflict, but clearly one of the biggest one, if not THE biggest one, if you set aside transhumanism)
And if you double down on the "screw them" solution ? Well, you enter exactly in what I described with "even with perfect play, you are going to lose some Human Values". For it is a non-negligible chunk of Human Values.
comment by HKaiWu (hao-kai-wu) · 2025-01-22T18:28:35.152Z · LW(p) · GW(p)
This article closely aligns to what I think, but it misses a big point. I believe the crux is not whether the END state is desirable but rather how societal upheaval should be managed in relation to AI development. Even if we believed that ASI-run society could lead to paradise (some do and some don't clearly), if we never can get there at all, this whole conversation is a moot point. Judging by how AI development is going, there's a distinct chance that we never "get there" by enraging regular people in the short term. In fact, bad planning can lead to a world that barely adopts ASI because people have revolted so hard against it.
To start, I agree with all your points about AI people like Darius Amodei giving naively optimistic views of the future. They all assume we're going to get to ASI, and then ASI will suddenly become an inevitable way of running the world everywhere. However, this is a convenient assumption and it hides a messy reality of how we'd actually leverage ASI on day one. Amodei says it himself in a WSJ interview that widespread takeover is an aspiration rather than some verifiable fact.
Sadly, this is likely not going to happen. A jagged, staged adoption is more likely, leading to unrest among those who were put out of a job first. This is where the lack of planning will hurt us most.
My assumptions that underpin a jagged adoption:
- Continued lightning fast pace of AI improvement and roll-out
- Easy disintermediation of digital work by AI
- Trailing capabilities in physical interface creation and adoption (essentially, robust, scalable, and widely capable robotics), even by a few years.
- The trust "tax" which is put on robots in doing mission-critical physical work vs. humans (essentially robots need to do much better than humans for regular people to trust them in critical scenarios).
- Life-changing improvements to the average person depends on the adoption of physical interfaces, not digital interfaces (i.e. AI cooking my dinner is much more positively disruptive than AI telling me what to cook).
My logical thought process:
Belief in these assumptions => jagged AI adoption => societal upheaval => scapegoating of AI.
It stands to reason that anyone with purely digital work will get put out of a job much faster than those which rely on a physical interface. Essentially, if you can do your work remotely, an AI will replace you faster than if you have to be in person. Robotics might be close behind in theory, but if enough people lose their jobs fast enough, a span of a few years of 15-20% unemployment will be enough to kick off a Jihad.
Millions of people do digital work today. The problem with millions of people losing their jobs so quickly is that the economy has no time to compensate. These people can't learn new skills in time to find meaningful, new careers, and being suddenly out of work, they'll look for a boogeyman to blame. Unsurprisingly, ASI would become the scapegoat.
All the while, ASI could be making incredible breakthroughs in various scientific fields, but if these breakthroughs aren't immediately translated to physical improvements like life extending drugs, quality of life changes, and so on, then it's effectively meaningless to the average person.
As such, we could very well see AI development become self-defeating. The regular person isn't going to hope for some transhuman singularity if they can't put food on the table that year. It's one thing if AI takes the jobs of .1% of the population, but once you're hitting percentages like 5-10%, you can easily imagine these unemployed people banding up to do massive protests, conduct sabotage on AI data centers, and commit violent acts to those who propped up AI in the first place.
For the record, this is not a typical bourgeoisie vs. proletariat framing. Digital jobs are largely white collar jobs, and this economic collapse will affect business owners as well. Once large parts of the population lose their salary (or expect to lose their salary), they won't spend as much. If they don't spend as much, the economy goes into recession, impacting the value of every company and their ability to get capital. Companies that can't sell products and can't get capital go bust, meaning that many business owners will eventually come around as well against ASI adoption.
In my mind, the only way a Butlerian Jihad COULD be avoided is by slowing ASI roll-out until regular people can see massive benefits from it such that they're OK with letting go of their jobs. However, the reckless, headlong advancement of AI leads me to believe that this won't happen. As a result, we'll all have to learn our lesson the hard way.
Again, the main point in all of this is not that humans can never co-exist with ASI or that severe conflict is inevitable. Rather, it's specifying the consequence of not planning. The likely consequence isn't the creation of an actual dystopia -- it's the widespread revolt of a populace which believes that AI will lead them to a dystopia, regardless whether or not their beliefs turn out to be true.
On a last note, I believe looking at illegal immigration is a good stand-in for what's going to happen. Just as people have revolted against illegal immigrants taking their jobs, so too will they revolt against AI if it happens too quickly. The US recently elected a president on a platform of stopping illegal immigration. Don't be surprised if another one gets elected based on stopping the AI takeover.
comment by Shankar Sivarajan (shankar-sivarajan) · 2025-01-12T07:38:40.811Z · LW(p) · GW(p)
What do we privilege, the preference of doctors or the welfare of patients?
What is more important, educators preferences or quality of children education?
I understand you intended these questions to be rhetorical, but the answers you think are obvious: did you arrive at them through "pure reason," or by looking at what "democratic consensus" actually ended up with?
Replies from: sloonz↑ comment by sloonz · 2025-01-12T14:59:44.257Z · LW(p) · GW(p)
I’m not sure why you think it matters ?
I was mostly speaking about the democratic consensus here, but I’m also pretty sure that it’s also perfectly reasonable opinions, each point taken in isolation.
If you’re going to argue that the preference of doctors is more important that the welfare of patients, I’m genuinely interested by your arguments.
Replies from: shankar-sivarajan↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2025-01-12T17:47:38.172Z · LW(p) · GW(p)
The doctors' cartel which enriches its members at the expense of patient welfare is backed by the force of the state, and I expect few to support its abolition. The teachers' unions are similarly popular. In what sense do you believe "democratic consensus" has answered these question the way you think they have?
Replies from: sloonz↑ comment by sloonz · 2025-01-12T17:56:48.197Z · LW(p) · GW(p)
Because the debate is never set in terms of "better education" vs "teachers preference". It’s "give more money for teachers so they can give better education". When there’s a tradeoff, it’s usually of the form "better top-performers education" or "more equality on education" ? I don’t see teachers unions arguing that school vouchers are good for children but should still be outlawed. I see teachers unions arguing that school vouchers are bad for children, and They’re The Experts, so outlaw them.
I don’t expect that tactic to work when the alternative is a literal superintelligence.
comment by R. Mutt (r-mutt) · 2025-01-11T21:33:39.000Z · LW(p) · GW(p)
What are you on about Christian Paradise equating not working? The book of Genesis says man will toil by the sweat of his brow. This is a good.
Personal experience tells me I would degenerate under Ubi. I'm clearly meant to work for my daily bread.
Replies from: martin-randall, sloonz↑ comment by Martin Randall (martin-randall) · 2025-01-12T03:28:22.806Z · LW(p) · GW(p)
It's not a good, it's a curse. Genesis 3, 17-19. CEB translation:
cursed is the fertile land because of you; in pain you will eat from it every day of your life. Weeds and thistles will grow for you, even as you eat the field’s plants; by the sweat of your face you will eat bread— until you return to the fertile land, since from it you were taken; you are soil, to the soil you will return.
Also implies that the curse lasts until death.
↑ comment by sloonz · 2025-01-11T21:49:04.619Z · LW(p) · GW(p)
I’m pretty sure "man will toil by the sweat of his brow" is about down there, before you die and (hopefully) go to the paradise, and you don’t have to work in paradise. And anyway I know next to nothing to Christianism, it’s mostly a reference to Scott Alexander (or was it Yudkowsky ? now I’m starting to doubt…) who said something like "the description of christian paradise seems pretty lame, I mean just bask in the glory of god doing nothing for all eternity, you would be bored after two days, but it makes sense to describe that as a paradise if you put yourself in the shoes of the average medieval farmer that toil all day".
(I did all that from my terrible memories, so apologies if I’m depicting anything wrongly here).