Posts
Comments
How to generalize to multiple humans is... not an unimportant question, but a question whose salience is far, far out of proportion to its relative importance
I expect it to be the hardest problem, not from a technical point of view, but from a lack of ground truth.
The question "how do I model the values of a human" has a simple ground truth : the human in question.
I doubt there’s such a ground truth with "how do I compress the values of all humans in one utility function ?". "All models are wrong, some are useful", and all that, except all the different humans have a different opinion on "useful", ie their own personal values. There would be a lot of inconsistencies ; while I agree with your stance "Approximation is part of the game" for modeling the value of individual persons, people can wildly disagree on what approximations they are okay with or not, mostly based on the agreement between the outcome and their values.
In other words : do you believe in the existence of at least a model where nobody can honestly say "the output of that model approximates away too much of my values" ? If yes, what makes you think so ?
Failure in itself is valuable to you?
What I sense from this is that what you’re not getting is that my value system is made of tradeoff of let’s call it "Primitive Values" (ie one that are at least sufficiently universal in human psychology that you kind of can describe them with compact words).
I obviously don’t value failure. If I did I would plan for failure. I don’t. I value/plan for success.
But if all plans ultimately lead to success, what of use/fun/value is planning ?
So failure has to be part of the territory, if I want my map-making skills to… matter ? make sense ? make a difference ?
It feels to me like a weird need to make your whole life into some kind of game to be "won" or "lost", or some kind of gambling addiction or something.
My first reaction was "no, no, gambling addiction and speaking of Winning at Life like Trump could looks like terribly uncharitable".
My second reaction is you’re pretty much directionaly right and into the path of understanding ? Just put it in a bit more charitable way ? We have been shaped by Evolution at large. By winners in the great game of Life, red in blood and claws. And while playing don’t mean winning, not playing certainly means losing. Schematically, I can certainly believe that "Agency" is the shard inside of me that comes out of that outer (intermediate) objective "enjoy the game, and play to win". I have the feeling that you have pretty much lost the "enjoy the game" shard, possibly because you have a mutant variant "enjoy ANY game" (and you know what ? I can certainly imagine a "enjoy ANY game" variant enjoying UBI paradise).
Well, the big stakes are already gone. If you're on Less Wrong, you probably don't have much real chance of failing so hard that you die, without intentionally trying. Would your medieval farmer even recognize that your present stakes are significant?
This gives me another possible source/model of inspiration, the good old "It’s the Journey that matters, not the Destination".
Many video games have a "I win" cheatcode. Players at large don’t use it. Why not, if winning the game is the goal ? And certainly all of their other actions are consistent with the player want to win the game. He’s happy when things go well, frustrated when they go wrong, At the internet age, they look at guides, tips. They will sometimes hand the controller to a better player after being stuck. And yet they don’t press the "I win" button.
You are the one saying "do you enjoy frustration or what ? Just press the I Win button". I’m the one saying "What are you saying ? He’s obviously enjoying the game, isn’t he ?".
I agree that the Destination of Agency is pretty much "there is no room left for failure" (and pretty much no Agency left). This is what most of our efforts go into : better plans for a better world with better odds for us. There’s some Marxist vibes "competition tend to reduce profit over time in capitalist economies, therefore capitalism will crumble under the weight of its own contradiction". If you enjoy entrepreneurship in a capitalistic economy, the better you are at it, the stronger you drive down profits. "You: That seems to indicate that entrepreneurs hate capitalism and profits, and would be happy in a communist profit-less society. Me: What ?". Note we have the same thing as "will crumble under the weights…" in the game metaphor : when the player win, it’s also the end of the game.
So let’s go a bit deeper into that metaphor : the game is Life. Creating an ASI-driven UBI paradise is discovering that the developer created a "I Win" button. Going into that society is pressing that button. Your position I guess is "well, living well in an UBI paradise is the next game". My position is "no, the UBI paradise is still in the same game. It’s akin to the Continue Playing button in a RTS after having defeated all opponents on the map. Sure, you can play in the sense you can still move units around gather resources and so on but c'mon, it’s not the same, and I can already tell how much it’s going to be much less fun, simply because it’s not what the game was designed for. There is no next game. We have finished the only game we had. Enjoy drawing fun patterns with your units while you can enjoy it ; for me I know it won’t be enjoyable for very long."
... and if you care, your social prestige, among whoever you care about, can always be on the table, which is already most of what you're risking most of the time.
Oh, this is another problem I thought of, then forgot.
This sounds like a positive nightmare to me.
It seems a hard-to-avoid side-effect of losing real stakes/agency.
In our current society, you can improve the life of others around you in the great man-vs-nature conflict. AKA economics is positive-sum (I think you mentioned something about some people talking about Meaningfulness giving you an altruistic definition ? There we are !).
Remove this and you only have man-vs-man conflicts (gamified so nobody get hurt). Those are generally zero-sum, just positional. When you gain a rank in the Chess ladder, another one lose one.
No place for positive-sum games seems a bad place to live. Don’t know at what extent it is fixable in the UBI-paradise (does cooperative, positive-sum games fix this ? I’m not sure how much the answer is "obviously yes" or "it’s just a way to informally make a ranking of who is the best player, granting status, so it’s actually zero sum"), or how much is it just going to end up Agency in another guise.
Forces mostly unknown and completely beyond your control have made a universe in which you can exist, and fitted you for it. You depend on the fine structure constant. You have no choice about whether it changes. You need not and cannot act to maintain the present value. I doubt that makes you feel your agency is meaningless.
My first reaction is "the shard of Agency inside me has been created by Evolution ; the definition of the game I’m supposed to enjoy and its scope draws from there. Of course it’s not going to care about that kind of stuff".
My second reaction is : "I certainly hope my distant descendants will change the fine-structure constant of the universe, it looks possible and a way to avoid the heat death of the universe" (https://www.youtube.com/watch?v=XhB3qH_TFds&list=PLd7-bHaQwnthaNDpZ32TtYONGVk95-fhF&index=2). I don’t know how much it’s a nitpick (I certainly notice that I prefer "my distant descendants" to "the ASI supervisor of UBI-paradise").
More likely, other humans could kill you, still in a way you couldn't influence, for reasons you couldn't change and might never learn. You will someday die of some probably unchosen cause.
This is the split between Personal Agency and Collective Agency. At our current level at capabilities, it doesn’t differentiate very much. It will certainly, later.
Since we live in society, and much people tend to not like being killed, we shape societies such that such events tend not to happen (mostly via punishment and socialization). Each individual try to steer society at the best of its capabilities. If we collectively end up in a place where there’s no murders, people like me consider this a success. Otherwise, a failure.
Politics, advocacy, leading-by-example, guided by things like Game Theory, Ethics, History. Those are very much not out of the scope of Agency. It would be if individuals had absolutely 0 impact on society.
It's all very nice to talk about being able to fail, but you don't fail in a vaccuum. You affect others. Your "agentic failure" can be other people's "mishap they don't control". It's almost impossible to totally avoid that. Even if you want that, why do you think you should get it?
That’s why, for me and at my current speculation level, I think there is two Red Bright Lines for a post-ASI future.
One : if there is no recognizable Mormons society in a post-ASI future, something Has Gone Very Wrong. Mormons tend to value their traditional way of life pretty heavily (which includes agency). Trampling those in particular probably indicate that we are generally trampling a awful lot of values actually held by a lot of actual people.
Two : if there is no recognizable UBI paradise in a post-ASI future, something Has Gone Very Wrong. For pretty much the same reason.
(there is plausibly a similar third red line for transhumanists, but they cause serious security/safety challenges for the rest of the universe, so it’s getting more complicated there, so I found no way to articulate such a red line for them).
The corollary being is : the (non-terribly-gone-wrong) pot-ASI future is almost inevitably a patchwork of different societies with different tradeoffs. Unless One Value System wins, one which is low on Diversity on top of that. Which would be terrible.
To answer you : I should get that because I’m going to live with other people who are okay that I get that, because they want to get it too.
"But don't you see, Sparklebear? The value was inside of YOU all the time!"
I entirely agree with you here. It’s all inside us. If there was some Real Really Objectively Meaningful Values out there, I would believe a technically aligned ASI to be able to recognize this and would be much less concerned by the potential loss of Agency/Meaningfulness/whatever we call it. Alas, I don’t believe it’s the case.
Being hung up on that seems deeply weird to me for a bunch of reasons that I could name that you might not care to hear about
Yeah, I’m curious. The only reason I know that makes sense for not caring about that is pretty extreme negative utilitarianism that you apparently don’t agree with ? (if you have agency you can fail in your plans and suffer, and That Is Not Allowed)
Would you take that?
Given an AGI, there’s a big concern whether this is a true proposal, or a lie going from "and secretly a vast majority of the rest of that world is a prop, you don’t really risk anything" to "I’m going to upload you to what is essentially a gigantic MMO". But I think it’s not the purpose of your thought experiment ?
I think there are better intermediate places between "medieval farmer" and "UBI paradise", if it’s what you mean by "details to your tastes". Current society. Some more SF-like setups like : "we give you and some other space-settler-minded individuals that galaxy other there and basic space tech, do whatever you want". Some of those I go there without a second thought. I pretty much like current society, actually, setting AGI-builders aside (and yes, limiting to developed world). Medieval farmer life is genuinely sufficiently terrible that I’m on the fence between death and medieval farmer.
But yes, between just medieval farmer and UBI paradise, I’ll probably give a test to UBI paradise (I might be proven wrong and was too lacking in imagination to see all the wonderful things there !), milk the few drop of util that I expect to still find there, but my current expectations is I’m going to bail out at some point.
Or does having to choose it spoil it?
There are various levels of "spoils it". Your proposal is on the very low ends of spoiling it. Essentially negligible, but I think I can empathize with people thinking "it’s already too high levels of spoiling". On increasing levels there are "and you can decide to go back to UBI society anytime" (err… that’s pretty close to just being a big IRL role-playing game, isn’t it ?) up to "and I can give you a make-a-wish button" ("wait, that’s basically what I wanted to escape").
And it’s pretty much a given that it’s a level of Agency/Meaningfulness that is going to be lost even in the worlds where the Agency/Meaningful crowd get most of what they want, as part of bargaining, unless we somehow end up just blindly maximizing Agency/Meaningfulness. Which to be clear would be a pretty awful outcome.
I get the sense that you were just trying to allude to the ideas that--
Even if you have some kind of "alignment", blindly going full speed ahead with AI is likely to lead to conflict between humans and/or various human value systems, possibly aided by powerful AI or conducted via powerful AI proxies, and said conflict could be seriously Not Good.
Claims that "democratic consensus" will satisfactorily or safely resolve such conflicts, or even resolve them at all, are, um, naively optimistic.
It might be worth it to head that off by unspecified, but potentially drastic means, involving preventing blindly going ahead with AI, at least for an undetermined amount of time.
If that's what you wanted to express, then OK, yeah.
Yes. That’s really my central claim. All the other discussions over values is not me saying "look, we’re going to resolve this problem of human values in one lesswrong post". It was to point to the depth of the issue (and, one important and I think overlooked point, that it is not just Mistake Theory that raw clarity/intelligence can solve, there is a fundamental aspect of Conflict Theory we won’t be able to casually brush aside) and that it is not idle philosophical wandering.
I'm sorry that came off as unduly pugnacious. I was actually reacting to what I saw as similarly emphatic language from you ("I can't believe some of you..."), and trying to forcefully make the point that the alternative wasn't a bed of roses.
Don’t be sorry, it served its illustration purpose.
We lesswronger are a tiny point in the space of existing human values. We are all WEIRD or very close to that. We share a lot of beliefs that, seen from the outside, even close outside like academia, seems insane. Relative to the modal human who is probably a farmer in rural India or China, we may as well be a bunch of indistinguishable aliens.
And yet we manage to find scissor statements pretty easily. The tails come apart scarily fast.
It just seemed awfully glib and honestly a little combative in itself.
I don’t see how glib and combative "this post is already too long" is ?
"obvious" probably is, yes. My only defense is I don’t have a strong personal style, I’m easily influenced, and read Zvi a lot, who has the same manner of overusing it. I probably should be mindful to not do it myself (I removed at least two on drafting this answer, so progress !).
Well, yes, but I did say "as least not while the 'humans' involved are recognizably like the humans we have now". I guess both the Superhappies and the Babyeaters are like humans in some ways, but not in the ways I had in mind.
No, I mean recognizable humans having an AGI in their hand can decide to go the Superhappies way. Or Babyeaters way. Or whatever unrecognizable-as-humans way. The choice was not even on the table before AGI, and that represent a fundamental change. Another fundamental change brought by AGI is the potential for an unprecedented concentration of power. Many leaders had the ambition to mold humanity to their taste ; none had the capacity to.
Some people definitely have a lot of their self-worth and sense of prestige tied up in their jobs, and in their jobs being needed. But many people don't. I don't think a retail clerk, a major part of whose job is to be available as a smiling punching bag for any customers who decide to be obnoxious, is going to feel too bad about getting the same or a better material lifestyle for just doing whatever they happen to feel like every day.
I think a lot of people have that. There’s a even meme for that "It ain’t much, but it’s honest work".
All in one, I don’t think either of us has much more evidence that a vague sense of things anyway ? I sure don’t have.
I remember hearing things close to "my agency is meaningful if and only if I have to take positive, considered action to ensure my survival, or at least a major chunk of my happiness".
I think that’s the general direction of the thing we’re trying to point, yes ?
A medieval farmer who screw up is going to starve. A medieval farmer who does exceptionally well will have a surplus he can use on stuff he enjoys/finds valuable.
A chess player who screw up is going to lose some ELO points (and some mix of shame/disappointment). A chess player who does exceptionally well will gain some ELO points (and some mix of pride/joy).
If you give me the choice of living the life of a medieval farmer or someone who has nothing in his life but playing chess, I will take the former. Yes, I know it’s a very, very hard life. Worse in a lot of ways (if you give me death as a third choice, I will admit that death starts to become enticing, if only because if you throw me in a medieval farmer life I’ll probably end up dead pretty fast anyway). The generator of that choice is what I (and apparently others) are trying to point with Meaningfulness/Agency.
I think a lot of things we enjoy and value can be described as "growing as a person".
Does "growing as a person" sounds like a terminal goal to you ? It doesn’t to me.
If it’s not, what is it instrumental to ?
For me it’s clear, it’s the same thing as the generator of the choice above. I grow so I can hope to act better when there’s real stakes. Remove real stakes, there’s no point in growing, and ultimately, I’m afraid there’s no point to anything.
Is "real stakes" easier to grasp than Agency/Meaningfulness ? Or have I just moved confusion around ?
I've also heard plenty of people talk about "meaningfulness" in ways that directly contradict your definition.
Well, the problem is that there is so much concepts, especially when you want to be precise, and so few words.
My above Agency/Meaningfulness explanation does not match perfectly with the one in my previous answer. It’s not that I’m inconsistent, it’s that I’m trying to describe the elephant from different sides (and yeah, sure, you can argue, the trunk of the elephant is not the same thing as the leg of the elephant).
That being said I don’t think they point to completely unrelated concepts. All of those definitions above "positive, considered actions..." ? "Broad Sweep of History" ? its collective version ? Yeah, I all recognize them as parts of the elephant. Even the altruistic one, even if I find that one a bit awkward and maybe misleading. You should not see them as competing and inconsistent definitions, they do point to the same thing, at least for me.
Try to focus more on the commonalities, less on the distinctions ? Try to outline the elephant from the trunk and legs ?
Because the debate is never set in terms of "better education" vs "teachers preference". It’s "give more money for teachers so they can give better education". When there’s a tradeoff, it’s usually of the form "better top-performers education" or "more equality on education" ? I don’t see teachers unions arguing that school vouchers are good for children but should still be outlawed. I see teachers unions arguing that school vouchers are bad for children, and They’re The Experts, so outlaw them.
I don’t expect that tactic to work when the alternative is a literal superintelligence.
I’m not sure why you think it matters ?
I was mostly speaking about the democratic consensus here, but I’m also pretty sure that it’s also perfectly reasonable opinions, each point taken in isolation.
If you’re going to argue that the preference of doctors is more important that the welfare of patients, I’m genuinely interested by your arguments.
This (a) doesn't have anything in particular to do with Christianity, (b) has been the most widely held view among people in general since forever, and (c) seems obviously correct. If you want to rely on the contrary supposition, I'm afraid you're going to have to argue for it.
Yes, I agree that is the least obviously "wrong" part of the three "copes", and merge with the next remark. It’s very hard to answer that. I’ll start with the simple answer that will convince some, but perhaps not everyone :
I am very low in the "Negative Utilitarianism" scale. I really don’t care much about minimizing suffering in the universe. Still a bit, sure, but not that much. Still, I recognize it is very important to some persons, my current best rules for creating a "Best Model of Human Values" says these persons count, so it’s a pretty good Existence Proof that it’s a Pretty Important Value even if I don’t feel it a lot myself.
So I am going to give you the exact same Existence Proof : I notice that if you give me everything else, Hedonistic Happiness, Justice, Health, etc. and take away Agency (which means having things to do that go beyond "having a hobby"), the value to me of my existence is not 0, but not that far above 0. If I live in such a society and we need to sacrifice some individuals, I will happily step in, "nothing of value was lost" style. If I live in such a society and Omega appears and announce "Sorry, Vacuum Decay Bubble incoming, everyone is going to disappear in exactly 3 minutes", I will sure feel bad for "everyone" who is apparently pretty happy, but I will also think "well, I was already pretty much dead inside anyway".
Please define it in a succinct, relevant, and unambiguous way
I’m afraid you will have to pick only two adjectives, should you want to ask this to someone smarter and more articulated than me but with the same views. Alas, you’re stuck with me, so we’ll have to pick one, so let’s pick "relevant".
It will be also very hand-wavy. Despite all that, it’s still the best I can do. Sorry.
Take the Utility Function of someone. We can decide to split it in roughly three parts :
is "direct" utility. I’m hungry, I want ice scream. Would love to go to that concert. Just got an idea, can I build it ?
is "how much i care about j"
is the Utility Function of j.
This is a roughly speaking a very rough first approximation of "how to model egoism and altruism". Yes, I’m fully conscious that this is far from capturing most of interpersonal relationships & utility. I still think it’s relevant to point at the Big Picture, namely : if A only cares about B (all other P_Ax are 0) and B only cares about A (same), then if D_A = 0 and D_B = 0 : there is not utility left. Or : a world of Pure Altruists who only cares about others is a worthless world.
Which is not the same as to say that Altruism is Worthless. As long as you have at least some D_something, Altruism can create arbitrarily large values of Utility, as a multiplying force. It’s why it’s such a potent Human Value.
Now let’s go even further in handwaviness : this generalizes : many Values are similarly powerless at creating Utility from Nothing. They just act as Force Multipliers.
I call "Meaningful Values" the one that can create Utility by themselves, without having to rely on others to be present in the first place. Which does not means that the others (let’s call them Amplifying Values) are meaningless, to be clear. They just happen to become meaningless if you have 0 Meaningful Value hanging around.
In short : I’m very afraid that we’re putting a lot of load-bearing "we’ll be fine, there’s still value" on mostly Amplifying Values.
When I said above "I notice that in a world where I don’t have meaningful Agency, I don’t put much value on my own existence", I do not say "I do not like hobbies". I happen to like hobbies, in this world ! I’m also self-reflective enough that what I like in hobbies in the opportunity to grow, and the value of growing resolves in being better at Agency, which is a way better candidate as a "Meaningful Value" and a "Terminal Value". Hence, if you throw me in the UBI Paradise (let’s drop "Christian", it seems to annoy everyone), the value of hobbies go to zero, too, and I become a shell of my current self, despite my current self saying "hobbies are cool".
The democratic consensus also won't allow a Butlerian Jihad, and I don't think you're claiming that it will.
Okay, there’s two things to unpack here.
First, I believe with those answers that I went too far in the Editoralizing vs Being Precise tradeoff with the term "Butlerian Jihad", without even explaining what I mean. I will half-apologize for that, only half because I didn’t intend the "Butlerian Jihad" to actually be the central point ; the central point is about how we’re not ready to tackle the problem of Human Values but that current AI timelines force us to. You can see it’s pretty dumb of me to put a non-central point as the title. I have no defense for that.
Second : By Butlerian Jihad, I do not mean "no AI, ever, forever", I mostly mean "a very long pause, at capabilities levels far bellow AGI. I feel already bad about GPT 5 even if it does no go human-level. I’m not even sure I’m entirely fine with GPT 4"
Contra you and Zvi, I think that if GPT 5 leads to 80% jobs automation, the democratic consensus will be pretty much the Dune version of the Butlerian Jihad. No AI forever and the guillotine for those who try. Which I would agree with you and Zvi and probably everyone else on lesswrong is not a good outcome. I don’t think it’s a very interesting point of discussion either, so let’s drop it ?
I'm actually not sure what you're arguing for or against in this whole section.
I’m essentially arguing against taking Human Values as some Abstract Adjectives like Happiness and Health and Equality and Justice and forgetting about… you know… the humans in the process.
What that has to do with justice destroying the world, I have absolutely no clue
It’s about Abstract Justice destroying humans (values) if you go too far in your Love for Justice and forget that they’re the reason we want Justice in the first place.
Some values have always won, and some values have always lost, and that will not change
Yes, which already rises an important point :
What value do we (who is "we" ?) place on Diversity ? On values which we do not personally have but that seems to have a good place in our "Best Model of Human Values" ? What about values which do not really fit in our "Best Model of Human Values", but turns out that some other humans on the planet happen to put in their model of the "Best Model of Human Values". What if that other human is your sworn enemy ?
It was there that I was trying to point with my "exercise for the reader".
I think you're trying to take the view that any major change in the "human condition", or in what's "human", is equivalent to the destruction of the world, no matter what benefits it may have. This is obviously wrong
Oh, I will not dispute that it is wrong. Better the super-happies that the literal Void. Just not much better.
You seem a bit bitter about my "I won’t expand on that", "too long post", and so on. I’m sorry, but I spent two days on the post, already 2 hours on one reply. I’m not a good or prolific writer. I have to pick what I spend my energy on.
So you're siding with the guy who killed 15 billion non-consenting people because he personally couldn't handle the idea of giving up suffering?
I initially didn’t want to reply to that. I don’t want to fight you. I just want to reply as an illustration of how fast things can go difficult and conflictual. It doesn’t take much :
So you’re siding with the guy who is going to forcibly wirehead all sentient life in the universe, just because he can’t handle that somewhere, someone is using his agency wrong and suffering as a result ?
That being said, what now ? Should we fight each other to death for the control of the AGI, to decide whether the universe will have Agency and Suffering, or no Agency and no Suffering ?
Human Values have been changing, for individuals and in the "average", for as long as there've been humans, including being discarded consciously or unconsciously. Mostly in a pretty aimless, drifting way.
Lot of consciously too, but yes.
neither AI nor anything else will fundamentally change it.
Hard disagree on that (wait, is this the first real disagreement we have ?). We can have the supperhappies if we want to (or for that matter, the baby-eaters). We couldn’t before. The supperhappies do represent a fundamental change.
Before, we still had not much choice over diversity. Many people fought countless wars to reduce diversity in humans values, without much overall success (some, yes, but not much in the grand picture of things). In the AGI age nothing forces the one controlling the AGI to care much for diversity. It will have to be a deliberate choice. And do you notice all the forces and values already arraying against diversity ? It does not bode well for those who value at least some diversity.
I haven't actually heard many people suggesting that.
That’s the "best guess of what we will do with AGI" from those building AGI.
I’m pretty sure "man will toil by the sweat of his brow" is about down there, before you die and (hopefully) go to the paradise, and you don’t have to work in paradise. And anyway I know next to nothing to Christianism, it’s mostly a reference to Scott Alexander (or was it Yudkowsky ? now I’m starting to doubt…) who said something like "the description of christian paradise seems pretty lame, I mean just bask in the glory of god doing nothing for all eternity, you would be bored after two days, but it makes sense to describe that as a paradise if you put yourself in the shoes of the average medieval farmer that toil all day".
(I did all that from my terrible memories, so apologies if I’m depicting anything wrongly here).
I’m not missing that strategy at all. It’s an almost certainty that any solution will have to involve something like that, barring some extremely strong commitment to Unity which by itself will destroy a lot of Values. But there are some pretty fundamental values that some people (even/especially) here care a lot about, like negative utilitarianism ("minimize suffering"), which are flatly incompatible with simple implementations of that solution. Negative utilitarians care very much about the total suffering in the universe and their calculus do not stop at the boundaries of "different societies".
And if you say "screw them", well, what about the guy who basically goes "let’s create the baby eaters society ?". If you recoil at that, it means there’s at least a bit of negative utilitarianism in you. Which is normal, don’t worry, it’s a pretty common human value, even in people who doesn’t describe themselves as "negative utilitarians".
Now you can recognize the problem, which is that every individual will have a different boundary in the Independence-Freedom-Diversity vs Negative-Utilitarianism tradeoff.
(which I do not think is the only tradeoff/conflict, but clearly one of the biggest one, if not THE biggest one, if you set aside transhumanism)
And if you double down on the "screw them" solution ? Well, you enter exactly in what I described with "even with perfect play, you are going to lose some Human Values". For it is a non-negligible chunk of Human Values.
My headcanon is that there are two levels of alignment :
- Technical alignment : you get an AI that does what you ask it to do, without any shenanigans (a bit more precisely : without any short-term/medium-term side-effect that, should you know that side-effect beforehand, would cause you to refuse to do the thing in the first place). Typical misalignment at this level : hidden complexity of wishes (or, you know, no alignment at all, like clippy)
- Comprehensive alignment : you get an AI that does what the CEV-you wants. Typical misalignment : just ask a technically-aligned AI some heavily social-desirability-biased outcome, solve for equilibrium, get close to 0 value remaining in the universe.
But yeah, I don’t think that distinction has got enough discussion.
(there’s also a third level, where CEV-you wishes also goes to essentially 0 value for current-you, but let’s not get there)
Note that a slightly different worded problem gives the intuitive result :
A_k is the event "I roll a dice k times, and it end up with 66, with no earlier 66 sequence".
B_k be the event "I roll a dice k times, and it end up with a 6, and one and only one 6 before that (but not necessarily the roll just before the end : 16236 works)".
C_k is the event "I roll a dice k times, and I only get even numbers".
In this case we do have the intuitive result (that I think most mathematicians intuitively translate this problem into) :
Σ[k * P(A_k|C_k)] > Σ[k * P(B_k|C_k)]
Now the question is : why are not the two formulations equivalent ? How would you write "expected number of runs" more formally, in a way that would not yield the above formula, and would reproduce the numbers of your Python program ?
(this is what I hate in probability theory, where slightly different worded problems, seemingly equivalent, yields completely different results for no obvious reason).
Also, the difference between the two processes is not small :
Expected rolls until two 6s in a row (given all even): 2.725588
Expected rolls until second 6 (given all even): 2.999517
vs (n = 10 millions)
k P(A_k|C_k) P(B_k|C_k)
-------------------------
1 0.000000 0.000000
2 0.111505 0.111505
3 0.074206 0.148227
4 0.074719 0.148097
5 0.066254 0.130536
6 0.060060 0.108174
7 0.053807 0.086706
8 0.049360 0.067133
9 0.046698 0.050944
10 0.040364 0.038915
11 0.038683 0.029835
12 0.030691 0.024297
13 0.034653 0.011551
14 0.036450 0.014263
15 0.024138 0.006897
16 0.007092 0.007092
17 0.012658 0.000000
18 0.043478 0.000000
19 0.000000 0.000000
20 0.000000 0.000000
Expected values:
E[k * P(A_k|C_k)] = 6.259532
E[k * P(B_k|C_k)] = 5.739979
I get a strong "our physical model says that spherical cows can move with way less energy by just rolling, thereby proving that real cows are stupid when deciding to walk" vibe here.
Loss aversion is real, and is not especially irrational. It’s simply that your model is way too simplistic to properly take it into account.
If I have $100 lying around, I am just not going to keep it around "just in case some psychology researcher offers me a bet". I am going to throw out in roughly 3 baskets of money : spending, savings, and emergency fund. The policy of the emergency fund is "as small as possible, but not smaller". In other words : adding to the balance of that emergency funds is low added util, but taking from it is high (negative) util.
The loss from an unexpected bet is going to mostly be taken from the emergency fund (because I can’t take back previous spendings, and I can’t easily take from my savings). On the positive side (gain), any gain will be put into spendings or savings.
So the "ratio" you’re measuring is not a sample from a smooth, static "utility of global wealth". I am constantly adjusting my wealth assignment such that, by design and constraint, yes, the disutility of loss is brutal. If I weren’t, I would just be leaving util lying on the ground, so to speak (I could spend or save).
You want to model this ?
Ignore spending. Start with an utility function of the form U(W_savings, W_emergency_fund). Notice that dU/dW_emergency_fund is large and negative on the left. Notice that your bet is 1/2 U(W_savings + 110, W_emergency_fund) + 1/2 U(W_savings, W_emergency_fund - 100).
I have not tested, but I’m ready to bet (heh !) that it is relatively trivial to construct a reasonable utility function that says no to the first bet and yes to the second if you follow this model and those assumptions about the utility function.
(there is a slight difficulty here : assuming that my current emergency fund is at its target level, revealed preference shows that obviously dU/dW_savings > dU/dW_emergency_funds. An economist would say that obviously, U is maximized where dU/dW_savingqs = dU/dW_emergency_funds)
From what I can tell, the far right in France supports environmentalism.[1]
Yes (with some minor caveats). It is also pro-choice on abortion (https://www.lemonde.fr/politique/article/2022/11/22/sur-l-ivg-marine-le-pen-change-de-position-et-propose-de-constitutionnaliser-la-loi-veil_6151030_823448.html) (with some minor caveats), and pro-gun-control (can’t find a link for that, sorry — the truth is that they are pro-gun-control because there is literally no one debating for the side pro-gun-rights at all, pro-gun-control is an across-the-board consensus).
Environmentalism is not partisan in many other countries, including in highly partisan countries like South Korea or France
French here. I think diving into details will shed some light.
Our mainstream right is roughly around your Joe Biden. Maybe a bit more on the right, but not much more. Our mainstream left is roughly around your Bernie Sanders. We just don’t have your republicans in the mainstream. And it turns out that there’s not much partisanship relative to climate change between Biden and Sanders.
This can be observed on other topics. There is no big ideological gap in gun control or abortion in France, because the pro-gun-rights and pro-life positions are just not represented here at all.
I’m not sure how you measure "highly partisan", but I don't think it captures the correct picture, namely the ideological gap between mainstream right and mainstream left.
I think you’re trying to point towards multimodal distributions ?
If you can decompose P(X) as P(X) = P(X|H1)P(H1) + ... + P(X|Hn)P(Hn), and the P(X|Hn) are nice unimodal distributions (like a normal distribution), you end up with a multimodal distribution.
But what can you do with a singular story? Twain and all the witnesses are long dead, and no new observations can be made. All we have is his anecdote and whatever confirmatory recollections may have been recorded by the others in his story.
It is, in principle, reproducible and testable. Ask every husband, wife, sibling, parent to a soldier involved in an ongoing conflict (such as Russia/Ukraine war, Israel/Palestine war) to record those "sentiments of dread for a loved one". See if it matches with recorded casualties.
if you know that the random variable D is Monday
Yes, that’s kind of my point. There’s two wildly different problems that looks the same on the surface, but they are not. One gives the answer of your post, the other is 1/3. I suspect that your initial confusion is your brain trying to interpret the first problem as an instance of the second. My brain sure did, initially.
On the first one, you go and interview 1000 fathers having two children. You ask them the question "Do you have at least one boy born on a Monday ?". If they answer yes, you then ask then "Do you have two boys ?". You ask the probability that the second answer is yes, conditioning on the event that the first one is yes. The answer is the one of your post.
On the second one, you send one survey to 1000 fathers having two children. It reads something like that. "1. Do you have at least one boy ? 2. Give the weekday of birth of the boy. If you have two, pick any one. 3. Do you have two boys ?". Now the question is, conditioning on the event that the first answer is yes, and on the random variable given by the second answer, what is the probability that the third answer is yes ? The answer is 1/3.
My main point is that none of the answers are counter-intuitive. In the first problem, your conditioning on Monday is like always selecting a specific child, like always picking the youngest one (in the sentence "I have two children, and the youngest one is a boy", which gives then a probability of 1/2 for two boys). With low n, the specificity is low and you're close to the problem without selecting a specific child and get 1/3. With large n, the specificity is high and you’re close to the problem of selecting a specific child (eg the youngest one) and get 1/2. In the second problem, the "born on the monday" piece of information is indeed irrelevant and get factored out.
I don’t think you’re modeling your problem correctly, unless I misunderstood the question you’re trying to answer. You have those following random variables :
X_1 is bernoulli, first child is a boy
X_2 is bernouilli, second child is a boy
Y_1 is uniform, weekday of birth of the first child
Y_2 is uniform, weekday of birth of the second child
D is a random variable which corresponds to the weekday in the sentence "one of them is a boy, born a (D)". There is many ways to construct one like this, but we only require that if X_1=1 or X_2=1, then D=Y_1 or D=Y_2, and that D=Y_i implies X_i=1.
Then what you're looking for is not P(X_1=1,X_2=1 | (X_1=1,Y_1=monday) or (X_2=1,Y_2=monday)) (which, indeed, is not 1/3), but P(X_1=1,X_2=1 | ((X_1=1,D_1=D) or (X_2=1,D_2=D)) and D=monday). This is still 1/3, as illustrated by this Python snippet (I’m too lazy to properly demonstrate this formally) : https://gist.github.com/sloonz/faf3565c3ddf059960807ac0e2223200
There wass a similar paradox presented on old lesswrong. If someone can manage to find it (a quick google search returned nothing, but i may have misremembered the exact terms of the problem…), the solution would be way better presented there :
Alice, Bob and Charlie are accused of treason. To make an example, one of them, chosen randomly, will be executed tomorrow. Alice ask for a guard, and give him a letter with those instructions : "At least Bob or Charlie will not be executed. Please give him this letter. If I am to be executed and both live, give the letter to any one of them". The guard leaves, returns and tell Alice : "I gave the letter to Bob".
Alice is unable to sleep the following night : "Before doing this, I had a 1/3 chance of being executed. Now that it’s either me or Charlie, I have a 1/2 chance of being executed. I shouldn’t have written that letter".