Value Deathism

vladimir_nesov

Value Deathism

post by Vladimir_Nesov · 2010-10-30T18:20:30.796Z · LW · GW · Legacy · 121 comments

121 comments

Ben Goertzel:

I doubt human value is particularly fragile. Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology. I think it's fairly robust.

Robin Hanson:

Like Ben, I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors. The risks of attempting a world government anytime soon to prevent this outcome seem worse overall.

We all know the problem with deathism: a strong belief that death is almost impossible to avoid, clashing with undesirability of the outcome, leads people to rationalize either the illusory nature of death (afterlife memes), or desirability of death (deathism proper). But of course the claims are separate, and shouldn't influence each other.

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise. It's easier to see a sudden change as morally relevant, and easier to rationalize gradual development as morally "business as usual", but if we look at the end result, the risks of value drift are the same. And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.

Regardless of difficulty of the challenge, it's NOT OK to lose the Future. The loss might prove impossible to avert, but still it's not OK, the value judgment cares not for feasibility of its desire. Let's not succumb to the deathist pattern and lose the battle before it's done. Have the courage and rationality to admit that the loss is real, even if it's too great for mere human emotions to express.

121 comments

Comments sorted by top scores.

comment by ata · 2010-10-31T02:12:33.821Z · LW(p) · GW(p)

Anyone who predicts that some decision may result in the world being optimized according to something other than their own values, and is okay with that, is probably not thinking about terminal values. More likely, they're thinking that humanity (or its successor) will clarify its terminal values and/or get better at reasoning from them to instrumental values to concrete decisions, and that their understanding of their own values will follow that. Of course, when people are considering whether it's a good idea to create a certain kind of mind, that kind of thinking probably means they're presuming that Friendliness comes mostly automatically. It's hard for the idea of an agent with different terminal values to really sink in; I've had a little bit of experience with trying to explain to people the idea of minds with really fundamentally different values, and they still often try to understand it in terms of justifications that are compelling (or at least comprehensible) to them personally. Like, imagining that a paperclip maximizer is just like a quirky highly-intelligent human who happens to love paperclips, or is under the mistaken impression that maximizing paperclips is the right thing could do and could be talked out of it by the right arguments. I think Ben and Robin ("Human value ... will continue to [evolve]", "I think it is ok (if not ideal) if our descendants' values deviate from ours") are thinking as though AI-aided value loss would be similar to the gradual refinement of instrumental values that takes place within societies consisting of largely-similar human brains (the kind of refinement that we can anticipate in advance and expect we'll be okay with), rather than something that could result in powerful minds that actually don't care about morality.

And I feel like anyone who really has internalized the idea that minds are allowed to fundamentally care about completely different things that we do, and still thinks they're okay with that actually happening, probably just haven't taken five minutes to think creatively about what kinds of terrible worlds or non-worlds could be created as a result of powerfully optimizing for a value system based on our present muddled values plus just a little bit of drift.

I suppose what remains are the people who don't buy the generalized idea of optimization processes as a superset of person-like minds in the first place, with really powerful optimization processes being another subset. Would Ben be in that group? Some of his statements (e.g. "It's possible that with sufficient real-world intelligence tends to come a sense of connectedness with the universe that militates against squashing other sentiences. But I'm not terribly certain of this, any more than I'm terribly certain of its opposite." (still implying a privileged 50-50 credence in this unsupported idea)) do suggest that he is expecting AGIs to automatically be people in some sense.

(I do think that "Value Deathism" gives the wrong impression of what this post is about. Something like "Value Loss Escapism" might be better; the analogy to deathism seems too much like a surface analogy of minor relevance. I'm not convinced that the tendency to believe that value loss is illusory or desirable is caused by the same thought processes that cause those beliefs about death. More likely, most people who try to think about AI ethics are going to be genuinely really confused about it for a while or forever, whereas "is death okay/good?" is not a confusing question.)

Replies from: Perplexed, MichaelVassar, XiXiDu, MichaelVassar

↑ comment by Perplexed · 2010-10-31T03:23:28.441Z · LW(p) · GW(p)

Ok. Well done. You have managed to frighten me. Frightened me enough to make me ask the question: "Just why do we want to build a powerful optimizer, anyways?"

More likely, most people who try to think about AI ethics are going to be genuinely really confused about it for a while or forever, whereas "is death okay/good?" is not a confusing question.

Oh, yeah. Now I remember. The reason we want to build a powerful optimizer is because some people think that "Is death okay/good?" is not a confusing question but that the question "Is it okay/good to risk the future of the Earth by building an amoral agent much more powerful than ourselves?" is confusing.

Replies from: ata

↑ comment by ata · 2010-10-31T03:44:42.928Z · LW(p) · GW(p)

Ok. Well done. You have managed to frighten me. Frightened me enough to make me ask the question: "Just why do we want to build a powerful optimizer, anyways?"

I feel like I remember trying to answer the same question (asked by you) before, but essentially, the answer is that (1) eventually (assuming humanity survives long enough) someone is probably going to build one anyway, probably without being extremely careful about understanding what kind of optimizer it's goint to be, and getting FAI before then will probably be the only way to prevent it; (2) there are many reasons why humanity might not survive long enough for that to happen — it's likely that humanity's technological progress over the next century will continuously lower the amount of skill, intelligence, and resources needed to accidentally or intentionally do terrible things — and getting FAI before then may be the best long-term solution to that; (3) given that pursuing FAI is likely necessary to avert other huge risks, and is therefore less risky than doing nothing, it's an especially good cause considering that it subsumes all other humanitarian causes (if executed successfully).

Replies from: Perplexed

↑ comment by Perplexed · 2010-10-31T04:36:59.327Z · LW(p) · GW(p)

I feel like I remember trying to answer the same question (asked by you) before ...

Perhaps you did. This time, my question was mostly rhetorical, but since you gave a thoughtful response, it seems a shame to waste it.

(1) eventually ... someone is probably going to build one anyway, probably without being extremely careful ..., and getting FAI before then will probably be the only way to prevent it;

Uh. Prevent it how. I'm curious how that particular sausage will be made.

(2) ... it's likely that humanity's technological progress over the next century will continuously lower the amount of skill, intelligence, and resources needed to accidentally or intentionally do terrible things — and getting FAI before then may be the best long-term solution to that;

More sausage. How does the FAI solve that problem? It seemed that you said the root cause of the problem was technological progress, but perhaps I misunderstood.

(3) ... it subsumes all other humanitarian causes ...

Hmmm. Amnesty International, Doctors without Borders, and the Humane Society are three humanitarian causes that come to mind. FAI subsumes these ... how, exactly?

Again, my questions are somewhat rhetorical. If I really wanted to engage in this particular dialog, I should probably do so in a top-level posting. So please do not feel obligated to respond.

It is just that if Ben Goertzel is so confused as to hope that any sufficiently intelligent entity will automatically empathize with humans, then how much confusion exists here regarding just how much humans will automatically accept the idea of sharing a planet with an FAI? Smart people can have amazing blind spots.

Replies from: ata, orthonormal

↑ comment by ata · 2010-10-31T05:03:22.524Z · LW(p) · GW(p)

If I knew how that sausage will be made, I'd make it myself. The point of FAI is to do a massive amount of good that we're not smart enough to figure out how to do on our own.

Hmmm. Amnesty International, Doctors without Borders, and the Humane Society are three humanitarian causes that come to mind. FAI subsumes these ... how, exactly?

If humanity's extrapolated volition largely agrees that those causes are working on important problems, problems urgent enough that we're okay with giving up the chance to solve them ourselves if they can be solved faster and better by superintelligence, then it'll do so. Doctors Without Borders? We shouldn't be needing doctors (or borders) anymore. Saying how that happens is explicitly not our job — as I said, that's the whole point of making something massively smarter than we are. Don't underestimate something potentially hundreds or thousands or billions of times smarter than every human put together.

Replies from: MichaelVassar

↑ comment by MichaelVassar · 2010-11-03T20:13:52.814Z · LW(p) · GW(p)

I actually think we know how to do the major 'trauma care for civilization' without FAI at this point. FAI looks much cheaper and possibly faster though, so in the process of doing the "trauma care" we should obviously fund it as a top priority. I basically see it as the largest "victory point" option in a strategy game.

↑ comment by orthonormal · 2010-11-07T18:16:03.069Z · LW(p) · GW(p)

When answering questions like this, it's important to make the following disclaimer: I do not know what the best solution is. If a genuine FAI considers these questions, ve will probably come up with something much better. I'm proposing ideas solely to show that some options exist which are strictly preferable to human extinction, dystopias, and the status quo.

It's pretty clear that (1) we don't want to be exterminated by a rogue AI, or nanotech, or plague, or nukes, (2) we want to have aging and disease fixed for us (at least for long enough to sit back and clearly think about what we want of the future), and (3) we don't want an FAI to strip us of all autonomy and growth in order to protect us. There are plenty of ways to avoid both these possibilities. For one, the FAI could basically act as a good Deist god should have: fix the most important aspects of aging, disease and dysfunction, make murder (and construction of superweapons/unsafe AIs) impossible via occasional miraculous interventions, but otherwise hang back and let us do our growing up. (If at some point humanity decides we've outgrown its help, it should fade out at our request.) None of this is technically that difficult, given nanotech.

Personally, I think a FAI could do much better than this scenario, but if I talked about that we'd get lost arguing the weird points. I just want to ask, is there a sense in which this lower bound would really seem like a dystopia to you? (If so, please think for a few minutes about possible fixes first.)

Replies from: Perplexed

↑ comment by Perplexed · 2010-11-07T18:27:15.797Z · LW(p) · GW(p)

I just want to ask, is there a sense in which this lower bound would really seem like a dystopia to you?

No, not at all. It sounds pretty good. However, my opinion of what you describe is not the issue. The issue is what ordinary, average, stupid, paranoid, and conservative people think about the prospect of a powerful AI totally changing their lives when they have only your self-admittedly ill informed assurances regarding how good it is going to be.

Replies from: orthonormal

↑ comment by orthonormal · 2010-11-07T18:45:36.452Z · LW(p) · GW(p)

Please don't move the goalposts. I'd much rather know whether I'm convincing you than whether I'm convincing a hypothetical average Joe. Figuring out a political case for FAI is important, but secondary to figuring out whether it's actually possible and desirable.

Replies from: Perplexed

↑ comment by Perplexed · 2010-11-07T18:56:28.455Z · LW(p) · GW(p)

Ok, I don't mean to be unfairly moving goalposts around. But I will point out that gaining my assent to a hypothetical is not the same as gaining my agreement regarding the course that ought to be followed into an uncertain future.

Replies from: orthonormal

↑ comment by orthonormal · 2010-11-07T19:08:26.023Z · LW(p) · GW(p)

That's fair enough. The choice of course depends on whether FAI is even possible, and whether any group could be trusted to build it. But conditional on those factors, we can at least agree that such a thing is desirable.

↑ comment by MichaelVassar · 2010-11-03T20:11:00.638Z · LW(p) · GW(p)

I'd really appreciate your attempting to write up some SIAI literature to communicate these points to the audiences you are talking about. It is hard.

↑ comment by XiXiDu · 2010-11-01T09:46:53.140Z · LW(p) · GW(p)

And I feel like anyone who really has internalized the idea that minds are allowed to fundamentally care about completely different things...

What is questionable is not the possibility of fundamentally different values but that they could accidentally be implemented. What you are suggesting is that some intelligence is able to evolve a vast repertoire of heuristics, acquire vast amounts of knowledge about the universe, dramatically improve its cognitive flexibility and yet never evolve its values but keep its volition at the level of a washing machine. I think this idea is flawed, or at least not sufficiently backed up to take it serious right now. I believe that such an incentive, or any incentive, will have to be deliberately and carefully hardcoded or evolved. Otherwise we are merely talking about grey goo scenarios.

↑ comment by MichaelVassar · 2010-11-03T20:10:20.030Z · LW(p) · GW(p)

Is Death absolutely bad or not is a somewhat confusing question. If you can't phrase questions, at an emotional level, only choose between them, that can become "Is death okay/good" by pattern match.

comment by Emile · 2010-10-30T19:21:26.261Z · LW(p) · GW(p)

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise.

That really depends of what you mean by "our values":

1) The values of modern, western, educated humans? (as opposed to those of the ancient Greek, or of Confucius, or of medieval Islam), or

2) The "core" human values common to all human civilizations so far? ("stabbing someone who just saved your life is a bit of a dick move", "It would be a shame if humanity was exterminated in order to pave the universe with paperclips", etc.)

Both of those are quite fuzzy and I would find it hard to describe either of them precisely enough that even a computer could understand them.

When Eliezer talks of Friendly AI having human value, I think he's mostly talking about the second set (in The Psychological Unity of Mankind. But when Ben or Robin talk about how it isn't such a big deal if values change, because they've already changed in the past, they seem to be referring to the first kind of value.

I would agree with Ben and Robin that it isn't a big deal if our descendents (or Ems or AIs) have values that are at odds with our current, western, values (because they might be "wrong", some might be instrumental values we confuse for terminal values, etc.); but I wouldn't extend that to changes in "fundamental human values".

So I don't think "Ben and Robin are OK with a future without our values" is a good way of phrasing it. The question is more whether there is such a thing as fundamental human values (or is everything cultural?), whether it's easy to hit those in mind-space, etc.

Counterpoints: The Psychological Diversity of Mankind, Human values differ as much as values can differ.

Replies from: AdeleneDawner, Vladimir_Nesov, None, JamesAndrix

↑ comment by AdeleneDawner · 2010-10-30T20:33:44.249Z · LW(p) · GW(p)

I find this distinction useful. According to the OP, I'd be considered a proponent of values-deathism proper, but only in terms of the values you place in the first set; I consider the exploration of values-space to be one of the values in the second set, and a significant part of my objection to the idea of tiling the universe with paperclips is that it would stop that process.

↑ comment by Vladimir_Nesov · 2010-10-30T19:33:11.826Z · LW(p) · GW(p)

That really depends of what you mean by "our values"

Your values is at least something that on reflection you'd be glad happened, which doesn't apply to acting on human explicit beliefs that are often imprecise or wrong. More generally, any heuristic for good decisions you know doesn't qualify. "Don't kill people" doesn't qualify. Values are a single criterion that doesn't tolerate exceptions and status quo assumptions. See magical categories for further discussion.

Replies from: Emile

↑ comment by Emile · 2010-10-30T19:43:35.381Z · LW(p) · GW(p)

But that may not be what Ben implied when saying

I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors.

(I read it as "our ancestors" meaning "the ancient Greeks", not, "early primates" but I may be wrong)

Replies from: wedrifid

↑ comment by wedrifid · 2010-10-30T19:49:04.754Z · LW(p) · GW(p)

I read it as "our ancestors" meaning "the ancient Greeks", not, "early primates" but I may be wrong

In a certain sense "primordial single celled replicator" may be an even more relevant comparison than either. Left free to deviate Nash would weed out those pesky 'general primate values'.

↑ comment by [deleted] · 2010-10-31T22:37:09.630Z · LW(p) · GW(p)

Spelling notice (bold added):

When Eliezer talks of Friendly AI having human value, I think he's mostly talking about the second set (in The Psychological Unity of Manking.

Replies from: Emile

↑ comment by Emile · 2010-11-01T08:52:58.850Z · LW(p) · GW(p)

Fixed, thanks.

↑ comment by JamesAndrix · 2010-10-31T03:25:14.464Z · LW(p) · GW(p)

I don't think 2 accurately reflects Eliezer's Preservation target. CEV doesn't ensure beforehand that any of those core values aren't thrown out. What's important is the process by which we decide to adopt or reject values, how that process changes when we learn more, and things like that.

That is also one thing that could now change as the direct result of choices we make, through brain modification, or genetic engineering, or AI's with whole new value-adoption systems. Our intuition tends to treat this as stable even when we know we're dealing with 'different' cultures.

comment by RobinHanson · 2010-10-30T18:36:45.308Z · LW(p) · GW(p)

How about ruler-of-the-universe deathism? Wouldn't it be great if I were sore undisputed ruler of the universe? And yet thinking that rather unlikely, I don't even try to achieve it. I even think trying to achieve it would be counter-productive. How freackin' defeatist is that?

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T18:40:55.653Z · LW(p) · GW(p)

That you won't try incorporates feasibility (and can well be a correct decision, just as expecting defeat may well be correct), but value judgment doesn't, and shouldn't be updated on lack of said feasibility. It's not OK to not take over the world.

I even think trying to achieve it would be counter-productive.

There is no value in trying.

Replies from: DSimon, RobinHanson, wedrifid

↑ comment by DSimon · 2010-10-30T20:04:01.905Z · LW(p) · GW(p)

I think that if I took over the world it might cause me to go Unfriendly; that is, there's a nontrivial chance that the values of a DSimon that rules the world would diverge from my current values sharply and somewhat quickly.

Basically, I just don't think I'm immune to corruption, so I don't personally want to rule the world. However, I do wish that the world had an effective ruler that shared my current values.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T20:08:30.109Z · LW(p) · GW(p)

See this comment. The intended meaning is managing to get your values to successfully optimize the world, not for your fallible human mind to issue orders.

Your actions are pretty "Unfriendly" even now, to the extent they don't further your values because of poor knowledge of what you actually want and poor ability to form efficient plans.

↑ comment by RobinHanson · 2010-10-30T18:46:07.062Z · LW(p) · GW(p)

I don't think you know what "OK" means.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T18:50:31.260Z · LW(p) · GW(p)

Yes, that was some rhetoric applause-lighting on my part with little care about whether you meant what my post seemed to assume you meant. I think the point is worth making (with deathist interpretation of "OK"), even if it doesn't actually apply to yours or Ben's positions.

↑ comment by wedrifid · 2010-10-30T18:56:09.011Z · LW(p) · GW(p)

It's not OK to not take over the world.

Unless you know you're kind of a git or, more generally, your value system itself doesn't rate 'you taking over the world' highly. I agree with your position though.

It is interesting to note that Robin's comment is all valid when considered independently. The error he makes is that he presents it as a reply to your argument. "Should" is not determined by "probably will".

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T19:23:38.547Z · LW(p) · GW(p)

It's not OK to not take over the world.

Unless you know you're kind of a git or, more generally, your value system itself doesn't rate 'you taking over the world' highly.

It's an instrumental goal, it doesn't have to be valuable in itself. If you don't want for your "personal attitude" to apply to the world as a whole, it reflects the fact that your values disagree with your personal attitude, and you prefer for the world to be controlled by your values rather than personal attitude.

Taking over the world as a human ruler is certainly not what I meant, and I expect is a bad idea with bad expected consequences (apart from independent reasons like being in a position to better manage existential risks).

Replies from: wedrifid

↑ comment by wedrifid · 2010-10-30T19:42:57.486Z · LW(p) · GW(p)

It's an instrumental goal, it doesn't have to be valuable in itself.

The point being that It can be a terminal anti-goal. People could (and some of them probably do) value not-taking-over-the-world very highly. Similarly there are people who actually do want to die after the normal alloted years, completely independently of sour grapes updating. I think they are silly, but it is their values that matter to them, not my evaluation thereof.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T19:52:12.150Z · LW(p) · GW(p)

People could (and some of them probably do) value not-taking-over-the-world very highly.

This is a statement about valuation of states of the world, a valuation that is best satisfied by some form of taking over the world (probably much more subtle than what gets classified so by the valuation itself).

I think they are silly, but it is their values that matter to them, not my evaluation thereof.

It's still your evaluation of their situation that says whether you should consider their opinion on the matter of their values, or know what they value better than they do. What is the epistemic content of your thinking they are silly?

Replies from: wedrifid

↑ comment by wedrifid · 2010-10-30T19:53:47.017Z · LW(p) · GW(p)

I do not agree.

comment by MichaelVassar · 2010-11-03T18:23:43.632Z · LW(p) · GW(p)

Voted up. However, I disagree with "it's not OK". Everything is always OK. OK is a feature of the map. From a psychological perspective, that's important. If an OK state of the map can't be generated by changing the territory, it WILL be generated by cheating and directly manipulating the map.

That said, we have preferences, rank orderings of outcomes. The value of futures with our values is high.

comment by Mass_Driver · 2010-10-30T21:33:10.183Z · LW(p) · GW(p)

OK, fine, literally speaking, value drift is bad.

But if I live to see the Future, then my values will predictably be updated based on future events, and it is part of my current value system that they do so. I affirmatively value future decisions being made by entities that have taken a look at what the future is actually like and reflected on the data they gain.

Why should this change if it turns out that I don't live to see the future? I would like future-me to be one of the entities that help make future decisions, but failing that, my second-best option is to have future-others make those decisions. I don't want present-me's long dead hand to go around making stupid decisions for other people.

Even if future people's values seem bizarre to present-me, that just reflects the fact that it takes a while to process the data that leads to shifts in value. Presumably, if you show me the values and the cultural/demographic landscape of 2050 now, then by 2020 I'll endorse many of them and be chewing on most of the rest. Why arbitrarily privilege my initial disgust over my considered reaction?

Replies from: Matt_Simpson, Vladimir_Nesov

↑ comment by Matt_Simpson · 2010-10-30T21:36:44.880Z · LW(p) · GW(p)

But if I live to see the Future, then my values will predictably be updated based on future events, and it is part of my current value system that they do so.

That's not our value system changing, that's your assessment of how to best achieve your values changing edit: or your best guess of what your values actually are changing. end edit You're using the term 'value' in a different sense from the original post.

↑ comment by Vladimir_Nesov · 2010-10-30T21:44:45.184Z · LW(p) · GW(p)

Read the comments ([1], [2]).

Replies from: Mass_Driver

↑ comment by Mass_Driver · 2010-10-31T02:40:25.209Z · LW(p) · GW(p)

OK, fine. That's a perfectly reasonable way to use the word "values." If that's what you mean, though, then I don't think any of us should get worked up about value drift. We can't even specify most of our top-node values with any kind of precision or accuracy -- why should we care if (a) they change or (b) a world that we personally do not live in becomes optimized for other values?

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-31T02:55:53.897Z · LW(p) · GW(p)

We can't even specify most of our top-node values with any kind of precision or accuracy -- why should we care if (a) they change or (b) a world that we personally do not live in becomes optimized for other values?

Where you don't have any preference, you have indifference, and you are not indifferent all around. There is plenty of content to your values. Uncertainty and indifference are no foes to accuracy, they can be captured as precisely as any other concept.

Whether "you don't personally live" in the future is one property of the future to consider: would you like that property to hold? An uncaring future won't have you living in it, but a future that holds your values will try to arrange something at least as good, or rather much better.

Also see Belief in the Implied Invisible. What you can't observe is still there, and still has moral weight.

Replies from: PeterS

↑ comment by PeterS · 2010-10-31T08:15:03.699Z · LW(p) · GW(p)

As Poincaré said, "Every definition implies an axiom, since it asserts the existence of the object defined." You can call a value a "single criterion that doesn't tolerate exceptions and status quo assumptions" -- but it's not clear to me that I even have values, in that sense.

Of course, I will believe in the invisible, provided that it is implied. But why is it, in this case?

You also speak of the irrelevance (in this context) of the fact that these values might not even be feasibly computable. Or, even if we can identify them, there may be no feasible way to preserve them. But you're talking about moral significance. Maybe we differ, but to me there is no moral significance attached to the destruction of an uncomputable preference by a course of events that I can't control.

It might be sad/horrible to live to see such days (if only by definition -- as above, if one can't compute their top-node values then it's possible that one can't compute how horrible it would be), as you say. It also might not. Although I can't speak personally for the values of a Stoic, they might be happy to... well, be happy.

comment by cousin_it · 2010-10-31T13:08:33.342Z · LW(p) · GW(p)

Saying a certain future is "not ok", and saying gradual value drift is "business as usual", are both value judgments. I don't understand why you dismiss one of them but not the other, and call it "courageous and rational".

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-31T13:22:19.280Z · LW(p) · GW(p)

I don't understand your comment enough to reply ("business as usual"? - that was more of a fallacy pattern). Maybe if you write in more detail?

Replies from: cousin_it

↑ comment by cousin_it · 2010-10-31T13:41:01.953Z · LW(p) · GW(p)

"Business as usual" can't be a "fallacy pattern" because it's not a statement about facts, it's a value statement that says we're okay with value drift (as long as it's some sort of "honest drift", I assume). As you see from the others' comments, people do really subscribe to this, so you're not allowed to dismiss it the way you do.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-31T13:48:42.882Z · LW(p) · GW(p)

it's a value statement that says we're okay with value drift (as long as it's some sort of "honest drift", I assume). And yeah, people do really subscribe to this, so you're not allowed to dismiss it the way you do.

(Gradual) value drift -> Future not optimized -> Future worse than if it's optimized -> Bad. People subscribe to many crazy things. This one is incorrect pretty much by definition.

(Note that a singleton can retain optimized future while allowing different-preference or changing-preference agents to roam in its domain without disturbing the overall optimality of the setup and indeed contributing to it, if that's preferable. This would not be an example of value drift.)

Replies from: cousin_it, Perplexed

↑ comment by cousin_it · 2010-10-31T16:05:25.881Z · LW(p) · GW(p)

Value drift is a real-world phenomenon, something that happens to humans. No real-world phenomenon can be bad "by definition" - our morality is part of reality and cannot be affected by the sympathetic magic of dictionaries. Maybe you're using "value drift" in a non-traditional sense, as referring to some part of your math formalism, instead of the thing that happens to humans?

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-31T16:21:05.916Z · LW(p) · GW(p)

No real-world phenomenon can be bad "by definition"

By definition, as in the property is logically deduced (given some necessary and not unreasonable assumptions). Consider a bucket with 20 apples of at least 100 grams each. Such buckets exist, they are "real-world phenomena". Its weight is at least 2 kg "by definition" in the same sense as misoptimized future is worse than optimized future.

Replies from: cousin_it

↑ comment by cousin_it · 2010-10-31T23:11:41.071Z · LW(p) · GW(p)

I think I figured out a way to incorporate value drift into your framework: "I value the kind of future that is well-liked by the people actually living in it, as long as they arrived at their likes and dislikes by honest value drift starting from us". Do you think anyone making such a statement is wrong?

Replies from: wnoise, Vladimir_Nesov, ata

↑ comment by wnoise · 2010-11-01T01:14:24.096Z · LW(p) · GW(p)

It's a start. But note that it is trivially satisfied by a world that has no one living in it.

↑ comment by Vladimir_Nesov · 2010-11-01T11:42:00.743Z · LW(p) · GW(p)

"I value the kind of future that is well-liked by the people actually living in it..."

If what you value happens to be what's valued by future people, then future people are simultaneously stipulated to have the same values as you do. You don't need the disclaimers about "honest value drift", and there is actually no value drift.

If there is genuine value drift, then after long enough you won't like the same situations as the future people. If you postulate that you only care about the pattern of future people liking their situation, and not other properties of that situation, you are embracing a fake simplified preference, similarly to people who claim that they only value happiness or lack of suffering.

↑ comment by ata · 2010-11-01T00:09:33.380Z · LW(p) · GW(p)

What's "honest value drift" and what's good about it? Normally "value drift" makes me think of our axiology randomly losing information over time; the kind of future-with-different-values I'd value is the kind that has different instrumental values from me (because I'm not omniscient and my terminal values may not be completely consistent) but is more optimized according to my actual terminal values, presumably because that future society will have gotten better at unmuddling its terminal values, knowing enough to agree more on instrumental values, and negotiating any remaining disagreements about terminal values.

↑ comment by Perplexed · 2010-10-31T16:13:43.853Z · LW(p) · GW(p)

(Gradual) value drift -> Future not optimized -> Future worse than if it's optimized -> Bad

value drift -> Future not optimized to original values -> Future more aligned with new values -> Bad from original viewpoint, better from new viewpoint, not optimal from either viewpoint, but what can you do?

Use of word "optimized" without specifying the value system against which the optimization took place --> variant of mind projection fallacy.

ETA: There is a very real sense in which it is axiomatic both that our value system is superior to the value system of our ancestors and that our values are superior to those of our descendants. This is not at all paradoxical - our values are better simply because they are ours, and therefore of course we see them as superior to anyone else's values.

Where the paradox arises is in jumping from this understanding to the mistaken belief that we ought not to ever change our values.

Replies from: Kutta

↑ comment by Kutta · 2010-11-01T12:52:22.204Z · LW(p) · GW(p)

Bad from original viewpoint, better from new viewpoint, not optimal from either viewpoint, but what can you do?

Where the paradox arises is in jumping from this understanding to the mistaken belief that we ought not to ever change our values.

A compelling moral argument may change our values, but not our moral frame of reference.

The moral frame of reference is like a forking bush of possible future value systems stemming from a current human morality; it represents human morality's ability to modify itself upon hearing moral arguments.

The notion of moral argument and moral progress is meaningful within my moral frame of reference, but not meaningful relative to a paperclipper utility function. A paperclipper will not ever switch to stapler maximization on any moral argument; a consistent paperclipper does not think that it will possibly modify its utility function upon acquiring new information. In contrast, I think that I will possibly modify my morality for the better, it's just that I don't yet know the argument that will compel me, because if I knew it I would have already changed my mind.

It is not impossible that paperclipping is the endpoint to all moral progress, and there exists a perfectly compelling chain of reasoning that converts all humans to paperclippers. It is "just" vanishingly unlikely. We cannot, of course, observe our moral frame of reference from an outside omniscient vantage point but we're able to muse about it.

If we do assume omniscience for a second, then there is a space of values that humans would never willingly modify themselves into. Value drift means drifting into such space rather than a modification of values in general.

There is a very real sense in which it is axiomatic both that our value system is superior to the value system of our ancestors and that our values are superior to those of our descendants. This is not at all paradoxical - our values are better simply because they are ours, and therefore of course we see them as superior to anyone else's values.

If our ancestors and our descendant are in the same moral frame of reference then you could possibly convert your ancestors or most of your ancestors to your morality and be converted to future morality by future people. Of course it is not easy to say which means of conversion are valid; on the most basic level I'd say that rearranging your brains' atoms to a paperclipper breaks out of the frame of reference while verbal education and arguments generally don't.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-11-01T13:04:44.919Z · LW(p) · GW(p)

If we do assume omniscience for a second, then there is a space of values that humans would never willingly modify themselves into. Value drift means drifting into such space rather than a modification of values in general.

Rather (in your terminology), value drift is change in the moral frame of reference, even if (current instrumental) morality stays the same.

Replies from: Kutta

↑ comment by Kutta · 2010-11-01T13:22:40.459Z · LW(p) · GW(p)

I agree, it seems a more general way of putting it.

Anyway, now that you mention it I'm intrigued and slightly freaked out by a scenario in which my frame of reference changes without my current values changing. First, is it even knowable when it happens? All our reasoning is based on current values. If an alien race comes and modifies us in a way that our future moral progress changes but not our current values, we could never know the change happened at all. It is a type of value loss that preserves reflective consistency. I mean, we wouldn't agree to be changed to paperclippers but on what basis could we refuse an unspecified change to our moral frame of reference (leaving current values intact)?

Replies from: Perplexed, Vladimir_Nesov

↑ comment by Perplexed · 2010-11-01T15:40:18.733Z · LW(p) · GW(p)

I'm not sure I understand this talk of "moral frames of reference" vs simply "values".

But would an analogy to frame change be theory change? As when we replace Newton's theory of gravity with Einstein's theory, leaving the vast majority of theoretical predictions intact?

In this analogy, we might make the change (in theory or moral frame) because we encounter new information (new astronomical or moral facts) that impel the change. Or, we might change for the same reason we might change from the Copenhagen interpretation to MWI - it seems to work just as well, but has greater elegance.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-11-01T15:51:55.316Z · LW(p) · GW(p)

I'm not sure I understand this talk of "moral frames of reference" vs simply "values".

By analogy, take a complicated program as "frame of reference", and state of knowledge about what it outputs "current values". As you learn more, "current values" change, but frame of reference, defining the subject matter, stays the same and determines the direction of discovering more precise "current values".

Note that the exact output may well be unknowable in its explicit form, but "frame of reference" says precisely what it is. Compare with infinite mathematical structures that can never be seen "explicitly", but with the laws of correct reasoning about them perfectly defined.

Replies from: Kutta

↑ comment by Kutta · 2010-11-01T16:24:57.303Z · LW(p) · GW(p)

As you learn more, "current values" change, but frame of reference, defining the subject matter, stays the same and determines the direction of discovering more precise "current values".

Is there potential divergence of "current values" in this analogy (or in your model of morality)?

↑ comment by Vladimir_Nesov · 2010-11-01T13:59:24.704Z · LW(p) · GW(p)

Moral frame of reference determines the direction in exploration of values, but you can't explicitly know this direction, even if you know its definition, because otherwise you'd already be there. It's like with definition of action in ambient control. When definition is changed, you have no reason to expect that the defined thing remains the same, even though at that very moment your state of knowledge about the previous definition might happen to coincide with your state of knowledge about the new definition. And then the states of knowledge go their different ways.

comment by PhilGoetz · 2010-11-07T19:32:02.826Z · LW(p) · GW(p)

And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.

I still find it shocking and terrifying every time someone compares the morphing of human values with the death of the universe. Even though I saw another FAI-inspired person do it yesterday.

If all intelligent life held your view about the importance of their own values, then life in the universe would be doomed. The outcome of that view is that intelligent life greatly increases its acceptable ratio of (risk of destroying all life) / (chance of preserving its value system). (This is especially a problem when there are multiple intelligent beings with value systems that differ, as there already are.) The fragility of life in the long term means we can't afford that luxury. It will be hard enough to avoid the death of the universe even if we all cooperate.

Publicly stating the view that you cannot value the existence of anything but agents implementing your own values (even after your death), makes cooperation very difficult. It's easier to cooperate with someone who is willing to compromise.

Someone will complain that I can't value anything but my own values; that's a logical impossibility. But notice that I never said I value anything but my own values! The trick is that there's a difference between acting to maximize your values, versus going meta, and saying that you must value the presence of your values in others. There is no law of nature saying that a utility function U must place a high value on propagating U to other agents. In fact, there are many known utility functions U that would place a negative value on propagating their values to other agents! And it would be interesting if human utility functions are represented in a manner that is even capable of self-reference.

(The notion that your utility function says you must propagate your utility function only makes sense if you assume you will become a singleton.)

Even if you insist that you must propagate your utility function (or you plan on becoming a singleton), you should be able to reach a reflective equilibrium, and realize that attempting to force your values on the universe will result in a dead universe; and accept a compromise with other value systems. Avoiding that compromise, not saving humanity, is, I think, the most plausible reason for the FAI+CEV program.

But, falling short of reaching that equilibrium, I would like it if all the CEVers would at least stop talking about Human Values as if they were a single atomic package. If you model values as being like genes, and say that you want to transmit your values in the same way that an organism wants to transmit its genes (which it doesn't, by the way; and it is in exactly this way that the Yudkowskian attachment to values makes no sense - insisting that you must propagate your values because that's what your values want is exactly like insisting you must propagate your genes because that's what your genes want; it is forgetting that you are a computational system, and positing a value homunculus who looks at you with your own eyes to figure out what you should do); then that would at least allow for the possibility of value-altruism in a way isomorphic to kin selection (sometimes sacrificing your complete value package in order to preserve a set of related value packages).

I wish I could save up all my downvotes for the year, and apply them all to this post (if I could without thus having to apply them to its author - I don't want to get personal about it); for this is the single most dangerous idea in the LessWrong memespace; the one thing rotten at the core of the entire FAI/CEV project as conceived of here.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2011-01-30T19:59:05.404Z · LW(p) · GW(p)

I still find it shocking and terrifying every time someone compares the morphing of human values with the death of the universe. Even though I saw another FAI-inspired person do it yesterday.

Agreed.

for this is the single most dangerous idea in the LessWrong memespace; the one thing rotten at the core of the entire FAI/CEV project as conceived of here

Yes, this (value drift -> death of the universe) belief needs to be excised.

comment by Perplexed · 2010-10-31T00:10:42.927Z · LW(p) · GW(p)

Goertzel: Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology.

Agree, but the multiple different current forms of human values are the source of much conflict.

Hanson: Like Ben, I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors.

Agree again. And in honor of Robin's profession, I will point out that the multiple current forms of human values are the driving force causing trade, and almost all other economic activity.

Nesov: Change in values of the future agents, however sudden or gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise. ... Regardless of difficulty of the challenge, it's NOT OK to lose the Future.

Strongly disagree. The future is not ours to lose. A growing population of enfranchised agents is going to be sharing that future with us. We need to discount our own interest in that future for all kinds of reasons in order to achieve some kind of economic sanity. We need to discount because:

We really do care more about the short-term future than the distant future.
We have better control over the short-term future than the distant future.
We expect our values to change. Change can be good. It would be insane to attempt to determine the distant future now. Better to defer decisions about the distant future until later, when that future eventually becomes the short-term future. We will then have a better idea what we want and a better idea how to achieve it.
As mentioned, an increasing immortal population means that our "rights" over the distant future must be fairly dilute.
If we don't discount the future, we run into mathematical difficulties. The first rule of utilitarianism ought to be KIFS - Keep It Finite, Stupid.

Replies from: timtyler, Vladimir_Nesov, timtyler, lukstafi

↑ comment by timtyler · 2010-10-31T10:22:31.464Z · LW(p) · GW(p)

If we don't discount the future, we run into mathematical difficulties. The first rule of utilitarianism ought to be KIFS - Keep It Finite, Stupid.

http://lesswrong.com/lw/n2/against_discount_rates/

The idea is not really that you care equally about future events - but rather that you don't care about them to the extent that you are uncertain about them; that you are likely to be unable to influence them; that you will be older when they happen - and so on.

It is like in chess: future moves are given less consideration - but only because they are currently indistinct low probability events - and not because of some kind of other intrinsic temporal discounting of value.

↑ comment by Vladimir_Nesov · 2010-12-19T00:16:58.100Z · LW(p) · GW(p)

We really do care more about the short-term future than the distant future.

How do you know this? It feels this way, but there is no way to be certain.

We have better control over the short-term future than the distant future.

That we probably can't have something doesn't imply we shouldn't have it.

We expect our values to change. Change can be good.

That we expect something to happen doesn't imply it's desirable that it happens. It's very difficult to arrange so that change in values is good. I expect you'd need oversight from a singleton for that to become possible (and in that case, "changing values" won't adequately describe what happens, as there are probably better stuff to make than different-valued agents).

As mentioned, an increasing immortal population means that our "rights" over the distant future must be fairly dilute.

Preference is not about "rights". It's merely game theory for coordination of satisfaction of preference.

If we don't discount the future, we run into mathematical difficulties. The first rule of utilitarianism ought to be KIFS - Keep It Finite, Stupid.

God does not care about our mathematical difficulties. --Einstein.

Replies from: Perplexed, timtyler

↑ comment by Perplexed · 2010-12-19T05:45:28.353Z · LW(p) · GW(p)

We really do care more about the short-term future than the distant future.

How do you know this? It feels this way, but there is no way to be certain.

Alright. I shouldn't have said "we". I care more about the short term. And I am quite certain. WAY!

We have better control over the short-term future than the distant future.

That we probably can't have something doesn't imply we shouldn't have it.

Huh? What is it that you are not convinced we shouldn't have? Control over the distant future? Well, if that is what you mean, then I have to disagree. We are completely unqualified to exercise that kind of control. We don't know enough. But there is reason to think that our descendants and/or future selves will be better informed.

God does not care about our mathematical difficulties.

Then lets make sure not to hire the guy as an FAI programmer.

Replies from: Vladimir_Nesov, Nick_Tarleton

↑ comment by Vladimir_Nesov · 2010-12-19T13:36:10.644Z · LW(p) · GW(p)

We really do care more about the short-term future than the distant future.

How do you know this? It feels this way, but there is no way to be certain.

Alright. I shouldn't have said "we". I care more about the short term. And I am quite certain. WAY!

I believe you know my answer to that. You are not licensed to have absolute knowledge about yourself. There are no human or property rights on truth. How do you know that you care more about short term? You can have beliefs or emotions that suggest this, but you can't know what all the stuff you believe and all the moral arguments you respond to cash out into on reflection. We only ever know approximate answers, and given the complexity of human decision problem and sheer inadequacy of human brains, any approximate answers we do presume to know are highly suspect.

Huh? What is it that you are not convinced we shouldn't have? Control over the distant future? Well, if that is what you mean, then I have to disagree. We are completely unqualified to exercise that kind of control. We don't know enough. But there is reason to think that our descendants and/or future selves will be better informed.

That we aren't qualified doesn't mean that we shouldn't have that control. Exercising this control through decisions made with human brains is probably not it of course, we'd have to use finer tools, such as FAI or upload bureaucracies.

God does not care about our mathematical difficulties.

Then lets make sure not to hire the guy as an FAI programmer.

Don't joke, it's serious business. What do you believe on the matter?

Replies from: Perplexed

↑ comment by Perplexed · 2010-12-19T15:11:19.522Z · LW(p) · GW(p)

God does not care about our mathematical difficulties.

Then lets make sure not to hire the guy as an FAI programmer.

Don't joke, it's serious business. What do you believe on the matter?

I am not the person who initiated this joke. Why did you mention God? If you don't care for discounting, what is your solution to the very standard puzzles regarding unbounded utilities and infinitely remote planning horizons?

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-12-19T15:20:09.050Z · LW(p) · GW(p)

I am not the person who initiated this joke. Why did you mention God?

Einstein mentioned God, as a stand-in for Nature.

If you don't care for discounting, what is your solution to the very standard puzzles regarding unbounded utilities and infinitely remote planning horizons?

I didn't say I don't care for discounting. I said that I believe that we must be uncertain about this question. That I don't have solutions doesn't mean I must discard the questions as answered negatively.

↑ comment by Nick_Tarleton · 2010-12-19T05:50:40.582Z · LW(p) · GW(p)

We are completely unqualified to exercise that kind of control. We don't know enough. But there is reason to think that our descendants and/or future selves will be better informed.

Yes. So, for "our values", read "our extrapolated volition".

It's not clear to me how much you and Nesov actually disagree about "changing" values, vs. you meaning by "change" the sort of reflective refinement that CEV is supposed to incorporate, while Nesov uses it to mean non-reflectively-guided (random, evolutionary, or whatever) change.

Replies from: Perplexed

↑ comment by Perplexed · 2010-12-19T06:26:03.078Z · LW(p) · GW(p)

I do not mean "reflective refinement" if that refinement is expected to take place during a FOOM that happens within the next century or two. I expect values to change after the first superhuman AI comes into existence. They will inevitably change by some small epsilon each time a new physical human is born or an uploaded human is cloned. I want them to change. The "values of mankind" are something like the musical tastes of mankind or the genome of mankind. It is a collage of divergent things, and the set of participants in that collage continues to change.

VN and I are in real disagreement, as far as I can tell.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-12-19T13:40:17.212Z · LW(p) · GW(p)

This is not a disagreement, but failure of communication. There is no one relevant sentence in this dispute which we both agree that we understand in the same sense, and whose truth value we assign differently.

Replies from: Perplexed

↑ comment by Perplexed · 2010-12-19T15:02:29.193Z · LW(p) · GW(p)

It is a complete failure of communication if you are under the impression that the dispute has anything to do with the truth values of sentences. I am under the impression that we are in dispute because we have different values - different aspirations for the future.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-12-19T15:16:29.941Z · LW(p) · GW(p)

It is a complete failure of communication if you are under the impression that the dispute has anything to do with the truth values of sentences. I am under the impression that we are in dispute because we have different values - different aspirations for the future.

Any adequate disagreement must be about different assignment of truth values to the same meaning. For example, I disagree with the truth of the statement that we don't converge on agreement because of differences in our values, given both yours and mine preferred interpretation of "values". But explaining the reason for this condition not being the source of our disagreement requires me to explain to you my sense of "values", the normative and not factual one, which I fail to accomplish.

Replies from: Perplexed

↑ comment by Perplexed · 2010-12-19T15:41:55.947Z · LW(p) · GW(p)

Any adequate disagreement must be about different assignment of truth values to the same meaning.

I think we are probably in agreement that we ought to mean the same thing by the words we use before our disagreement has any substance. But your mention of "truth values" here may be driving us into a diversion from the main issue. Because I maintain that simple "ought" sentences do not have truth values. Only "is" sentences can be analyzed as true or false in Tarskian semantics.

But that is a diversion. I look forward to your explanation of your sense of the word "value" - a sense which has the curious property (as I understand it) that it would be a tragedy if mankind does not (with AI assistance) soon choose one point (out of a "value space" of rather high dimensionality) and then fix that point for all time as the one true goal of mankind and its creations.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-12-19T16:00:03.039Z · LW(p) · GW(p)

But your mention of "truth values" here may be driving us into a diversion from the main issue.

I gave up on the main issue, and so described my understanding of the reasons that justify giving up.

Because I maintain that simple "ought" sentences do not have truth values. Only "is" sentences can be analyzed as true or false in Tarskian semantics.

Yes, and this is the core of our disagreement. Since your position is that something is meaningless, and mine is that there is a sense behind that, this is a failure of communication and not a true disagreement, as I didn't manage to communicate to you the sense I see. At this point, I can only refer you to "metaethics sequence", which I know is not very helpful.

One last attempt, using an intuition/analogy dump not carefully explained.

Where do the objective conclusions about "is" statements come from? Roughly, you encounter new evidence, including logical evidence, and then you look back and decide that your previous understanding could be improved upon. This is the cognitive origin of anything normative: you have a sense of improvement, and expectation of potential improvement. Looking at the same situation from the past, you know that there is a future process that can suggest improvements, you just haven't experienced this process yet. And so you can reason about the truth without having it immediately available.

If you understand the way previous paragraph explains the truth of "is" questions, you can apply exactly the same explanation to "ought" questions. You can decide in the moment what you prefer, what you choose, which action you perform. But in the future, when you learn more, experience more, you can look back and see that you should've chosen differently, that your decision could've been improved. This anticipation of possible improvement generates semantics of preference over the decisions that is not logically transparent. You don't know what you ought to choose, but you know that here is a sense in which some action is preferable to some other action, and you don't know which is which.

Replies from: Perplexed

↑ comment by Perplexed · 2010-12-19T17:06:48.468Z · LW(p) · GW(p)

I gave up on the main issue, and so described my understanding of the reasons that justify giving up.

Sorry. I missed that subtext. Giving up may well be the best course.

your position is that something is meaningless, and mine is that there is a sense behind that, this is a failure of communication.

But my position is not that something (specifically an 'ought' statement) is meaningless. I only maintain that the meaning is not attained by assigning "truth value conditions".

One last attempt ...

Your attempt was a step in the right direction, but still IMO still leaves a large gap in understanding. You seem to think that anyone who thinks carefully enough will agree with you that there is some set of core meta-ethical principles that acts as an attractor in a dynamic process of reflective updating.

I disagree with this. There is no core attractor, and the dynamic process is not one of better and better thinking as time goes on. Instead, the dynamics I am talking about is the biological evolutionary process which results in a change over time in the typical human brain. That plus the technological change over time which is likely to bring uploaded humans, AIs, aliens, and "uplifted" non-human animals into our collective social contract.

Replies from: timtyler

↑ comment by timtyler · 2010-12-19T17:17:07.334Z · LW(p) · GW(p)

You seem to think that anyone who thinks carefully enough will agree with you that there is some set of core meta-ethical principles that acts as an attractor in a dynamic process of reflective updating. I disagree with this. There is no core attractor, and the dynamic process is not one of -better and better thinking as time goes on.

How can we know whether that is true or not? If we had access to multiple mature alien races, and could examine their moral systems, that might be a reasonable conclusion - if they were all very different. However, until then, the moral systems we can see are primitive - and any such conclusions would seem to be premature.

Replies from: Perplexed

↑ comment by Perplexed · 2010-12-19T17:24:17.039Z · LW(p) · GW(p)

How can we know whether that is true or not?

I'm sorry. I don't know which statement you mean to designate by "that".

... any such conclusions would seem to be premature.

Nor do I know which conclusions you worry might be premature.

To the best of my knowledge, I did not draw any conclusions.

↑ comment by timtyler · 2010-12-19T11:42:07.682Z · LW(p) · GW(p)

It's very difficult to arrange so that change in values is good. I expect you'd need oversight from a singleton for that to become possible (and in that case, "changing values" won't adequately describe what happens, as there are probably better stuff to make than different-valued agents).

We do seem to have an example of systematic positive change in values - the history of the last thousand years. No doubt some will argue that our values only look "good" because they are closest to our current values - but I don't think that is true. Another possible explanation is that material wealth lets us show off our more positive values more frequently. That's a harder charge to defend against, but wealth-driven value changes are surely still value changes.

Systematic, positive changes in values tend to suggest a bright future. Go, cultural evolution!

↑ comment by timtyler · 2010-12-19T17:35:05.007Z · LW(p) · GW(p)

If we don't discount the future, we run into mathematical difficulties. The first rule of utilitarianism ought to be KIFS - Keep It Finite, Stupid.

Too much discounting runs into problems with screwing the future up, to enjoy short-term benefits. With 5-year political horizons, that problem seems far more immediate and pressing than the problems posed by discounting too little. From the point of view of those fighting the evils that too much temporal discounting represents, arguments about mathematical infinity seem ridiculous and useless. Since such arguments are so feeble, why even bother mentioning them?

↑ comment by lukstafi · 2010-10-31T10:01:15.049Z · LW(p) · GW(p)

I agree, but be careful with "We expect our values to change. Change can be good." Dutifully explain, that you are not talking about value change in the mathematical sense, but about value creation, i.e. extending valuation to novel situations that is guided by values of a meta-level with respect to values casually applied to remotely similar familiar situations.

Replies from: Perplexed

↑ comment by Perplexed · 2010-10-31T13:53:36.500Z · LW(p) · GW(p)

I beseech you, in the bowels of Christ, think it possible your fundamental values may be mistaken.

I think that we need to be able to change our minds about fundamental values, just as we need to be able to change our minds about fundamental beliefs. Even if we don't currently know how to handle this kind of upheaval mathematically.

If that is seen as a problem, then we better get started working on building better mathematics.

Replies from: lukstafi, timtyler

↑ comment by lukstafi · 2010-10-31T20:45:32.540Z · LW(p) · GW(p)

OK. I've been sympathetic with your view from the beginning, but haven't really thought through (so, thanks,) the formalization that puts values on epistemic level: distribution of believes over propositions "my-value (H, X)" where H is my history up to now and X is a preference (order over world states, which include me and my actions). But note that people here will call the very logic you use to derive such distributions your value system.

ETA: obviously, distribution "my-value (H1, X[H2])", where "X[H2]" is the subset of worlds where my history turns out to be "H2", can differ greatly from "my-value (H2, X[H2])", due to all sorts of things, but primarily due to computational constraints (i.e. I think the formalism would see it as computational constraints).

ETA P.S.: let's say for clarity, that I meant "X[H2]" is the subset of world-histories where my history has prefix "H2".

↑ comment by timtyler · 2010-10-31T14:53:18.325Z · LW(p) · GW(p)

I think that we need to be able to change our minds about fundamental values, just as we need to be able to change our minds about fundamental beliefs. Even if we don't currently know how to handle this kind of upheaval mathematically.

What we may need more urgently is the maths for agents who have "got religion" - because we may want to build that type of agent - to help to ensure that we continue to receive their prayers and supplications.

comment by timtyler · 2010-10-30T18:55:07.690Z · LW(p) · GW(p)

Hmm. I wonder if it helps with gathering energy to fight the views of others if you label their views as being "deathist".

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T21:54:41.139Z · LW(p) · GW(p)

I wonder if it helps with gathering energy to fight the views of others if you label their views as being "deathist".

Do you really? The choice of the label was certainly not optimized for this purpose, it was a pattern I saw in the two citations and used to communicate the idea of this post (it happens to have an existing label). Fighting the views of others is a wrong attitude, you communicate (if you expect your arguments to be accepted) and let people decide, not extinguish what others believe.

Replies from: anonym

↑ comment by anonym · 2010-10-30T23:43:46.399Z · LW(p) · GW(p)

The first quote says human values have changed and that the core of our values is robust to radical, catastrophic change.

The second quote says that human values have changed and that some future changes would be okay, and it states that there are greater risks to human values in accepting a global entity responsible for protecting against value changes due to AGI.

Glossing those two quotes as being analogous to, and as equally irrational as, standard deathism seems like a deliberate misreading, and using the name 'value deathism' seems pretty suspect to me, for whatever that's worth.

comment by Peter_de_Blanc · 2010-11-01T00:06:58.583Z · LW(p) · GW(p)

I agree.

comment by red75 · 2010-12-14T00:08:09.264Z · LW(p) · GW(p)

Direct question. I cannot infer answer from you posts. If human values do not exist in closed form (i.e. do include updates on future observations including observations which in fact aren't possible in our universe), then is it better to have FAI operating on some closed form of values instead?

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-12-14T00:20:52.919Z · LW(p) · GW(p)

I don't understand the question. Unpack closed form/no closed form, and where updating comes in. (I probably won't be able to answer, since this deals with observations, which I don't understand still.)

Replies from: red75

↑ comment by red75 · 2010-12-14T04:58:30.741Z · LW(p) · GW(p)

Then it seems better to demonstrate it on toy model as I've done for no closed form already.

[...] computer [operating within Conway's game of life universe] is given a goal of tiling universe with most common still life in it and universe is possibly infinite.

One way I can think of to describe closed/no closed distinction is that latter does require unknown amount of input to be able to compute final/unchanging ordering over (internal representations of) world-states, former doesn't require input at all or requires predictable amount of input to do the same.

Another way to think about value with no closed form is that it gradually incorporates terms/algorithms acquired/constructed from environment.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-12-14T10:58:05.730Z · LW(p) · GW(p)

I understand the grandparent comment now. Open/closed distinction can in principle be extracted from values, so that values of the original agent only specify what kind of program the agent should self-improve into, while that program is left to deal with any potential observations. (It's not better to forget some component of values.)

Replies from: red75

↑ comment by red75 · 2010-12-15T05:27:55.408Z · LW(p) · GW(p)

I'm not sure I understand you. Values of the original agent specify a class of programs it can become. Which program of this class should deal with observations?

It's not better to forget some component of values.

Forget? Is it about "too smart to optimize"? This meaning I didn't intend.

When computer encounters borders of universe, it will have incentive to explore every possibility that it is not true border of universe such as: active deception by adversary, different rules of game's "physics" for the rest of universe, possibility that its universe is simulated and so on. I don't see why it is rational for it to ever stop checking those hypotheses and begin to optimize universe.

comment by entirelyuseless · 2018-02-21T16:17:53.188Z · LW(p) · GW(p)

"But of course the claims are separate, and shouldn't influence each other."

No, they are not separate, and they should influence each other.

Suppose your terminal value is squaring the circle using Euclidean geometry. When you find out that this is impossible, you should stop trying. You should go and do something else. You should even stop wanting to square the circle with Euclidean geometry.

What is possible, directly influences what you ought to do, and what you ought to desire.

comment by jacob_cannell · 2011-01-30T20:24:35.529Z · LW(p) · GW(p)

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise.

Even at the personal level our values change with time, somewhat dramatically as we grow in maturity from children to adolescents to adults, and more generally just as a process of learning modifying our belief networks.

Hanson's analogy to new generations of humans is especially important - value drift occurs naturally. A fully general argument against any form of value drift/evolution is equivalent to a fully general argument against begetting new generations of humans. Thus it is not value drift that equates to deathism, but the lack of value drift.

Out of the space of all trajectories value evolution could take, some are going to be much worse or much better from the perspective of our current value system.

All we have to do (and in fact the best we possibly can do) is guide that trajectory in a favorable direction from our current vantage point.

Ignore AGI for a moment and imagine a much more gradual future where human intelligence enhancement continues but asymptotically reaches a limit while human population somehow continues to explode exponentially (via wormholes for example). Run that millions of years forward and you get a future that is unimaginably different than our present, with future beings that are probably quite unlike humans.

A runaway Singularity is more or less equivalent to a portal into that future. It just compresses or accelerates it.

I would be that many of those futures, probably even most, will be considerably better overall than our past from the perspective of our current value system, and I would give good odds that they would be better than our present from the perspective of our current value system.

We don't need to suddenly halt all value drift in order to create a good future, and even if we could (questionable) it would probably be counterproductive or outright disasterous.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-01-30T23:01:45.213Z · LW(p) · GW(p)

Preserving morality doesn't mean that we have nothing more to decide, that there are no new ideas to create. New ideas (and new normative considerations) can be created while keeping morality. Value drift is destruction of morality, but lack of value drift doesn't preclude development, especially since morality endorses development (in the ways it does).

Every improvement is a change, but not every change is an improvement. Current normative considerations that we have (i.e. morality) describes what kinds of change we should consider improvement. Forgetting what changes we consider to be improvement (i.e. value drift) will result in future change that is not moral (i.e. not improvement from the current standpoint).

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2011-01-31T00:12:57.574Z · LW(p) · GW(p)

I find myself in agreement with all this, except perhaps:

Forgetting what changes we consider to be improvement (i.e. value drift) will result in future change that is not moral (i.e. not improvement from the current standpoint).

Perhaps we are using the term 'value drift' differently.

Would you consider the change from our ancestors values to modern western values to be 'value drift'? What about change from the values of the age-10 version of yourself versus the age-20 or the current?

The current world is far from an idealization of my ancestor's values, and my current self is far from an idealized extrapolation of it's future according to it's values at the time.

I don't find this type of 'value drift' to be equivalent to "forgetting what changes we consider to be improvement". At each step we are making changes to ourselves that we do believe are improvements at the time according to our values at the time, and over time this process can lead to significant changes in our values. Even so, this does not imply that the future changes are not moral (not improvement from the current standpoint).

Change in values implies that the future will not be ideal from the perspective of current values, but from this it does not follow that all such futures are worse from the perspective of our current values (because all else is not equal).

comment by Eneasz · 2010-11-03T20:56:19.373Z · LW(p) · GW(p)

Azathoth already killed you once, at puberty. Are you significantly worse off now that you value sex? Enough to eliminate that value?

Replies from: Oligopsony

↑ comment by Oligopsony · 2011-01-31T00:08:08.359Z · LW(p) · GW(p)

There are two things wrong with this analogy.

One is that this isn't a real value change. You gained the ability to appreciate sex as a source of pleasures, both lower and higher. As a child you already valued physical pleasure and social connections. Likewise, just as the invention of pianos allowed us to develop appreciations for things our ancestors never could, future technology will allow our descendants, we imagine, to find new sources of pleasure, higher and lower. And few people think this is bad in itself.

The more fundamental problem with the analogy - or rather, the question that followed it - is that it asks the question from the perspective of the adult rather than the child. Of course if our descendants have radically different values, then unless their ability to alter their environment has been drastically reduced, the world they create will be much better, from their perspective, than ours is. But the perspective we care about is our own - in the future generations case, we're the child.

Consider a young law student at a prestigious school who, like many young law students, is passionate about social justice and public service. She looks at the legal profession and notices that a great deal of lawyers, especially elite lawyers, don't seem to care about this to the extent that her age-mates do - after all, there are just as many of them fighting for the bad guys as the good guys, and so on and so on. Suppose for the sake of argument that she's right to conclude that as they get socialized into the legal profession, and start earning very high incomes, and begin to hobknob with the rich and powerful, their values start changing such that they care about protecting privilege rather than challenging it, and earning gobs of money rather than fighting what she would see as the good fight. And suppose further that she sees, accurately, that these corrupted lawyers are quite happy - they genuinely do enjoy what their work, and have changed their politics so that they genuinely do get even moral satisfaction from it. And suppose she sees that she's not, in any measurable respect, different from all those other young idealistic law students that turn into old wicked lawyers - aside, perhaps, from coming to evaluate just these facts.

Is it rational, given her values and assuming her conclusions above are true, for her to take (costly, but not catastrophically costly) steps to not become corrupted in this way? After all, if she is corrupted, her future self won't consider herself worse off - in fact, she'll look back at her youthful naivete thank her lucky stars that she shed all that!

Replies from: AaronAgassi, Eneasz, Eneasz

↑ comment by AaronAgassi · 2012-02-13T03:40:16.050Z · LW(p) · GW(p)

By even considering how lawyers observably change, it would seem that our idealistic young law student is already infected with the memplex of perspective. Nevertheless, nigh tautologically, the future offers new perspectives as yet unappreciated. After all, as for any question of memetic self preservation of integrity, you have it simply given as premise, that the greedy future lawyers are entirely honest with themselves. Actually, memes only exist in context of their medium, being culture, an ongoing conscious and social phenomena that governs even Axiology.

↑ comment by Eneasz · 2011-01-31T20:18:53.656Z · LW(p) · GW(p)

I would consider valuing sex a real value change. Just as I would consider valuing heroin more than just a new source of pleasure. I used to smoke, and the quitting process has convinced me that it wasn't just another source of pleasure, it was a fundamental value shift.

I know this wasn't clear in my original comment, but my question was not entirely rhetorical. It was during the quitting ordeal that this thought first occurred to me, and occasionally I still puzzle over it. It is not obviously clear to me that I'm better off liking sex than I was back when I was a child and disinterested. And arguments to the contrary seem too much like rationalization.

↑ comment by Eneasz · 2011-01-31T20:25:14.458Z · LW(p) · GW(p)

But in answer to your question - it seems rational on the surface. If her present and future selves are in direct competition for existence, she should obviously spend resources in support of her present self. If nothing else, at least her future self doesn't yet have any desires that can be thwarted.

I was more trying to say that I suspect complete value ossification is not a good thing.

comment by teageegeepea · 2010-10-31T04:44:01.103Z · LW(p) · GW(p)

The future is lost when I cease to exist.

Replies from: timtyler, Vladimir_Nesov, PhilGoetz

↑ comment by timtyler · 2010-10-31T10:13:42.682Z · LW(p) · GW(p)

"Now I will destroy the whole world... - What a Bokononist says before committing suicide."

http://en.wikipedia.org/wiki/Bokononism

Replies from: MichaelVassar

↑ comment by MichaelVassar · 2010-11-03T20:17:54.815Z · LW(p) · GW(p)

Yep. That's the philosophy that the semi-wise pattern-recognizing pro-death are trying to oppose.

↑ comment by Vladimir_Nesov · 2010-10-31T04:51:49.171Z · LW(p) · GW(p)

Again, see Belief in the Implied Invisible. What you can't observe is still there, and still has moral weight.

Replies from: teageegeepea

↑ comment by teageegeepea · 2010-10-31T19:39:08.070Z · LW(p) · GW(p)

I do not anticipate any experience after I die. It is indistinguishable to me whether anything afterward exists or not. I'm not going to make arguments against any laws of thermodynamics, but if there is nothing to distinguish two states, no fact about them enters into my calculations. They are like fictional characters. And I assign moral weight based on ass-kicking. I suppose there might be an issue if denizens of the future are able to travel back in time.

Replies from: cousin_it

↑ comment by cousin_it · 2010-10-31T20:01:10.461Z · LW(p) · GW(p)

There's no way you'd agree to receive $1000 a year before your death on condition that your family members will be tortured a minute after it. This is an example of what Vladimir means.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2010-11-07T20:35:40.407Z · LW(p) · GW(p)

We are confused because a human individual does not possess her own morality, but rather the morality of a "virtual agent" comprising that human's genetic lineage.

↑ comment by PhilGoetz · 2010-11-07T20:29:50.400Z · LW(p) · GW(p)

To rephrase this in a less negative-sounding way:

It makes no sense to ask what my utility function says should happen after I die.

comment by Snowyowl · 2010-10-30T23:41:32.542Z · LW(p) · GW(p)

Is it more important to you that people of the future share your values, or that your values are actually fulfilled? Do you want to share your values, so that other (future?) people can make the world better, or are you going to roll up your sleeves and do it yourself? After all, if everyone relies on other people to get work done, nothing will happen. It's not Pareto efficient.

I think your deathism metaphor is flawed, but in your terms: Why do you assume "living for as long as I want" has a positive utility in my values system? It's not Pareto efficient to me: if everyone had the option of dying whenever they wanted, it would have consequences that I consider too negative in comparison to the benefits. (Bear in mind that this is a subjective assessment; even if I gave my reasoning you might still disagree, which is fine by me.)

Have the courage and rationality to admit that the loss is real, even if it's too great for mere human emotions to express.

I'm cowardly, irrational, and shallow, and I'm not afraid to admit it!

Replies from: ata, wedrifid

↑ comment by ata · 2010-10-31T01:18:31.318Z · LW(p) · GW(p)

I'm cowardly, irrational, and shallow, and I'm not afraid to admit it!

Doesn't matter whether you're afraid to admit it, what matters is what you're planning to do about it.

Replies from: Snowyowl

↑ comment by Snowyowl · 2010-10-31T10:38:54.118Z · LW(p) · GW(p)

Sorry, a failed attempt at sarcasm there.

↑ comment by wedrifid · 2010-10-31T00:03:51.333Z · LW(p) · GW(p)

Is it more important to you that people of the future share your values, or that your values are actually fulfilled?

latter >= former by logical deduction.

Replies from: Snowyowl

↑ comment by Snowyowl · 2010-10-31T10:40:26.264Z · LW(p) · GW(p)

Yes. That was my intended point, sorry for being unclear.

Replies from: wedrifid

↑ comment by wedrifid · 2010-10-31T14:38:30.995Z · LW(p) · GW(p)

Are you sure your intended point wasn't "values - values about other's values > values about other's values"? That point is hard to express neatly but it is a more important intuitive point and one that seems to be well supported by your argument.

(By the way. I was the one who had downvoted your earlier comment, but that was actually in response to "I'm cowardly, irrational, and shallow, and I'm not afraid to admit it!" which doesn't fit well as a response to that particular exhortation. But I removed the downvote because I decided there was no point being grumpy if I wasn't going to be grumpy and specific. ;))

Replies from: Snowyowl

↑ comment by Snowyowl · 2010-11-01T02:08:53.232Z · LW(p) · GW(p)

Effort required to achieve your goal directly < effort required to convince others to achieve your goal for you.

...and I've just spotted the glaring hole in my argument, so the reason that it was unclear is probably that it was wrong. I assume that people who share your values will act similarly to you. Before, I only considered the possibility that you would work alone (number of people contributing: 1), or that everyone you convinced would do as you did and convince more people (number of people doing work other than marketing: 0). I concluded incorrectly that the best strategy was to work alone; in fact the best strategy is probably a mixed strategy of some sort.

TLDR: I was wrong and you were right. Ignore my previous posts.

(And I'm fine with being downvoted as long I know why. I can make good use of constructive criticism.)

comment by Sideways · 2010-10-30T18:47:11.894Z · LW(p) · GW(p)

The problem with this logic is that my values are better than those of my ancestors. Of course I would say that, but it's not just a matter of subjective judgment; I have better information on which to base my values. For example, my ancestors disapproved of lending money at interest, but if they could see how well loans work in the modern economy, I believe they'd change their minds.

It's easy to see how concepts like MWI or cognitive computationalism affect one's values when accepted. It's likely bordering on certain that transhumans will have more insights of similar significance, so I hope that human values continue to change.

I suspect that both quoted authors are closer to that position than to endorsing or accepting random value drift.

Replies from: Vladimir_Nesov, None

↑ comment by Vladimir_Nesov · 2010-10-30T18:54:27.323Z · LW(p) · GW(p)

The problem with this logic is that my values are better than those of my ancestors.

Your values are what they are. They talk about how good certain possible future-configurations are, compared to other possible future-configurations. Other concepts that happen to also be termed "values", such as your ancestors' values, don't say anything more about comparative goodness of the future-configurations, and if they do, then that is also part of your values.

If you'd like for future people to be different in given respects from how people exist now, that is also a value judgment. For future people to feel different about their condition than you feel about their condition would make them disagree with your values (and dually).

Replies from: Sideways

↑ comment by Sideways · 2010-10-30T19:22:17.576Z · LW(p) · GW(p)

Other concepts that happen to also be termed "values", such as your ancestors' values, don't say anything more about comparative goodness of the future-configurations, and if they do, then that is also part of your values.

I'm having difficulty understanding the relevance of this sentence. It sounds like you think I'm treating "my ancestors' values" as a term in my own set of values, instead of a separate set of values that overlaps with mine in some respects.

My ancestors tried to steer their future away from economic systems that included money loaned at interest. They were unsuccessful, and that turned out to be fortunate; loaning money turned out to be economically valuable. If they had known in advance that loaning money would work out in everyone's best interest, they would have updated their values (future-configuration preferences).

Of course, you could argue that neither of us really cared about loaning at interest; what we really cared about was a higher-level goal like a healthy economy. It would be convenient if we could establish a restate our values in a well-organized hierarchy, with a node at the top that was invariant on available information. But even if that could be done, which I doubt, it would still leave a role for available information in deciding something as concrete as a preferred future-configuration.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2010-10-30T19:38:06.759Z · LW(p) · GW(p)

Of course, you could argue that neither of us really cared about loaning at interest; what we really cared about was a higher-level goal like a healthy economy. It would be convenient if we could establish a restate our values in a well-organized hierarchy, with a node at the top that was invariant on available information.

That's closer to the sense I wanted to convey with this word.

But even if that could be done, which I doubt, it would still leave a role for available information in deciding something as concrete as a preferred future-configuration.

Distinction is between a formal criterion of preference and computationally feasible algorithms for estimation of preference between specific plans. The concept relevant for this discussion is the former one.

↑ comment by [deleted] · 2010-10-31T17:11:56.660Z · LW(p) · GW(p)

I haven't yet been convinced that my values are any better than the values of my ancestors by this argument.

Yes if I look at history people generally tend to move towards my own current values (with periods of detours). But this would be true if I looked at my travelled path after doing a random walk.

Sure there are things like knowledge changing proxy values due to knowledge (I would like my ancestors favour punishing witches if it turned out that they factually do use demonically gifted powers to hurt others), but there has also been just plain old value drift. There are plenty of things our ancestors would never approve of even if they had all the knowledge we had.

Value Deathism

Contents

121 comments