Objections to Coherent Extrapolated Volition

xixidu

Objections to Coherent Extrapolated Volition

post by XiXiDu · 2011-11-22T10:32:13.175Z · LW · GW · Legacy · 56 comments

  Foragers versus industry era folks
  What do you really want?
  A singleton is an attractor
  With knowledge comes responsibility, with wisdom comes sorrow
  Beware rationality as a purpose in and of itself
None
56 comments

In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

— Eliezer Yudkowsky, May 2004, Coherent Extrapolated Volition

Foragers versus industry era folks

Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new tribal chief, and a modern computer scientist who wants to determine if a “sufficiently large randomized Conway board could turn out to converge to a barren ‘all off’ state.”

The utility of the success in hunting down animals and proving abstract conjectures about cellular automata is largely determined by factors such as your education, culture and environmental circumstances. The same forager who cared to kill a lot of animals, to get the best ladies in its clan, might have under different circumstances turned out to be a vegetarian mathematician solely caring about his understanding of the nature of reality. Both sets of values are to some extent mutually exclusive or at least disjoint. Yet both sets of values are what the person wants, given the circumstances. Change the circumstances dramatically and you change the persons values.

What do you really want?

You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn’t know it. But there is no set of values that a person “really” wants. Humans are largely defined by the circumstances they reside in.

If you already knew a movie, you wouldn’t watch it.
To be able to get your meat from the supermarket changes the value of hunting.

If “we knew more, thought faster, were more the people we wished we were, and had grown up closer together” then we would stop to desire what we learnt, wish to think even faster, become even different people and get bored of and rise up from the people similar to us.

A singleton is an attractor

A singleton will inevitably change everything by causing a feedback loop between itself as an attractor and humans and their values.

Much of our values and goals, what we want, are culturally induced or the result of our ignorance. Reduce our ignorance and you change our values. One trivial example is our intellectual curiosity. If we don’t need to figure out what we want on our own, our curiosity is impaired.

A singleton won’t extrapolate human volition but implement an artificial set values as a result of abstract high-order contemplations about rational conduct.

With knowledge comes responsibility, with wisdom comes sorrow

Knowledge changes and introduces terminal goals. The toolkit that is called ‘rationality’, the rules and heuristics developed to help us to achieve our terminal goals are also altering and deleting them. A stone age hunter-gatherer seems to possess very different values than we do. Learning about rationality and various ethical theories such as Utilitarianism would alter those values considerably.

Rationality was meant to help us achieve our goals, e.g. become a better hunter. Rationality was designed to tell us what we ought to do (instrumental goals) to achieve what we want to do (terminal goals). Yet what actually happens is that we are told, that we will learn, what we ought to want.

If an agent becomes more knowledgeable and smarter then this does not leave its goal-reward-system intact if it is not especially designed to be stable. An agent who originally wanted to become a better hunter and feed his tribe would end up wanting to eliminate poverty in Obscureistan. The question is, how much of this new “wanting” is the result of using rationality to achieve terminal goals and how much is a side-effect of using rationality, how much is left of the original values versus the values induced by a feedback loop between the toolkit and its user?

Take for example an agent that is facing the Prisoner’s dilemma. Such an agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a allegedly more “valuable” goal?

Beware rationality as a purpose in and of itself

It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions. But what is it that we are approaching if the extrapolation of our volition becomes a purpose in and of itself? Extrapolating our coherent volition will distort or alter what we really value by installing a new cognitive toolkit designed to achieve an equilibrium between us and other agents with the same toolkit.

Would a singleton be a tool that we can use to get what we want or would the tool use us to do what it does, would we be modeled or would it create models, would we be extrapolating our volition or rather follow our extrapolations?

_{(This post is a write-up of a previous comment designated to receive feedback from a larger audience.)}

56 comments

Comments sorted by top scores.

comment by DSimon · 2011-11-22T23:17:47.659Z · LW(p) · GW(p)

If you already knew a movie, you wouldn’t watch it.

This example, and a few others in your post that follow the general pattern of "If you already knew X, then you would have no volition to go and learn X", don't apply to CEV as I understand it. The hypothetical extrapolated better-faster-smarter future yous are not being asked "What do you want now?", they're being asked "What should the past primitive-slow-dumb-you want?"

I might well advise someone to see a movie that I enjoyed, even if I have no particular desire to watch the movie again myself.

Replies from: army1987, XiXiDu

↑ comment by A1987dM (army1987) · 2011-11-23T11:28:26.536Z · LW(p) · GW(p)

IOW "I don't regret doing that, but I wouldn't do it again" is a perfectly valid state of mind.

↑ comment by XiXiDu · 2011-11-23T09:26:04.446Z · LW(p) · GW(p)

This example, and a few others in your post that follow the general pattern of "If you already knew X, then you would have no volition to go and learn X", don't apply to CEV as I understand it.

Human values are complex and little perturbations can influence them considerably. Just think about the knowledge that a process like CEV exists in the first place and how it would influence our curiosity, science and status-seeking behavior.

If there was a process that could figure out everything conditional upon the volition of humanity, then I at least would perceive the value of discovery to be considerably diminished. After all the only reason for why I don't already know something might at most be because humanity decided to figure it out on its own. But that would be an artificially created barrier. It will never be the same as before.

Take for example an artificial intelligence researcher. It would be absurd to try to figure out how general intelligence works, given the existence of CEV, and then receive praise from their fellow humans. Or how would you feel about a nobel prize for economics if humanity already learnt all there is to learn about economics by creating the CEV process? Would the nobel prize have the same value as before?

Another example is science fiction. Could you imagine reading science fiction in a world where a process like CEV makes sure that the whole universe suits the needs of humanity? Reading about aliens or AI would become ridiculous.

Those are just a few quick examples. That there won't be any problems that need to be fixed anymore is another. My point is, are we sure that CEV won't cripple a lot of activities, at least for nerds like me?

Replies from: TheOtherDave, DSimon

↑ comment by TheOtherDave · 2011-11-23T15:07:58.919Z · LW(p) · GW(p)

This line of reasoning is hardly limited to CEV. I'm reminded of Bishop Wright's apocryphal sermon about how we've pretty much discovered everything there is to discover.

Sure, if we progress far enough, fast enough, that all the meaningful problems I can conceive of are solved and all the meaningful goals I can imagine are reached, then there will only be two kinds of people: people solving problems I can't conceive of in order to achieve goals I can't imagine, and people living without meaningful goals and problems. We have the latter group today; I suspect we'll have them tomorrow as well.

The possibility of their continued existence -- even the possibility that everyone will be in that category -- doesn't strike me as a good enough reason to avoid such progress.

I'm also tempted to point out that there's something inherently inconsistent about a future where the absence of meaningful problems to solve is a meaningful problem, although I suspect that's just playing with words.

Replies from: thomblake, XiXiDu

↑ comment by thomblake · 2011-11-23T16:34:09.875Z · LW(p) · GW(p)

I'm also tempted to point out that there's something inherently inconsistent about a future where the absence of meaningful problems to solve is a meaningful problem, although I suspect that's just playing with words.

I don't think that's just playing with words. If we've solved all the problems, then we've solved that problem. We shouldn't assume a priori that solving that problem is impossible.

See also: fun theory.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-11-23T18:04:18.102Z · LW(p) · GW(p)

I agree with you that we shouldn't assume that finding meaningful activities for people to engage in as we progress is impossible. Not least of which because I think it is possible.

Actually, I'd say something stronger: I think right now we suck as a species at understanding what sorts of activities are meaningful and how to build a social infrastructure that creates such activities, and that we are suffering for the lack of it (and have done for millenia), and that we are just starting to develop tools with which to engage with this problem efficiently. In a few generations we might really see some progress in this area.

Nevertheless, I suspect that an argument of the form "lack of meaningful activity due to the solving of all problems is a logical contradiction, because such a lack of meaningful activity would then be an unsolved problem" is just playing with words, because the supposed contradiction is due entirely to the fact that the word "problem" means subtly different things in its two uses in that sentence.

Replies from: thomblake

↑ comment by thomblake · 2011-11-24T04:02:30.647Z · LW(p) · GW(p)

the word "problem" means subtly different things

Can you explain what those two meanings are?

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-11-24T04:36:01.399Z · LW(p) · GW(p)

I don't mean anything deep by it, just that for example a system might be able to optimize our environment to .99 human-optimal (which is pretty well approximated by the phrase "solving all problems") and thereby create, say, a pervasive and crippling sense of ennui that it can't yet resolve (which would constitute a "problem"). There's no contradiction in that scenario; the illusion of contradiction is created entirely by the sloppiness of language.

Replies from: DSimon

↑ comment by DSimon · 2011-11-24T04:43:16.356Z · LW(p) · GW(p)

I don't think I follow; if the environment is .99 human-optimal, then that remaining .01 gap implies that there are some problems remain to be solved, however few or minor, right?

It might simply be impossible to solve all problems, because of conflicting dependencies.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-11-24T05:49:05.310Z · LW(p) · GW(p)

Yes, I agree that the remaining .01 gap represents problems that remain to be solved, which implies that "solving all problems" doesn't literally apply to that scenario. If you're suggesting that therefore such a scenario isn't well-enough approximated by the phrase "solving all problems" to justify the phrase's use, we have different understandings of the level of justification required.

↑ comment by XiXiDu · 2011-11-23T17:30:24.400Z · LW(p) · GW(p)

The problem is not that all problems might be solved at some point but that as long as we don't turn ourselves into something similarly capable as the CEV process, then there exists an oracle that we could ask if we wanted to. The existence of such an oracle is that which is diminishing the value of research and discovery.

Replies from: TheOtherDave, Bugmaster

↑ comment by TheOtherDave · 2011-11-23T17:52:11.449Z · LW(p) · GW(p)

I agree that the existence of such an oracle closes off certain avenues of research and discovery.

I'm not sure that keeping those avenues of research and discovery open is particularly important, but I agree that it might be.

If it turns out that the availability such an oracle closing off certain avenues is an important human problem, it seems to follow that any system capable of and motivated to solve human problems will ensure that no such oracle is available.

↑ comment by Bugmaster · 2011-11-23T19:48:37.992Z · LW(p) · GW(p)

The existence of such an oracle is that which is diminishing the value of research and discovery

You seem to be saying that research and discovery has some intrinsic value, in addition to the benefits of actually discovering things and understanding them. If so, what is this value ?

The only answer I can think of is something like, "learning about new avenues of research that the oracle had not yet explored", but I'm not sure whether that makes sense or not -- since the perfect oracle would explore every avenue of research, and an imperfect oracle would strive toward perfection (as long as the oracle is rational).

Replies from: Elund, Alethea

↑ comment by Elund · 2014-10-25T06:29:45.767Z · LW(p) · GW(p)

Well, the process of research and discovery can itself be enjoyable. That said, I don't feel that there is a need to hold onto our current enjoyable activities if a CEV can create novel superior ways for us to have fun.

↑ comment by Alethea · 2015-02-11T18:35:18.898Z · LW(p) · GW(p)

I would posit that divergent behaviors and approaches to the norms will still occur, despite the existence of such an oracle just for the sake of imagination, exploration, and the enjoyment of the process itself. Such oracle would also be aware of the existence of unknown future factors, and the benefits of diverse approaches to problems in the face of factors with unknown long term benefits and viability until certain processes has been executed. As you said, such an oracle would then try to explore every avenue of research, while still focusing on the ones deemed most likely to be fruitful. Such oracle should also be good on self-reflection, and able to question its own approaches and the various perspectives it is able to subsume. After all, isn't self introspection and self reflections part of how one improve themselves?

Then there's the Fun theory sequence that DSimon have posted about.

↑ comment by DSimon · 2011-11-23T17:14:02.056Z · LW(p) · GW(p)

Seems like the fun theory sequence answers this question. In summary : if a particular state of affairs turns to be so boring that we regret going in that direction, then cev would prefer another way that builds a more exciting world by I.e. not giving us all the answers or solutions we might momentarily want.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-11-23T17:35:34.315Z · LW(p) · GW(p)

In summary : if a particular state of affairs turns to be so boring that we regret going in that direction, then cev would prefer another way...

It would have to turn itself off to fix the problem I am worried about. The problem is the existence of an oracle. The problem is that the first ultraintelligent machine is the last invention that man need ever make.

To fix that problem we would have to turn ourselves into superintelligences rather than creating a singleton. As long as there is a singleton that does everything that we (humanity) want, as long as we are inferior to it, all possible problems are artificially created problems that we have chosen to solve the slow way.

Replies from: DSimon, hairyfigment, None, TheOtherDave, ArisKatsaris, Giles, lessdazed

↑ comment by DSimon · 2011-11-23T22:24:44.673Z · LW(p) · GW(p)

As long as there is a singleton that does everything that we (humanity) want, as long as we are inferior to it, all possible problems are artificially created problems that we have chosen to solve the slow way.

I am basically alright with that, considering that "artificial problems" would still include social challenge. Much of art and sport and games and competition and the other enjoyable aspects of multi-ape-systems would probably still go on in some form; certainly as I understand the choices available to me I would definitely prefer that they do.

Could you imagine reading science fiction in a world where a process like CEV makes sure that the whole universe suits the needs of humanity?

Yes, absolutely I can.

Right now, we write and read lots and lots of fiction about times from the past that we would not like to live in. Or, about variations on those periods with some extra fun stuff (i.e. magic spells, fire-breathing dragons, benevolent non-figurehead monarchies) that are nonetheless not safe or comfortable places to live. It can be very entertaining and engaging to read about worlds that we ourselves would not want to be stuck in, such as a historical-fantasy novel about the year 2000 when people could still die of natural causes, despite having flying broomsticks.

↑ comment by hairyfigment · 2011-11-23T19:02:41.275Z · LW(p) · GW(p)

To fix that problem we would have to turn ourselves into superintelligences rather than creating a singleton.

I've told you before this seems like a false dichotomy. Did you give a counterargument somewhere that I missed?

Seems to me the situation has an obvious parallel in the world today. And since children would presumably still exist when we start the process, they can serve as more than a metaphor. Now children sometimes want to avoid growing up, but I don't know of any such case we can't explain as simple fear of death. That certainly suffices for my own past behavior. And you assume we've fixed death through CEV.

It therefore seems like you're assuming that we'd desire to stifle our children's curiosity and their desire to grow, rather than letting them become as smart as the FAI and perhaps dragging us along with them. Either that or you have some unstated objection to super-intelligence as a concrete ideal for our future selves to aspire to.

Replies from: Elund

↑ comment by Elund · 2014-10-25T20:57:16.864Z · LW(p) · GW(p)

Now children sometimes want to avoid growing up, but I don't know of any such case we can't explain as simple fear of death.

They can be afraid of having to deal with adult responsibilities, or the physical symptoms of aging after they've reached their prime.

↑ comment by [deleted] · 2011-11-27T21:05:21.470Z · LW(p) · GW(p)

I believe Eliezer's discussed this issue before. If I remember correctly, he suggested that a CEV-built FAI might rearrange things slightly and then disappear rather than automatically solving all of our problems for us. Of course, we might be able to create an FAI that doesn't do that by following a different building plan, but I don't think that says anything about CEV. I'll try to find that link...

Edit: I may be looking for a comment, not a top-level post. Here's some related material, in case I can't find what I'm looking for.

↑ comment by TheOtherDave · 2011-11-23T18:09:50.833Z · LW(p) · GW(p)

It doesn't have to turn itself off, it just has to stop taking requests.

Come to think of it, it doesn't even have to do that. If I were such an oracle and the world were as you describe it, I might well establish the policy that before I solve problem A on humanity's behalf, I require that humanity solve problem B on their own behalf.

Sure, B is an artificially created problem, but it's artificially created by me, and humanity has no choice in the matter.

Replies from: dlthomas

↑ comment by dlthomas · 2011-11-23T19:47:50.526Z · LW(p) · GW(p)

Or it could even focus on the most pressing problems, and leave stuff around the margins for us to work on. Just because it has vast resources doesn't mean it has infinite resources.

↑ comment by ArisKatsaris · 2011-11-23T17:51:04.309Z · LW(p) · GW(p)

As long as there is a singleton that does everything that we (humanity) want, as long as we are inferior to it, all possible problems are artificially created problems that we have chosen to solve the slow way.

Is it a terminal value for you that you want to have non-artificial problems in your life? Or is it merely an instrumental value on the path to something else (like "purpose", or "emotional fulfilment", or "fun", or "excitement", etc)?

↑ comment by Giles · 2011-11-25T14:13:46.017Z · LW(p) · GW(p)

all possible problems are artificially created problems that we have chosen to solve the slow way.

I agree that if this were to happen, it seems like a bad thing (I'll call this the "keeping it real" preference). But it seems like the point when this happens is the point where humanity has the opportunity to create a value-optimizing singleton, not the point where it actually creates one.

In other words, if we could have built an FAI to solve all of our problems for us but didn't, then any remaining problems are in a sense "artificial".

But they seem less artificial in that case. And if there is a continuum of possibilities between "no FAI" and "FAI that immediately solves all our problems for us", then the FAI may be able to strike a happy balance between "solving problems" and "keeping it real".

That said, I'm not sure how well CEV addresses this. I guess it would treat "keeping it real" to be a human preference and try and satisfy it along with everything else. But it may be that if it even gets to that stage, the ability to "keep it real" has been permanently lost.

↑ comment by lessdazed · 2011-11-23T19:43:55.189Z · LW(p) · GW(p)

Try formalizing the argument and defining variables. I don't think it will hold together.

comment by Vladimir_Nesov · 2011-11-22T22:47:24.217Z · LW(p) · GW(p)

You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn’t know it. But there is no set of values that a person “really” wants. Humans are largely defined by the circumstances they reside in.

At which point, you should ask, what general principles do (should) they want to use in order to decide what to do depending on the circumstances. Dependence on a lot of parameters is not chaos, not absence of a fact, it's merely a more difficult problem where all the different cases should be addressed, rather than just a single configuration, a blind answer that doesn't distinguish.

comment by Ben_Welchner · 2011-11-23T00:53:18.032Z · LW(p) · GW(p)

Take for example an agent that is facing the Prisoner’s dilemma. Such an agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a allegedly more “valuable” goal?

The agent's goals aren't changing due to increased rationality, but just because the agent confused him/herself. Even if this is a payment-in-utilons and no-secondary-consequences Dilemma, it can still be rational to cooperate if you expect the other agent will be spending the utilons in much the same way. If this is a more down-to-earth Prisoner's Dilemma, shooting for cooperate/cooperate to avoid dicking over the other agent is a perfectly rational solution that no amount of game theory can dissuade you from. Knowledge of game theory here can only change your mind if it shows you a better way to get what you already want, or if you confuse yourself reading it and think defecting is the 'rational' thing to do without entirely understanding why.

You describe a lot of goals as terminal that I would describe as instrumental, even in their limited context. While it's true that our ideals will be susceptible to culture up until (if ever) we can trace and order every evolutionary desire in an objective way, not many mathematicians would say "I want to determine if a sufficiently-large randomized Conway board would converge to an all-off state so I will have determined if a sufficiently-large randomized Conway board would converge to an all-off state". Perhaps they find it an interesting puzzle or want status from publishing it, but there's certainly a higher reason than 'because they feel it's the right thing to do'. No fundamental change in priorities needs occur between feeding one's tribe and solving abstract mathematical problems.

I won't extrapolate my arguments farther than this, since I really don't have the philosophical background it needs.

comment by moridinamael · 2011-11-23T01:47:12.609Z · LW(p) · GW(p)

Instead of thinking about a hypothetical human hunter, I find it useful to think about the CEV for dogs, or for a single dog. (Obviously the CEV for a single dog is wildly different from the CEV for all dogs, but the same types of observations emerge from either.)

I think it would be pretty straightforward to devise a dog utopia. The general features of dog values are pretty apparent and seem very simple to humans. If our technology were a bit more advanced, a study of dog brains and dog behaviors would tell us enough to design a virtual universe of dog bliss.

We are so much smarter than dogs that we could even create ways of intervening into inter-dog conflicts in ways not obvious to the dogs. We could remove the whole mentality of a dominance hierarchy from the dog's psychology and make them totally egalitarian. Since these dogs would be immortal and imbued with greater intelligence, they could be taught to enjoy more complex pleasures and possibly even to generate art.

Of course none of what I just described is actually dog CEV. It is more like what a human thinks human CEV might look like, applied to dogs in a lazy fashion. It is not Coherent in the sense that it is ad hoc, nor is it Extrapolated, in the sense that it essentially disregards what dogs actually want - in this case, to be the alpha dog, and to be perpetually gorging on raw meat.

Still - STILL - the dogs probably wouldn't complain if we imposed our humanized-CEV onto them, after the fact. At least, they wouldn't complain unless we really messed it up. There is probably an infinite space of stable, valid Utopias that human beings would willingly choose and would be perpetually content with. The idea that human CEV is or should be one single convergent future does not seem obviously true to me. Maybe CEV should be a little bit lazy on purpose, as the above human design designed a lazy but effective dog utopia.

My main point here is that this singleton-attractor phenomenon really becomes a problem when the CEV subject and the CEV algorithm become too tightly coupled. It seems to be generally assumed that CEV will be very tightly coupled to human desires. Maybe there should be a bit of wiggle room.

Replies from: TheOtherDave, Vladimir_Nesov, Elund

↑ comment by TheOtherDave · 2011-11-23T02:54:57.830Z · LW(p) · GW(p)

Be careful about equating "would not complain about after the fact" with "would willingly choose."

↑ comment by Vladimir_Nesov · 2011-11-23T01:55:55.781Z · LW(p) · GW(p)

The general features of dog values are pretty apparent and seem very simple to humans.

The reasons for difficulty of specifying human value apply to dogs with similar strength.

Replies from: moridinamael

↑ comment by moridinamael · 2011-11-23T02:23:17.682Z · LW(p) · GW(p)

Respectfully, I do not understand what you mean here. The reason for specifying human values precisely is that our values are assumed to be complex, muddy, inconsistent and paradoxical, and care must be taken to avoid accidental nightmare futures resulting from lazy extrapolation. I picked dogs because at least the dogs I have known have seemed to exhibit vastly simpler value systems than any human. This is probably because dogs do not possess true episodic memory nor are they capable of multiple layers of abstraction. If you object to the example of dogs, simply substitute frogs, or fish, or C. elegans.

That's a fun thought experiment, and maybe more specific to the discussion - describe the CEV for C. elegans. How much information is required to express it? Do all CEV's converge, i.e., does C. elegans become smarter in the course of trying to better satisfy its nutritional needs, inevitably becoming superintelligent?

Replies from: steven0461

↑ comment by steven0461 · 2011-11-23T04:19:04.731Z · LW(p) · GW(p)

From CEV:

our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted

Dogs or C. eleganses don't have an "as we wish that extrapolated", which surely makes the question what their CEV is a wrong question, like "what is the GDP per capita of the Klein four-group".

Replies from: Manfred

↑ comment by Manfred · 2011-11-23T12:07:05.228Z · LW(p) · GW(p)

The reason we'd have to extrapolate our goal structure in order to get a satisfying future is because human value grows and changes. In C. elegans the extrapolation is dead simple - its goals don't change much at all. So CEV seems possible, it just wouldn't do anything that "CV" wouldn't.

Replies from: steven0461, MaoShan

↑ comment by steven0461 · 2011-11-24T02:21:01.006Z · LW(p) · GW(p)

When something sneezes, is it "trying" to expel germs, "trying" to make achoo noises, or "trying" to get attention? It seems to me that the question simply doesn't make sense unless the thing sneezing is a philosopher, in which case it might just as well decide not to look to its counterfactual behaviors for guidance at all.

Replies from: wedrifid, Manfred

↑ comment by wedrifid · 2011-11-24T02:41:30.898Z · LW(p) · GW(p)

When something sneezes, is it "trying" to expel germs, "trying" to make achoo noises, or "trying" to get attention?

I would have thought trying to ensure that the breathing apparatus was clear enough to work acceptably was a higher priority than anything specific to germs.

↑ comment by Manfred · 2011-11-24T05:34:41.842Z · LW(p) · GW(p)

If a utility maximizer that has its utility function in terms of 'attention gotten' sneezes, is it "trying" to make achoo noises?

It seems like the question we're asking here is "to what extent can we model this animal as if it made choices based on its prediction of future events?" Or a closely related question, "to what extent does it act like a utility maximizer?"

And the answer seems to be "pretty well within a limited domain, not very well at all outside that domain." Small fish in their natural environment do a remarkable utility-maximizer impression, but when kept as pets they have a variety of inventive ways to kill themselves. The fish can act like utility maximizers because evolution stamped it into them, with lots and lots of simplifications to make the program run fast in tiny brains. When those simplifications are valid, the fish acts like the utility maximizer. When they're broken, the ability to act like a utility maximizer evaporates.

The trouble with this context-dependent approach is that it's context-dependent. But for animals that aren't good at learning, that context seems pretty clearly to be the environment they evolved in, since evolution is the causal mechanism for them acting like utility maximizers, and by assumption they won't have learned any new values.

So judging animals by their behavior, when used on dumb animals in ancestral environments, seems to be a decent way of assigning "wants" to animals.

↑ comment by MaoShan · 2011-12-11T19:27:31.981Z · LW(p) · GW(p)

Seeing as C. elegans lacks the neural structure to think objectively or have emotions, existing at all is the utopia for them. If we changed the worms enough to enable them to "enjoy" a utopia, they would no longer be what we started with. Would this be different with humans? To make us able to not get bored with our new utopia, we'd need different neural architecture, as well, which would, again, miss the point of a utopia for "us-now" humans.

If we want a FAI to create CEV, it wouldn't be one for us-now. If we aren't making a CEV for ourselves, why not just make a utopia for the FAI?

Replies from: None

↑ comment by [deleted] · 2011-12-11T19:46:13.105Z · LW(p) · GW(p)

No. The point of utopia is that it is what we would want and not get bored with. CEV attempts to solve the problem of finding out what we would want, and not just what our current incoherent stated values are.

"Enjoy" is a human (or at least higher vertebrate) concept. Trying to squeeze the worm into that mold will of course not work.

Also, it's worth noting that if you take just the brain of a human, you are getting most of the interesting parts of the system. The same cannot be said of the worm, you might as well take the heart and extrapolate its volition if you aren't going to work with the whole thing.

Replies from: MaoShan

↑ comment by MaoShan · 2011-12-28T23:18:30.554Z · LW(p) · GW(p)

I understand the abstract concept that you are endorsing, but the way that human brains work would not allow a utopia in the sense that you describe. Heaven would become boring; Hell would become bearable. If the perfect place for humans is one where wonderful things happen, then horrible things happen, well, coincidentally, that sounds a lot like Earth. Simulated Reality FTW?

↑ comment by Elund · 2014-10-25T22:13:32.921Z · LW(p) · GW(p)

CEV is supposed to aim for the optimal future, not a satisficing future. My guess is that there is only one possible optimal future for any individual, unless there is a theoretical upper limit to individual utility and the FAI has sufficiently vast resources.

Also, if the terminal goals for both humans and dogs are to simply experience maximum subjective well-being for as long as possible, then their personal CEVs at least will be identical. However, since individuals are selfish, there's no reason to expect that the ideal future for one individual will, if enacted by a FAI, lead to ideal futures for the other individuals who are not being extrapolated.

comment by gwern · 2011-11-24T01:35:53.394Z · LW(p) · GW(p)

If you already knew a movie, you wouldn’t watch it.

Indeed? May I suggest reading http://www.wired.com/wiredscience/2011/08/spoilers-dont-spoil-anything/ (PDF) ?

Much of our values and goals, what we want, are culturally induced or the result of our ignorance. Reduce our ignorance and you change our values. One trivial example is our intellectual curiosity. If we don’t need to figure out what we want on our own, our curiosity is impaired.

I don't follow this one. Is this just making the argument-by-definition that an omniscient being couldn't be curious? The universe seems to place hard limits on how much computation can be done and storage accessed, so there will always be things a FAI will not know. (I can also appeal to more general principles here: Godel, Turing, Hutter or Legge's no-elegant-predictor results, etc.)

Take for example an agent that is facing the Prisoner’s dilemma. Such an agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a allegedly more “valuable” goal?

Er, what? If the agent isn't reaping greater payoffs then it was simply mistaken (that happens sometimes) and can go back to not cooperating. If it had defecting as an intrinsically good thing, then why did it ever start cooperating?

It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions.

If this is the basic point, you're missing a lot of more germane results than what you put down. Openness and parasite load (or psilocybin), IQ and cooperation and taking normative economics stances (besides the linked cites, Pinker had a ton of relevant stuff in the later chapters of Better Angels), etc.

Replies from: None

↑ comment by [deleted] · 2011-11-27T02:19:56.527Z · LW(p) · GW(p)

I don't follow this one. Is this just making the argument-by-definition that an omniscient being couldn't be curious?

I think XiXiDu is actually saying that if you model a given human, but with changed context that flows from their inferred values (smarter, more the people we wished we were, etc...) you will wind up with a model of a completely different human whose values are not coherent with those of the source human, because our context is extremely important in determining what we think, know, want, and value.

comment by NickH · 2024-11-12T16:25:06.146Z · LW(p) · GW(p)

My counter thought experiment to CEV is to consider our distant ancestors. I mean so far distant that we wouldn't call them human, maybe even as far back as some sort of fish-like creature. Suppose a super AI somehow offered this fish the chance to rapidly "advance", following its CEV and it showed it a vision of the future, us, and asked the fishy thing whether to go ahead. Do you think the fishy thing would say yes?
Similarly, if an AI offered to evolve humankind, in 50 years, into telepathic little green men that it assured us was the result of our CEV, would we not instantly shut it down in horror?
My personal preference, I like to call the GFP - Glorious Five-year Plan: You have the AI offer a range of options for 5 (or 50 but definitely no longer) years in the future, and we pick one. And in 5 years time we repeat the process. The bottom line is that humans do not want rapid change. Just we are happier with 2% inflation than 0% or 100%, we want a moderate rate of change.
At its heart there is a "Ship of Theseus" problem. If the AI replaces every part of the ship overnight so that in the morning we find the QE2 at dock then it is not the ship of Theseus.

comment by [deleted] · 2011-11-27T02:13:40.524Z · LW(p) · GW(p)

Couple minor nitpicks:

Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new tribal chief

Not too many foragers have chiefs, unless they're living in such opulence that they can settle down (see the very wikipedia article you linked and comments about social structures).

If you already knew a movie, you wouldn’t watch it.

So why do people re-watch movies? Or watch movies to which they already know the spoiler or premise?

comment by A1987dM (army1987) · 2011-11-23T11:42:12.278Z · LW(p) · GW(p)

Humans are largely defined by the circumstances they reside in.

This reminds me of an objection I once came up with about the idea that there's no reason to expect that your utility function for money should have an inflection point/some other special property at or near your current net worth. I think that's a not-so-unreasonable assumption indeed, if you are used to that level of wealth (i.e., unless you've recently gained or lost a huge lot of money): I think that someone who grew up in a wealthy family and stayed wealthy until recently would find it much harder to live on $1000 a month than someone who has been that poor all along, as the former would find it very hard to imagine life without [insert expensive pastime here], the latter would know more ways to cope with poverty (e.g. where to buy cheap clothes), etc.

comment by Jonathan_Graehl · 2011-11-22T21:47:28.378Z · LW(p) · GW(p)

An awareness of the context-sensitivity of our desires only leads me to value novelty (diversity on a personal-experience level). Not only do I expect some drift, I intend to enjoy it.

I haven't thought about how I'd like protection against black-holes of desire to be implemented. It's easy to imagine some hypothetical future-me resenting a growing novelty-itch I committed to.

Since I expect my preference for novelty to inform CEV, if you're making an argument against CEV that might persuade me, it must be that the optimization problem is too hard - that the result will either deny unexpected pleasures that could have otherwise been reached, or that the pull of expected pleasures will be stronger than expected and lead to a fixed state (violating our present belief in our preference for novelty).

comment by timtyler · 2011-11-26T12:31:46.337Z · LW(p) · GW(p)

It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions.

That's probably down to memes and memetic hijacking. Advanced civilisation results in a greater density of memes, and they have greater say in civilisation's values. These values usually include reduced reproduction for humans - because of the resource competition between memes and genes - thus the demographic transition.

But what is it that we are approaching if the extrapolation of our volition becomes a purpose in and of itself?

As I say, the actual effect is probably down to more memes. If that continues, we'll probably get this.

comment by A1987dM (army1987) · 2011-11-23T11:25:42.407Z · LW(p) · GW(p)

It reminds me of a story I've read, where someone from an early-21st-century capitalist culture goes back in time and tells a few Cro-Magnon hunters and gatherers what wonders the future will contain, and they (very convincingly) argue that they are no overall improvement at all. (Of course, there are many more people alive today than Earth could support if agriculture hadn't been invented, so a total utilitarian would disagree with them.)

If you already knew a movie, you wouldn’t watch it.

The usual framework seems to not apply here: IIRC there's some theorem showing that the value of information cannot be negative, but that seems obviously false if the information in question is the ending of a film you've already paid a ticket for.

Replies from: Morendil, gwern, TheOtherDave

↑ comment by Morendil · 2011-11-23T13:56:28.368Z · LW(p) · GW(p)

Obviously false but actually true?

Not to mention, the line "if you already knew a movie, you wouldn’t watch it" is falsified by Star Wars fans too numerous to count, and other cult movies. (RHPS anyone?)

We don't watch movies or read books to know the ending.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2011-11-23T14:44:40.737Z · LW(p) · GW(p)

Not exclusively to know the ending, but at least for me that's one of the reason, and I would enjoy it much less if I knew it.

(In my case at least, this is more true about books than about films, so I'd rather not see a movie based on a book if I think I might want to read the book, until I've read it.)

↑ comment by gwern · 2011-11-24T01:50:48.609Z · LW(p) · GW(p)

It reminds me of a story I've read, where someone from an early-21st-century capitalist culture goes back in time and tells a few Cro-Magnon hunters and gatherers what wonders the future will contain, and they (very convincingly) argue that they are no overall improvement at all.

It's recent enough that this is probably not it, but Miller has that exact story in the first chapter of his new book Spent.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2011-11-24T12:29:07.526Z · LW(p) · GW(p)

Yes, that was it.

↑ comment by TheOtherDave · 2011-11-23T14:48:48.053Z · LW(p) · GW(p)

You seem to be assuming that all "information" obtainable by viewing the ending of a film is equally available to someone who has a memory of having viewed the ending of the film... confirm? If so, can you expand on why you're assuming that?

Replies from: army1987

↑ comment by A1987dM (army1987) · 2011-11-23T17:20:34.761Z · LW(p) · GW(p)

I'm not sure I understand your question... You mean someone who has already seen the end of the movie will learn (or be reminded of) nothing new if they saw it again? No, I'm not assuming that (there aren't that many people with perfect eidetic memory around), but I can't see the relevance. My point is not that the value of watching a film/reading a book when I already know its ending is zero; it is that the value of watching a film/reading a book when I already know its ending is strictly less than if I didn't knew it. (How much less depends on the type of film/book.)

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2011-11-23T17:55:58.665Z · LW(p) · GW(p)

Hm. OK, that's not what I thought you were saying, so thanks for the correction.

FWIW, though, I've certainly reread books that I enjoyed more the second time, knowing the ending, than I did the first time, not knowing the ending.

Objections to Coherent Extrapolated Volition

Contents

Foragers versus industry era folks

What do you really want?

A singleton is an attractor

With knowledge comes responsibility, with wisdom comes sorrow

Beware rationality as a purpose in and of itself

56 comments