Rationality Reading Group: Part V: Value Theory

gram_stone

Rationality Reading Group: Part V: Value Theory

post by Gram_Stone · 2016-03-10T01:11:53.679Z · LW · GW · Legacy · 32 comments

  V. Value Theory
None
32 comments

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This fortnight we discuss Part V: Value Theory (pp. 1359-1450). This post summarizes each article of the sequence, linking to the original LessWrong post where available.

V. Value Theory

264. Where Recursive Justification Hits Bottom - Ultimately, when you reflect on how your mind operates, and consider questions like "why does Occam's Razor work?" and "why do I expect the future to be like the past?", you have no other option but to use your own mind. There is no way to jump to an ideal state of pure emptiness and evaluate these claims without using your existing mind.

265. My Kind of Reflection - A few key differences between Eliezer Yudkowsky's ideas on reflection and the ideas of other philosophers.

266. No Universally Compelling Arguments - Because minds are physical processes, it is theoretically possible to specify a mind which draws any conclusion in response to any argument. There is no argument that will convince every possible mind.

267. Created Already in Motion - There is no computer program so persuasive that you can run it on a rock. A mind, in order to be a mind, needs some sort of dynamic rules of inference or action. A mind has to be created already in motion.

268. Sorting Pebbles into Correct Heaps - A parable about an imaginary society that has arbitrary, alien values.

269. 2-Place and 1-Place Words - It is possible to talk about "sexiness" as a property of an observer and a subject. It is also equally possible to talk about "sexiness" as a property of a subject, as long as each observer can have a different process to determine how sexy someone is. Failing to do either of these will cause you trouble.

270. What Would You Do Without Morality? - If your own theory of morality was disproved, and you were persuaded that there was no morality, that everything was permissible and nothing was forbidden, what would you do? Would you still tip cabdrivers?

271. Changing Your Metaethics - Discusses the various lines of retreat that have been set up in the discussion on metaethics.

272. Could Anything Be Right? - You do know quite a bit about morality. It's not perfect information, surely, or absolutely reliable, but you have someplace to start. If you didn't, you'd have a much harder time thinking about morality than you do.

273. Morality as Fixed Computation - A clarification about Yudkowsky's metaethics.

274. Magical Categories - We underestimate the complexity of our own unnatural categories. This doesn't work when you're trying to build a FAI.

275. The True Prisoner's Dilemma - The standard visualization for the Prisoner's Dilemma doesn't really work on humans. We can't pretend we're completely selfish.

276. Sympathetic Minds - Mirror neurons are neurons that fire both when performing an action oneself, and watching someone else perform the same action - for example, a neuron that fires when you raise your hand or watch someone else raise theirs. We predictively model other minds by putting ourselves in their shoes, which is empathy. But some of our desire to help relatives and friends, or be concerned with the feelings of allies, is expressed as sympathy, feeling what (we believe) they feel. Like "boredom", the human form of sympathy would not be expected to arise in an arbitrary expected-utility-maximizing AI. Most such agents would regard any agents in its environment as a special case of complex systems to be modeled or optimized; it would not feel what they feel.

277. High Challenge - Life should not always be made easier for the same reason that video games should not always be made easier. Think in terms of eliminating low-quality work to make way for high-quality work, rather than eliminating all challenge. One needs games that are fun to play and not just fun to win. Life's utility function is over 4D trajectories, not just 3D outcomes. Values can legitimately be over the subjective experience, the objective result, and the challenging process by which it is achieved - the traveller, the destination and the journey.

278. Serious Stories - Stories and lives are optimized according to rather different criteria. Advice on how to write fiction will tell you that "stories are about people's pain" and "every scene must end in disaster". I once assumed that it was not possible to write any story about a successful Singularity because the inhabitants would not be in any pain; but something about the final conclusion that the post-Singularity world would contain no stories worth telling seemed alarming. Stories in which nothing ever goes wrong, are painful to read; would a life of endless success have the same painful quality? If so, should we simply eliminate that revulsion via neural rewiring? Pleasure probably does retain its meaning in the absence of pain to contrast it; they are different neural systems. The present world has an imbalance between pain and pleasure; it is much easier to produce severe pain than correspondingly intense pleasure. One path would be to address the imbalance and create a world with more pleasures, and free of the more grindingly destructive and pointless sorts of pain. Another approach would be to eliminate pain entirely. I feel like I prefer the former approach, but I don't know if it can last in the long run.

279. Value is Fragile - An interesting universe, that would be incomprehensible to the universe today, is what the future looks like if things go right. There are a lot of things that humans value that if you did everything else right, when building an AI, but left out that one thing, the future would wind up looking dull, flat, pointless, or empty. Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth.

280. The Gift We Give to Tomorrow - How did love ever come into the universe? How did that happen, and how special was it, really?

This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Part W: Quantified Humanism (pp. 1453-1514) and Interlude: The Twelve Virtues of Rationality (pp. 1516-1521). The discussion will go live on Wednesday, 23 March 2016, right here on the discussion forum of LessWrong.

32 comments

Comments sorted by top scores.

comment by SquirrelInHell · 2016-03-17T13:32:01.894Z · LW(p) · GW(p)

It's a shame this seems not very active. I have the impression that the folks who could benefit from this the most are those who don't regularly visit the forum, and probably don't know about these threads.

I'll recomend this to a "From AI To Zombies" reading group that I started around a year ago at my university, I think they are around part U or V right now.

comment by Gram_Stone · 2016-03-11T00:51:47.441Z · LW(p) · GW(p)

I think this is a good time to talk about an objection that I see sometimes to some approaches to moral philosophy. Something like, "Simple moral theories are too neat to do any real work in moral philosophy; you need theories that account for human messiness if you want to discover anything important." I can think of at least one linkable example, but I think that it would be inappropriate to put someone on the stand like that.

For one, this just sounds like descriptive ethics. It exists and it's a separate problem from normative ethics and metaethics. But that seems uncharitable. It's likely that this argument is made even with this distinction in mind. They aren't secretly trying to answer the question, "How do humans make moral decisions?"; they're saying that moral theories that are not messy like humans will not lead to the general resolution of normative ethics and metaethics. The argument is that simple moral theories fail to account for important facts by the simplicity of their assumptions.

But this seems to me to close off an entire approach to the problem. Why would you not try to consider decision-making in a non-human context? When I see this, I always remember a surprising-at-the-time but immediately obvious-in-hindsight point from lukeprog's The Neuroscience of Desire:

Many of the neurons involved in valuation and choice have stochastic features, meaning that when the subjective utility of two or more options are similar (represented in the brain by neurons with similar firing rates), we sometimes choose to do something other than the action that has the most subjective utility. In other words, we sometimes fail to do what we most want to do, even if standard biases and faults (akrasia, etc.) are considered to be part of the valuation equation. So don't beat yourself up if you have a hard time choosing between options of roughly equal subjective utility, or if you feel you've chosen an option that does not have the greatest subject utility.

Things like this make it implausible to me that this approach to ethics, of imagining decision-making behavior that doesn't perfectly fit real human behavior, is just stupid on its face. How can you regard it as anything else but a flaw that we sometimes just don't do what we really want to do? It just seems interesting to consider the consequences of the assumption that there is a decision-maker without a trembling hand. And then I would ask myself, "How far can I go with this?"

I also read an article by cousin_it recently, Common mistakes people make when thinking about decision theory, that describes another objection I have to ruling out this approach:

Many assumptions seemed to be divorced from real life at first. People dismissed the study of electromagnetism as an impractical toy, and considered number theory hopelessly abstract until cryptography arrived. The only way to make intellectual progress (either individually or as a group) is to explore the implications of interesting assumptions wherever they might lead. Unfortunately people love to argue about assumptions instead of getting anything done, though they can't really judge before exploring the implications in detail.

I've been reading Bostrom's Anthropic Bias recently, and it's important to note that one of his main motivations for studying anthropic reasoning at the time of writing is expressed in the form of an implication. Bostrom explains that the empirical case for the hypothesis that our universe is fine-tuned seems strong, and goes on to consider the situation if we came to believe with high confidence that our universe was not fine-tuned:

One should not jump from this to the conclusion that our universe is finetuned. For it is possible that some future physical theory will be developed that uses fewer free parameters or uses only parameters on which life does not sensitively depends. Even if we knew that our universe were not fine-tuned, the issue of what fine-tuning would have implied could still be philosophically interesting.

I think that this is a productive practice. This is the same thing that cousin_it is talking about. It's useful to lay out your arguments, to consider all of the different ways that things could be different if you assume that different things are true or false. This is exploration; this is how you find solutions in a huge problem space as a group of humans.

And someone might say, "I'm not going to spend all of my time arguing about an edifice of conclusions built on foundations that I already believe to be false," so that this doesn't seem as productive as I claim it is. But given the way that humans are, you probably shouldn't expect those intuitions to be all that reliable in this domain. I would expect a human with the policy of following arguments for a greater period of time than they would naturally like to do better than a human without that policy, given that the arguments aren't wrong on their faces.

And as a social matter, each party seems to have an incentive to participate in this activity. The people who believe the initial assumptions believe that they are making very relevant progress. And the people who object to the assumptions should also expect to have opportunities to discredit any deductions from the assumptions if they really believe that those arguments rest on a confused foundation.

And you might be cynical and say, "People are going to believe whatever they already believe; this is useless," but I write the comment because this is one of the places in the world where that is least likely to be true for any given person.

And there is also the possibility that both approaches are valuable, and to reference The Neuroscience of Desire again, lukeprog suggested that there is an imminent reduction of some of the relations between economics, psychology, and neuroscience. If both approaches arrive at truth, then we should expect these truths to be reducible to one another, at least in principle. This reminds me of the false empiricism-rationalism dichotomy.

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-17T13:19:07.043Z · LW(p) · GW(p)

Something like, "Simple moral theories are too neat to do any real work in moral philosophy; you need theories that account for human messiness if you want to discover anything important."

This is exactly the mistake from http://lesswrong.com/lw/ix/say_not_complexity/ , and (I hope) LW'ers are aware of it. So probably your examples are not from LW?

How can you regard it as anything else but a flaw that we sometimes just don't do what we really want to do?

This adds variety, and is good when your best options are close in utility.

And the people who object to the assumptions should also expect to have opportunities to discredit any deductions from the assumptions if they really believe that those arguments rest on a confused foundation.

This is not how it logically works. If you get a suspicious deduction, you discredit your assumptions, and not the other way round.

It's useful to lay out your arguments, to consider all of the different ways that things could be different if you assume that different things are true or false.

Sure. (Isn't this what we all like to do in any case?)

Replies from: gjm, Gram_Stone, Gram_Stone

↑ comment by gjm · 2016-03-17T13:49:01.324Z · LW(p) · GW(p)

This is exactly the mistake from http://lesswrong.com/lw/ix/say_not_complexity/

I'm not sure it is. That's about claims of the form "Doing X needs complexity, so if we shovel in enough complexity we'll get X", whereas Gram_Stone [EDITED to add: oops, this is another place where I said "Gram_Stone" and should actually have said "the unspecified people Gram_Stone was disagreeing with] is saying something more like "It looks like no simple model will do X, so any that does X will necessarily turn out to be complex".

I don't know whether that's right -- sometimes complex-looking things turn out to have surprisingly simple explanations -- but it doesn't look either obviously wrong or fallacious. The author of "Say not complexity" also wrote "The hidden complexity of wishes" which is making a point not a million miles away from Gram_Stone's. [EDITED to add: or, more precisely, not-Gram_Stone's.]

Replies from: SquirrelInHell, Gram_Stone

↑ comment by SquirrelInHell · 2016-03-17T14:09:20.686Z · LW(p) · GW(p)

This is exactly the mistake from http://lesswrong.com/lw/ix/say_not_complexity/

I'm not sure it is. That's about claims of the form "Doing X needs complexity, so if we shovel in enough complexity we'll get X", whereas Gram_Stone is saying something more like "It looks like no simple model will do X, so any that does X will necessarily turn out to be complex".

The lesson from "Say not Complexity" is not that it's untrue that complexity is needed. It is that to assume at the beginning we need "complexity" or "messiness", is not a good heuristic to use when looking for solutions. You want to aim at simplicity, and make sure that every step that takes you farther from it is well justified. Debating beforehand whether the solution is going to turn out to be complex or not is not very useful.

In this sense, I see this as the same mistake.

Replies from: gjm, Lumifer

↑ comment by gjm · 2016-03-17T15:26:53.736Z · LW(p) · GW(p)

not a good heuristic

OK, so I agree that that's part of what Eliezer is saying under "Say not 'complexity'". But let's be a bit more precise about it. He makes (at least) two separate claims.

The first is that "complexity should never be a goal in itself". I strongly agree with that, and I bet Gram_Stone does too and isn't proposing to chase after complexity for its own sake.

[EDITED to add: Oops, as SquirrelInHell points out later I actually mean not Gram_Stone but whatever other people Gram_Stone had in mind who hold that theories of ethics should not be very simple. Sorry, Gram_Stone!]

The second is that "saying 'complexity' doesn't concentrate your probability mass". This I think is almost right, but that "almost" is important sometimes. Eliezer's point is that there are vastly many "complex" things, which have nothing much in common besides not being very simple, so that "let's do something complex" doesn't give you any guidance to speak of. All of that is true. But suppose you're trying to solve a problem whose solution you have good reason to think is complex, and suppose that for whatever reason you (or others) have a strong temptation to look for solutions that you're pretty sure are simpler than the simplest actual solution. Then saying "no, that won't do; the solution will not be that simple" does concentrate your probability mass and does guide you -- by steering you away from something specific that won't work and that you'd otherwise have been inclined to try.

Again, this is dependent on your being right when you say "no, the solution will not be that simple". That's often not something you can have any confidence in. But if what you're trying to do is to model something formed by millions of years of arbitrary contingencies in a complicated environment -- like, e.g., human values -- I think you can be quite confident that no really simple model is very accurate. More so, if lots of clever people have looked for simple answers and not found anything good enough.

Here's another of Eliezer's posts that maybe comes closer to agreeing explicitly with Gram_Stone: Value is Fragile. Central thesis: "Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth." Note that if our values could be adequately captured by a genuinely simple model, this would be false.

(I am citing things Eliezer has written not because there's anything wrong with disagreeing with Eliezer, but because your application here of what he wrote in "Say not 'complexity'" seems to lead to conclusions at variance with other things he's written, which suggests that you might be misapplying it.)

Replies from: SquirrelInHell, Gram_Stone

↑ comment by SquirrelInHell · 2016-03-17T23:00:16.611Z · LW(p) · GW(p)

Central thesis: "Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth." Note that if our values could be adequately captured by a genuinely simple model, this would be false.

I think you are not fully accurate in your reasoning here. It is still possible to have a relatively simple and describable transformation that takes "humans" as an input value, see e.g. http://intelligence.org/files/CEV.pdf (Now I'm not saying this is true in this particular case, just noting it for the sake of completeness.)

seems to lead to conclusions at variance with other things he's written, which suggests that you might be misapplying it.

I'd say the message is consistent if you resist dumping the meta-level and object-level together. On meta-level, "we need more complexity/messiness" is still a bad heuristic. On object-level, we have determined that simple solutions don't work, so we are suspicious of them.

Thanks for pointing out the inconsistency, it certainly makes the issue worthwhile to discuss in depth.

Then saying "no, that won't do; the solution will not be that simple" does concentrate your probability mass and does guide you -- by steering you away from something specific that won't work and that you'd otherwise have been inclined to try.

In practice, there's probably more value in confronting your simple solution and finding an error in it, then in dismissing it out of hand because it's "too simple". You just repeat this until you stop making errors of this kind, and what you have learned will be useful in finding a real solution. In this sense it might be harmful to use the notion that "complexity" sometimes concentrates your probability mass a little bit.

Meta-note: reading paragraphs 2-3 of your comment gave me a funny impression that you are thinking and writing like you are a copy of me. ???? MYSTERIOUS MAGICAL SOULMATES MAKE RAINBOW CANDY FALL FROM THE SKY ????

Replies from: gjm

↑ comment by gjm · 2016-03-17T23:34:37.291Z · LW(p) · GW(p)

[...] a relatively simple and describable transformation that takes "humans" as an input value [...]

Perhaps I'm misunderstanding you, but it sounds as if you're saying that we might be able to make a simple model of human values by making it say something like "what humans value is valuable". I agree that in some sense that could be a simple model of human values, but it manages to be one only by not actually doing its job.

On meta-level [...] On object-level [...]

Sure, I agree that in general "need more complexity" is a bad heuristic. Again, I don't think Gram_Stone was proposing it as a general-purpose meta-level heuristic -- but as an observation about apparent constraints on models of human values.

[...] more value in confronting your simple solution [...]

Yes, that could well be true. But what if you're just getting started and haven't arrived on a simple candidate solution yet? It might be better to save the wasted effort and just spurn the seduction of super-simple solutions. (Of course doing that means that you'll miss out if in fact there really is a super-simple solution. Unless you get lucky and find it by starting with a hairy complicated solution and simplifying, I guess.)

Meta-note:

I am at least 95% sure I'm not you, and I regret that if I have the ability to make rainbow candy fall from the sky (other than by throwing it in the air and watching it fall) I've not yet discovered it. But, um, hi, pleased to meet you. I hope we'll be friends. (Unless you're secretly trying to convert us all to Raëlianism, of course.) And congratulations on thinking at a high enough level of abstraction to see someone as thinking like a copy of you when they write something disagreeing with you, I guess :-).

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-18T02:03:06.178Z · LW(p) · GW(p)

Perhaps I'm misunderstanding you [..] make a simple model of human values by making it say something like "what humans value is valuable"

Yes you are? It need not necessarily be trivial, e.g. the aforementioned Coherent Extrapolated Volition idea by Eliezer'2004. Considering that "what humans value" is not clear or consistent, any analysis would proceed on two fronts:

extracting and defining what humans value,
constructing a thinking framework that allows us to put it all together. So we can have insights about 2 without progress in 1. Makes sense?

Again, I don't think Gram_Stone was proposing

Well, if you actually look at Gram_Stone's post, he was arguing against the "simple doesn't work" heuristic in moral philosophy. I think you might have automatically assumed that each next speaker's position is in opposition to others? It's not important who said what, it's important to sort out all relevant thoughts and came out having more accurate beliefs than before.

However there is an issue of hygiene of discussion. I don't "disagree", I suggest alternative ways of looking at an issue. I tend to do it even when I am less inclined to support the position I seem to be defending, versus the opposing position. In Eliezer's words, you need not only to fight the bad argument, but the most terrible horror that can be constructed from its corpse.

So it makes me sad that you see this as "disagreeing". I do this in my own head, too, and there need not be emotional againstness in pitting various beliefs against each other.

I am at least 95% sure I'm not you

Now if I also say I don't think I'm really you, it will be just another similarity.

pleased to meet you. I hope we'll be friends.

Agreed.

Replies from: gjm

↑ comment by gjm · 2016-03-18T14:14:09.879Z · LW(p) · GW(p)

we can have insights about 2 without progress in 1. Make sense?

Yes, but I don't think people saying that simple moral theories are too simple are claiming that no theory about any aspect of ethics should be simple. At any rate, in so far as they are I think they're boringly wrong and would prefer to contemplate a less silly version of their position. The more interesting claim is (I think) something more like "no very simple theory can account for all of human values", and I don't see how CEV offers anything like a counterexample to that.

if you actually look at Gram_Stone's post [...]

Ahem. You would appear to be right. Therefore, when I've said "Gram_Stone" in the above, please pretend I said something like "those advocating the position that simple moral theories are too simple". As you say, it doesn't particularly matter who said what, but I regret foisting a position on someone who wasn't actually advocating it.

it makes me sad that you see this as "disagreeing"

I regret making you sad. I wasn't suggesting any sort of "emotional againstness", though. And I think we actually are disagreeing. For instance, you are arguing that saying "we should reject very simple moral theories because they cannot rightly describe human values" is making a mistake, and that it's the mistake Eliezer was arguing against when he wrote "Say not 'Complexity'". I think saying that is probably not a mistake, and certainly can't be determined to be a mistake simply by recapitulating Eliezer's arguments there. Isn't that a disagreement?

But I take your point, and I too am in the habit of defending things I "disagree" with. I would say then, though, that I am disagreeing with bad arguments against those things -- there is still disagreement, and that's not the same as looking at an issue from multiple directions.

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-19T06:41:24.005Z · LW(p) · GW(p)

For instance, you are arguing that saying [...]

I have a very strong impression we disagree insomuch that we interpret each other's words to mean something we can argue with.

Just now, you treated my original remark in this way by changing the quoted phrase, which was (when I wrote my comment) "Simple moral theories are too neat to do any real work in moral philosophy" but become (in your version) "simple moral theories cannot rightly describe human values". Notice the difference?

I'm not defending my original comment, it was pretty stupid the way I had phrased it in any case.

So of course, you were right when you argued and corrected me, and I thank you for that.

But it still is worrying to have this tendency to imagine someone's words being stupider than they really were, and then arguing with them.

That's what I mean when I say I wish we all could give each other more credit, and interpret others' words in the best possible way, not the worst...

Agreeing with me here?

But in any case, I also wanted to note that this discussion had not enough concreteness from the start.

http://lesswrong.com/lw/ic/the_virtue_of_narrowness/ etc.

Replies from: gjm

↑ comment by gjm · 2016-03-19T20:11:03.252Z · LW(p) · GW(p)

changing the quoted phrase

I plead guilty to changing it, but not guilty to changing it in order to be able to argue. If you look a couple of paragraphs earlier in the comment in question you will see that I argue, explicitly, that surely people saying this kind of thing can't actually mean that no simple theory can be useful in ethics, because that's obviously wrong, and that the interesting claim we should consider is something more like "simple moral theories cannot account for all of human values".

this tendency to imagine someone's words being stupider than they really are, and then arguing with them.

Yup, that's a terrible thing, and I bet I do it sometimes, but on this particular occasion I was attempting to do the exact opposite (not to you but to the unspecified others Gram_Stone wrote about -- though at the time I was under the misapprehension that it was actually Gram_Stone).

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-19T21:28:09.179Z · LW(p) · GW(p)

Hmm. So maybe let's state the issue in a more nuanced way.

We have argument A and counter-argument B.

You adjust argument A in direction X to make it stronger and more valuable to argue against.

But it is not enough to apply the same adjustment to B. To make B stronger in a similar way, it might need adjusting in direction -X, or some other direction Y.

Does it look like it describes a bug that might have happened here? If not, feel free to drop the issue.

Replies from: gjm

↑ comment by gjm · 2016-03-20T01:12:36.218Z · LW(p) · GW(p)

I'm afraid your description here is another thing that may have "not enough concreteness" :-). In your analogy, I take it A is "simple moral theories are too neat to do any real work in moral philosophy" and X is what takes you from there to "simple moral theories can't account for all of human values", but I'm not sure what B is, or what direction Y is, or where I adjusted B in direction X instead of direction Y.

So you may well be right, but I'm not sure I understand what you're saying with enough clarity to tell whether you are.

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-20T03:05:22.686Z · LW(p) · GW(p)

You caught me red handed at not being concrete! Shame on me!

By B I meant applying the idea from "Say not 'Complexity'".

You adjusting B in direction X is what I pointed out when I accused you of changing my original comment.

By Y I mean something like our later consensus, which boils down to (Y1) "we can use the heuristic of 'simple doesn't work' in this case, because in this case we have pretty high confidence that that's how it really is; which still doesn't make it a method we can use for finding solutions and is dangerous to use without sufficient basis"

Or it could even become (Y2) "we can get something out of considering those simple and wrong solutions" which is close to Gram_Stone's original point.

↑ comment by Gram_Stone · 2016-03-19T23:29:31.594Z · LW(p) · GW(p)

Sorry, Gram_Stone!

Heh, it's okay. I had no idea that the common ancestor comment had generated so much discussion.

Also, I agree that neither is the complex approach obviously wrong to me, and that it seems that until there's something that makes it seem obviously wrong, we might as well let the two research paths thrive.

↑ comment by Lumifer · 2016-03-17T15:06:33.133Z · LW(p) · GW(p)

You want to aim at simplicity, and make sure that every step that takes you farther from it is well justified.

In the specific example of "simple moral theories", I believe they have been shown to have enough problems so that stepping into the complexity morass is well justified.

Replies from: Gram_Stone

↑ comment by Gram_Stone · 2016-03-19T23:36:32.764Z · LW(p) · GW(p)

Yeah, but there's a difference between prospect theory and thinking that prospect theory is 'too neat'. Worth saying.

↑ comment by Gram_Stone · 2016-03-20T04:37:38.134Z · LW(p) · GW(p)

whereas Gram_Stone is saying something more like "It looks like no simple model will do X, so any that does X will necessarily turn out to be complex".

Looks like you forgot to edit this one.

Replies from: gjm

↑ comment by gjm · 2016-03-20T15:01:26.588Z · LW(p) · GW(p)

I did. I have edited it now.

↑ comment by Gram_Stone · 2016-03-19T23:40:48.388Z · LW(p) · GW(p)

This adds variety, and is good when your best options are close in utility.

Just because you might be wrong about utilities (I assume that's a possibility that you're implying) doesn't mean that you should make the process you use to choose outcomes random.

This is not how it logically works. If you get a suspicious deduction, you discredit your assumptions, and not the other way round.

The idea is that different people consider different assumptions suspicious, and that sometimes people change their minds about which assumptions are suspicious. A suspicious deduction can also be the effect of a bad inference rule, not just a bad premise. It seems better to me to run as far as you can with many lines of reasoning unless they're completely obviously wrong at a glance.

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-20T02:10:57.965Z · LW(p) · GW(p)

Just because you might be wrong about utilities (I assume that's a possibility that you're implying) doesn't mean that you should make the process you use to choose outcomes random.

Yes. What I meant is rather something like in this example:

You have 4 options.

Option A: estimated utility = 10 ± 5
Option B: estimated utility = 5 ± 10
Option C: estimated utility = 3 ± 2
Option D: estimated utility = -10 ± 30

It seems reasonable to not always choose A, and sometimes choose B, and from time to time even D. At least until you gather enough data to improve the accuracy of your estimates.

I expect you can arrive at this solution by carefully calculating probabilities of changing your estimates by various amounts and how much more utility you can get if your estimates change.

Replies from: gjm, Gram_Stone

↑ comment by gjm · 2016-03-20T03:03:18.804Z · LW(p) · GW(p)

There's been quite a lot of work on this sort of question, under the title of "Multi-armed bandits". (As opposed to the "one-armed bandits" you find rows and rows of in casinos.)

Replies from: Gram_Stone

↑ comment by Gram_Stone · 2016-03-20T03:44:17.195Z · LW(p) · GW(p)

Your response is very different from mine, so I'm wondering if I'm wrong.

Replies from: gjm

↑ comment by gjm · 2016-03-20T15:00:15.296Z · LW(p) · GW(p)

The multi-armed bandit scenario applies when you are uncertain about the distributions produced by these options, and are going to have lots of interactions with them that you can use to discover more about them while extracting utility.

For a one-shot game, or if those estimated utilities are distributions you know each option will continue to produce every time, you just compute the expected utility and you're done.

But suppose you know that each produces some distribution of utilities, but you don't know what it is yet (but e.g. maybe you know they're all normally distributed and have some guess at the means and variances), and you get to interact with them over and over again. Then you will probably begin by trying them all a few times to get a sense of what they do, and as you learn more you will gradually prioritize maximizing expected-utility-this-turn over knowledge gain (and hence expected utility in the future).

↑ comment by Gram_Stone · 2016-03-20T03:37:06.108Z · LW(p) · GW(p)

I assume that when you write '10 +/- 5', you mean that Option A could have a utility on the open interval with 0 and 10 as lower and upper bounds.

You can transform this into a decision problem under risk. Assuming that, say, in option A, you're not reasoning as though 6 is more probable than 10 because 6 is closer to 5 than 10 is (your problem statement did not indicate anything like this), then you can assign an expected utility to each option by making an equiprobable prior over the open interval including the set of possible utilities for each action. For example, since there are 10 members in the set defined as the open interval with 0 and 10 as lower and upper bounds, you would assign a probability of 0.1 to each member of the set. Furthermore, the expected utility for each Option is as follows:

A = (0*0.1) + (1*0.1) + (2*0.1) + (3*0.1) + (4*0.1) + (5*0.1) +(6*0.1) + (7*0.1) + (8*0.1) + (9*0.1) + (10*0.1) = 1.5
B = 0
C = 0.3
D = -61

The expected utility formalism prescribes A. Choosing any other option violates the Von Neumann-Morgenstern axioms.

However, my guess is that your utilities are secretly dollar values and you have an implicit utility function over outcomes. You can represent this by introducing a term u into the expected utility calculations that weights the outcomes by their real utility. This makes sense in the real world because of things like gambler's ruin. In a world of perfect emptiness, you have infinite money to lose, so it makes sense to maximize expected value. In the real world, you can run out of money, so evolution might make you loss averse to compensate. This was the original motivation for formulating the notion of expected utility (some quantitative measure of desirableness weighted by probability), as opposed to the earlier notion of expected value (dollar value weighted by probability).

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-20T03:46:49.505Z · LW(p) · GW(p)

Your analysis misses the point that you may play the game many times and change your estimates as you go.

For the record, 10 ± 5 means an interval from 5 to 15, not 0 to 10, and in any case I intended it as a shorthand for a bell-like distribution with a mean of 10 and a standard deviation of 5.

Replies from: Gram_Stone

↑ comment by Gram_Stone · 2016-03-20T03:54:07.182Z · LW(p) · GW(p)

Yeah, I parsed it as 5 +/- 5 some how. Might have messed up the other ones too.

Wouldn't you just maximize expected utility in each iteration, regardless of what estimates are given in each iteration?

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-20T03:56:13.058Z · LW(p) · GW(p)

You would indeed maximize EV in each iteration, but this EV would also include a factor from value of information.

Replies from: Gram_Stone

↑ comment by Gram_Stone · 2016-03-20T04:11:20.555Z · LW(p) · GW(p)

Ah, okay. I went downstairs for a minute and thought to myself, "Well, the only way I get what he's saying is if we go up a level and assume that the given utilities are not simply changing, but are changing according to some sort of particular rule."

Also, I spent a long time writing my reply to your original problem statement, without refreshing the page, so I only read the original comment, not the edit. That might explain why I didn't immediately notice that you were talking about value of information, if I seemed a little pedantic in my earlier comment with all of the math.

Back to the original point that brought this problem up, what's going on inside the brain is that the brain has assigned utilities to outcomes, but there's a tremble on its actions caused by the stochastic nature of neural networks. The brain isn't so much uncertain about utilities as it is believing that its utility estimates are accurate and randomly not doing what it considers most desirable.

That's why I wrote, in the original comment:

It just seems interesting to consider the consequences of the assumption that there is a decision-maker without a trembling hand.

Does that make sense?

Replies from: SquirrelInHell

↑ comment by SquirrelInHell · 2016-03-20T05:31:57.290Z · LW(p) · GW(p)

Ah, okay. I went downstairs for a minute and thought to myself, "Well, the only way I get what he's saying is if we go up a level and assume that the given utilities are not simply changing, but are changing according to some sort of particular rule."

Congratulations on good thinking and attitude :)

Does that make sense?

Yes, I get that. What I meant to suggest to you in the broader picture, is that this "tremble" might be evolution's way to crudely approximate a fully rational agent, who makes decisions based on VOI.

So it's not necessarily detrimental to us. Sometimes it might well be.

The main takeaway from all that I have said is it that replacing your intuition with "let's always take option A because it's the rational thing to do" just doesn't do the trick when you play multiple games (as is often the case in real life).

↑ comment by Gram_Stone · 2016-03-19T23:37:48.071Z · LW(p) · GW(p)

This adds variety, and is good when your best options are close in utility.

Just because you might be wrong about utilities (I assume that's a possibility that you're implying) doesn't mean that you should make the process you use to choose outcomes random.

Rationality Reading Group: Part V: Value Theory

Contents

V. Value Theory

32 comments