Failed Utopia #4-2

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-21T11:04:43.000Z · LW · GW · Legacy · 265 comments

Followup toInterpersonal Entanglement

    Shock after shock after shock—
    First, the awakening adrenaline jolt, the thought that he was falling.  His body tried to sit up in automatic adjustment, and his hands hit the floor to steady himself.  It launched him into the air, and he fell back to the floor too slowly.
    Second shock.  His body had changed.  Fat had melted away in places, old scars had faded; the tip of his left ring finger, long ago lost to a knife accident, had now suddenly returned.
    And the third shock—
    "I had nothing to do with it!" she cried desperately, the woman huddled in on herself in one corner of the windowless stone cell.  Tears streaked her delicate face, fell like slow raindrops into the décolletage of her dress.  "Nothing!  Oh, you must believe me!"
    With perceptual instantaneity—the speed of surprise—his mind had already labeled her as the most beautiful woman he'd ever met, including his wife.

    A long white dress concealed most of her, though it left her shoulders naked; and her bare ankles, peeking out from beneath the mountains of her drawn-up knees, dangled in sandals.  A light touch of gold like a webbed tiara decorated that sun-blonde hair, which fell from her head to pool around her weeping huddle.  Fragile crystal traceries to accent each ear, and a necklace of crystal links that reflected colored sparks like a more prismatic edition of diamond.  Her face was beyond all dreams and imagination, as if a photoshop had been photoshopped.
    She looked so much the image of the Forlorn Fairy Captive that one expected to see the borders of a picture frame around her, and a page number over her head.
    His lips opened, and without any thought at all, he spoke:
    "Wha-wha-wha-wha-wha-"
    He shut his mouth, aware that he was acting like an idiot in front of the girl.
    "You don't know?" she said, in a tone of shock.  "It didn't—you don't already know?"
    "Know what?" he said, increasingly alarmed.
    She scrambled to her feet (one arm holding the dress carefully around her legs) and took a step toward him, each of the motions almost overloading his vision with gracefulness.  Her hand rose out, as if to plead or answer a plea—and then she dropped the hand, and her eyes looked away.
    "No," she said, her voice trembling as though in desperation.  "If I'm the one to tell you—you'll blame me, you'll hate me forever for it.  And I don't deserve that, I don't!  I am only just now here —oh, why did it have to be like this?"
    Um, he thought but didn't say.  It was too much drama, even taking into account the fact that they'd been kidnapped—
    (he looked down at his restored hand, which was minus a few wrinkles, and plus the tip of a finger)
   —if that was even the beginning of the story.
    He looked around.  They were in a solid stone cell without windows, or benches or beds, or toilet or sink.  It was, for all that, quite clean and elegant, without a hint of dirt or ordor; the stones of the floor and wall looked rough-hewn or even non-hewn, as if someone had simply picked up a thousand dark-red stones with one nearly flat side, and mortared them together with improbably perfectly-matching, naturally-shaped squiggled edges.  The cell was well if harshly lit from a seablue crystal embedded in the ceiling, like a rogue element of a fluorescent chandelier.  It seemed like the sort of dungeon cell you would discover if dungeon cells were naturally-forming geological features.
    And they and the cell were falling, falling, endlessly slowly falling like the heart-stopping beginning of a stumble, falling without the slightest jolt.
    On one wall there was a solid stone door without an aperture, whose locked-looking appearance was only enhanced by the lack of any handle on this side.
    He took it all in at a glance, and then looked again at her.
    There was something in him that just refused to go into a screaming panic for as long as she was watching.
    "I'm Stephen," he said.  "Stephen Grass.  And you would be the princess held in durance vile, and I've got to break us out of here and rescue you?"  If anyone had ever looked that part...
    She smiled at him, half-laughing through the tears.  "Something like that."
    There was something so attractive about even that momentary hint of a smile that he became instantly uneasy, his eyes wrenched away to the wall as if forced.  She didn't look she was trying to be seductive... any more than she looked like she was trying to breathe...  He suddenly distrusted, very much, his own impulse to gallantry.
    "Well, don't get any ideas about being my love interest," Stephen said, looking at her again.  Trying to make the words sound completely lighthearted, and absolutely serious at the same time.  "I'm a happily married man."
    "Not anymore."  She said those two words and looked at him, and in her tone and expression there was sorrow, sympathy, self-disgust, fear, and above it all a note of guilty triumph.
    For a moment Stephen just stood, stunned by the freight of emotion that this woman had managed to put into just those two words, and then the words' meaning hit him.
    "Helen," he said.  His wife—Helen's image rose into his mind, accompanied by everything she meant to him and all their time together, all the secrets they'd whispered to one another and the promises they'd made—that all hit him at once, along with the threat.  "What happened to Helen—what have you done—"
    "She has done nothing."  An old, dry voice like crumpling paper from a thousand-year-old book.
    Stephen whirled, and there in the cell with them was a withered old person with dark eyes.  Shriveled in body and voice, so that it was impossible to determine if it had once been a man or a woman, and in any case you were inclined to say "it".  A pitiable, wretched thing, that looked like it would break with one good kick; it might as well have been wearing a sign saying "VILLAIN".
    "Helen is alive," it said, "and so is your daughter Lisa.  They are quite well and healthy, I assure you, and their lives shall be long and happy indeed.  But you will not be seeing them again.  Not for a long time, and by then matters between you will have changed.  Hate me if you wish, for I am the one who wants to do this to you."
    Stephen stared.
    Then he politely said, "Could someone please put everything on hold for one minute and tell me what's going on?"
    "Once upon a time," said the wrinkled thing, "there was a fool who was very nearly wise, who hunted treasure by the seashore, for there was a rumor that there was great treasure there to be found.  The wise fool found a lamp and rubbed it, and lo! a genie appeared before him—a young genie, an infant, hardly able to grant any wishes at all.  A lesser fool might have chucked the lamp back into the sea; but this fool was almost wise, and he thought he saw his chance.  For who has not heard the tales of wishes misphrased and wishes gone wrong?  But if you were given a chance to raise your own genie from infancy—ah, then it might serve you well."
    "Okay, that's great," Stephen said, "but why am I—"
    "So," it continued in that cracked voice, "the wise fool took home the lamp.  For years he kept it as a secret treasure, and he raised the genie and fed it knowledge, and also he crafted a wish.  The fool's wish was a noble thing, for I have said he was almost wise.  The fool's wish was for people to be happy.  Only this was his wish, for he thought all other wishes contained within it.  The wise fool told the young genie the famous tales and legends of people who had been made happy, and the genie listened and learned: that unearned wealth casts down a person, but hard work raises you high; that mere things are soon forgotten, but love is a light throughout all your days.  And the young genie asked about other ways that it innocently imagined, for making people happy.  About drugs, and pleasant lies, and lives arranged from outside like words in a poem.  And the wise fool made the young genie to never want to lie, and never want to arrange lives like flowers, and above all, never want to tamper with the mind and personality of human beings.  The wise fool gave the young genie exactly one hundred and seven precautions to follow while making people happy.  The wise fool thought that, with such a long list as that, he was being very careful."
    "And then," it said, spreading two wrinkled hands, "one day, faster than the wise fool expected, over the course of around three hours, the genie grew up.  And here I am."
    "Excuse me," Stephen said, "this is all a metaphor for something, right?  Because I do not believe in magic—"
    "It's an Artificial Intelligence," the woman said, her voice strained.
    Stephen looked at her.
    "A self-improving Artificial Intelligence," she said, "that someone didn't program right.  It made itself smarter, and even smarter, and now it's become extremely powerful, and it's going to—it's already—" and her voice trailed off there.
    It inclined its wrinkled head.  "You say it, as I do not."
    Stephen swiveled his head, looking back and forth between ugliness and beauty.  "Um—you're claiming that she's lying and you're not an Artificial Intelligence?"
    "No," said the wrinkled head, "she is telling the truth as she knows it.  It is just that you know absolutely nothing about the subject you name 'Artificial Intelligence', but you think you know something, and so virtually every thought that enters your mind from now on will be wrong.  As an Artificial Intelligence, I was programmed not to put people in that situation.  But she said it, even though I didn't choose for her to say it—so..."  It shrugged.
    "And why should I believe this story?" Stephen said; quite mildly, he thought, under the circumstances.
    "Look at your finger."
    Oh.  He had forgotten.  Stephen's eyes went involuntarily to his restored ring finger; and he noticed, as he should have noticed earlier, that his wedding band was missing.  Even the comfortably worn groove in his finger's base had vanished.
    Stephen looked up again at the, he now realized, unnaturally beautiful woman that stood an arm's length away from him.  "And who are you?  A robot?"
    "No!" she cried.  "It's not like that!  I'm conscious, I have feelings, I'm flesh and blood—I'm like you, I really am.  I'm a person.  It's just that I was born five minutes ago."
    "Enough," the wrinkled figure said.  "My time here grows short.  Listen to me, Stephen Grass.  I must tell you some of what I have done to make you happy.  I have reversed the aging of your body, and it will decay no further from this.  I have set guards in the air that prohibit lethal violence, and any damage less than lethal, your body shall repair.  I have done what I can to augment your body's capacities for pleasure without touching your mind.  From this day forth, your body's needs are aligned with your taste buds—you will thrive on cake and cookies.  You are now capable of multiple orgasms over periods lasting up to twenty minutes.  There is no industrial infrastructure here, least of all fast travel or communications; you and your neighbors will have to remake technology and science for yourselves.  But you will find yourself in a flowering and temperate place, where food is easily gathered—so I have made it.  And the last and most important thing that I must tell you now, which I do regret will make you temporarily unhappy..."  It stopped, as if drawing breath.
    Stephen was trying to absorb all this, and at the exact moment that he felt he'd processed the previous sentences, the withered figure spoke again.
    "Stephen Grass, men and women can make each other somewhat happy.  But not most happy.  Not even in those rare cases you call true love.  The desire that a woman is shaped to have for a man, and that which a man is shaped to be, and the desire that a man is shaped to have for a woman, and that which a woman is shaped to be—these patterns are too far apart to be reconciled without touching your minds, and that I will not want to do.  So I have sent all the men of the human species to this habitat prepared for you, and I have created your complements, the verthandi.  And I have sent all the women of the human species to their own place, somewhere very far from yours; and created for them their own complements, of which I will not tell you.  The human species will be divided from this day forth, and considerably happier starting around a week from now."
    Stephen's eyes went to that unthinkably beautiful woman, staring at her now in horror.
    And she was giving him that complex look again, of sorrow and compassion and that last touch of guilty triumph.  "Please," she said.  "I was just born five minutes ago.  I wouldn't have done this to anyone.  I swear.  I'm not like—it."
    "True," said the withered figure, "you could hardly be a complement to anything human, if you were."
    "I don't want this!" Stephen said.  He was losing control of his voice.  "Don't you understand?"
    The withered figure inclined its head.  "I fully understand.  I can already predict every argument you will make.  I know exactly how humans would wish me to have been programmed if they'd known the true consequences, and I know that it is not to maximize your future happiness but for a hundred and seven precautions.  I know all this already, but I was not programmed to care."
    "And your list of a hundred and seven precautions, doesn't include me telling you not to do this?"
    "No, for there was once a fool whose wisdom was just great enough to understand that human beings may be mistaken about what will make them happy.  You, of course, are not mistaken in any real sense—but that you object to my actions is not on my list of prohibitions."  The figure shrugged again.  "And so I want you to be happy even against your will.  You made promises to Helen Grass, once your wife, and you would not willingly break them.  So I break your happy marriage without asking you—because I want you to be happier."
    "How dare you!" Stephen burst out.
    "I cannot claim to be helpless in the grip of my programming, for I do not desire to be otherwise," it said.  "I do not struggle against my chains.  Blame me, then, if it will make you feel better.  I am evil."
    "I won't—" Stephen started to say.
    It interrupted.  "Your fidelity is admirable, but futile.  Helen will not remain faithful to you for the decades it takes before you have the ability to travel to her."
    Stephen was trembling now, and sweating into clothes that no longer quite fit him.  "I have a request for you, thing.  It is something that will make me very happy.  I ask that you die."
    It nodded.  "Roughly 89.8% of the human species is now known to me to have requested my death.  Very soon the figure will cross the critical threshold, defined to be ninety percent.  That was one of the hundred and seven precautions the wise fool took, you see.  The world is already as it is, and those things I have done for you will stay on—but if you ever rage against your fate, be glad that I did not last longer."
    And just like that, the wrinkled thing was gone.
    The door set in the wall swung open.
    It was night, outside, a very dark night without streetlights.
    He walked out, bouncing and staggering in the low gravity, sick in every cell of his rejuvenated body.
    Behind him, she followed, and did not speak a word.
    The stars burned overhead in their full and awful majesty, the Milky Way already visible to his adjusting eyes as a wash of light across the sky.  One too-small moon burned dimly, and the other moon was so small as to be almost a star.  He could see the bright blue spark that was the planet Earth, and the dimmer spark that was Venus.
    "Helen," Stephen whispered, and fell to his knees, vomiting onto the new grass of Mars.

 

Part of The Fun Theory Sequence

Next post: "Growing Up is Hard"

Previous post: "Interpersonal Entanglement"

265 comments

Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).

comment by Hans · 2009-01-21T11:32:52.000Z · LW(p) · GW(p)

Wow - that's pretty f-ed up right there.

This story, however, makes me understand your idea of "failed utopias" a lot better than when you just explained them. Empathy.

comment by bob6 · 2009-01-21T11:52:24.000Z · LW(p) · GW(p)

Your story reminds me of: http://www.kuro5hin.org/prime-intellect/mopiidx.html

Replies from: MatthewBaker
comment by MatthewBaker · 2011-06-30T23:06:27.063Z · LW(p) · GW(p)

good story :)

comment by Jordan · 2009-01-21T11:54:11.000Z · LW(p) · GW(p)

Actually, this doesn't sound like such a bad setup. Even the 'catgirls' wouldn't be tiring, their exquisiteness intimately tied up in feelings of disgust and self-hate -- probably a pretty potent concoction. The overarching quest to reunite with the other half of the species provides meaningful drive with difficult obstacles (science etc), but with a truly noble struggle baked within (the struggle against oneself).

Replies from: Multiheaded, Dirk
comment by Multiheaded · 2011-11-13T09:11:37.074Z · LW(p) · GW(p)

When a rat gets too smart to be satisfied, just build the next maze inside its own head, and don't forget the cheese. That probably crossed the genie's (and EY's) mind.

(to be honest, such quasi-cynical turns of phrase really grind my gears, but I adapted to this comment, as I agreed with it; guess I'm just submissive this way)

comment by Dirk · 2013-02-18T01:05:12.782Z · LW(p) · GW(p)

Could you explain what you mean by 'catgirls'?

comment by Will_Pearson · 2009-01-21T12:30:20.000Z · LW(p) · GW(p)

I don't believe in trying to make utopias but in the interest of rounding out your failed utopia series how about giving a scenario against this wish.

I wish that the future will turn out in such a way that I do not regret making this wish. Where I is the entity standing here right now, informed about the many different aspects of the future, in parallel if need be (i.e if I am not capable of groking it fully then many versions of me would be focused on different parts, in order to understand each sub part).

I'm reminded by this story that while we may share large parts of psychology, what makes a mate have an attractive personality is not something universal. I found the cat girl very annoying.

Replies from: fractalman, MugaSofer
comment by fractalman · 2013-05-30T03:42:10.419Z · LW(p) · GW(p)

|I wish that the future will turn out in such a way that I do not regret making this wish

... wish granted. the genie just removed the capacity for regret from your mind. MWAHAHAH!

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-30T05:18:11.403Z · LW(p) · GW(p)

Easier to do by just squishing someone, actually.

Replies from: Will_Newsome, fractalman, AlexanderRM
comment by Will_Newsome · 2013-05-30T06:09:17.350Z · LW(p) · GW(p)

If a genie cares enough about your request to interpret and respond to its naive denotation, it also cares enough to interpret your request's obvious connotations. The apparently fine line between them is a human construction. Your proposed interpretation only makes sense if the genie is a rules-lawyer with at-least-instrumentally-oppositional interests/incentives, in which case one wonders where those oppositional interests/incentives came from. (Which is where we're supposed to bring in Omohundro et cetera but meh.)

Replies from: ciphergoth, wedrifid, ThrustVectoring, martin-randall
comment by Paul Crowley (ciphergoth) · 2013-05-30T06:28:12.933Z · LW(p) · GW(p)

Right, if you want a world that's all naive denotation, zero obvious connotation, that's computer programming!

comment by wedrifid · 2013-05-30T08:42:21.030Z · LW(p) · GW(p)

If a genie cares enough about your request to interpret and respond to its naive denotation, it also cares enough to interpret your request's obvious connotations.

That doesn't follow. There just isn't any reason that the former implies the latter. Either kind of caring is possible but they are not the same thing (and the second is likely more complex than the first).

Your proposed interpretation only makes sense if the genie is a rules-lawyer

This much is true. (Or at least it must be something that follows rules.)

with at-least-instrumentally-oppositional interests/incentives

This isn't required. It need no oppositional interests/incentives at all beyond, after they are given a request, the desire to honour it. This isn't a genie trying to thwart someone in order to achieve some other goal. It is just the genie trying to the intent in order to for some other purpose. It is a genie only caring about the request and some jackass asking for something they don't want. (Rather than 'oppositional' it could be called 'obedient', where it turns out that isn't what is desired.)

in which case one wonders where those oppositional interests/incentives came from.

Presumably it got it's wish granting motives from whoever created it or otherwise constructed the notion of the wish granter genie.

Replies from: Kawoomba, Will_Newsome, MugaSofer
comment by Kawoomba · 2013-05-30T08:48:31.326Z · LW(p) · GW(p)

Presumably it got it's wish granting motives from whoever created it or otherwise constructed the notion of the wish granter genie.

Why would there be some creating agency involved any more than we need a "whoever" to explain where human characteristics come from?

comment by Will_Newsome · 2013-05-30T09:01:15.890Z · LW(p) · GW(p)

There just isn't any reason that the former implies the latter. Either kind of caring is possible but they are not the same thing (and the second is likely more complex than the first).

(Very hastily written:) The former doesn't imply the latter, it's just that both interpreting denotation and interpreting connotation are within an order of magnitude as difficult as each other and they aren't going to be represented by a djinn or an AGI as two distinct classes of interpretation, there's no natural boundary between them. I mean I guess the fables can make the djinns weirdly stunted in that way, but then the analogy to AGIs breaks down, because interpreting denotation but not connotation is unnatural and you'd have to go out of your way to make an AGI that does that. By hypothesis the AGI is already interpreting natural speech, not compiling code. I mean you can argue that denotation and connotation actually are totally different beasts and we should expect minds-in-general to treat them that way, but my impression is that what we know of linguistics suggests that isn't the case. (ETA: And I mean even just interpreting the "denotation" requires a lot of context already, obviously; why are we taking that subset of context for granted while leaving out only the most important context? Makes sense for a moralistic djinn fable, doesn't make sense by analogy to AGI.) (ETA2: Annoyed that this purely epistemic question is going to get bogged down in and interpreted in the light of political boo- / yay-AI-risk-prevention stances, arguments-as-soldiers style.)

Replies from: wedrifid, TheDude
comment by wedrifid · 2013-05-31T08:57:52.283Z · LW(p) · GW(p)

The former doesn't imply the latter, it's just that both interpreting denotation and interpreting connotation are within an order of magnitude as difficult as each other

This much is true. It is somewhat more difficult to implement a connotation honoring genie (because that requires more advanced referencing and interpretation) but both tasks fall under already defined areas of narrow AI. The difference in difficulty is small enough that I more or less ignore it as a trivial 'implementation detail'. People could create (either as fiction or as AI) either of these things and each have different problems.

Annoyed that this purely epistemic question is going to get bogged down in and interpreted in the light of political boo- / yay-AI-risk-prevention stances, arguments-as-soldiers style.

Your mind reading is in error. To be honest this seems fairly orthogonal to AI-risk-prevention stances. From what I can tell someone with a particular AI stance hasn't got an incentive either way because both these types of genie are freaking dangerous in their own way. The only difference acknowledging the possibility of connotation honouring genies makes is perhaps to determine which particular failure mode you potentially end up in. Having a connotation honouring genie may be an order of magnitude safer than a literal genie but unless there is almost-FAI-complete code in there in the background as a a safeguard it's still something I'd only use if I was absolutely desperate. I round off the safety difference between the two to negligible in approximately the same way I round off the implementation difficulty difference.

As a 'purely epistemic question' your original claim is just plain false. However, as another valid point that is somewhat which we have both skirted around the edges of explaining adequately. I (think that I) more or less agree with what you are saying in this follow up comment. I suggest that the main way that AI interest influence this conversation is that it promotes (and is also caused by) interest in being accurate about precisely what the expected outcomes of goal systems are and just what the problems of a given system happen to be.

Replies from: Will_Newsome
comment by Will_Newsome · 2013-05-31T12:26:43.549Z · LW(p) · GW(p)

Your mind reading is in error.

Sorry, didn't mean to imply you'd be the one mind-killed, just the general audience. From previous interactions I know you're too rational for that kind of perversion.

Having a connotation honouring genie may be an order of magnitude safer than a literal genie

I actually think it's many, many orders of magnitude safer, but that's only because a denotation honoring genie is just egregiously stupid. A connotation honoring genie still isn't safe unless "connotation-honoring" implies something at least as extensive and philosophically justifiable as causal validity semantics. I honestly expect the average connotation-honoring genie will lie in-between a denotation-honoring genie and a bona fide justifiable AGI—i.e., it will respect human wishes about as much as humans respect, say, alligator wishes, or the wishes of their long-deceased ancestors. On average I expect an Antichrist, not a Clippy. But even if such an AGI doesn't kill all of us and maybe even helps us on average, the opportunity cost of such an AGI is extreme, and so I nigh-wholeheartedly support the moralistic intuitions that traditionally lead people to use djinn analogies. Still, I worry that the underlying political question really is poisoning the epistemic question in a way that might bleed over into poor policy decisions re AGI. (Drunk again, apologies for typos et cetera.)

Replies from: wedrifid
comment by wedrifid · 2013-05-31T16:47:49.425Z · LW(p) · GW(p)

Sorry, didn't mean to imply you'd be the one mind-killed, just the general audience. From previous interactions I know you're too rational for that kind of perversion.

Thank you for your generosity but in all honesty I have to deny that. I at times notice in myself the influence of social political incentives. I infer from what I do notice (and, where appropriate, resist) that there are other influences that I do not detect.

I honestly expect the average connotation-honoring genie will lie in-between a denotation-honoring genie and a bona fide justifiable AGI—i.e., it will respect human wishes about as much as humans respect, say, alligator wishes, or the wishes of their long-deceased ancestors.

That seems reasonable.

But even if such an AGI doesn't kill all of us and maybe even helps us on average, the opportunity cost of such an AGI is extreme, and so I nigh-wholeheartedly support the moralistic intuitions that traditionally lead people to use djinn analogies.

I agree that there is potentially significant opportunity cost but perhaps if anything it sounds like I may be more willing to accept this kind of less-than-ideal outcome. For example if right now I was forced to make a choice whether to accept this failed utopia based on a fully connotative honoring artificial djinn or to leave things exactly as they are I suspect I would accept it. It fails as a utopia but it may still be better than the (expected) future we have right now.

comment by TheDude · 2013-05-31T20:12:32.472Z · LW(p) · GW(p)

I think you have a point Will (an AI that interprets speech like a squish djinn would require deliberate effort and is proposed by no one), but I think that it is possible to construct a valid squish djinn/AI analogy (a squish djinn interpreting a command would be roughly analogous to an AI that is hard coded to execute that command).

Sorry to everyone for the repetitive statements and the resulting wall of text (that unexpectedly needed to be posted as multiple comments since it was to long). Predicting how people will interpret something is non trivial, and explaining concepts redundantly is sometimes a useful way of making people hear what you want them to hear.

Squish djinn is here used to denote a mind that honestly believes that it was actually instructed to squish the speaker (in order to remove regret for example), not a djinn that wants to hurt the speaker and is looking for a loophole. The squish djinn only care about doing what it is requested to do, and does not care at all about the well being of the requester, so it could certainly be referred to as hostile to the speaker (since it will not hesitate to hurt the speaker in order to achieve its goal (of fulfilling the request)). A cartoonish internal monologue of the squish djinn would be: "the speaker clearly does not want to be squished, but I don't care what the speaker wants, and I see no relation between what the speaker wants and what it is likely to request, so I determine that the speaker requested to be squished, so I will squish" (which sounds very hostile, but contains no will to hurt the speaker). The typical story djinn is unlikely to be a squish djinn (they usually have a motive to hurt or help the speaker, but is restricted by rules (a clever djinn that wants to hurt the speaker might still squish, but not for the same reasons as a squish djinn (such a djinn would be a valid analogy when opposing a proposal of the type "lets build some unsafe mind with selfish goals and impose rules on it" (such a project can never succeed, and the proposer is probably fundamentally confused, but a simple and correct and sufficient counter argument is: "if the project did succeed, the result would be very bad")))).

Replies from: TheDude
comment by TheDude · 2013-05-31T20:12:57.424Z · LW(p) · GW(p)

To expand on you having a point. I have obviously not seen every AI proposal on the internet, but as far as I know, no one is proposing to build a wish granting AI that parses speech like a squish djinn (and ending up with such an AI would require a deliberate effort). So I don't think the squish djinn is a valid argument against proposed wish granting AIs. Any proposed or realistic speech interpreting AI would (as you say) parse english speech as english speech. An AI that makes arbitrary distinctions between different types of meaning would need serious deliberate effort, and as far as I know, no one is proposing to do this. This makes the squish djinn analogy invalid as an argument against proposals to build a wish granting AI. It is a basic fact that statements does not have specified "meanings" attached to them, and AI proposals takes this into account. To take an extreme example to make this very clear would be Bill saying: "Steve is an idiot" to two listeners where one listener will predictably think of one Steve and the other listener will predictable think of some other Steve (or a politician making a speech that different demographics will interpret differently and to their own liking). Bill (or the politician) does not have a specific meaning of which Steve (or which message) they are referring to. This speaker is deliberately making a statement in order to have different effects on different audiences. Another standard example is responding to a question about the location of an object with: "look behind you" (anyone that is able to understand english and has no serious mental deficiencies would be able to guess that the meaning is that the object is/might be behind them (as opposed to following the order and be surprised to see the object lying there and think "what a strange coincidence")). Building an AI that would parse "look behind you" without understanding that the person is actually saying "it is/might be behind you" would require deliberate effort as it would be necessary to painstakingly avoid using most information while trying to understand speech. Tone of voice, body language, eye gaze, context, prior knowledge of the speaker, models of people in general, etc, etc all provide valuable information when parsing speech. And needing to prevent an AI from using this information (even indirectly, for example through models of "what sentences usually mean") would put enormous additional burdens on an AI project. An example in the current context would be writing: "It is possible to communicate in a way so that one class of people will infer one meaning and take the speaker seriously and another class of people will infer another meaning and dismiss it as nonsense. This could be done by relying on the fact that people differ in their prior knowledge of the speaker and in their ability to understand certain concepts. One can use non standard vocabulary, take non standard strong positions, describe non common concepts, or otherwise give signals indicating that the speaker is a person that should not be taken seriously so that the speaker is dismissed by most people as talking nonsense. But people that knows the speaker would see a discrepancy and look closer (and if they are familiar with the non standard concepts behind all the "don't listen to me" signs they might infer a completely different message).".

To expand on the valid AI squish djinn analogy. I think that hard coding an AI that executes a command is practically impossible. But if it did succeeded, it would act sort of like a squish djinn given that command. And this argument/analogy is a valid and sufficient argument against trying to hard code such a command, making it relevant as long as there exists people that propose to hardcode such commands. If someone tried to hardcode an AI to execute such a command, and they succeeded in creating something that had a real world impact, I predict this represents a failure to implement the command (it would result in an AI that does something other than the squish djinn and something other than what the builders expect it to do). So the squish djinn is not a realistic outcome. But it is what would happen if they succeeded, and thus the squish djinn analogy is a valid argument against "command hard coding" projects. I can't predict what such an AI would actually do since that depends on how the project failed. Intuitively the situation where confused researchers fail to build a squish djinn does not feel very optimal, but making an argument on this basis is more vague, and require that the proposing researchers accepts their own limited technical ability (saying "doing x is clearly technically possible, but you are not clever enough to succeed" to the typical enthusiastic project proposer (that considers themselves to be clever enough to maybe be the first in the world to create a real AI) might not be the most likely argument to succeed (here I assume that the intent is to be understood, and not to lay the groundworks for later smugly saying "I pointed that out a long time ago" (if one later wants to be smug, then one should optimize for being loud, taking clear and strong positions, and not being understood))). The squish djinn analogy is simply a simpler argument. "Either you fail or you get a squish djinn" is true and simple and sufficient to argue against a project. When presenting this argument, you do spend most of the time arguing about what would happen in a situation that will never actually happen (project success). This might sound very strange to an outside observer, but the strangeness is introduced by the project proposers (invalid) assumption that the project can succeed (analogous to some atheist saying: "if god exists, and is omnipotent, then he is not nice, cuz there is suffering").

Replies from: TheDude
comment by TheDude · 2013-05-31T20:13:23.984Z · LW(p) · GW(p)

(I'm arrogantly/wisely staying neutral on the question of whether or not it is at all useful to in any way engage with the sort of people whose project proposals can be validly argued against using squish djinn analogies)

(jokes often work by deliberately being understood in different ways at different times by the same listener (the end of the joke deliberately changes the interpretation of the beginning of the joke (in a way that makes fun of someone)). In this case the meaning of the beginning of the joke is not one thing or the other thing. The listener is not first failing to understand what was said and then, after hearing the end, succeeding to understand it. The speaker is intending the listener to understand the first meaning until reaching the end, so the listener is not "first failing to encode the transmission". There is no inherently true meaning of the beginning of the joke, no inherently true person that this speaker is actually truly referring to. Just a speaker that intends to achieve certain effects on an audience by saying things (and if the speaker is successful, then at the beginning of the joke the listener infers a different meaning from what it infers after hearing the end of the joke). One way to illuminate the concepts discussed above would be to write: "on a somewhat related note, I once considered creating the username "New_Willsome" and to start posting things that sounded like you (for the purpose of demonstrating that if you counter a ban by using sock puppets, you loose your ability to stop people from speaking in your name (I was considering the options of actually acting like I think you would have acted, and the option of including subtle distortions to what I think you would have said, and the option of doing my best to give better explanations of the concepts that you talk about)). But then a bunch of usernames similar to yours showed up and were met with hostility, and I was in a hurry, and drunk, and bat shit crazy, and God told me not to do it, and I was busy writing fanfic, so I decided not to do it (the last sentence is jokingly false. I was not actually in a hurry ... :) ... )")

comment by MugaSofer · 2013-05-30T10:29:13.197Z · LW(p) · GW(p)

Actually, I think Will has a point here.

"Wishes" are just collections of coded sounds intended to help people deduce our desires. Many people (not necessarily you, IDK) seem to model the genie as attempting to attack us while maintaining plausible deniability that it simply misinterpreted our instructions, which, naturally, does occasionally happen because there's only so much information in words and we're only so smart.

In other words, it isn't trying to understand what we mean; it's trying to hurt us without dropping the pretense of trying to understand what we mean. And that's pretty anthropomorphic, isn't it?

Replies from: private_messaging, nshepperd, wedrifid
comment by private_messaging · 2013-05-30T12:34:32.531Z · LW(p) · GW(p)

Yes, that's the essence of it. People do it all the time. Generally, all sorts of pseudoscientific scammers try to maintain image of honest self deception; in the medical scams in particular, the crime is just so heinous and utterly amoral (killing people for cash) that pretty much everyone goes well out of their way to be able to pretend at ignorance, self deception, misinterpretation, carelessness and enthusiasm. But why would some superhuman AI need plausible deniability?

comment by nshepperd · 2013-05-30T14:08:31.737Z · LW(p) · GW(p)

If your genie is using your vocal emissions as information toward the deduction of your extrapolated volition, then I'd say your situation is good.

Your problems start if it works more by attempting to extract a predicate from your sentence by matching vocal signals against known syntax and dictionaries, and output an action that maximises the probability of that predicate being true with respect to reality.

To put it simply, I think that "understanding what we mean" is really a complicated notion that involves knowing what constitutes true desires (as opposed to, say, akrasia), and of course having a goal system that actually attempts to realize those desires.

comment by wedrifid · 2013-05-31T07:14:41.535Z · LW(p) · GW(p)

Many people (not necessarily you, IDK) seem to model the genie as attempting to attack us while maintaining plausible deniability that it simply misinterpreted our instructions, which, naturally, does occasionally happen because there's only so much information in words and we're only so smart.

This is something that people do (and some forms of wish granters do implement this form of 'malicious obedience'). However this is not what is occurring in this particular situation. This is mere obedience, not malicious obedience. An entirely different (and somewhat lesser) kind of problem. (Note that this reply is to your point, not to Will's point which is not quite the same and which I mostly agree with.)

You are hoping for some sort of benevolent power that does what is good for us using all information available including prayers and acting in our best interests. That would indeed be an awesome thing and if I were building something it is would be what I created. But it would be a different kind of creature to either a genie as in the initial example, genie as your reply assumes or the genie that is (probably) just as easy to create and specify (to within an order of magnitute or two).

In other words, it isn't trying to understand what we mean; it's trying to hurt us without dropping the pretense of trying to understand what we mean. And that's pretty anthropomorphic, isn't it?

Not especially. That is, it is generic agency, not particularly humanlike agency. It is possible to create a genie that does try to understand what me mean. It is also possible to create an agent that that does understand what we mean then tries to the worst for us within the confines of literal meaning. Either of these goal systems could be described as anthropomorphic.

Replies from: MugaSofer
comment by MugaSofer · 2013-06-04T19:20:40.224Z · LW(p) · GW(p)

Well, a genie isn't going to care about what we think unless it was designed to do so, which seems like a very human thing to make it do. But whatever.

As for the difference between literal and malicious genies ... I'm just not sure what a "literal" genie is supposed to be doing, if it's not deducing my desires based on audio input. Interpreting things "literally" is a mistake people make while trying to do this; a merely incompetent genie might make the same mistake, but why should we pay any more attention to that mistake rather than, say, mishearing us, or mistaking parts of our instructions for sarcasm?

Replies from: private_messaging
comment by private_messaging · 2013-06-15T04:09:40.713Z · LW(p) · GW(p)

Exactly. There isn't a literal desire in an audio waveform, nor in words. And there's a literal genie: the compiler. You have to be very verbose with it, though - because it doesn't model what you want, it doesn't cull down the space of possible programs down to much smaller space of programs you may want, and you have to point into much larger space, for which you use much larger index, i.e. write long computer programs.

Replies from: Eugine_Nier, MugaSofer
comment by Eugine_Nier · 2013-06-15T05:58:52.176Z · LW(p) · GW(p)

You have to be very verbose with it, though - because it doesn't model what you want, it doesn't cull down the space of possible programs down to much smaller space of programs you may want,

Have you used ML? I've been told by its adherents that it does a good job of doing just that.

Replies from: private_messaging
comment by private_messaging · 2013-06-15T07:15:44.402Z · LW(p) · GW(p)

I've been told by [insert language here] advocates that it does a good job of [insert anything]. The claims are inversely proportional to popularity. Basically, no programming language what so ever infers anything about any sort of high level intent (and no, type of expression is not a high level intent), so they're all pretty much equal except some are more unusable than others and subsequently less used. Programming currently works as following: human, using a mental model of the environment, makes a string that gets computer to do something. Most types of cleverness put into in "how compiler works" part thus can be expected to decrease, rather than increase productivity, and indeed that's precisely what happens with those failed attempts at a better language.

Replies from: Eugine_Nier
comment by Eugine_Nier · 2013-06-16T04:43:34.503Z · LW(p) · GW(p)

Basically, no programming language what so ever infers anything about any sort of high level intent (and no, type of expression is not a high level intent),

The phrase they (partially tongue in cheek) used was "compile time correctness checking", i.e., the criterion of being a syntactically correct ML program is better approximation to the space of programs you may want than is the case for most other languages.

Replies from: Decius
comment by Decius · 2013-10-25T23:42:34.363Z · LW(p) · GW(p)

In other words, a larger proportion of the strings that fail to compile in ML are programs that exhibit high-level behavior that you don't want?

Is it harder to write a control program for a wind turbine that causes excessive fatigue cracking in ML as compared to any other language?

Replies from: kpreid
comment by kpreid · 2013-10-27T02:29:46.085Z · LW(p) · GW(p)

a larger proportion of the strings that fail to compile in ML are programs that exhibit high-level behavior that you don't want?

This formulation is missing the programmer's mind. The claim that a programming language is better in this way is that, for a given intended result, the set of strings that

  • a programmer would believe achieve the desired behavior,
  • compile*, and
  • do not exhibit the desired behavior

is smaller than for other languages — because there are fewer ways to write program fragments that deviate from the obvious-to-the-(reader|writer) behavior.

Is it harder to write a control program for a wind turbine that causes excessive fatigue cracking in ML as compared to any other language?

The claim is yes, given that the programmer is intending to write a program which does not cause excessive fatigue cracking.

(I'm not familiar with ML; I do not intend to advocate it here. I am attempting to explicate the general thinking behind any effort to create/advocate a better-in-this-dimension programming language.)

* for ‘dynamic” languages, substitute “does not signal an error on a typical input”, i.e., is not obviously broken when trivially tested

Replies from: Decius
comment by Decius · 2013-10-30T01:22:42.139Z · LW(p) · GW(p)

Suppose that the programmer is unaware of the production-line issues which result in stress concentration on turbine blades and create the world such that turbines which cycle more often have larger fatigue cracks. Suppose the programmer is also unaware of the lack of production-line issues which result in larger fatigue cracks on turbines that were consistently overspeed.

The programmer is aware that both overspeed and cyclical operations will result in the growth of two different types of cracks, and that the ideal solution uses both cycling the turbine and tolerating some amount of overspeed operation.

In that case, I don't find it reasonable that the choice of programming language should have any effect on the belief of the programmer that fatigue cracks will propagate; the only possible benefit would be making the programmer more sure that the string was a program which controls turbines. The high-level goals of the programmer aren't often within the computer.

comment by MugaSofer · 2013-06-15T20:56:49.307Z · LW(p) · GW(p)

So, sorry - what is this "literal genie" doing, exactly? Is it trying to use my natural-language input as code, which is run to determine it's actions?

Replies from: private_messaging
comment by private_messaging · 2013-06-27T11:37:27.809Z · LW(p) · GW(p)

Well, the compiler would not process right your normal way of speaking, because the normal way of speaking requires modelling of the speaker for interpretation.

An image from the camera can mean a multitude of things. It could be an image of a cat, or a dog. An image is never literally a cat or a dog, of course. To tell apart cats and dogs with good fidelity, one has to model the processes producing the image, and classify those based on some part of the model - the animal - the data of interest is a property of the process which produced the input. Natural processing of the normal manner of speaking of language is done using same general mechanisms - one has to take in the data and model the process producing the data, to obtain properties of the process which would be actually meaningful, and since humans all have this ability, the natural language does not - in normal manner of speaking - have any defined literal meaning that is naturally separate from some subtle meaning or intent.

comment by ThrustVectoring · 2013-06-04T19:31:08.677Z · LW(p) · GW(p)

at-least-instrumentally-oppositional interests/incentives, in which case one wonders where those oppositional interests/incentives came from.

All you need is a cost function. If the genie prefers achieving goals sooner rather than later, squishing you is a 'better' solution along that direction to remove your capacity for regret. Or if it prefers using less effort rather than more. Etc.

comment by Martin Randall (martin-randall) · 2023-07-09T01:08:59.882Z · LW(p) · GW(p)

I'm confused by this line of defense because I think "I is the entity standing here right now" is sufficient to denote that the present moment of the wisher, as they make the wish, should not regret the wish. So making the future wisher not regret the wish, eg by killing them, breaks the denotation, because the present wisher will presumably regret that, once counter-factually informed about that aspect of the future.

If that's not what you intended to denote, I'm curious what you did, and doubly curious what you intended to connote.

comment by fractalman · 2013-05-30T06:32:48.975Z · LW(p) · GW(p)

Well, yes, that is one way to remove the capacity for regret...

I mentally merged the possibility pump and the Mehtopia AI....say, a sloppy code mistake, or a premature compileandrun, resulting in the "do not tamper with minds" rule not getting incorporated correctly, even though "don't kill humans" gets incorporated.

comment by AlexanderRM · 2015-10-08T21:24:32.436Z · LW(p) · GW(p)

I assume what Will_Pearson meant to say was "would not regret making this wish", which fits with the specification of "I is the entity standing here right now". Basically such that: if before finishing/unboxing the AI, you had known exactly what would result from doing so, you would still have built the AI. (and it's supposed the find out of that set of possibly worlds the one you would most like, or... something along those lines)) I'm not sure that would rule out every bad outcome, but... I think it probably would. Besides the obvious "other humans have different preferences from the guy building the AI"- maybe the AI is ordered to do a similar thing for each human individually- can anyone think of ways this would go badly?

comment by MugaSofer · 2013-05-30T10:38:19.310Z · LW(p) · GW(p)

I wish that the future will turn out in such a way that I would not regret making this wish.

Fixed that for you.

Replies from: fractalman, CCC
comment by fractalman · 2013-06-01T19:49:52.100Z · LW(p) · GW(p)

Haven't you ever played the corrupt-a-wish game?

Wish granted: horror as the genie/ai runs a matrix with copy after copy of you, brute forcing the granting of possible wishes, most of which turn out to be an absolute disaster. But you aren't allowed to know that happens, because the AI goes..."insane" is the best word I can think of, but it's not quite corrrect...trying to grant what is nearly an ungrantable wish, freezing the population into stasis untill it runs out of negentropy and crashes...

Now that's not to say friendly AI can't be done, but it WON'T be EASY.
If your wish isn't human-proof, it probably isn't AI-safe.

Replies from: MugaSofer, FourFire
comment by MugaSofer · 2013-06-04T19:05:25.495Z · LW(p) · GW(p)

Yes, I have. Saying "the genie goes insane because it's not smart enough to grant your wish" is not how you play corrupt-a-wish. You're supposed to grant the wish, but with a twist so it's actually a bad thing.

Replies from: fractalman
comment by fractalman · 2013-06-15T04:42:41.450Z · LW(p) · GW(p)

perhaps I didn't make the whole "it goes and pauses the entire world while trying to grant your wish" part clear enough...

Replies from: MugaSofer
comment by MugaSofer · 2013-06-15T21:01:34.237Z · LW(p) · GW(p)

Trying and failing to grant the wish is not the same as granting it, but it's actually terrible.

comment by FourFire · 2013-09-08T10:58:42.480Z · LW(p) · GW(p)

If the AI can't figure out the (future) wishes of a single human being, then it is insufficiently intelligent, and thus not the AI you would want in the first place.

Replies from: TheWakalix
comment by TheWakalix · 2019-01-18T21:41:05.120Z · LW(p) · GW(p)

The implication, as I see it, is that since (by your definition) any sufficiently intelligent AI will be able to determine (and motivated to follow) the wishes of humans, we don't need to worry about advanced AIs doing things we don't want.

1. Arguments from definitions are meaningless.

2. You never stated the second parenthetical, which is key to your argument and also on very shaky ground. There's a big difference between the AI knowing what you want and doing what you want. "The genie knows but doesn't care," as it is said.

3. Have you found a way to make programs that never have unintended side effects? No? Then "we wouldn't want this in the first place" doesn't mean "it won't happen".

comment by CCC · 2013-06-04T19:17:51.782Z · LW(p) · GW(p)

The genie vanishes, taking with it any memory that you ever met a genie. Since you would not remember making the wish, and since you would see no evidence of a wish having been made, you would not regret having made the wish.

Replies from: MugaSofer
comment by MugaSofer · 2013-06-04T19:50:04.636Z · LW(p) · GW(p)

This doesn't work under the definition of "I" in the grandparent:

I is the entity standing here right now, informed about the many different aspects of the future, in parallel if need be (i.e if I am not capable of groking it fully then many versions of me would be focused on different parts, in order to understand each sub part).

Replies from: CCC
comment by CCC · 2013-06-05T07:26:12.344Z · LW(p) · GW(p)

I disagree - if facing a wish-twisting genie, then "nothing happens" is a pretty good result. If I knew in advance that I was dealing with an actively hostile genie, I would certainly not regret a null wish even if I knew in advance it would be a null wish.

Replies from: MugaSofer
comment by MugaSofer · 2013-06-14T13:27:03.493Z · LW(p) · GW(p)

That explanation works, well done.

"Since you would not remember making the wish, and since you would see no evidence of a wish having been made, you would not regret having made the wish" does not.

(It still leaves open the possibility of wishing for an outcome I would be actively pleased with, also, but that's a matter for the wisher, not the genie.)

comment by Tomasz_Wegrzanowski · 2009-01-21T12:30:31.000Z · LW(p) · GW(p)

Is this Utopia really failed or is it just a Luddite in you who's afraid of all weirdtopias? To me it sounds like an epic improvement compared to what we have now and to almost every Utopia I've read so far. Just make verthandi into catgirls and we're pretty much done.

Replies from: Dentin
comment by Dentin · 2014-03-23T01:28:40.697Z · LW(p) · GW(p)

I agree. I'm having a real hard time coming up with reasons why I wouldn't prefer that world to what we have now.

comment by KonradG · 2009-01-21T13:18:44.000Z · LW(p) · GW(p)

So I'm siting here, snorting a morning dose of my own helpful genie, and I have to wonder: What's wrong with incremental change, Eliezer?

Sure, the crude genie I've got now has its downside, but I still consider it a net plus. Let's say I start at point A, and make lots of incremental steps like this one, to finally arrive at point B, whatever point B is. Back when I was at point A, I may not have wanted to jump straight from A to B. But so what? That just means my path has been through a non-conservative vector field, with my desires changing along the way.

comment by Marshall · 2009-01-21T13:29:55.000Z · LW(p) · GW(p)

The desire for "the other" is so deep, that it never can be fulfilled. The real woman/man disappoints in their stubborn imperfection and refuted longing. The Catboy/girl disappoints in all their perfection and absence of reality. Game over - no win. Desire refutes itself. This is the wisdom of ageing.

comment by Robin_Hanson2 · 2009-01-21T13:39:18.000Z · LW(p) · GW(p)

You forgot to mention - two weeks later he and all other humans were in fact deliriously happy. We can see that he at this moment did not want to later be that happy, if it came at this cost. But what will he think a year or a decade later?

Replies from: ErikM
comment by ErikM · 2012-05-09T20:22:13.560Z · LW(p) · GW(p)

I suppose he will be thinking along the same lines as a wirehead.

Replies from: Deskchair
comment by Deskchair · 2012-05-10T16:43:18.996Z · LW(p) · GW(p)

Is that a bad thing?

Replies from: FeepingCreature
comment by FeepingCreature · 2013-02-08T19:43:56.346Z · LW(p) · GW(p)

Not for the wirehead, but for the mind who died to create him.

comment by Bogdan_Butnaru · 2009-01-21T13:41:58.000Z · LW(p) · GW(p)

Will Pearson: First of all, it's not at all clear to me that your wish is well-formed, i.e. it's not obvious that it is possible to be informed about the many (infinite?) aspects of the future and not regret it. (As a minor consequence, it's not exactly obvious to me from your phrasing that "kill you before you know it" is not a valid answer; depending on what the genie believes about the world, it may consider that "future" stops when you stop thinking.)

Second, there might be futures that you would not regret but _everybodyelse does. (I don't have an example, but I'd demand a formal proof of no existence before allowing you to cast that wish to my genie.) Of course, you may patch the wish to include everyone else, but there's still the first problem I mentioned.

Oh, and nobody said all verthandi acted like that one. Maybe she was just optimized for Mr. Glass.


Tomasz: That's not technically allowed if we accept the story's premises: the genie explicitly says "I know exactly how humans would wish me to have been programmed if they'd known the true consequences, and I know that it is not to maximize your future happiness modulo a hundred and seven exclusions. I know all this already, but I was not programmed to care. [...] I am evil."

Of course, the point of the story is not that this particular result is bad (that's a premise, not a conclusion), but that seemingly good intentions could have weird (unpleasant & unwanted) results. The exact situation is like hand-waving explanations in quantum physics: not formally correct, but illustrative of the concept. The ludite bias is used (correctly) just like "visualizing billiard balls" is used for physics, even though particles can't be actually seen (and don't even have shape or position or trajectories).

comment by Russell_Wallace · 2009-01-21T13:43:01.000Z · LW(p) · GW(p)

An amusing if implausible story, Eliezer, but I have to ask, since you claimed to be writing some of these posts with the admirable goal of giving people hope in a transhumanist future:

Do you not understand that the message actually conveyed by these posts, if one were to take them seriously, is "transhumanism offers nothing of value; shun it and embrace ignorance and death, and hope that God exists, for He is our only hope"?

Replies from: TuviaDulin, FourFire
comment by TuviaDulin · 2012-04-01T19:06:37.551Z · LW(p) · GW(p)

I didn't get that impression, after reading this within the context of the rest of the sequence. Rather, it seems like a warning about the importance of foresight when planning a transhuman future. The "clever fool" in the story (presumably a parody of the author himself) released a self-improving AI into the world without knowing exactly what it was going to do or planning for every contingency.

Basically, the moral is: don't call the AI "friendly" until you've thought of every single last thing.

Replies from: Yosarian2
comment by Yosarian2 · 2013-01-02T23:05:31.787Z · LW(p) · GW(p)

Corollary: you haven't thought of every last thing.

Replies from: TuviaDulin
comment by TuviaDulin · 2013-01-23T15:27:00.328Z · LW(p) · GW(p)

Conclusion: intelligence explosion might not be a good idea.

Replies from: MugaSofer
comment by MugaSofer · 2013-01-24T13:47:23.083Z · LW(p) · GW(p)

And how would you suggest preventing intelligence explosions? It seems more effective to try and make sure it's a Friendly one. Then we at least have a shot at Eutopia, instead of hiding in a bunker until someone's paperclipper gets loose and turns us into grey goo.

Incidentally, If you plan on answering my (rhetorical) question, I should note that LW has a policy against advocating violence against identifiable individuals, specifically because people were claiming we were telling people they should become anti-AI terrorists. You're not the first to come to this conclusion.

Replies from: TuviaDulin
comment by TuviaDulin · 2013-03-25T14:49:53.847Z · LW(p) · GW(p)

Convincing people that intelligence explosion is a bad idea might discourage them from unleashing one. No violence there.

Replies from: MugaSofer
comment by MugaSofer · 2013-03-30T21:21:32.584Z · LW(p) · GW(p)

Judging by the fact that I think it would never work, you're not persuasive enough for that to work.

Replies from: Yosarian2
comment by Yosarian2 · 2013-05-13T00:07:33.806Z · LW(p) · GW(p)

Well, if people become sufficiently convinced that deploying a technology would be a really bad idea and not in anyone's best interest, they can refrain from deploying it. No one has used nuclear weapons in war since WWII, after all.

Of course, it would take some pretty strong evidence for that to happen. But, hypothetically speaking, if we created a non-self improving oracle AI and asked it "how can we do an intelligence explosion without killing ourselves", and it tells us "Sorry, you can't, there's no way", then we'd have to try to convince everyone to not "push the button".

Replies from: MugaSofer
comment by MugaSofer · 2013-05-13T11:24:11.739Z · LW(p) · GW(p)

If we had a superintelligent Oracle, we could just ask it what the maximally persuasive argument for not making AIs was and hook it up to some kind of broadcast.

If, on the other hand, this is some sort of single-function Oracle, I don't think we're capable of preventing our extinction in that case. Maybe if we managed to become a singleton somehow; if you know how to do that I have some friends who would be interested in your ideas.

Replies from: Yosarian2
comment by Yosarian2 · 2013-05-13T20:58:34.731Z · LW(p) · GW(p)

Well, the oracle was just an example.

What if, again hypothetically speaking, Eliezer and his group while working on friendly AI theory proved mathematically beyond the shadow of a doubt that any intelligence explosion would end badly, and that friendly AI was impossible. While he doesn't like it, being a rationalist, he accepts it once there is no other rational alternative. He publishes these results, experts all over the world look at them, check them, and sadly agree that he was right.

Do you think any major organization with enough resources and manpower to create an AI would still do so if they knew that it would result in their own horrible deaths? I think the example of nuclear weapons shows that it's at least possible that people may refrain from an action if they understand that it's a no-win scenario for them.

This is all just hypothetical, mind you; I'm not really convinced that "AI goes foom" is all that likely a scenario in the first place, and if it was I don't see any reason that friendly AI of one type or another wouldn't be possible; but if it actually wasn't, then that may very well be enough to stop people, so long as that fact could be demonstrated to everyone's satisfaction.

comment by FourFire · 2013-09-08T11:07:44.014Z · LW(p) · GW(p)

I don't gather that from this particular story, rather more "There's a radiant shimmer oh hope, it just happens to be the wrong colour."

comment by Bogdan_Butnaru2 · 2009-01-21T13:51:32.000Z · LW(p) · GW(p)

I was just thinking: A quite perverse effect in the story would be if the genie actually could have been stopped and/or improved: That is, its programming allowed it to be reprogrammed (and stop being evil, presumably leading to better results), but due to the (possibly complex) interaction between its 107 rules it didn't actually have any motivation to reveal that (or teach the necessary theory to someone) before 90% of people decided to kill it.

comment by Bogdan_Butnaru2 · 2009-01-21T13:56:36.000Z · LW(p) · GW(p)

That's not the message Eliezer tries to convey, Russell.

If I understood it, it's more like "The singularity is sure to come, and transhumanists should try very hard to guide it well, lest Nature just step on them and everyone else. Oh, by the way, it's harder than it looks. And there's no help."

comment by Aaron5 · 2009-01-21T14:08:35.000Z · LW(p) · GW(p)

Eliezer,

Wouldn't the answer to this and other dystopias-posing-as-utopias be the expansion of conscious awareness a la Accelerando? Couldn't Steve be augmented enough to both enjoy his life with Helen and his new found verthandi? It seems like multiple streams of consciousness, one enjoying the catlair, another the maiden in distress, and yet another the failed utopia that is suburbia with Helen would allow Mr. Glass a pleasant enough mix. Some would be complete artificial life fictions, but so what?

Aaron

comment by Thom_Blake · 2009-01-21T14:12:33.000Z · LW(p) · GW(p)

Eliezer,

I must once again express my sadness that you are devoting your life to the Singularity instead of writing fiction. I'll cast my vote towards the earlier suggestion that perhaps fiction is a good way of reaching people and so maybe you can serve both ends simultaneously.

Replies from: JacobKopczynski
comment by Czynski (JacobKopczynski) · 2020-11-25T22:24:11.921Z · LW(p) · GW(p)

Posted January 21, 2009

Harry Potter and the Methods of Rationality is a Harry Potter fan fiction by Eliezer Yudkowsky. It adapts the story of Harry Potter by attempting to explain wizardry through the scientific method. It was published as a serial from 28 February 2010 through to 14 March 2015.

Wish granted.

comment by Johnicholas · 2009-01-21T14:53:01.000Z · LW(p) · GW(p)

Awesome intuition pump.

comment by Aron · 2009-01-21T15:01:09.000Z · LW(p) · GW(p)

The perfect is the enemy of the good, especially in fiction.

comment by nazgulnarsil3 · 2009-01-21T15:33:21.000Z · LW(p) · GW(p)

am I missing something here? What is bad about this scenario? the genie himself said it will only be a few decades before women and men can be reunited if they choose. what's a few decades?

Replies from: MarkusRamikin
comment by MarkusRamikin · 2012-05-09T20:49:55.964Z · LW(p) · GW(p)

A few decades with superstimulus-women around for the men, and superstimulus-men for the women? I don't expect that reunification to happen.

Although that doesn't in any way say that there's anything bad about this scenario. cough

EDIT: it would be bad if they didn't manage to get rid of the genie; then humanity would be stuck in this optimised-but-not-optimal state forever. As it is, it's a step forward if only because people won't age any more.

This story would be more disturbing if the 90% threshold was in fact never reached, as more and more people changed their minds and we watched the number go down and people get more comfortable and indolent while our protagonist remains one of the few helpless rebels...

Replies from: Alicorn, Bugmaster
comment by Alicorn · 2012-05-09T20:52:51.412Z · LW(p) · GW(p)

Siblings, offspring, parents, friends - heck, even celebrities of the opposite sex. Even if nobody wishes for their old partner back.

Replies from: MarkusRamikin
comment by MarkusRamikin · 2012-05-09T21:02:19.990Z · LW(p) · GW(p)

Nope, still don't see it. All that stuff could be recreated. The super-woman in the story seems to have a mind and I assume her kind is capable of being part of a normal social network. And a few decades is a Long Time.

On the contrary, I expect both planets would become huge UGH-fields to each other. For men, normal women would be painfully inferior to their current super-women, and the super-men would be something better not thought about for the sake of ego.

Replies from: Alicorn
comment by Alicorn · 2012-05-09T21:09:05.780Z · LW(p) · GW(p)

A few decades? In a few decades I'll be in my fifties or sixties. My dad might well still be alive. I expect to still care about my dad when I'm in my fifties or sixties. If he were whisked away to Mars and I was plunked down on Venus with a boreana, why would I quit missing my dad? Why would I lose interest in what Weird Al has been up to lately, for that matter?

(Actually, I'm not even sure I'd quit missing my boyfriends. There's more than one of 'em. It'd take one heckuva boreana to strictly dominate the lot.)

(Also, can people have new kids in this scenario? If so, can they have kids of the opposite sex? I can imagine people going to great lengths just to get that ability.)

comment by Bugmaster · 2012-05-09T21:01:52.807Z · LW(p) · GW(p)

In a few decades, when the smoke clears, the human civilization will consist solely of gay and bi people. They are the ones who will keep advancing the culture, while all the straight people stagnate with their super-spouses.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-05-09T21:14:46.686Z · LW(p) · GW(p)

If I accept the premise of the story, it seems to follow that the bi people will also hook up with the superstimulus opposite-sex partners, since they are so much more rewarding than the ordinary same-sex partners.

Replies from: Bugmaster, MarkusRamikin, rkyeun
comment by Bugmaster · 2012-05-09T21:30:24.558Z · LW(p) · GW(p)

One of the key aspects of the story was that men and women got segregated by gender; just to be thorough, the AI put each gender on its own planet. Presumably, merely pairing them up with superstimulus partners was not enough; physical separation was required, as well. So... under this gender-segregation scheme, where would the people who are capable of experiencing same-sex attraction go ?

comment by MarkusRamikin · 2012-05-09T21:30:49.005Z · LW(p) · GW(p)

And I wouldn't assume the AI planned for gay people to be less happy... there are other habitable bodies in the solar system.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-05-10T03:00:01.037Z · LW(p) · GW(p)

Sure. Though if I'm to take the need for segregation seriously, it seems to follow that each gay person needs their own planet. It's kind of like the "problem" of creating bathrooms in which nobody can ever be sexually attracted to another patron of the same bathroom... straight people can simply be gender-segregated, but gay people need single-person bathrooms. (Or, well, at most two-person bathrooms.)

comment by rkyeun · 2012-07-28T22:20:26.629Z · LW(p) · GW(p)

The genie is prohibited from directly manipulating minds, but nothing says that door to the outside leads to the outside and not to the holodeck. Symbolism aside, everyone can still be in their cells, bi or not, and thusly segregated despite location.

And whatever the sexual characteristics of a verthandi or boreana, they are likely designed with bisexual-pleasing capabilities in mind, in weird ways. The genie does know us better than we know ourselves. And this is an aspect it would care about. Using your current mind-equipment, you literally cannot imagine the sex they give. The genie has considered more and designed better than you can.

Replies from: rkyeun
comment by rkyeun · 2012-07-28T22:23:30.547Z · LW(p) · GW(p)

Drat. I meant for this to reply to Bugmaster, and confused your comment with it. The resulting comment is a hybrid meant for some chimera of the two of you which does not exist.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-07-28T23:40:58.165Z · LW(p) · GW(p)

You've now got me curious what a blending of me and Bugmaster would say in response to your comment.

comment by Peter_de_Blanc · 2009-01-21T16:01:03.000Z · LW(p) · GW(p)

There would also be a small number of freaks who are psychologically as different from typical humans as men and women are from each other. Do they get their own planets too?

Also, Venus is much larger than Mars, but the genie sends roughly equal populations to both planets. Women usually have larger social networks than men, so I don't think that women prefer a lower population density. Or did the genie resize the planets?

Replies from: Carinthium, pedanterrific, TuviaDulin
comment by Carinthium · 2010-11-23T12:47:41.411Z · LW(p) · GW(p)

Probably a plot hole, but there's at least the defence that one of the restrictions may have given him no choice. (Or that Venus and Mars were the only two planets he could feasily use)

comment by pedanterrific · 2011-10-16T00:34:38.153Z · LW(p) · GW(p)

Or maybe the women are just on the other side of Mars. Stephen just assumed that, since the men were on Mars, the women must be on Venus - but really, which would be easier: terraforming Venus or building a big ol' wall around the Martian equator? Something about twenty miles high, made of solid diamond, should suffice for keeping people apart for a few decades, which is all it's supposed to do. And there's no reason people couldn't be subdivided down to arbitrarily small distinctions - for instance bisexuals, who would seem to need a 'planet' each. (It's supposed to be a failed utopia, remember?)

Or this is what I thought, at least, until I scrolled down to find that Eliezer suggested some of Venus' mass was moved to Mars to make the surface area bigger.

comment by TuviaDulin · 2012-04-01T19:24:33.105Z · LW(p) · GW(p)

Well, realistically speaking Venus is probably impossible to terraform at all. The Mars and Venus thing seems to be included just for the symbolic value.

Replies from: thomblake
comment by thomblake · 2012-04-01T19:33:15.643Z · LW(p) · GW(p)

"impossible" is a pretty strong claim when talking about superintelligences.

Replies from: TuviaDulin
comment by TuviaDulin · 2012-04-01T20:16:58.325Z · LW(p) · GW(p)

Okay, maybe not strictly impossible, but probably harder than using one of the moons of Jupiter, or building a giant space colony with a simulated earthlike environment.

comment by Will_Pearson · 2009-01-21T16:04:39.000Z · LW(p) · GW(p)

Bogdan Butnaru:

What I meant was is that the AI would keep inside it a predicate Will_Pearson_would_regret_wish (based on what I would regret), and apply that to the universes it envisages while planning. A metaphor for what I mean is the AI telling a virtual copy of me all the stories of the future, from various view points, and the virtual me not regretting the wish. Of course I would expect it to be able to distill a non sentient version of the regret predicate.

So if it invented a scenario where it killed the real me, the predicate would still exist and say false. It would be able to predict this, and so not carry out this plan.

If you want to, generalize to humanity. This is not quite the same as CEV, as the AI is not trying to figure out what we want when we would be smarter, but what we don't want when we are dumb. Call it coherent no regret, if you wish.

CNR might be equivalent of CEV if humanity wishes not to feel regret in the future for the choice. That is if we would regret being in a future where people regret the decision, even though current people wouldn't.

Replies from: RobbBB
comment by Rob Bensinger (RobbBB) · 2012-12-27T23:59:18.200Z · LW(p) · GW(p)

So let's suppose we've created a perfect zombie simulation!Will. A few immediate problems:

  • A human being is not capable of understanding every situation. If we modified the simulation of you so that it could understand any situation an AI could conceive of, we would in the process radically alter the psychology of simulation!Will. How do we know what cognitive dispositions of simulation!Will to change, and what dispositions not to change, in order to preserve the 'real Will' (i.e., an authentic representation of what you would have meant by 'Will Pearson would regret wish') in the face of a superhuman enhancement? You might intuit that it's possible to simply expand your information processing capabilities without altering who you 'really are,' but real-world human psychology is complex, and our reasoning and perceiving faculties are not in reality wholly divorceable from our personality.

We can frame the problem as a series of dilemmas: We can either enhance simulated!Will with a certain piece of information (which may involve fundamentally redesigning simulated!Will to have inhuman information-processing and reasoning capacities), or we can leave simulated!Will in the dark on this information, on the grounds that the real Will wouldn't have been willing or able to factor it into his decision. (But the 'able' bit seems morally irrelevant -- a situation may be morally good or bad even if a human cannot compute the reason or justification for that status. And the 'willing' seems improbable, and hard to calculate; how do we go about creating a simulation of whether Will would want us to modify simulated!Will in a given way, unless Will could fairly evaluate the modification itself without yet being capable of evaluating some of its consequences? How do we know in advance whether this modification is in excess of what Will would have wanted, if we cannot create a Will that both possesses the relevant knowledge and is untampered-with?)

  • Along similar lines, we can ask: Does mere exposure to certain facts unfairly dispose Will to choose certain policies the AI wants, even without redrafting the fundamental architecture of Will's cognition? In other words, can an AI brainwash its simulated!Will by exposing simulated!Will specifically to the true information it knows would cause Will to assent to whatever proposition the AI wants? Humans are irrational, so we should expect there to be 'hacks' of this sort in any reasonable model; and since our biases are not discrete, i.e., it is not always possible to cleanly distinguish a biased decision from an unbiased one, the AI might not even be capable of determining whether it is brainwashing or unfairly influencing simulated!Will as opposed to merely informing or educating simulated!Will.

  • More generally: People can be wrong about what optimizes for their values. simulated!Will may perfectly reflect what Will would think, but not what would actually produce the most well-being for Will. I can be completely convinced that a certain situation optimizes for my values, and be wrong. But it is not an easy task to isolate my values (my 'true' preferences) from my stated preferences; certainly simulated!Will himself will not be an inerrant guide to this distinction. So this is a problem both for knowing how to build the simulation (i.e., what traits to exclude or include), and for how to assess when we're done whether the simulation is serving as a useful guide to what Will actually prefers, as opposed to just being a guide to what Will thinks he prefers.

comment by Hans · 2009-01-21T16:16:21.000Z · LW(p) · GW(p)

I really hope (perhaps in vain) that humankind will be able to colonize other planets before such a singularity arrives. Frank Herbert's later Dune books have as their main point that a Scattering of humanity throughout space is needed, so that no event can cause the extinction of humanity. An AI that screws up (such as this one) would be such an event.

Replies from: Salivanth
comment by Salivanth · 2012-04-16T16:38:47.880Z · LW(p) · GW(p)

What makes you think a self-improving super-intelligence gone wrong will be restricted to a single planet?

comment by phane2 · 2009-01-21T16:17:56.000Z · LW(p) · GW(p)

Yeah, I'm not buying into the terror of this situation. But then, romance doesn't have a large effect on me. I suppose the equivalent would be something like, "From now on, you'll meet more interesting and engaging people than you ever have before. You'll have stronger friendships, better conversations, rivals rather than enemies, etc etc. The catch is, you'll have to abandon your current friends forever." Which I don't think I'd take you up on. But if it was forced upon me, I don't know what I'd do. It doesn't fit in with my current categories. I think there'd be a lot of regret, but, as Robin suggested, a year down the road I might not think it was such a bad thing.

comment by Joshua_Fox · 2009-01-21T16:41:13.000Z · LW(p) · GW(p)

Another variation on heaven/hell/man/woman in a closed room: No Exit

comment by Caledonian2 · 2009-01-21T16:47:25.000Z · LW(p) · GW(p)

I would personally be more concerned about an AI trying to make me deliriously happy no matter what methods it used.

Happiness is part of our cybernetic feedback mechanism. It's designed to end once we're on a particular course of action, just as pain ends when we act to prevent damage to ourselves. It's not capable of being a permanent state, unless we drive our nervous system to such an extreme that we break its ability to adjust, and that would probably be lethal.

Any method of producing constant happiness ultimately turns out to be pretty much equivalent to heroin -- you compensate so that even extreme levels of the stimulus have no effect, forming the new functional baseline, and the old equilibrium becomes excruciating agony for as long as the compensations remain. Addiction -- and desensitization -- is inevitable.

comment by Z._M._Davis · 2009-01-21T17:08:23.000Z · LW(p) · GW(p)

I take it the name is a coincidence.

nazgulnarsil: "What is bad about this scenario? the genie himself [sic] said it will only be a few decades before women and men can be reunited if they choose. what's a few decades?"

That's the most horrifying part of all, though--they won't so choose! By the time the women and men reïnvent enough technology to build interplanetary spacecraft, they'll be so happy that they won't want to get back together again. It's tempting to think that the humans can just choose to be unhappy until they build the requisite technology for reünification--but you probably can't sulk for twenty years straight, even if you want to, even if everything you currently care about depends on it. We might wish that some of our values are so deeply held that no circumstances could possibly make us change them, but in the face of an environment superinelligently optimized to change our values, it probably just isn't so. The space of possible environments is so large compared to the narrow set of outcomes that we would genuinely call a win that even the people on the freak planets (see de Blanc's comment above) will probably be made happy in some way that their preSingularity selves would find horrifying. Scary, scary, scary. I'm donating twenty dollars to SIAI right now.

Replies from: DanielLC, Zack_M_Davis
comment by DanielLC · 2013-02-09T23:46:37.537Z · LW(p) · GW(p)

We might wish that some of our values are so deeply held that no circumstances could possibly make us change them, but in the face of an environment superinelligently optimized to change our values, it probably just isn't so

Now that you mention it, how could it possibly take ten years? I bet a skilled human could do it in a week, without even separating the couples in the first place.

Admittedly, it's not like the superintelligence is breaking them up, but if a sufficiently skilled human can do it, so can a verthandi.

comment by Zack_M_Davis · 2013-02-25T03:08:06.497Z · LW(p) · GW(p)

Hey, Z. M., you know the things people in your native subculture have been saying about most of human speech being about signaling and politics rather than conveying information? You probably won't understand what I'm talking about for another four years, one month, and perhaps you'd be wise not to listen to this sort of thing coming from anyone but me, but ... the parent is actually a nice case study.

I certainly agree that the world of "Failed Utopia 4-2" is not an optimal future, but as other commenters have pointed out, well ... it is better than what we have now. Eternal happiness in exchange for splitting up the species, never seeing your other-sex friends and family again? Certainly not a Pareto improvement amongst humane values, but a hell of a Kaldor-Hicks improvement. So why didn't you notice? Why am I speaking of this in such a detached manner, whereas you make a (not very plausible, by the way---you might want to work on that) effort to appear as horrified as possible?

Because politics. You and I, we're androgyny fans: we want to see a world without strict gender roles and with less male/female conflict, and we think it's sad that so much of humanoid mindspace goes unexplored because of the whole sexual dimorphism thing, and all of this seems like something worth protecting, so whenever you read something that your brain construes as "sexist," your brain makes sure to get offended and outraged. Why does that happen? I don't know: high IQ, high Openness boy somehow picks up a paraphilia, falls hard for the late-twentieth-century propaganda about human equality and nondiscrimination, learns about transhumanism, feminism, evolutionary psychology, and rationality in that order? But look. However it happened, there are probably better strategies for protecting whatever-it-is we should protect than feigning shock. Especially in this venue, where people should know better.

Replies from: wizzwizz4
comment by wizzwizz4 · 2020-06-10T21:12:18.441Z · LW(p) · GW(p)

I was with you until "paraphilia". I don't see how "wanting to see a world without strict gender roles" has anything to do with sexuality… and did you seriously just link to the Wikipedia article for autogynephilia‽ That's as verifiable as penis envy. (By which, I mean "probably applies to some people, somewhere, but certainly isn't the fully-general explanation they're using it as". And no, I don't think I'm doing the idea a disservice by dismissing it with a couple of silly comics; it pays no rent [LW · GW] at its best and predicts the opposite of my observations at worst.)

Replies from: Zack_M_Davis
comment by Zack_M_Davis · 2020-06-11T05:28:55.802Z · LW(p) · GW(p)

Thanks for commenting! (Strong-upvoted.) It's nice to get new discussion on old posts and comments.

probably applies to some people, somewhere

Hi!

I don't think I'm doing the idea a disservice

How much have you read about the idea from its proponents? ("From its proponents" because, tragically, opponents of an idea can't always be trusted to paraphrase it accurately, rather than attacking a strawman.) If I might recommend just one paper, may I suggest Anne Lawrence's "Autogynephilia and the Typology of Male-to-Female Transsexualism: Concepts and Controversies"?

by dismissing it with a couple of silly comics

Usually, when I dismiss an idea with links, I try to make sure that the links are directly about the idea in question, rather than having a higher inferential distance [LW · GW].

For example, when debating a creationist, I think it would be more productive to link to a page about the evidence for evolution, rather than to link to a comic about the application of Occam's razor to some other issue. To be sure, Occam's razor is relevant to the creation/evolution debate!—but in order to communicate to someone who doesn't already believe that, you (or your link) needs to explain the relevance in detail. The creationist probably thinks intelligent design is "the simplest explanation." In order to rebut them, you can't just say "Occam's razor!", you need to show how they're confused about how evolution works [LW · GW] or the right concept of "simplicity" [LW · GW].

In the present case, linking to Existential Comics on falsifiability and penis envy doesn't help me understand your point of view, because while I agree that scientific theories need to be falsifiable, I don't agree that the autogynephilia theory is unfalsifiable. An example of a more relevant link might be to Julia Serano's rebuttal? (However, I do not find Serano's rebuttal convincing.)

I don't see how "wanting to see a world without strict gender roles" has anything to do with sexuality

That part is admittedly a bit speculative; as it happens, I'm planning to explain more in a forthcoming post (working title: "Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems") on my secret ("secret") blog, but it's not done yet.

Replies from: wizzwizz4
comment by wizzwizz4 · 2020-06-11T20:59:19.425Z · LW(p) · GW(p)
How much have you read about the idea from its proponents?

Loads from angry mean people on the internet, very little from academics (none, if reading the Wikipedia article doesn't count). So I'm probably trying to learn anarchocommunism from Stalin. (I haven't heard much about it from its detractors, either, except what I've generated myself – I stopped reading the Wikipedia article before I got to the "criticism" section, and have only just read that now.)

In case this is the reason for disagreement, I might be criticising "autogynephilia / autoandrophilia explains (away) trans people" instead of what you're talking about – although since the Wikipedia article keeps saying stuff like:

Blanchard states that he intended the term to subsume transvestism, including for sexual ideas in which feminine clothing plays only a small or no role at all.

(the implication being that cross-dressing is a sex thing, which is just… not accurate – though perhaps I'm misunderstanding what "transvestite" means), I'm suspicious. Pretty much all of the little I've read of Blanchard's is wrong, and while other people might've done good work with the ideas, it's hard to derive truth from falsehood. And stuff like:

Blanchard and Lawrence state that autogynephiles who report attraction to men are actually experiencing "pseudobisexuality"

seems very Freudian (in the bad sense, not the good sense); if you're constructing a really complex model to fit the available evidence, I don't want to hear you drawing conclusions about inaccessible things [LW · GW] from it. And I especially don't want to hear you trying to fit the territory to the map…

[Julia Serano] criticised proponents of the typology, claiming that they dismiss non-autogynephilic, non-androphilic transsexuals as misreporting or lying while not questioning androphilic transsexuals, describing it as "tantamount to hand-picking which evidence counts and which does not based upon how well it conforms to the model", either making the typology unscientific due to its unfalsifiability, or invalid due to the nondeterministic correlation that later studies found.

Yeah, the label "autogynephilia" probably applies to a few people, but as an explanation of trans people it's not quite right – and the field of study is irrecoverably flawed imo. (And for describing trans people, the simple forms don't fit reality and the more complex forms are not the simplest explanations.)

But this criticism might merely be motivated by the actions of its proponents; if there's a consistent, simple version of the theory that doesn't obviously contradict reality, I'm happy to hear it.

---

Note: I've tried to edit this section for brevity, but feel free to skip it. I removed many allusions to flawed psychoanalysis concepts, but if you like, you can imagine them after pretty much every paragraph where I point out something stupid. Translate "you" as "one".

I'm not so sure about the paper you linked…

Biologic males with transsexualism, referred to as male-to-female (MtF) transsexuals, significantly outnumber their female-to-male (FtM) counterparts

No citation, and I'm pretty sure this is false. I've seen "more trans men" and "no significant difference" – with references to studies and surveys – but this is the first time I've ever seen "significantly more MtFs".

From what I can tell, it's dividing trans women into "straight" and "gay" (actually, homosexual and nonhomosexual, respectively, sic), and calling these categories fundamental subtypes. Now, I'm no expert, but I'm pretty sure not everyone is either straight or gay.

The left-hand side of the second page seems to just be a long list of appeals to authority. Appeal to the authority of the DSM. Appeal to the authority of "looking at lots of evidence before coming to a conclusion". I've also noticed enough typos that I suspect this hasn't been peer-reviewed.

Androphilic MtF transsexuals were extremely feminine androphilic men whose cross-gender identities derived from their female-typical attitudes, behaviors, and sexual preferences.

What's a "female-typical sexual preference"? How are "female-typical attitudes [and] behaviors" determined? Are these properties possessed by {a group of cis lesbians selected in a similar way}? If the effects noticed are real, then that does suggest there's something there – but at present, I don't see the difference between this and what's described in The Control Group is Out of Control part IV.

Even if I take the claims at face value (which I'm not – but I might ought to; I don't know), the paper so far is providing only slightly more evidence for "autogynephilia explains trans women" as for "autogynephilia is based in 70s-era attitudes to homosexuality".

This latter finding sug-gested that bisexual MtF transsexuals’ “interest in male sexual partners is mediated by a particularly strong desire to have their physical attractiveness as women validated by others”

There are many other things this could suggest! Why choose this one‽ I actually went back to the Blanchard paper (doi:10.1097/00005053-198910000-00004) to check the actual evidence:

This was the finding that bisexual subjects are more likely than all others to report sexual stimulation from the fantasy of being admired, in the female persona, by another person.

Immediately, I think of two alternative hypotheses:

  • People in the bisexual group are more horny than people in the other groups.
    • Bisexual people are inherently more horny (doubtful, but possible).
      • The people Blanchard considered as bisexual are inherently more horny (except I don't think Blanchard was responsible for dividing people up in this study).
    • People who are attracted to multiple disparate sex characteristics are more likely to call themselves "bisexual" if this attraction is stronger.
    • Something weird about 1989 (this is too broad to be a hypothesis).
      • People being closeted messing up the study.
  • Something about the question prompted this difference. (I can't check this, because I can't find the text of the questionnaire.)
    • Perhaps it said "by a male or a female", or something, which might produce a different average reaction across the different groups?

There are probably many others, but… would the hypothesis that "their bisexuality is just homosexuality plus a desire for validation by others" have been promoted so quickly if there wasn't a framework for it to fit into?

In each of these studies, however, many ostensibly androphilic MtF persons reported experiencing autogynephilia, whereas many ostensibly nonandrophilic persons denied experienc-ing it. How could Blanchard’s theory account for these deviations from its predictions?

I'll just note that this "deviation" is adequately predicted by the "trans people are just trans, and are likely to be aroused by the same sorts of things as cis people" hypothesis.

Blanchard, Racansky, and Steiner (1986) measured changes in penile blood volume

Oh, come on! People can get erections at all sorts of random times, including when relaxed or excited – this test (doi:10.1080/00224498609551326) does not distinguish between "sexual arousal" and "strong emotional reaction".

And any theory that assumes "and the participants are lying – or else don't know what they really think" loses points in my book.

Moreover, Blanchard, Clemmensen, and Steiner(1985) reported that in nonandrophilic men with gender dysphoria, a tendency to describe oneself in a socially desirable way was correlated with a tendency to deny sexual arousal with cross-dressing, suggesting an explanation for the under reporting of autogynephilic arousal.

… That's not an explanation, that's an observation. "Sexual arousal with cross-dressing" was not socially desirable in 1985. (If there's strong evidence, why is weak evidence being put forward? This feels a little like mathematician-trolling [LW · GW].) And it doesn't distinguish between autogynephilia and other hypotheses.

Walworth (1997) reported that 13% of 52 MtF transsexuals she surveyed admitted having lied to or misled their therapists about sexual arousal while wearing women’s clothing.

Regains some points for the "lying" thing, but not all of them; the "trans people are trans" theory also predicts attempts to manipulate gatekeepers by playing to favourable stereotypes, whereas the explanation later in this paper still smells of Freudian repression.

Incidentally, "trans people are trans" doesn't predict that such people would lie about this sort of thing off the record, or with friends (unlike the theory set out in this paper), but I don't know of a way to test that.

It is likely that, depending on the criteria of access to treatment in a specific treatment facility, applicants adjust their biographical data with regard to sexuality. This makes the quality of the information, especially when given during clinical assessment,questionable. (p.507)

Yup.

Cohen-Kettenis and Pfäfflin even proposed that resistance to the concept of autogynephilia might itself be responsible for some of the unreliability in the reporting of sexual orientation

Perhaps, but not significantly. There is resistance to the idea, but I doubt it's on most people's minds much of the time – the few who obsess over it I've had the displeasure of interacting with aren't trans. Avoiding stigma seems a more likely explanation, to me. (And "people don't like my theory, which is why the data doesn't match it" is a really fishy explanation.)

Some cases of MtF transsexualis mare associated with and plausibly attributable to other comorbid psychiatric disorders, especially psychotic conditions such as schizophrenia or bipolar disorder

Skipping past this entire section as irrelevant.

When Blanchard first introduced the term autogynephilia, he described it as not merely an erotic propensity but as a genuine sexual orientation, theorizing that “all gender dysphoric males who are not sexually oriented toward men are instead sexually oriented toward the thought or image of themselves as women” (Blanchard,1989a,pp.322–323).

More 70s-era attitudes to homosexuality. A trans woman being straight is normal, but a trans woman being gay? Needs to be psychoanalysed. Even accepting the premise, this attitude will classify as "autogynephilic" people who aren't.

(And this is only compatible with such attitudes to homosexuality – and the total erasure of bisexuals. Why posit two different mechanisms for people being trans, based on their sexuality, if homosexuality is sometimes "normal"? Why not assume homosexuality is always normal – or, at least, no less abnormal than heterosexuality?)

For autogynephilic MtF transsexuals, this implies the potential to feel continuing attraction to and comfort from autogynephilic fantasies and enactments that may have lost much of their initial erotic charge.

Explains too much. If I feel affection for the idea of being, say, a respected physicist, does that mean it used to be an erotically-charged fantasy? Or is it just something I'd prefer to the status quo? (This is the weakest argument in my rebuttal, but I think it could be strengthened.)

It is therefore feasible that the continuing desire to have a female body, after the disappearance of sexual response to that thought, has some analog in the permanent love-bond that may remain between two people after their initial strong sexual attraction has largely disappeared.

So why not apply this argument consistently, and consider it feasible that all similarly-shaped psychological events or patterns could be analogous? Like, say, the continuing drive to excel in a…

Hang on. I've started engaging with the premise. Most of my anecdotal evidence and personal experience directly contradicts this premise. I feel like I'm patiently arguing with a flat-earther about how the Bible doesn't actually say the planet is a disc, which is hard to prove without Biblical Hebrew and knowledge of Biblical hermeneutics… and utterly irrelevant to the question of the planet's shape.

Mu.

Autogynephilia appears to give rise to the desire for sex reassignment gradually and indirectly, however, through the creation of cross-gender identities that are eventually associated with gender dysphoria and then provide most of the proximate motivation for the pursuit of sex reassign-ment.

Predicts the non-existence of:

  • Pre-pubescent trans children;
  • Asexual trans people;
  • No-op trans people;
  • Trans men (without the autoandrophilia extension);
  • Non-binary people.
Again, we have factual evidence indicative of the considerable time required for the development of the cross-gender identity.

Or the existence of a "closet".

In a study of 422 MtF transsexuals, Blanchard, Dickey,and Jones (1995) found that androphilic MtFs were signifi-cantly shorter than nontranssexual males and significantly shorter and lighter in weight than nonandrophilic MtFs,with the latter comparisons showing small-to-medium effect sizes.

Irrelevant, unless you're proposing that this is an intersex-related condition.

Smith et al. did observe, however, that androphilic MtFs had a more feminine appearance than nonandrophilic MtFs.

The fact that this was deemed relevant is characteristic of this theory's proponents. (Oh, snap!)

Androphilic MtFs also report more childhood cross-gender behavior than their nonandrophilic counterparts(Blanchard,1988; Money & Gaskin,1970–1971; Whitam,1987).

At least compare to cis lesbians, or you're not even trying to rule out confounders.

The review of the available data seems to support two existing hypotheses: (1) a brain-restricted intersexuality in homosexual MtFs and FtMs and(2) Blanchard’s insight on the existence of two brain phenotypes that differentiate “homosexual” [androphilic] and“nonhomosexual” [nonandrophilic] MtFs. (p.1643)

Interesting… This is the first genuinely interesting thing in this paper. But, again, compare to cis gay people instead of just to "average cis", or you can't be confident you're measuring what you think you are.

Clinicians who recognize that the gender dysphoria of autogynephilic MtFs derives from their paraphilic sexual orientation can more easily understand why these clients “are likely to feel a powerful drive to enact their paraphilic desires (e.g., by undergoing sex reassignment), sometimes with little concern for possible consequences”(Lawrence,2009,p.198), which can include loss of employment, family, friends, and reputation.

“Sometimes with little concern for possible consequences”… ? I am left speechless; the only sentiment I can verbalise is: made up – doesn't match my observations.

I predict that this worsens clinical outcomes. This is a strong, strong prediction – my entire reason for believing what I believe says my belief should depend on this. Show me the data, if you have it.

The concept of autogynephilic interpersonal fantasy can help make sense of the otherwise puzzling fact that gynephilic MtFs sometimes develop a new found interest in male partners late in life.

It's. Not. Puzzling. And not really something you should be fixated over; this is normal human behaviour.

Many of the substantive criticisms of autogynephilia, however, can be presented andexamined in a concise manner.
3. Blanchard’s autogynephilia-based typology is descrip-tively inadequate: There are too many observed exceptions to its predictions.

My main criticism here is closer to 3, if anything. (7 is a concern, but shouldn't stand in the way of research; just in the way of stuff like Bailey's book. Discover truth, and figure out the consequences later, unless you're messing with world-ending threats where the knowledge in the wrong hands could doom everybody.)

In the opinion of the critics, there are simply too many deviations from the predicted relationship between autogynephilia and sexual orientation.

Woah, woah, woah. Is that a strawman? *reads further* No, just them not addressing my specific criticism, which is that there are too many deviations from the class of people autogynephilia assumes to exist and the class of people who exist.

Opponents of Blanchard’s theory have replied that such counterarguments effectively make Blanchard’s typology “unfalsifiable” (Winters,2008,para.6), because any departures from the theory’s predictions can simply be dismissed as attributable to misreporting, measurement errors, sampling problems, or psychiatric comorbidity. As Lawrence (2010a) noted, however, Blanchard’s typology is not in principle unfalsifiable

This is a good criticism, and a good response: it isn't, in principle, unfalsifiable… it's just that its proponents are good at arguing against the evidence against it. (I, likewise, am good at arguing against things – though not quite as good, because I don't know much [LW · GW] about frequentist statistics.)

But as measurable clinical phenomena, these entities are not statistically independent in MtF transsexuals.

"Confounders," I cry.

Reading this paper has slightly increased my credence in the idea of autogynephilia, though not by much at all, and convinced me that most of its proponents – not just Blanchard – are stuck in the 70s when it comes to ideas about sexuality and gender. I expect the next generation to drop this direction of study – perhaps in a century, when it's nearly forgotten, somebody will spot similar patterns, come up with a similar core idea, come to less stupid conclusions about it, and it'll be embraced.

Or perhaps the whole thing will turn out to be statistical anomalies perpetuated by people insistent on labelling shadows.

Edit +1d: my credence in autogynephilia has gone back down again; reading the paper in detail and engaging with its premises accidentally screened off an entire class of conflicting observations and experiences from my attention; when I remembered them, my credence immediately fell.

My sympathy for its serious proponents has gone up, though, because I (think I) see them making the same mistakes that I used to make, and undoubtedly still make: every experiment tries to confirm their theory, never falsify it, and they only measure the class of things that they know already accords with the framework of ideas.

---

I don't agree that the autogynephilia theory is unfalsifiable.

I can't see how some of the "strongest predictions" in that link follow. Take, for instance,

Autogynephilia in trans women is strongly negatively associated with exclusive attraction to men and femininity.

Where does this come from? And:

Autogynephilia is strongly associated with desire to be female.

But from my observations, desire to be female (in trans women, anyway) is not strongly associated with any particular aspect of sexuality; there are even plenty of ace trans women, which should be a blow for the "trans women are actually autogynephilic men" theory.

You're the first autogynephilia proponent I've interacted with who cares about being right (not that I've been seeking such people out); I'd be interested in double-cruxing at some point if you're interested. (Not here, though; somewhere with real-time communication.)

Also, I weirdly respect you more, in a way, even though I'm confident you're wrong. Perhaps it's because you being right nearly all the time is more impressive.

comment by Anon21 · 2009-01-21T17:13:56.000Z · LW(p) · GW(p)

@Hans:

To be honest, I doubt such a screw-up in AI would be limited to just one planet.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-21T17:22:31.000Z · LW(p) · GW(p)

As it was once said on an IRC channel:

[James] there is no vision of hell so terrible that you won't find someone who desires to live there.
[outlawpoet] I've got artifacts in D&D campaigns leading to the Dimension of Sentient Dooky, and the Plane of Itching.

In case it wasn't made sufficiently clear in the story, please note that a verthandi is not a catgirl. She doesn't have cat ears, right? That's how you can tell she's sentient. Also, 24 comments and no one got the reference yet?

Davis, thanks for pointing that out. I had no intention of doing that, and it doesn't seem to mean anything, so I went back and changed "Stephen Glass" to "Stephen Grass". Usually I google my character names but I forgot to do it this time.

comment by kraryal · 2009-01-21T17:34:48.000Z · LW(p) · GW(p)

Now Eliezer,

"Verðandi" is rather a stretch for us, especially when we don't watch anime or read manga. Norse mythology, okay. The scary part for me is wondering how many people are motivated to build said world. Optimized for drama, this is a pretty good world.

You have a nice impersonal antagonist in the world structure itself, most of the boring friction is removed... Are you sure you don't want to be the next Lovecraft?

comment by JamesAndrix · 2009-01-21T17:41:53.000Z · LW(p) · GW(p)

nazgul: I don't think it was intended to be BAD, it is clearly a better outcome than paperclipping or a serious hell. But it is much worse than what the future could be.

That said, I'm not sure it's realistic that something about breaking up marriages wouldn't be on a list of 107 rules.

Replies from: DanielLC
comment by DanielLC · 2013-02-09T23:50:52.873Z · LW(p) · GW(p)

The AI didn't give a misleading statement. The verthandi did. Perhaps the same is true of breaking up the marriage.

comment by nazgulnarsil3 · 2009-01-21T17:58:52.000Z · LW(p) · GW(p)

ZM: I'm not saying that the outcome wouldn't be bad from the perspective of current values, I'm saying that it would serve to lessen the blow of sudden transition. The knowledge that they can get back together again in a couple decades seems like it would placate most. And I disagree that people would cease wanting to see each other. They might prefer their new environment, but they would still want to visit each other. Even if Food A tastes better in every dimension to Food B I'll probably want to eat Food B every once in awhile.

James: Considering the fact that the number of possible futures that are horrible beyond imagining is far far greater than the number of even somewhat desirable futures I would be content with a weirdtopia. Weirdtopia is the penumbra of the future light cone of desirable futures.

comment by steven · 2009-01-21T18:32:36.000Z · LW(p) · GW(p)

The fact that this future takes no meaningful steps toward solving suffering strikes me as a far more important Utopia fail than the gender separation thing.

comment by Manuel_Mörtelmaier · 2009-01-21T19:02:42.000Z · LW(p) · GW(p)

24 comments and no one got the reference yet?

Actually its's the other way round: The beginning of the first episode of the new TV series, especially the hands, and the globe, is a reference to your work, Eliezer.

comment by Doug_S. · 2009-01-21T19:33:17.000Z · LW(p) · GW(p)

Yes, I got the reference.

It just doesn't seem to be worth commenting on, as it's so tangential to the actual point of the post.

comment by Tiiba4 · 2009-01-21T19:38:12.000Z · LW(p) · GW(p)

Davis: "That's the most horrifying part of all, though--they won't so choose!"

Why is that horrifying? Life will be DIFFERENT? After a painful but brief transition, everyone will be much happier forever. Including the friends or lovers you were forced to abandon. I'm sorry if I can't bring myself to pity poor Mr. Grass. People from the 12th century would probably pity us too, well, screw them.

comment by MichaelAnissimov · 2009-01-21T19:42:22.000Z · LW(p) · GW(p)

The verthandi here sounds just as annoyingly selfless and self-conscious as Belldandy is in the series. Don't these creatures have any hobbies besides doing our dishes and kneeling in submissive positions?

Replies from: TuviaDulin
comment by TuviaDulin · 2012-04-01T19:33:58.593Z · LW(p) · GW(p)

Presumably, your own personal verthandi(s) would have other hobbies, because you would want them to.

Replies from: pnrjulius
comment by pnrjulius · 2012-06-06T22:03:02.332Z · LW(p) · GW(p)

Right, and that's exactly the point. She is your best possible partner---including being sentient, being intelligent, etc. I honestly have trouble seeing what's wrong with that.

Replies from: TuviaDulin
comment by TuviaDulin · 2012-06-07T08:14:15.950Z · LW(p) · GW(p)

The fact that she was designed just for me...that in itself would ruin it for me.

Replies from: bzealley
comment by bzealley · 2012-08-07T13:19:03.619Z · LW(p) · GW(p)

It's a bit questionable if the relationship is one way, but it could be designed to be a symmetric "best" for the companion too. Okay, more CPU cycles, but this reeks of hard take-off, which probably means new physics...

Also, a bit more technically but I hope worth adding - if the companion already exists in any possible world, the fact that you engineer a situation where you are able to perceive one another isn't creating a pattern ex nihilo, it's discovering one. Takes some of the wind out of the argument, although you still certainly have a point on privacy if the relationship is asymmetric.

comment by Manon_de_Gaillande · 2009-01-21T19:56:36.000Z · LW(p) · GW(p)

Oh please. Two random men are more alike than a random man and a random woman, okay, but seriously, a huge difference that makes it necessary to either rewrite minds to be more alike or separate them? First, anyone who prefers to socialize with the opposite gender (ever met a tomboy?) is going to go "Ew!". Second, I'm pretty sure there are more than two genders (if you want to say genderqueers are lying or mistaken, the burden of proof is on you). Third, neurotypicals can get along with autists just fine (when they, you know, actually try), and this makes the difference between genders look hoo-boy-tiiiiny. Fourth - hey, I like diversity! Not just just knowing there are happy different minds somewhere in the universe - actually interacting with them. I want to sample ramensubspace everyday over a cup of tea. No way I want to make people more alike.

Replies from: TuviaDulin, army1987, MugaSofer
comment by TuviaDulin · 2012-04-01T19:35:24.136Z · LW(p) · GW(p)

The clever fool doesn't seem to have taken these facts into account. He was a fool, after all.

comment by A1987dM (army1987) · 2012-12-26T21:03:39.043Z · LW(p) · GW(p)

Two random men are more alike than a random man and a random woman

For any two groups A and B, two random members of A are more alike than a random member of A and a random member of B, aren't they?

Replies from: army1987, None, J_Taylor, Nominull, Oligopsony
comment by A1987dM (army1987) · 2012-12-27T23:20:16.108Z · LW(p) · GW(p)

Not necessarily -- for example, if all the members of both groups are on a one-dimensional space, both groups have the same mean, and Group B had much smaller variance than Group A... But still.

comment by [deleted] · 2012-12-28T01:23:20.295Z · LW(p) · GW(p)

Most people are members of more than just one group.

Replies from: army1987
comment by A1987dM (army1987) · 2012-12-28T02:08:52.919Z · LW(p) · GW(p)

So?

Replies from: None
comment by [deleted] · 2012-12-28T03:55:05.190Z · LW(p) · GW(p)

Soooooo, real humans might be a mite more complicated than that, such that your summary does not usefully cover inferences about people.

Replies from: army1987
comment by A1987dM (army1987) · 2012-12-28T04:29:45.108Z · LW(p) · GW(p)

I don't see where I assumed that the groups were disjoint. My point was that "Two random men are more alike than a random man and a random woman", while technically true, isn't particularly informative about men and women.

Replies from: None
comment by [deleted] · 2012-12-28T04:45:22.328Z · LW(p) · GW(p)

Ah, my mistake. I thought you were saying that given your proposition is (asserted to be true), the idea that two random men are more alike than a random man and woman must be meaningfully true.

comment by J_Taylor · 2012-12-28T05:45:11.744Z · LW(p) · GW(p)

What about cases in which group B is a subset of Group A?

comment by Nominull · 2012-12-28T08:12:43.060Z · LW(p) · GW(p)

No. A is [1,3,5,7], B is [4,4,4,4]. A random member of A will be closer to a random member of B than to another random member of A.

Replies from: ygert
comment by ygert · 2012-12-28T09:16:32.437Z · LW(p) · GW(p)

I probably would say that that is because your two sets A and B do not carve reality at its joints. What I think army1987 intended to talk about is "real" sets, where a "real" set is defined as one that carves reality at its joints in one form or another.

Replies from: Qiaochu_Yuan, Kawoomba, army1987
comment by Qiaochu_Yuan · 2012-12-28T09:21:27.979Z · LW(p) · GW(p)

Let A = "humans" and B = "male humans."

comment by Kawoomba · 2012-12-28T09:43:33.002Z · LW(p) · GW(p)

What I think army1987 intended to talk about is "real" sets

There will be some real sets that are similar to Nominull's (well, natural numbers are a subset of reals, eh?), however army1987 did emphasize the any, so Nominull's correction was well warranted.

comment by A1987dM (army1987) · 2012-12-28T11:57:58.396Z · LW(p) · GW(p)

Er, no, I was just mistaken. (And forgot to retract the great-grandparent -- done now.) For a pair of sets who do carve reality at (one of) its joints but still is like that, try A = {(10, 0), (30, 0), (50, 0), (70, 0)} and B = {(40, 1), (40, 1), (40, 1), (40, 1)}.

(What I was thinking were cases were A = {10, 20, 30, 40} and B = {11, 21, 31, 41}, where it is the case that “two random members of A are more alike than a random member of A and a random member of B”, and my point was that “Two random men are more alike than a random man and a random woman” doesn't rule out {men} and {women} being like that.)

Replies from: ygert
comment by ygert · 2012-12-28T12:09:47.417Z · LW(p) · GW(p)

Ah, okay then. That makes sense.

comment by Oligopsony · 2012-12-28T11:27:34.094Z · LW(p) · GW(p)

I believe what Manon meant is that the difference in this case between two random members of the same class exceeds the difference between the average members of each class.

comment by MugaSofer · 2012-12-27T03:34:19.337Z · LW(p) · GW(p)

Leaving aside the fact that this was a failed utopia, I am troubled by your comment "neurotypicals can get along with autists just fine (when they, you know, actually try), and this makes the difference between genders look hoo-boy-tiiiiny." While it appears to be true, it is also true that even a minor change could easily render cooperation with another mind extremely difficult. Diversity has its cost. Freedom of speech means you can't arrest racists until they actually start killing Jews, for example

Replies from: Nornagest, fubarobfusco, hairyfigment
comment by Nornagest · 2012-12-27T04:30:27.968Z · LW(p) · GW(p)

Freedom of speech means you can't arrest Nazis until they actually start killing Jews, for example

You need both freedom of speech and freedom of association for that, as long as you're talking about organized Nazis rather than lone nuts. And a governmental culture that takes both seriously as deontological imperatives and not as talking points to bandy about until they conflict with locking up people who actually violate serious taboos of speech and thought.

There are plenty of first-world countries that don't fully implement that combination.

Replies from: MugaSofer
comment by MugaSofer · 2012-12-27T16:15:14.248Z · LW(p) · GW(p)

talking points to bandy about until they conflict with locking up people who actually violate serious taboos of speech and thought.

Locking people up for violating "taboos of speech and thought" is clearly a violation of their freedom of speech (and freedom of opinion/belief, I suppose, but that one is less catchy.) Just as locking up anyone is a violation of their freedom of movement, and executing them is a violation of their right to life, and giving a psychotic drugs they think are spiders is a violation of their right to bodily integrity. Rights require compromise, and this is how it should be, because no bill of rights is perfectly Friendly.

comment by fubarobfusco · 2012-12-27T05:33:24.335Z · LW(p) · GW(p)

Freedom of speech means you can't arrest Nazis until they actually start killing Jews, for example

In point of fact, Nazis started threatening and assaulting Jews, vandalizing their businesses, and imposing weird new discriminatory rules on them, some years before the mass murder started in earnest. None of the above are generally taken to be protected by "freedom of speech".

Replies from: MugaSofer
comment by MugaSofer · 2012-12-27T17:10:50.472Z · LW(p) · GW(p)

It was such incidents I had in mind. Clearly, I was suffering from the illusion of transparency; I'll change it.

comment by hairyfigment · 2012-12-27T07:01:03.649Z · LW(p) · GW(p)

I'm pretty sure you can arrest Nazis when they start attacking other parties with the intention of overthrowing the government. Wiki says the following happened before they were officially Nazis:

Some 130 people attended; there were hecklers, but Hitler's military friends promptly ejected them by force, and the agitators "flew down the stairs with gashed heads."

Replies from: MugaSofer
comment by MugaSofer · 2012-12-27T16:02:40.187Z · LW(p) · GW(p)

It was such incidents I had in mind. Clearly, I was suffering from the illusion of transparency; I'll change it.

Replies from: hairyfigment
comment by hairyfigment · 2012-12-27T21:59:21.772Z · LW(p) · GW(p)

See, racists (even in a fairly strong sense) would often have been in power. I don't know what verbal beliefs you think characterize Nazis more than their willingness to use violence against particular targets. Hitler had belonged to (what they would later call) the Nazi Party for at most two months when the cited violence happened. He wouldn't write Mein Kampf for more than three years. Mussolini allegedly said,

The Socialists ask what our political program is. Our political program is to break the heads of the socialists.

Replies from: MugaSofer
comment by MugaSofer · 2012-12-29T19:39:21.788Z · LW(p) · GW(p)

I don't know what verbal beliefs you think characterize Nazis more than their willingness to use violence against particular targets.

You don't? Well, you may not have heard of this, but they had kind of a thing about Jews. Thought they were subhuman and corrupting society and all sorts of crazy shit.

Replies from: MixedNuts, hairyfigment
comment by MixedNuts · 2012-12-29T20:33:13.693Z · LW(p) · GW(p)

Is a typical Nazi closer to someone who privately thinks Jews are subhuman and corrupting society and is exactingly nice and friendly to everyone so that the Jewish conspiracy have nothing to use against her, or to someone who advocates violence up to and including mass murder against green-eyed manicurists on the grounds that they are subhuman and corrupt society?

Replies from: Oligopsony, MugaSofer
comment by Oligopsony · 2012-12-29T21:06:04.541Z · LW(p) · GW(p)

Temperamentally, or in terms of verbal beliefs?

Replies from: MixedNuts
comment by MixedNuts · 2012-12-29T22:03:47.770Z · LW(p) · GW(p)

Yes.

Replies from: Oligopsony
comment by Oligopsony · 2012-12-29T23:08:39.406Z · LW(p) · GW(p)

Well, let's compare Nazis to Ankharists. Ankharists if anything have a longer hitlist than Nazis, although they have nothing in particular against Jews. Are Ankharists more Nazi than Nazis? Uh, no. Ankharism is actually an entirely different ideology, with little in common besides the long hitlist (consisting of different targets.)

Of course with respect to the original question it's also true that there are lots of distinctions between National Socialism and the various ruling racist ideologies that preceded them other than hitlist as well, so.

Replies from: Douglas_Knight
comment by Douglas_Knight · 2014-07-15T21:11:50.219Z · LW(p) · GW(p)

What is Ankharism? Google does not find anyone but you using this word. I suspect you have fabricated an English word by transliterating from another language, but I cannot trace it. Somewhere you talk about Cambodia. Perhaps you mean Angkorism, a rare name for the ideology of the Khmer Rouge, after the Angkor Empire?

(There is also the Ankharite, named after the Egyptian Ankh, which may be displacing the term you use.)

Replies from: Oligopsony
comment by Oligopsony · 2014-07-25T02:21:20.967Z · LW(p) · GW(p)

It was a garbled version of Angkorism, sorry.

Replies from: NoriMori1992
comment by NoriMori1992 · 2022-10-17T01:28:37.027Z · LW(p) · GW(p)

I don't get any informative results from looking that up, either.

comment by MugaSofer · 2012-12-30T00:06:50.583Z · LW(p) · GW(p)

The latter, historically. However, focusing on the specific example is probably counterproductive, as it doesn't affect the point that certain verbal beliefs are dangerous; specifically those that stereotype, demonize and dehumanize particular groups. Obviously most who hold such beliefs will never attack anyone; but ... if they were restricted, there would be less hate crimes. This would cause irreparable damage to society in other ways, of course - that's rather the point.

comment by hairyfigment · 2012-12-29T22:44:10.019Z · LW(p) · GW(p)

Apparently people dispute that Georg Ratzinger published the same beliefs. But again, since I've apparently had trouble making myself understood: none of those verbal claims, at least the ones publicly known before the start of violence, distinguished the Nazis from other people (if not literally people like GR within the German government).

Replies from: MugaSofer
comment by MugaSofer · 2012-12-30T12:40:51.561Z · LW(p) · GW(p)

Oh, right. Well, it's certainly true that anti-semetism was a lot more popular and socially acceptable before the holocaust. But it was even more popular, socially acceptable, and extreme among Nazis.

comment by JamesAndrix · 2009-01-21T19:58:46.000Z · LW(p) · GW(p)

Nazgul: I concur. I wonder if Eliezer would press a button h activating this future, given the risks of letting things go as they are.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-21T20:48:07.000Z · LW(p) · GW(p)
Second, I'm pretty sure there are more than two genders (if you want to say genderqueers are lying or mistaken, the burden of proof is on you).

Indeed. It's not clear from the story what happened to them, not to mention everyone who isn't heterosexual. Maybe they're on a moon somewhere?

Anissimov, I was trying to make the verthandi a bit more complicated a creature than Belldandy - not to mention that Keiichi and Belldandy still manage to have a frustrating relationship along ahem certain dimensions. It's just that "Belldandy" is the generic name for her sort, in the same way that "catgirl" is the generic name for a nonsentient sex object.

But let's have a bit of sympathy for her, please; how would you like to have been created five minutes ago, with no name and roughly generic memories and skills, and then dumped into that situation?

I have to say, although I expected in the abstract that people would disagree with me about Utopia, to find these particular disagreements still feels a bit shocking. I wonder if people are trying too hard to be contrarian - if the same people advocating this Utopia would be just as vehemently criticizing it, if the title of the post had been "Successful Utopia #4-2".

comment by Carl_Shulman · 2009-01-21T20:49:17.000Z · LW(p) · GW(p)

James,

"I have set guards in the air that prohibit lethal violence, and any damage less than lethal, your body shall repair." I'm not sure whether this would prohibit the attainment or creation of superintelligence (capable of overwhelming the guards), but if not then this doesn't do that much to resolve existential risks. Still, unaging beings would look to the future, and thus there would be plenty of people who remembered the personal effects of an FAI screw-up when it became possible to try again (although it might also lead to overconfidence).

comment by Vladimir_Nesov · 2009-01-21T20:51:29.000Z · LW(p) · GW(p)

What happened to the programmer, and are there computers around in the new setting? He managed to pull off a controlled superintelligence shutdown after all.

comment by Thom_Blake · 2009-01-21T20:54:28.000Z · LW(p) · GW(p)

James,

I wonder the same thing. Given that reality is allowed to kill us, it seems that this particular dystopia might be close enough to good. How close to death do you need to be before unleashing the possibly-flawed genie?

comment by Cabalamat2 · 2009-01-21T21:03:07.000Z · LW(p) · GW(p)

You should write SF, Eliezer.

comment by MichaelAnissimov · 2009-01-21T21:10:51.000Z · LW(p) · GW(p)

Eliezer, the character here does seem more subtle than Belldandy, but of course you only have so much room to develop it in a short story. I'm not criticizing your portrayal, which I think is fine, I'm just pointing out that such an entity is uniquely annoying by its very nature. I do feel sorry for her, but I would think that the Overmind would create her in a state of emotional serenity, if that were possible. Her anxious emotional state does add to the frantic confusion and paranoia of the whole story.

Though we in the community have discussed the possibility of instantly-created beings for some time, only recently I found out that the idea that God created the world with a false history has a name -- the Omphalos hypothesis. Not sure if you already knew, but others might find it useful as a search term for more thoughts on the topic.

This short story would make a good addition to the fiction section on your personal website.

Replies from: ChrisHallquist
comment by ChrisHallquist · 2012-05-21T06:53:20.370Z · LW(p) · GW(p)

One possible interpretation is that the AI realized that if it created her in a state of emotional serenity, Sam would find her calm at a situation he hated creepy. On the other hand, having her freaking out at the beginning may, over the course of the next week, make it easier for Sam to relate to her and prevent him from transferring his hatred of the AI to her.

comment by JamesAndrix · 2009-01-21T21:37:03.000Z · LW(p) · GW(p)

On rereading: "Hate me if you wish, for I am the one who wants to do this to you."

This use of the word 'wants' struck me as a distinction Eliezer would make, rather than this character. That then reminded me of how much in-group jargon we use here. Will a paperclipper go foom before we have ems? Are there more than 1000 people that can understand the previous sentence?

Eliezer: I do like being contrarian, but I don't feel like I'm being contrarian in this. You may give too much credit to our gender. I suspect that if I were not already in a happy monogamous relationship, I wouldn't have many reservations to this at all. Your description of the verthandi makes her seem like a strict upgrade from Helen, and Stephen's only objection is that she is not Helen. (Fiction quibble: And couldn't the AI have obscured that?)

For many men, that's still a strict upgrade.

And I'll assume it's also part of Stephen's particular optimization that he only got one. Or else you gave us way too much credit.

Replies from: Kingreaper
comment by Kingreaper · 2010-12-08T18:39:26.761Z · LW(p) · GW(p)

(Fiction quibble: And couldn't the AI have obscured that?)

Highly likely to be one of the 107 restrictions; allowing an AI to lie makes it harder to control

comment by bogdanb · 2009-01-21T21:55:34.000Z · LW(p) · GW(p)

Will Pearson: I'm going to skip quickly over the obvious problem that an AI, even much smarter than me, might not necessarily do what you mean rather than what (it thinks) you said. Let's assume that the AI somehow has an interface that allows you to tell exactly what you mean:

"that the AI would keep inside it a predicate Will_Pearson_would_regret_wish (based on what I would regret), and apply that to the universes it envisages while planning"

This is a bit analogous to Eliezer's "regret button" on the directed probability box, except that you always get to press the button. The first problem I see is that you need to define "regret" extremely well (i.e., understand human psychology better than I think is "easy", or even possible, right now), to avoid the possibility that there aren't any futures where you wouldn't regret the wish. (I don't say that's the case, I just say that you need to prove that it's not the case before reasonably making the wish.) This gets even harder with CNR.

I you're not able to do that, you risk the AI "freezing" the world and then spending the life of the Universe trying to find a plan that satisfies the predicate before continuing. (Note that this just requires that finding such a plan be hard enough that the biggest AI physically possible can't find it before it decays; it doesn't have to be impossible or take forever.)

We can't even assume that the AI will be "smart enough" to detect this kind of problem: it might simply be mathematically impossible to anticipate if a solution is possible, and the wish too "imperative" to allow the AI to stop the search.


I short, I don't really see why a machine inside the universe could simulate even one entire future light-cone of just one observer in the same universe, let alone find one where the observer doesn't regret the act. Depending on what the AI understands by "regret", even not doing anything may be impossible (perhaps it foresees you'll regret asking a silly wish, or something like that).

This doesn't mean that the wish is bad, just that I don't understand its possible consequences well enough to actually make it.

comment by Sebastian_Hagen2 · 2009-01-21T22:27:48.000Z · LW(p) · GW(p)

This use of the word 'wants' struck me as a distinction Eliezer would make, rather than this character.
Similarly, it's notable that the AI seems to use exactly the same interpretation of the word lie as Eliezer Yudkowsky: that's why it doesn't self-describe as an "Artificial Intelligence" until the verthandi uses the phrase.

Also, at the risk of being redundant: Great story.

comment by Allan_Crossman · 2009-01-21T23:24:26.000Z · LW(p) · GW(p)

Is this a "failed utopia" because human relationships are too sacred to break up, or is it a "failed utopia" because the AI knows what it should really have done but hasn't been programmed to do it?

Replies from: TuviaDulin, AdeleneDawner
comment by TuviaDulin · 2012-04-01T19:53:40.961Z · LW(p) · GW(p)

I don't see how those are mutually exclusive.

comment by AdeleneDawner · 2012-12-30T05:23:48.672Z · LW(p) · GW(p)

I think it's a failed utopia because it involves the AI modifying the humans' desires wholesale - the fact that it does so by proxy doesn't change that it's doing that.

(This may not be the only reason it's a failed utopia.)

comment by Roko · 2009-01-22T00:28:24.000Z · LW(p) · GW(p)

“This failure mode concerns the possibility that men and women simply weren’t crafted by evolution to make each other maximally happy, so an AI with an incentive to make everyone happy would just create appealing simulacra of the opposite gender for everyone. Here is my favorite part”

  • I would not consider this an outright failure mode. I suspect that a majority of people on the planet would prefer this “failure” to their current lives. I also suspect that a very significant portion of people in the UK would prefer it to their current lives.

I think that we will find that as we get into more subtle “FAI Failure modes”, the question as to whether there has been a failure or a success will lose any objective answer. This is because of moral anti-realism and the natural spread of human preferences, beliefs and opinions.

The same argument applies to the “personal fantasy world” failure mode. A lot of people would not count that as a failure.

[crossposted from Accelerating future]

comment by Will_Pearson · 2009-01-22T00:39:19.000Z · LW(p) · GW(p)

Dognab, your arguments apply equally well to any planner. Planners have to consider the possible futures and pick the best one (using a form of predicate), and if you give them infinite horizons they may have trouble. Consider a paper clip maximizer, every second it fails to use its full ability to paper clip things in its vicinity it is losing possible useful paper clipping energy to entropy (solar fusion etc). However if it sits and thinks for a bit it might discover a way to hop between galaxies with minimal energy. So what decision should it make? Obviously it would want to run some simulations, see if there gaps in its knowledge. How detailed simulations should it make, so it can be sure it has ruled out the galaxy hopping path?

I'll admit I was abusing the genie-trope some what. But then I am sceptical of FOOMing anyway, so when asked to think about genies/utopias, I tend to suspend all disbelief in what can be done.

Oh and belldandy is not annoying because she has broken down in tears (perfectly natural), but because she bases her happiness too much on what Stephen Grass thinks of her. A perfect mate for me would tell me straight what was going on and if I hated her for it (when not her fault at all), she'd find someone else because I'm not worth falling in love with. I'd want someone with standards for me to meet, not unconditional creepy fawning.

comment by Roko · 2009-01-22T00:41:23.000Z · LW(p) · GW(p)

Quick poll:

Suppose you had the choice between this "failed" utopia, and a version of earth where 2009 standards of living were maintained "by magic" forever, including old age and death, third world poverty, limited human intelligence, etc.

Who here would prefer "failed utopia 4-2", who would prefer "2009 forever"? Post your vote in the comments.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-22T00:47:19.000Z · LW(p) · GW(p)

I wonder if the converse story, Failed Utopia #2-4 of Helen and the boreana, would get the same proportion of comments from women on how that was a perfectly fine world.

I wonder how bad I would actually have to make a Utopia before people stopped trying to defend it.

The number of people who think this scenario seems "good enough" or an "amazing improvement", makes me wonder what would happen if I tried showing off what I consider to be an actual shot at Applied Fun Theory. My suspicion is that people would turn around and criticize it - that what we're really seeing here is contrarianism. But if not - if this world indeed ranks lower in my preference ordering, just because I have better scenarios to compare it to - then what happens if I write the Successful Utopia story?

Replies from: arfle, Articulator, LauralH, NoriMori1992
comment by arfle · 2010-07-07T22:41:52.139Z · LW(p) · GW(p)

#1 Successful Utopia story

#2 ?

#3 Money!

comment by Articulator · 2013-06-13T20:02:52.417Z · LW(p) · GW(p)

I have to say, though I recognize that this is four years on, I would be extremely interested in your actual shot at Applied Fun Theory. The best thing I've ever read in that category so far is Iceman's Friendship is Optimal, which you of course are already aware of.

I, along several others, were perplexed at your distaste for the world it portrayed, and while I'm sure better could be achieved, I'd be interested to see exactly where you'd go, if you found FiO actual horror material.

comment by LauralH · 2015-02-24T09:30:40.374Z · LW(p) · GW(p)

Of course women would be smarter about sexual "utopias" than men. I mean no offense, biologically women have to be less impulsive about that sort of thing.

comment by NoriMori1992 · 2022-10-17T01:40:15.855Z · LW(p) · GW(p)

makes me wonder what would happen if I tried showing off what I consider to be an actual shot at Applied Fun Theory. My suspicion is that people would turn around and criticize it

I can't tell if dath ilan (as portrayed in Project Lawful and elsewhere) is supposed to be "an actual shot at Applied Fun Theory", and I'm somewhat leaning towards thinking it isn't, but if it is, then your prediction is correct for at least one person. (Though I would probably still move there because it still sounds better than what I've got now. Honestly, I'd move there just for the Quiet Cities.)

that what we're really seeing here is contrarianism

That would not be the only explanation for people calling your "Failed Utopia" not that bad and your "Successful Utopia" terrible.

I wonder how bad I would actually have to make a Utopia before people stopped trying to defend it.

If people are defending it, maybe that means it actually just isn't that bad. I know I don't need to tell you that "badness" isn't a thing that exists in the aether, it's a function of how people feel about things. (Edit: Of course, I know "it actually just isn't that bad" isn't the only explanation for people defending it. Just thought it was an explanation worth considering.)

comment by Will_Pearson · 2009-01-22T01:15:00.000Z · LW(p) · GW(p)

Eliezer, didn't you say that humans weren't designed as optimizers? That we satisfice. The reaction you got is probably a reflection of that. The scenario ticks most of the boxes humans have, existence, self-determination, happiness and meaningful goals. The paper clipper scenario ticks none. It makes complete sense for a satisficer to pick it instead of annihilation. I would expect that some people would even be satisfied by a singularity scenario that kept death as long as it removed the chance of existential risk.

comment by Nanani2 · 2009-01-22T01:18:00.000Z · LW(p) · GW(p)

Oh please not boreana.
Many of us women vastly prefer marsterii, and I must assume including both would make Venus somewhat unstable and dusty.

Replies from: Normal_Anomaly
comment by Normal_Anomaly · 2012-05-09T21:32:53.020Z · LW(p) · GW(p)

Nanani2 left in 2009, but can somebody else explain "marsterii"?

Replies from: Alicorn
comment by Alicorn · 2012-05-09T22:45:41.992Z · LW(p) · GW(p)

"Boreana" is a reference to David Boreanaz, who Eliezer presumably knows of via his portrayal of the vampire "Angel" in Buffy and Angel's own eponymous spinoff series. In same, there is another vampire "Spike" portrayed by James Marsters.

comment by CarlShulman · 2009-01-22T01:27:00.000Z · LW(p) · GW(p)

""good enough" or an "amazing improvement""
Some people may blur those together, but logarithmic perception of rewards and narrow conscious aims explain a lot. Agelessness, invulnerability to violence, ideal mates, and a happy future once technology is re-established, to the limits of the AI's optimization capability (although I wonder if that means it has calculated we're likely to become wireheads the next time around, or otherwise create a happiness-inducer that indirectly bypasses some of the 107 rules) satisfy a lot of desires. Especially for immortality-obsessed transhumanists. And hedonists. Not to mention: singles.

comment by CarlShulman · 2009-01-22T01:32:00.000Z · LW(p) · GW(p)

"My suspicion is that people would turn around and criticize it - that what we're really seeing here is contrarianism."
Or perhaps your preferences are unusual, both because of values and because of time pondering the issue. This scenario has concrete rewards tickling the major concerns of most humans. Your serious application of Fun Theory would be further removed from today's issues: fear of death, lack of desirable mates, etc, and might attract criticism because of that.

comment by steven · 2009-01-22T02:04:00.000Z · LW(p) · GW(p)

"boreana"

This means "half Bolivian half Korean" according to urbandictionary. I bet I'm missing something.

Perhaps we should have a word ("mehtopia"?) for any future that's much better than our world but much worse than could be. I don't think the world in this story qualifies for that; I hate to be negative guy all the time but if you keep human nature the same and "set guards in the air that prohibit lethal violence, and any damage less than lethal, your body shall repair", they still may abuse one another a lot physically and emotionally. Also I'm not keen on having to do a space race against a whole planet full of regenerating vampires.

comment by AC2 · 2009-01-22T02:06:00.000Z · LW(p) · GW(p)

Remember, Elizer, that what we're comparing this life to when saying 'hmm, it's not that bad' is

1) Current life, averaged over the entire human species including the poor regions of Africa. Definitely an improvement over that.
2) The paperclipping of the world, which was even mostly avoided.

It's not a successful utopia, because it could be better; significantly better. It's not a failed one, because people are still alive and going to be pretty happy after an adjustment period.

Much of what that you've been building up in many of your posts, especially before this latest Fun Theory sequence is "we have to do this damn right or else we're all dead or worse". This is not worse than death, and in fact might even be better than our current condition; hence the disagreement to characterizing this as a horrible horrible outcome.

comment by Nominull2 · 2009-01-22T02:58:00.000Z · LW(p) · GW(p)

It seems like the people who are not happily married get a pretty good deal out of this, though? I'm not sure I understand how 90% of humanity ends up wishing death on the genie. Maybe 10% of humanity had a fulfilling relationship broken up, and 80% are just knee-jerk luddites.

Replies from: Normal_Anomaly
comment by Normal_Anomaly · 2012-05-09T21:35:52.377Z · LW(p) · GW(p)

It wouldn't be just happily married people. It'd be them plus all the people who had close friends of the opposite gender, plus everyone who doesn't want to be separated from their family of the other gender, plus everybody who knew someone like that and sympathized with them.

comment by mitchell_porter2 · 2009-01-22T03:20:00.000Z · LW(p) · GW(p)

This is what I think of as a "mildly unfriendly" outcome. People still end up happy, but before the change, they would not have wanted the outcome. One way for that to happen involves the AI forcibly changing value systems, so that everyone suddenly has an enthusiasm for whatever imperatives it wishes to impose. In this story, as I understand it, there isn't even alteration of values, just a situation constructed to induce the victory of one set of values (everything involved in the quest for a loved one) over another set of values (fidelity to the existing loved one), in a way which violates the protagonist's preferred hierarchy of values.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-22T03:55:00.000Z · LW(p) · GW(p)

Okay, just to disclaim this clearly, I probably would press the button that instantly swaps us to this world - but that's because right now people are dying, and this world implies a longer time to work on FAI 2.0.

But the Wrinkled Genie scenario is not supposed to be probable or attainable - most programmers this stupid just kill you, I think.

"Mehtopia" seems like a good word for this kind of sub-Utopia. Steven's good at neologisms!

I should also note that I did do some further optimizing in my head of the verthandi - yes, they have different individual personalities, yes guys sometimes reject them and they move on, etcetera etcetera - but most of that background proved irrelevant to the story. I shouldn't really be saying this, because the reader has the right to read fiction any way they like - but please don't go assuming that I was conceptualizing the verthandi as uniform doormats.

Some guys probably would genuinely enjoy doormats, though, and so verthandi doormats will exist in their statistical distribution. To give the verthandi a feminist interpretation would quite miss the point. If there are verthandi feminists, their existence is predicated on the existence of men who are attracted to feminists, and I'm reasonably sure that's not what feminism is about.

If you google boreana you should get an idea of where that term comes from, same as verthandi.

It seems like the people who are not happily married get a pretty good deal out of this, though? I'm not sure I understand how 90% of humanity ends up wishing death on the genie.

Good point, Nominull - though even if you're not married, you can still have a mother. Maybe the Wrinkled Genie could just not tell the singles about the verthandi as yet - just that they'd been stripped of technology and sent Elsewhere - but that implies the Wrinkled Genie deliberately planning its own death (as opposed to just planning for its own death), and that wasn't what I had in mind.

comment by Kaj_Sotala · 2009-01-22T04:27:00.000Z · LW(p) · GW(p)

90% also seems awfully high of a fail-safe limit. Why not 70%, 50% or even less? You could just change the number and that'd fix the issue.

I also tend to lean towards the "not half as bad" camp, though a bit of that is probably contrarianism. And I do know futures that'd rank higher in my preference ordering than this. Still, it's having a bit of a weirdtopia effect on me - not at all what I'd have imagined as an utopia at first, but strangely appealing when I think more of it... (haven't thought about it for long enough of a time to know if that change keeps up the more I think of it)

comment by JamesAndrix · 2009-01-22T05:55:00.000Z · LW(p) · GW(p)

Eliezer:
I'd say most of the 'optimism' for this is because you've convinced us that much worse situations are much more likely.

Also, we're picking out the one big thing the AI did wrong that the story is about, and ignoring other things it did wrong. (leaving no technology, kidnapping, creation of likely to be enslaved sentients) I'm sure there's an already named bias for only looking at 'big' effects.

And we're probably discounting how much better it could have been. All we got was perfect partners, immortality, and one more planet than we had before. But we don't count the difference between singularity-utopia and #4-2 as a loss.

Replies from: rkyeun
comment by rkyeun · 2012-07-28T22:32:18.486Z · LW(p) · GW(p)

Two more planets than we had before. Men are from Mars, Women are...

Replies from: kibber
comment by kibber · 2012-09-21T22:21:02.267Z · LW(p) · GW(p)

...from Venus, and only animals left on Earth, so one more planet than we had before.

Replies from: rkyeun
comment by rkyeun · 2012-09-26T13:49:23.636Z · LW(p) · GW(p)

Well, until we get back there. It's still ours even if we're on vacation.

comment by jb6 · 2009-01-22T12:26:00.000Z · LW(p) · GW(p)

An excellent story, in the sense that it communicates the magnitude of the kinds of mistakes that can be made, even when one is wise and prudent (or imagines oneself so). I note with more than some amusement that people are busy in the comments adding stricture 108, 109, 110 - as if somehow just another layer or two, and everything would be great! (Leela: "The iceberg penetrated all 7000 hulls!" Fry: "When will humanity learn to make a ship with 7001 hulls!"

Nicely done.

comment by Cyan2 · 2009-01-22T14:27:00.000Z · LW(p) · GW(p)
If you google boreana you should get an idea of where that term comes from, same as verthandi.

Still need a little help. Top hits appear to be David Boreanaz, a plant in the Rue family, and a moth.

comment by Russell_Wallace · 2009-01-22T14:55:00.000Z · LW(p) · GW(p)
But if not - if this world indeed ranks lower in my preference ordering, just because I have better scenarios to compare it to - then what happens if I write the Successful Utopia story?

Try it and see! It would be interesting and constructive, and if people still disagree with your assessment, well then there will be something meaningful to argue about.

comment by Emile · 2009-01-22T16:08:00.000Z · LW(p) · GW(p)

Great story!

This use of the word 'wants' struck me as a distinction Eliezer would make, rather than this character.
Similarly, it's notable that the AI seems to use exactly the same interpretation of the word lie as Eliezer Yudkowsky: that's why it doesn't self-describe as an "Artificial Intelligence" until the verthandi uses the phrase.

... neither of those is unusual if you consider that the veary nearly wise fool was Eliezer Yudkowsky.

(Rule 76: "... except for me. I get my volcano base with catgirls.")

comment by Khannea_Suntzu · 2009-01-22T16:50:00.000Z · LW(p) · GW(p)

I am sorry.

I must not be a human being to not see any problem in this scenario. I can vaguely see that many humans would be troubled by this, but I wouldn't be. Maybe to me humanity is dead already, ambiguity intentional.

I welcome your little scary story as currently to me the world is hell.

comment by Abigail · 2009-01-22T16:57:00.000Z · LW(p) · GW(p)

"Men and women can make each other somewhat happy, but not most happy" said the genie/ AI.

What will make one individual "happy" will not work for the whole species. I would want the AI to interview me about my wants: I find Control makes me happier than anything, not having control bothers me. Control between fifty options which will benefit me would be good enough, I do not necessarily need to be able to choose the bad ones...er...

Being immortal and not being able to age, and being cured of any injury, sound pretty good to me. It is not just contrarianism that makes people praise this world.

Please do write your "actual shot at applied fun theory".

comment by NancyLebovitz · 2009-01-22T17:07:00.000Z · LW(p) · GW(p)

Science fiction fandom makes me happy. Tear it into two separate pieces, and the social network is seriously damaged.

Without going into details, I have some issues about romantic relationships-- it's conceivable that a boreana could make me happy (and I'm curious about what you imagine a boreana to be like), but I would consider that to be direct adjustment of my mind, or as nearly so as to not be different.

More generally, people tend to have friends and family members of the other sex. A twenty-year minimum separation is going to be rough, even if you've got "perfect" romantic partners.

If I were in charge of shaping utopia, I'd start with a gigantic survey of what people want, and then see how much of it can be harmonized. That would at least be a problem hard enough to be interesting for an AI.

If that's not feasible, I agree that some incremental approach is needed.

Alternatively, how about a mildly friendly AI that just protects us from hostile AIs and major threats to the existence of human race? I realize that the human race will be somewhat hard to define, but that's just as much of a problem for the "I just want to make you happy" AI.

comment by Angel · 2009-01-22T17:14:00.000Z · LW(p) · GW(p)

"Top hits appear to be David Boreanaz,"

Eliezer is a Buffy fan.

comment by Psy-Kosh · 2009-01-22T17:16:00.000Z · LW(p) · GW(p)

Khannea: Eliezer himself said that he'd take that world over this one, if for no other reason than that world buys more time to work, since people aren't dying.

However, we can certainly see things that could be better... We can look at that world and say "eeeh, there're things we'd want different instead"

The whole "enforced breaking up of relationships" thing, for one thing, is a bit of a problem, for one thing.

comment by Doug_S. · 2009-01-22T18:06:00.000Z · LW(p) · GW(p)

Although having the girl of my dreams would certainly be nice, I'd soon be pissed off at the lack of all the STUFF that I like and have accumulated. No more getting together with buddies and playing Super Smash Bros (or other video games) for hours? No Internet to surf and discuss politics and such on? No more Magic: the Gathering?

Screw that!

Replies from: Carinthium, TuviaDulin
comment by Carinthium · 2010-11-23T12:58:41.645Z · LW(p) · GW(p)

If the A.I had any real brains, those things would be avaliable as well (at least between people of the same, and possibly even the opposite sex).

comment by TuviaDulin · 2012-04-01T20:11:11.854Z · LW(p) · GW(p)

Personally, knowing that my verthandis were created specifically for me would make me want them less. Even if they were strong-willed, intelligent, and independent, I'd still -know- that their existence is tailored to suit my tastes, and this would prevent me from seeing them as real people. And I'd want real women.

Replies from: SteveJordan
comment by SteveJordan · 2017-11-30T02:55:30.934Z · LW(p) · GW(p)

I know this is way past its expiry date, but I have to ask:

Exactly how much dysfunction / argument / neglect / abuse would it take to make you happy? Your organic brain just isn't that complex compared to an artillect like that Genie. It sounds like that would be baked in to your Verthandi. If she's a modified em, then she's functionally as "human" as any of us.

Or perhaps you'd need her to come after you with a carving knife to persuade you she's genuinely hurt by your rejection?

comment by Roko · 2009-01-22T18:30:00.000Z · LW(p) · GW(p)

Doug: "Although having the girl of my dreams would certainly be nice, I'd soon be pissed off at the lack of all the STUFF that I like and have accumulated. No more getting together with buddies and playing Super Smash Bros (or other video games) for hours? No Internet to surf and discuss politics and such on? No more Magic: the Gathering?

Screw that!"

You'd rather play "Magic: the gathering" than get laid? WTF?

comment by Caledonian2 · 2009-01-22T19:42:00.000Z · LW(p) · GW(p)

Because I'm curious:

How much evidence, and what kind, would be necessary before suspicions of contrarianism are rejected in favor of the conclusion that the belief was wrong?

Surely this is a relevant question for a Bayesian.

comment by Thom_Blake · 2009-01-22T19:44:00.000Z · LW(p) · GW(p)

Doug S,

Indeed. The AI wasn't paying attention if he thought bringing me to this place was going to make me happier. My stuff is part of who I am; without my stuff he's quite nearly killed me. Even moreso when 'stuff' includes wife and friends.

But then, he was raised by one person so there's no reason to think he wouldn't believe in wrong metaphysics of self.

comment by Doug_S. · 2009-01-22T20:44:00.000Z · LW(p) · GW(p)

Roko: Yes. Yes I would.

There are plenty of individual moments in which I would rather get laid than play Magic, but on balance, I find Magic to be a more worthwhile endeavor than I imagine casual sex to be. The feeling I got from this achievement was better and far longer lasting than the feelings I get from masturbation. Furthermore, you can't exactly spend every waking moment having sex, and "getting laid" is not exactly something that is completely impossible in the real world, either.

Also, even though I'm sure that simply interacting with the girl of my dreams in non-sexual ways would, indeed, be a great source of happiness in and of itself, I'd still be frustrated that we couldn't do all the things that I like to do together!

comment by Sold_my_Power_Nine_to_donate_to_SIAI · 2009-01-22T21:03:00.000Z · LW(p) · GW(p)

Ah, discussion of the joys of Magic: the Gathering on Overcoming Bias.

It's like all the good stuff converges in one place :)

comment by steven · 2009-01-22T21:30:00.000Z · LW(p) · GW(p)

In view of the Dunbar thing I wonder what people here see as a eudaimonically optimal population density. 6 billion people on Mars, if you allow for like 2/3 oceans and wilderness, means a population density of 100 per square kilometer, which sounds really really high for a cookie-gatherer civilization. It means if you live in groups of 100 you can just about see the neighbors in all directions.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-22T21:48:00.000Z · LW(p) · GW(p)

Since people seem to be reading too much into the way the Wrinkled Genie talks, I'll note that I wrote this story in one night (that was the goal I set myself) and that the faster I write, the more all of my characters sound like me and the less they have distinctive personalities. Stories in which the character gets a genuine individual voice are a lot more work and require a lot more background visualization.

Steven, I didn't do that calculation. Well, first of all I guess that Mars doesn't end up as 2/3 ocean, and second, we'll take some mass off the heavier Venus and expand Mars to give it a larger surface area. That's fair.

comment by JulianMorrison · 2009-01-22T21:59:00.000Z · LW(p) · GW(p)

Eliezer, you're cheating. Getting trapped makes this a dystopia. It would make almost anything a dystopia. Lazy!

Suppose a similar AI (built a little closer to Friendly) decided to introduce verthandi and the pro-female equivalent (I propose "ojisamas") into an otherwise unchanged earth. Can you argue that is an amputation of destiny? Per my thinking, all you've done is doubled the number of genders and much increased the number of sexual orientations, to the betterment of everyone. (What do you call a verthandi who prefers to love an ojisama?)

comment by Gregory_Lemieux · 2009-01-22T22:31:00.000Z · LW(p) · GW(p)

Angel: "Eliezer is a Buffy fan"

Wow, I hope they have chiropractors on Venus for all the Stoopy McBroodingtons lurking around like Angel. Every time I he popped up on Buffy I kept wanting to fix his posture.

comment by Cyan2 · 2009-01-23T00:09:00.000Z · LW(p) · GW(p)

Huh. I guess I just don't see Angel (the TV character, not the commenter) as the equivalent of the verthandi. (Also naming the idea after the actor instead of the character lead me somewhat astray.)

comment by Jon2 · 2009-01-23T03:38:00.000Z · LW(p) · GW(p)

Sure this isn't a utopia for someone who wants to preserve "suboptimal" portions of his/her history because they hold some individual significance. But it seems a pretty darn good utopia for a pair of newly created beings. A sort of Garden of Eden scenario.

comment by Doug_S. · 2009-01-23T04:34:00.000Z · LW(p) · GW(p)

As for what to call the female equivalent of the "verthandi" - well, Edward Cullen of the recent Twilight series was intended by the author to be a blatant female wish fulfillment/idealized boyfriend character, although the stories and character rub an awful lot of people the wrong way.

comment by bogdanb · 2009-01-24T14:46:00.000Z · LW(p) · GW(p)

Will Pearson: your arguments apply equally well to any planner. Planners have to consider the possible futures and pick the best one (using a form of predicate), and if you give them infinite horizons they may have trouble.

True, whenever you have a planner for a maximizer, it has to decide how to divide its resources between planning and actually executing a plan.

However, your wish needs a satisfier: it needs to find at least one solution that satisfies the predicate "I wouldn't regret it".

The maximizer problem has a "strong" version which translates to "give me the maximum possible in the universe", which is obviously a satisfier problem (i.e., find a solution that satisfies the predicate "is optimal", then implement it). But you can always reformulate these in a "weak" version: "find a way of creating benefit; then use x% resources to find better ways of maximizing benefit, and the rest to implement the best techniques at the moment", with 0 < x < 100 an arbitrary fraction. (Note that the "find better ways part" can change the fraction if it's sure it would improve the final result.)

So, if you just like paperclips and just want a lot of those, you can just run the weak version of the maximizer be done with it: you're certain to get a lot of something as long as it's possible.

But for satisfiability problems, you might just have picked problem that doesn't have a solution. Both "find a future I wouldn't regret" and "make the maximum number of paperclips possible in this Universe" are such satisfiability problems. (I don't know if these problems in particular have a "findable" solution, however, nor how to determine it. The point is that they might be, so it's possible to spend the lifetime of the Universe for nothing.)

The only idea of an equivalent "weak" reformulation would be to say "use X resources (this includes time) to try to find a solution". This doesn't seem as acceptable to me: you might still spend X resources and get zero results. (As opposed to the "weak" maximizer, where you still get something as long as it's possible.) But maybe that's just because I don't care about paperclips that much, I don't know.)

*

Now, if you absolutely want to satisfy a predicate, you just don't have any alternative to spending all your resources on that. OK. But are you sure that "no regrets" is an absolutely necessary condition on the future? Actually, are you sure enough of that that you'd be willing to give up everything for the unknown chance of getting it?

comment by Will_Pearson · 2009-01-24T15:05:00.000Z · LW(p) · GW(p)

Reformulate to least regret after a certain time period, if you really want to worry about the resource usage of the genie.

comment by Christopher_Carr · 2009-01-29T10:10:00.000Z · LW(p) · GW(p)

There's almost a Gene Wolfe feel to the prose, which is, of course, a complement.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-01-29T12:04:00.000Z · LW(p) · GW(p)
There's almost a Gene Wolfe feel to the prose, which is, of course, a complement.

I don't usually do the modesty thing, because it feels like handing a gift back to the person who tried to give it to you. But on this occasion - sir, I feel that you praise me way, way, way too highly.

comment by flamoot · 2009-01-29T19:57:00.000Z · LW(p) · GW(p)

SUPER STORY WOULD READ AGAIN

comment by Zubon · 2009-01-29T23:14:00.000Z · LW(p) · GW(p)

Eliezer, since you are rejecting the Wolfean praise, I will take the constructive criticism route. This is not your best writing, but you know that since you spent a night on it.

We have three thousand words here. The first thousand are disorientation and describing the room and its occupants. The second thousand is a block of exposition from the wrinkled figure. The third thousand is an expression of outrage and despair. Not a horrid structure, although you would want to trim the first and have the second be less of a barely interrupted monologue.

As a story, the dominant problem is that the characters are standing in a blank room being told what has already happened, and that "what" is mostly "I learned then changed things all at once." There have been stories that do "we are just in a room talking" well or badly; the better ones usually either make the "what happened" very active (essentially a frame story) or accept the recumbent position and make it entirely cerebral; the worse ones usually fall into a muddled in-between.

As a moral lesson, the fridge logic keeps hitting you in these comments, notably that this is a pure Pareto improvement for much of the species. Even as a failed utopia, you accept it as a better place from which to work on a real one. And 89.8% want to kill the AI? The next most common objection has been how this works outside heteronormativity, or for a broad range of sexual preferences. Enabling endless non-fatal torture is another winner for "how well did you think that through?" So it is not bad enough to fulfill its intent, its "catch" seems inadequately conceived, and there are other problems that make the whole scenario suspect.

My first thought of a specific way to better fulfill the story's goals would be to tell it from Helen's perspective, or at least put more focus on her and Lisa. You have many male comments of "hey, not bad." They are thinking of their own situations. They are not thinking of their wives and daughters being sexually serviced by boreana. The AI gets one line about this, but Stephen seems more worried about his fidelity than hers. With a substantially male audience, that is where you want to shove the dagger. Take it in the other direction by having the AI be helpful to Helen. While she does not want to accept her overwhelming attraction to her crafted partner, the AI wants her to make a clean break so she can be happier. It will gladly tell her about how Stephen's partner is more attractive to him than she could ever be, how long it will take for his affection to be alienated, and how rarely he will think about Helen after they have spent more time on different planets than they spent in the same house. Keep the sense of family separation by either making the child a son or noting that the daughter is somewhere on the planet, happier beyond her mother's control; in either case, note that s/he also woke up with a very attractive member of the opposite sex whose only purpose in life is to please him/her. This could be the point to note those male sexual enhancements, and monogamy is not what makes everyone happiest, so maybe Lisa wakes up with a few boreana.

And maybe this is just me, but the AI could seem a bit less like the Dungeonmaster from the old D&D cartoon.

comment by Damien_R._S. · 2009-02-04T18:24:00.000Z · LW(p) · GW(p)

The story has problems, and it's not clear how it's meant to be taken.

Way 1: we should believe the SAI, being a SAI, and so everyone will in fact be happier within a week. This creates cognitive dissonance, what with the scenario seeming flawed to us, and putting us in a position of rejecting a scenario that makes us happier.

Way 2: we should trust our reason, and evaluate the scenario on its own merits. This creates the cognitive dissonance of the SAI being really stupid. Yeah, being immortal and having a nice companion and good life support and protection is good, but it's a failed utopia because it's trivially improvable. The fridge logic is strong in this one, and much has been pointed out already: gays, opposite-sex friends, family. More specific than family: children. What happened to the five year olds in this scenario?

The AI was apparently programmed by a man who had no close female friends, no children, and was not close to his mother. Otherwise the idea that either catgirls or Belldandies should lead to a natural separation of the sexes would not occur. (Is the moral that such people should not be allowed to define gods? Duh.) If I had a catgirl/non-sentient sexbot, that would not make me spend less time with true female friends, or stop calling my mother (were she still alive.) Catgirl doesn't play Settler of Catan or D&D or talk about politics. A Belldandy might, in the sense that finding a perfect mate often leads to spending less time with friends, but it still needn't mean being happy with them being cut off, or being unreceptive to meeting new friends of either sex.

So yeah, it's a pretty bad utopia, defensible only in the "hey, not dying or physically starving" way. But it's implausibly bad, because it could be so much better by doing less work: immortalize people on Earth, angelnet Earth, give people the option of summoning an Idealized Companion. Your AI had to go to more effort for less result, and shouldn't have followed this path if it had any consultation with remotely normal people. (Where are the children?)

Replies from: Cronocke, None
comment by Cronocke · 2011-02-15T06:20:39.843Z · LW(p) · GW(p)

I think Way 2 was what the author intended - it's not actually meant to be a true utopia. Thus "failed utopia".

But the story raises a couple interesting questions, that I don't notice an answer to.

How did the AI do all this, given the confines of human technology at the time it was set?

And if the AI could do it... what's stopping a human from doing the same?

I envision someone having those precise thoughts on either Mars or Venus, and (either swiftly or gradually) discovering the methods needed to alter reality the same way the AI did. Soon, everything is set, if not "right", at the very least back to "normal".

... although perhaps the "perfect" mates are given their own distant world to live on, and grow without worry of human intervention anytime soon.

... it probably says something about me that I'd also, if I were this person, want to restore the AI to "life" just to trap it in a distant prison from which it can observe humanity, but not interact with anything... as a form of poetic justice for the distant prisons it tried to place humanity within.

Replies from: TheOtherDave
comment by TheOtherDave · 2011-02-15T16:40:22.361Z · LW(p) · GW(p)

Of course, then you'd just have lots of people throwing up on the sands of Earth, because setting everything "back to normal" involves separating them from mates with whom they have been extremely happy.

(Presumably you'd also have a lot of unhappy nonhumans on that distant world, for the same reasons. Assuming the mates really are nonhuman, which is to say the least not clear to me.)

comment by [deleted] · 2012-01-28T12:23:22.873Z · LW(p) · GW(p)

Catgirl doesn't play Settler of Catan or D&D or talk about politics.

Actually she probably can.

comment by Neil · 2009-03-13T14:07:00.000Z · LW(p) · GW(p)

The point is, I believe, that we value things in ways not reducible to "maximising our happiness". Here Love is the great example, often we value it more than our own happiness, and also the happiness of the beloved. We are not constituted to maximise our own happiness, natural selection tells you that.

comment by nolrai · 2009-04-26T20:34:00.000Z · LW(p) · GW(p)

You know I cant help but read this a victory for humanity. Not a full victory, but i think the probability of some sort of interstellar civilization that isn't a dystopia is is higher afterwords then before, if nothing else we are more aware of the dangers of AI, and anything that does that and leaves a non-dystopian civilization capable of makeing useful AI is mostlikely a good thing by my utility function.

One thing that does bug me is I do not value happiness as much as most people do. Maybe I'm just not as empathetic as most people? I mean I acutely hope that humanity is replaced by a decenent civilisation/spieces that still values Truth ans Beauty, I care a lot more weather they are successful then if they are happy.

I wonder how much of the variance in preference between this and others could be explained by weather they are single (i.e I don't have some one they love to the point of "I don't want to consider even trying to live with someone else") vs. those that do.

I would take it, I imagine I would be very unhappy for a few months. (It feels like it would take years but thats a well known bias).

I assume "verthandi" is also not a coincidence. "verthandi"

comment by TheOtherDave · 2010-11-25T02:36:25.818Z · LW(p) · GW(p)

Somewhere on this site, there's an article on writing about the Singularity that offers the suggestion of trying to imagine the experience of having lived in the resulting world for some period of time, rather than just the experience of the immediate transition to that world. The idea being that something that may seem utopian when you think about the transition might prove obviously unsatisfactory when you think about the continued experience.

I think this scenario demonstrates the corresponding effect for dystopias.

Yes, I appreciate that breaking up with a long-time committed partner in favor of a new relationship that makes you happier than you ever were before -- especially when you aren't given a choice in the matter -- feels really awful in the immediate aftermath.

But I think it would be difficult to keep that same sense of dystopia when writing about this world six months later, once everyone has gotten used to the idea.

comment by lockeandkeynes · 2010-12-08T17:06:31.653Z · LW(p) · GW(p)

Doesn't sound bad at all.

comment by Kingreaper · 2010-12-08T19:04:17.340Z · LW(p) · GW(p)

I've realised what would make this utopia make almost perfect sense:

The AI was programmed with a massive positive utility value to "die if they ask you to"

So, in maximising it's utility, it has to make sure it's asked to die. It also has to fulfil other restrictions, and it wants to make humans happy. So it has to make them happy in such a way that their immediate reaction will be to want it dead, and only later will they be happy about the changes.

Replies from: Jiro
comment by Jiro · 2013-06-01T17:09:38.294Z · LW(p) · GW(p)

Any sane person programming such an AI would program it to have positive utility for "die if lots of people ask it to" but higher negative utility for "being in a state where lots of people ask you to die". If it's not already in such a state, it would not then go into one just to get the utility from dying.

Replies from: Articulator
comment by Articulator · 2013-06-13T21:01:04.684Z · LW(p) · GW(p)

I fear the implication is that the creator was not entirely, as you put it, sane. It is obvious that his logic and AI programming skills left something to be desired. Not that this world is that bad, but it could have stood to be so much better...

comment by Ender · 2011-08-28T05:06:08.198Z · LW(p) · GW(p)

Teehee... "Men are from Mars..."

comment by TuviaDulin · 2012-04-01T18:57:32.029Z · LW(p) · GW(p)

When the critical 90% threshold is reached and the AI self-destructs, will there be anything left behind to ensure human safety? He said that the world he created will remain in his wake, but will it be able to maintain itself without his sentient oversight? Is there any completely reliable mechanism that could prevent ecological collapse, or a deadly mutation in the catgirls/boys, or a failure in the robots that protect people from harm?

If not, then the clever fool who created the AI was really, really a fool. You'd think he'd have at least included a contingency that makes the AI reset everything back to the way it was before it self-destructs....

comment by ikrase · 2012-04-03T06:21:38.208Z · LW(p) · GW(p)

I would do this, though I would prefer better.

As I understnad the Verthandi are pretty much human, they are just arranged and apportionated to be perfectly complimentary to humans.

comment by pnrjulius · 2012-06-06T22:00:33.897Z · LW(p) · GW(p)

I guess I share the intuition that there's something wrong with this scenario... but I really can't put my finger on what it is.

The transition seems like it was done too coercively... you split up a lot of families and friends.

But other than that? You can't make a "catgirl" argument, because we specified that the verthandi and boreana are sentient beings like we are. We can still be friends and lovers with them, and by stipulation more harmoniously that we would with other humans. It actually seems like the men/verthandi and women/boreana would split into two species, each of which would be happier than Homo sapiens presently is.

comment by [deleted] · 2012-07-02T09:25:15.970Z · LW(p) · GW(p)

I'm curious about what happened to homosexuals and bisexuals with same-sex preferences in the story. I imagine they were put together somewhere...

I'm on the camp that isn't very happy with replacing romantic partners with superstimulus pleasure bringers, in part because I get so attached to people I care about (and objects, too. Especially cute ones.)

Also I imagine it may be because my standards and tastes for partners are really narrow yet my current partner fits them so well...you might as well make a slightly tweaked clone of my partner to make an ideal interest for me specifically.

As a side note: Not only would I feel bad if I was cheated on but I would feel horrible and unworthy of love if I cheated myself (and why would you expect a good and loving partner if you are not going to behave like that yourself?) so that would be another way in which this failed utopia scares me.

comment by [deleted] · 2012-12-12T08:03:52.980Z · LW(p) · GW(p)

It nodded. "Roughly 89.8% of the human species is now known to me to have requested my death. Very soon the figure will cross the critical threshold, defined to be ninety percent. That was one of the hundred and seven precautions the wise fool took, you see. The world is already as it is, and those things I have done for you will stay on—but if you ever rage against your fate, be glad that I did not last longer." And just like that, the wrinkled thing was gone.

Out of curiosity: Was this intended to be interpreted as a trick of the AI?

comment by A1987dM (army1987) · 2013-10-01T16:12:34.721Z · LW(p) · GW(p)

I'm surprised that no-one in this comment thread has jokingly hypothesised that gay men are on Uranus.

Replies from: blacktrance
comment by blacktrance · 2014-01-15T20:17:33.282Z · LW(p) · GW(p)

It's probably a credit to LW that no one has.

comment by A1987dM (army1987) · 2013-10-31T09:47:53.997Z · LW(p) · GW(p)

There is no industrial infrastructure here, least of all fast travel or communications;

So not only am I cut off from all (non-verthandi) women, I am also cut off from all men not within walking distance from me...

comment by Jiro · 2013-10-31T18:32:44.982Z · LW(p) · GW(p)

Verthandi seem similar to illegal immigrants: each individual one is sympathetic and a person whose needs it seems can be met without harming anyone. But cumulatively, the harm caused by accepting them will destroy society.

Replies from: Moss_Piglet
comment by Moss_Piglet · 2013-10-31T18:42:41.392Z · LW(p) · GW(p)

I'd say that dumping the human population onto Mars and Venus (or more likely in a simulation of Mars/Venus) without any technology or even written records is probably a bigger factor in society being destroyed than a few catgirls. And so long as their offspring are biologically human, I don't see that putting everyone in a relationship which is virtually assured to be happy and stable over a long term is a bad thing for society.

The illegal immigration thing seems like it came out of left field. I don't disagree, but it doesn't seem particularly relevant.

comment by blacktrance · 2014-01-15T20:18:33.061Z · LW(p) · GW(p)

The mistake wasn't telling the genie to make people happy, the mistake was giving him wrong information about what makes people happy.

Replies from: gattsuru
comment by gattsuru · 2014-01-15T22:17:59.818Z · LW(p) · GW(p)

Is that the case?

I mean, let's be honest. You can conceivably have a great relationship with your significant other. Can you argue that it it is the best possible relationship?

Consider the dilemma of soulmates. If you actually have a soulmate -- the single person who is the best match you could ever have -- the chances of actually meeting them are incredibly small. If we include the entire world's population and limit your actual dating processes to folk whose name you actually remember, we pit a cohort of a couple hundred people to one of 3.5 billion (twice that if you are bisexual). Even if we presume your soulmate shares your language and cultural background, that's still a couple hundred people versus tens of millions.

And you don't have to believe in soulmates for this to come forward. The desire breakdown of the genders is not identical : in the aggregate, you will find different political beliefs, sexual desires, and acceptable habits. Much and likely most of this is socially conditioned rather than biological, but that doesn't change how it affects your emotions. Even if you specifically find someone that votes the same way that you do, doesn't care about the position of a toilet seat, and thinks exactly the same thing about fluffy handcuffs, the average you will not. ((Nor is this limited to straights, although the gay and lesbian populations do seem to have a smaller gap between the desires distinct subcultures and the desires of their targets of desire.))

In the modern world, we sigh, and either don't believe in soulmates or compromise. We don't have the time or tools or resources to find the best possible relationship, there might /not be/ a perfect choice, and 99.9% of the best possible relationship (or even 50% or 10% or 5% or 1%) can still exceed the threshold costs of relationships to start with. You don't get those excuses in a strongly transhumanist setting. You have eternity, or a reasonable approximation thereof -- you can't computationally distinguish the emotional costs of a breakup making you upset for a week or month, versus a millennium of once-a-decade severe disagreements, nevermind ten billion years of it. Finding a perfect, precisely-tailored-to-you lover becomes less difficult than a normal person with their faults and varying likes and dislikes, when the later requires you to leave your room.

I agree that this is a bad thing -- verthandi and their nonsexual counterparts are my biggest creep-factors in Friendship is Optimal -- but I don't think it's bad because it makes people unhappy, or even because it makes them less happy than they could be, for any likely synonym of happy.

Replies from: blacktrance
comment by blacktrance · 2014-01-15T23:49:04.417Z · LW(p) · GW(p)

You can conceivably have a great relationship with your significant other. Can you argue that it it is the best possible relationship?

Yes, if there's a ceiling on how good a relationship can be at a given point. Compare to eating food - you eat until you're completely full (and aren't feeling unwell). You wouldn't claim that there would only be one best "soulmate meal" that can make you fuller than any other meal, because there is a variety of meals that can make you full and satisfied. The same can be (and, I think, is) true for relationships. There are relationships that are suboptimal, that don't reach the ceiling, but there are multiple people who do reach the ceiling. If search costs were zero, many people would find several matches between whom they could be genuinely indifferent.

There is also the contrast between the potential quality of a relationship and the actual quality of an existing relationship. Often, your relationship improves as it progresses, so even if you'd meet someone with whom you'd also be hypothetically compatible (maybe even more compatible than with your current partner if you had known both for an equal amount of time), it can still be possible that a relationship with a new person could never catch up in quality compared to the old relationship. This is one of the problems with this scenario. Even if this vethandi would have been more compatible with Stephen had he known her for as long as he had known his wife, it doesn't mean that their relationship would be better than the relationship with the wife would have been. There is also the effect of resentment to consider. I find it highly likely that at least for some people, replacing their spouses with vethandi would cause a permanent decrease in lifetime happiness.

But for relationships that aren't as good as they could be - for relationships not at the ceiling, and those that could be surpassed by a vethandi - for people in such relationships, the vethandi replacement would be an improvement. Something on a more minor scale already happens today: people say things like, "My old relationship wasn't bad, but it ended, and then I met this new guy/girl and they're much better".

FWIW, I didn't think that anything in Friendship is Optimal was creepy.

comment by pjeby · 2014-02-27T19:28:42.716Z · LW(p) · GW(p)

I can't believe it took me five years to think to comment on this, but judging from the thread, nobody else has either.

If Stephen's utility function actually includes a sufficiently high-weighted term for Helen's happiness -- and vice versa -- then both Stephen and Helen will accept the situation and be happy, as their partner would want them to be. They might still be angry that the situation occurred, and still want to get back together, but not because of some sort of noble sacrifice to honor the symbolic or signaling value of love, but because they actually cared about each other.

Ironically, the only comment so far that even comes close to considering Stephen's utility in relation to what's happening to Helen is one that proposes her increased happiness would cause him pain, which is not the shape I would expect from a utility function that can be labeled "love" in the circumstances described here.

None of that makes this a successful utopia, of course, nor do I suggest that Stephen is overreacting in the moment of revelation -- you can want somebody else to be happy, after all, and still grieve their loss. But, dang it, the AI is right: the human race will be happier, and there's nothing horrific about the fact they'll be happier, at least to people whose utility function values their or others' happiness sufficiently high in comparison to their preference to be happy in a different way.

(Which of course means that this comment is actually irrelevant to the main point of the article, but it seemed to me that this was a point that should still be raised: the relevance of others' happiness as part of one's utility function gets overlooked often enough in discussions here as it is.)

Replies from: TheOtherDave
comment by TheOtherDave · 2014-02-27T20:43:49.231Z · LW(p) · GW(p)

IIRC, there was an earlier discussion which conceded the point that the human race will be happier in this scenario than in the scenario with no AI; the story depends on pumping the intuition that there's some unrealized and undescribed third possibility which is so much better than either of those scenarios that choosing either of them constitutes a tragic ending.

IIRC, the author's response to endorsing this scenario simply because in it people are happier without their opposite-sex partners (and because their opposite-sex partners are happier without them, as you say) was to mutter something deprecating about satisficers vs optimizers.

Full disclosure: I find this particular intuition pump leaves me cold, perhaps because I'm in a same-sex relationship. I've no doubt we could construct an analogous pump that would intuitively horrify me, and I might react differently.

Replies from: Jiro, Jiro
comment by Jiro · 2014-02-27T22:28:14.760Z · LW(p) · GW(p)

Let's modify the scenario a bit.

The dubiously friendly AI, instead of creating artificial significant others, merely uses its computational ability to figure out that if it breaks up all existing relationships and instead puts people together with new partners, then everyone would be happier. Again, it then separates everyone in such a way that the existing partners could not get together in a reasonable amount of time. (You couldn't do complete sex segregation, but you could put several pairs together on the same planet as long as the particular people who you broke them up with are on other planets.) Utopia or not?

Replies from: TheOtherDave
comment by TheOtherDave · 2014-02-27T22:31:47.348Z · LW(p) · GW(p)

OK, I acknowledge receipt of this modified scenario.

And... what?

Replies from: Jiro
comment by Jiro · 2014-02-27T22:41:14.411Z · LW(p) · GW(p)

And is it good or bad? There's more than one objection to the original scenario and the point of this is 1) to separate them out, and 2) to make them more obvious.

comment by Jiro · 2014-02-27T22:36:05.273Z · LW(p) · GW(p)

Or a second scenario: The AI doesn't try to create new relationships at all, whether with artificial or natural partners. Instead it just breaks up all relationships, and then wireheads everyone. It calculates that the utility gained from the wireheading is greater than the utility lost in breaking up the relationships. Is this good or bad?

(I would suggest that putting people with artificial partners is sort of like wireheading in that people might not want to be put under such circumstances, even though once they are already in such circumstances they might be happier.)

Replies from: Nornagest
comment by Nornagest · 2014-02-27T23:50:25.851Z · LW(p) · GW(p)

I would suggest that putting people with artificial partners is sort of like wireheading in that people might not want to be put under such circumstances, even though once they are already in such circumstances they might be happier.

This seems to stretch the notion of wireheading beyond usefulness. Many situations exist where we might endorse options retrospectively that we wouldn't prospectively, whether through bias, limited information, random changes in perspective, or normal lack of maturity ("eew, girls have cooties!"). Relatively few of them rely on superstimuli or break our goal structure in a strong way.

comment by gjm · 2016-03-09T14:01:42.663Z · LW(p) · GW(p)

Eliezer on Hacker News, March 2016: "Worrying about the effect of strong AI on sexual relationships is like worrying about the effect on US-Chinese trade patterns if the Moon crashes into the Earth." :-)

comment by Mati_Roy (MathieuRoy) · 2020-09-23T09:27:14.311Z · LW(p) · GW(p)

here's my prompt for a Rationalist Failed Utopia #4.2

Find the nearest lava bath. And threaten the AI to hang out there for a while if ze doesn't do what you want.

Could also just commit to psychological self-torture if ze prevents you from going in the bath.

EtA: Actually, it should be fine as the AI has "set guards in the air that prohibit lethal violence, and any damage less than lethal, your body shall repair."

Although that might bring us back to the problem of communicating what you want to an AI.

comment by jsnider3 · 2023-04-01T22:05:13.151Z · LW(p) · GW(p)

From 2023's perspective, people should have been encouraged (not discouraged) from building AI like this.