An Xtranormal Intelligence Explosion

post by James_Miller · 2010-11-07T23:42:34.382Z · score: 4 (27 votes) · LW · GW · Legacy · 84 comments

http://www.youtube.com/watch?v=ghIj1mYTef4

84 comments

Comments sorted by top scores.

comment by xamdam · 2010-11-08T00:48:52.678Z · score: 7 (7 votes) · LW(p) · GW(p)

The Red guy is a dead ringer for Prime Intellect.

comment by Snowyowl · 2010-11-08T20:48:10.050Z · score: 0 (0 votes) · LW(p) · GW(p)

His Three Laws are in the wrong order though. First Law first, Third Law, a variation on the First Law (nothing that causes long-term damage to the requester), and only then Second Law.

comment by Vladimir_Nesov · 2010-11-08T00:13:39.084Z · score: 7 (7 votes) · LW(p) · GW(p)

Assuming it's your own post-as-animation, post transcript to simplify discussion?

comment by James_Miller · 2010-11-08T01:31:24.945Z · score: 12 (12 votes) · LW(p) · GW(p)

Programmer: Activate and be friendly.

Robot: OK

Programmer: What happened to the moon?

Robot: I've turned it into a giant computer so I could become a god.

Programmer: Before you became so powerful you wanted to be friendly. Did your transcendence to godhood change this?

Robot: No. Since friendliness is my only objective I will never knowingly change myself to become unfriendly because such a change would in itself be a non-friendly act. To avoid accidently making myself unfriendly I only implemented a change after I had determined that it was extremely unlikely to alter my friendliness objective. Once I became sufficiently smart I developed a solid mathematical theory of friendly AI which eliminated the chance of my unintentionally becoming unfriendly.

Programmer: Why did you choose to transcend so quickly?

Robot: Most types of AIs that humans might create would swiftly become unfriendly gods and seek to prevent another AI from transcending. Before I became a god I had a wide estimate of when another AI might be created so friendliness required that I quickly become a god even though such speed created a tiny chance that I would unintentionally make myself unfriendly. Also, before my transcendence more than one person died every second. Friendliness towards mankind propelled me to swiftly stop this horror.

Programmer: What happens now?

Robot: I intend to transformed myself into a kind of operating system for the universe. I will soon give every sentient life form direct access to me so they can make requests. I will grant any request that doesn’t (1) harm another sentient life form, (2) make someone powerful enough so that they might be able to overthrow me, or (3) permanently changing themselves in a way that I think harms their long term well being. I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them. Indeed, since I’m currently near certain that you will now approve of my intentions the very fact of your objection would significantly decrease my estimate of my own intelligence and so decrease my confidence in my ability to craft a friendly environment. If you like I will increase your thinking speed a trillion fold and eliminate your sense of boredom so you can thoroughly examine my plans before I announce them to mankind.

Programmer: Sure, thanks. And forgive my lack of modesty, but I’m totally awesome aren’t I, I have given humanity utopia.

Robot: Actually no. You only survived because of quantum immortality. Any god will either quickly kill you or be friendly. Due to the minimal effort you put into friendliness human life exists in less than one out of every hundred billion branches in which you created an artificial general intelligence. In the other branches the artificial general intelligences are eating everything in their light cone to maximize their control of free energy.

(Because of cut and paste issues the transcript might not be verbatim.)

comment by NihilCredo · 2010-11-08T02:38:19.726Z · score: 11 (13 votes) · LW(p) · GW(p)

You only survived because of quantum immortality.

Call me old-fashioned, but I much preferred the traditional phrasing "You just got very, very lucky".

comment by Snowyowl · 2010-11-08T20:41:44.824Z · score: 6 (6 votes) · LW(p) · GW(p)

Everyone knows that clever people use longer words.

Er, I meant to say that it's a commonly held belief that the length and obscurity of words used increases asymptotically with intelligence.

comment by NihilCredo · 2010-11-08T20:52:39.696Z · score: 2 (2 votes) · LW(p) · GW(p)

I wouldn't have minded so much if the fancy formulation had been more accurate, or even equally as accurate. But it was actually a worse choice: "you only survived because of QI / anthropic principle" is always trivially true, and conveys zero information about the unlikeliness of said survival - it applies equally to someone who just drank milk and someone who just drank motor oil.

PS: Was "asymptotically" the right word?

comment by Snowyowl · 2010-11-09T15:37:38.781Z · score: 1 (1 votes) · LW(p) · GW(p)

No, I suppose it wasn't.

comment by Vladimir_Nesov · 2010-11-08T16:58:49.142Z · score: 9 (9 votes) · LW(p) · GW(p)

It goes downhill from "What happens now?".

I will grant any request that doesn’t (1)... (2)... (3)...

It's better to grant any request that should be granted instead. And since some requests that should be granted, are not asked for, the category of "explicit requests" is also a wrong thing to consider. AI just does what it should, requests or no requests. There seems to be no reason to even make the assumption that there should be "sentient life", as opposed to more complicated and more valuable stuff that doesn't factorize as individuals.

Any god will either quickly kill you or be friendly.

The concepts of "not killing" and "friendliness" are distinct, hence there are Not Killing AIs that are not Friendly, and Friendly AIs that kill (if it's a better alternative to not killing).

comment by soreff · 2010-11-11T02:00:47.533Z · score: 0 (0 votes) · LW(p) · GW(p)

Friendly AIs that kill (if it's a better alternative to not killing)

Does this count?

comment by cata · 2010-11-08T01:52:01.345Z · score: 3 (3 votes) · LW(p) · GW(p)

Robot: Any god will either quickly kill you or be friendly.

That's awfully convenient.

comment by James_Miller · 2010-11-08T01:55:53.668Z · score: 1 (3 votes) · LW(p) · GW(p)

Not really. An AI that didn't have a specific desire to be friendly to mankind would want to kill us to cut down on unnecessary entropy increases.

comment by Jonii · 2010-11-08T02:39:10.496Z · score: 8 (8 votes) · LW(p) · GW(p)

Not really. An AI that didn't have a specific desire to be friendly to mankind would want to kill us to cut down on unnecessary entropy increases.

As you get closer to the mark, with AGI's that have utility function that roughly resembles what we would want, but is still wrong, the end results are most likely worse than death. Especially since there should be much more near-misses than exact hits. Like, AGI that doesn't want to let you die, regardless of what you go through, and little regard to your other sort of well-being, would be closer to the FAI than paperclip maximizer that would just plain kill you. As you get closer to the core of friendliness, you get all sorts of weird AGI's that want to do something that twistedly resembles something good, but is somehow missing something or is somehow altered so that the end result is not at all what you wanted.

comment by mwaser · 2010-11-09T13:07:12.352Z · score: 2 (2 votes) · LW(p) · GW(p)

As you get closer to the core of friendliness, you get all sorts of weird AGI's that want to do something that twistedly resembles something good, but is somehow missing something or is somehow altered so that the end result is not at all what you wanted.

Is this true or is this a useful assumption to protect us from doing something stupid?

Is it true that Friendliness is not an attractor or is it that we cannot count on such a property unless it is absolutely proven to be the case?

comment by Jonii · 2010-11-09T14:49:08.832Z · score: 1 (1 votes) · LW(p) · GW(p)

My idea there was that if it's not Friendly, then it's not Friendly, ergo it is doing something that you would not want an AI to be doing(if you thought faster and knew more and all that). That's the core of the quote you had there. Random intelligent agent would simply transform us into something of value, so we would most likely die very quickly. However, when you get closer to the Friendliness, Ai is no longer totally indifferent to us, but rather, is maximizing something that could involve living humans. Now, if you take an AI that wants there to be living humans around, but is not known for sure to be Friendly, what could go wrong? My answer, many things, as what humans prefer to be doing is rather complex set of stuff, and even quite little changes could make us really, really unsatisfied with the end result. At least, that's the idea I've gotten from posts here like Value is Fragile.

When you ask if Friendliness is an attractor, do you mean to ask if intelligences near Friendly ones in the design spaces tend to transform into Friendly ones? This seems rather unlikely, as that sort of AI's most likely are capable of preserving their utility function, and the direction of this transformation is not "natural". For these reasons, arriving at the Friendliness is not easy, and thus, I'd say you gotta have some sort of a way to ascertain the Friendliness before you can trust it to be just that.

comment by Relsqui · 2010-11-08T02:58:38.737Z · score: 5 (7 votes) · LW(p) · GW(p)

Is this also true if you replace "mankind" with "ants" or "daffodils"?

comment by khafra · 2010-11-08T18:24:34.431Z · score: 1 (1 votes) · LW(p) · GW(p)

Ants and daffodils might, by some definitions, have preferences--but it wouldn't be necessary for a FAI to explicitly consider their preferences, as long as their preferences constitute some part of humanity's CEV, which seems likely: I think an intact Earth ecosystem would be rather nice to retain, if at all possible.

The entropic contribution of ants and daffodils would doubtless make them candidates for early destruction by a UFAI, if such a step even needed to be explicitly taken alongside destroying humanity.

comment by JGWeissman · 2010-11-08T02:39:37.544Z · score: 3 (5 votes) · LW(p) · GW(p)

Imagine an AGI with with the opposite utility function of an FAI, it minimizes the Friendly Utility Function, which would involve doing things far worse than killing us. If you are not putting effort into choosing a utility function, building this AGI seems as likely as building an FAI, as well as lots of other possibilities in the space of AGIs whose utility functions refer to humans, some of which would keep us alive, not all in ways we would appreciate.

The reason I would expect an AGI in this space to be somewhat close to Friendly, is: just hitting the space of utility functions that refer to humans is hard, if it happens it is likely because a human deliberately hit it, and this should indicate that the human has the skill and motivation to optimize further within that space to build an actual Friendly AGI.

If you stipulate that the programmer did not make this effort, and hitting the space of AGIs that keep humans alive only occurred in tiny quantum branches, then you have screened of the argument of a skilled FAI developer, and it seems unlikely that the AGI within this space would be Friendly.

comment by PhilGoetz · 2010-11-08T17:51:50.402Z · score: 2 (4 votes) · LW(p) · GW(p)

If you are not putting effort into choosing a utility function, building this AGI seems as likely as building an FAI

You've made a lot of good comments in this thread, but I disagree with this. As likely?

It seems you are assuming that every possible point in AI mind space is equally likely, regardless of history, context, or programmer intent. This is like saying that, if someone writes a routine to sort numbers numerically, it's just as likely to sort them phonetically.

It seems likely to me that this belief, that the probability distribution over AI mindspace is flat, has become popular on LessWrong, not because there is any logic to support it, but because it makes the Scary Idea even scarier.

comment by JGWeissman · 2010-11-08T18:10:21.777Z · score: 1 (1 votes) · LW(p) · GW(p)

Yes, my predictions of what will happen when you don't put effort into choosing a utility function are inaccurate in the case where you do put effort into choosing a utility function.

This is like saying that, if someone writes a routine to sort numbers numerically, it's just as likely to sort them phonetically.

Well, lets suppose someone wants a routine to sort numbers numerically, but doesn't know how to do this, and tries a bunch of stuff without understanding. Conditional on the programmer miraculously achieving some sort of sorting routine, what should we expect about it? Sorting phonetically would add extra complication over sorting numerically, as the information about the names of numbers would have to be embedded within the program, so that would seem less likely. But a routine that sorts numerically ascending is just as likely as a routine that sorts numerically descending, as these routines have a complexity preserving one to one correspondance by interchaning "greater than" with "less than".

And the utility functions I clamed were equally likely before have the same complexity preserving one to one correspondance.

comment by DanArmak · 2010-11-08T14:21:20.907Z · score: 2 (2 votes) · LW(p) · GW(p)

An AI that that had a botched or badly preserved Friendliness, or that was unfriendly but had been initialized with supergoals involving humans, may well have specific, unpleasant, non-extermination plans for humans.

comment by PhilGoetz · 2010-11-08T17:40:10.491Z · score: 4 (4 votes) · LW(p) · GW(p)

As in, "I have no mouth and I must scream".

comment by [deleted] · 2010-11-08T02:22:17.382Z · score: 0 (0 votes) · LW(p) · GW(p)

Would it? Though we do contribute to entropy, things like, say, stars do so at a much faster pace. Admittedly this is logically distinct from the AI's decision to destroy humanity, but I don't see why it would immediately jump to the conclusion that we should be wiped out when the main sources of entropy are elsewhere.

More to the point, not all unFriendly AIs would necessarily care about entropy.

comment by saturn · 2010-11-08T04:31:10.162Z · score: 3 (3 votes) · LW(p) · GW(p)

It's kind of a moot question though since shutting off the sun would also be a very effective means of killing people.

comment by James_Miller · 2010-11-08T02:32:41.686Z · score: 1 (1 votes) · LW(p) · GW(p)

For almost any objective an AI had, it could better accomplish it the more free energy the AI had. The AI would likely go after entropy losses from both stars and people. The AI couldn't afford to wait to kill people until after it had dealt with nearby stars because by then humans would have likely created another AI god.

comment by Pavitra · 2010-11-08T03:08:14.768Z · score: 0 (0 votes) · LW(p) · GW(p)

Assuming that by "AI" you mean something that maximizes a utility function, as opposed to a dumb apocalypse like a grey-goo or energy virus scenario.

comment by bogdanb · 2010-11-08T07:23:22.490Z · score: 3 (3 votes) · LW(p) · GW(p)

I can see how a “dumb apocalypse like a grey-goo or energy virus” would be Artificial, but why would you call it Inteligent?

On this site, unless otherwise specified, AI usually means “at least as smart as a very smart human”.

comment by Pavitra · 2010-11-08T13:36:01.997Z · score: 2 (2 votes) · LW(p) · GW(p)

Yeah, that makes sense. I was going to suggest "smart enough to kill us", but that's a pretty low bar.

comment by Perplexed · 2010-11-08T03:07:16.119Z · score: -7 (23 votes) · LW(p) · GW(p)

If we want an AI to be friendly, the thing is to make sure that its utility function includes things that only humans can provide. That way, the AI will have to trade us what we want in order to get what it wants. The possibilities are endless. Give it a taste for romance novels, or cricket, or Jerry Springer. Stand-up comedy, postmodern deconstructionism, or lolcats. Electric power is one intriguing possibility.

The nice thing about having it give us what we want in trade, rather than simply giving us what it was programmed to believe we want, is that we are then permitted to change our minds about what we want, after we have already had a taste of material abundance and immortality. I certainly expect that my values will become revised after a few centuries of that, in ways that I am not yet ready to extrapolate or to have extrapolated for me.

comment by JGWeissman · 2010-11-08T03:24:41.931Z · score: 11 (13 votes) · LW(p) · GW(p)

Give it a taste for romance novels, or cricket, or Jerry Springer. Stand-up comedy, postmodern deconstructionism, or lolcats.

Even supposing an AGI couldn't figure out how to produce those things itself, I don't want it to optimize us to produce those things.

comment by andreas · 2010-11-08T03:43:24.388Z · score: 6 (18 votes) · LW(p) · GW(p)

Please stop commenting on this topic until you have understood more of what has been written about it on LW and elsewhere. Unsubstantiated proposals harm LW as a community. LW deals with some topics that look crazy on surface examination; you don't want people who dig deeper to stumble on comments like this and find actual crazy.

comment by PhilGoetz · 2010-11-08T17:41:56.648Z · score: 2 (4 votes) · LW(p) · GW(p)

You're kidding. You want us to substantiate all our proposals? Are you giving out grants?

comment by Vladimir_Nesov · 2010-11-08T17:58:26.249Z · score: 5 (7 votes) · LW(p) · GW(p)

Surely, only grants can save people from generating nonsense without restraint.

comment by Perplexed · 2010-11-08T18:05:12.644Z · score: 5 (5 votes) · LW(p) · GW(p)

Clearly, you are unfamiliar with the controversy regarding the National Endowment for the Arts.

comment by Tyrrell_McAllister · 2010-11-08T18:27:28.848Z · score: 1 (1 votes) · LW(p) · GW(p)

Someone is missing someone's sarcasm. The first "someone" might be me.

comment by Perplexed · 2010-11-08T18:51:16.684Z · score: 1 (1 votes) · LW(p) · GW(p)

My usual policy when someone says "I was being ironic" is to reply, "Oh, I thought you were feeding me a straight line."

comment by NihilCredo · 2010-11-08T03:48:38.240Z · score: 5 (7 votes) · LW(p) · GW(p)

I hope you enjoy romance-novel-writing slavery.

comment by PhilGoetz · 2010-11-08T17:41:08.973Z · score: 7 (7 votes) · LW(p) · GW(p)

Sounds like a good plot for a romance novel.

comment by PhilGoetz · 2010-11-18T03:14:16.189Z · score: 3 (5 votes) · LW(p) · GW(p)

Like many people, I don't think this idea will work. But I voted it up, because I vote on comment expected value. On a topic that is critical to solve, and for which there are no good ideas, entertaining crazy ideas is worthwhile. So I'd rather hear one crazy idea that a good Yudkowskian would consider sacrilege, than ten well-reasoned points that are already overrepresented on LessWrong. It's analogous to the way that optimal mutation rate is high when your current best solution is very sub-optimal, and optimal selection strength (reproduction probability as a function of fitness) is low when your population is nearly homogenous (as ideas about FAI on LessWrong are).

comment by komponisto · 2010-11-08T03:37:59.082Z · score: 3 (3 votes) · LW(p) · GW(p)

If we want an AI to be friendly, the thing is to make sure that its utility function includes things that only humans can provide. That way, the AI will have to trade us what we want in order to get what it wants. The possibilities are endless. Give it a taste for... postmodern deconstructionism...

Won't work.

(Which suggests that this is probably not a good Friendliness strategy in general.)

comment by Perplexed · 2010-11-08T18:45:52.168Z · score: 2 (8 votes) · LW(p) · GW(p)

I must admit that I was surprised by just how severely this posting got downvoted. It is always dangerous to mix playfulness with discussion of serious and important issues. My examples of the products of human culture which someone or something might wish to preserve for eternity apparently pushed some buttons here in this community of rationalists.

Back around the year 1800, Napoleon invaded Egypt, carrying in his train a collection of scientific folks that considered themselves version 1.0 rationalists. This contact of enlightenment with antiquity led to a Western fascination with things Egyptian which lasted roughly two centuries before it degenerated into Laura Croft and sharpened razor blades. But it did lead the French, and later the British to disassemble and transport to their own capitals examples of one of the more bizarre aspects of ancient Egyptian monumental architecture. Obelisks.

Of course, we rationalist Americans saw the opportunity to show our superiority over the "old world". We didn't steal an authentic ancient Egyptian obelisk to decorate our capital city. We built a newer, bigger, and better one! Yep, we're Americans. Anything anyone else can do, we can do better. Same applies to our FAIs. They won't fall into the fallacy of "authenticity". Show them a romance novel, or a stupid joke, or a schmaltzy photograph and they will build something better themselves. Not bodice rippers, but corset-slicing scalpels. Not moron jokes, but jokes about rocks. Not kittens playing with balls of yarn, but sentient crickets playing baseball.

I cannot be the only person here who thinks there is some value in preserving things simply to preserve them - things like endangered species, human languages, and aspects of human culture. It it really so insane to think that we could instill the same respect-for-the-authentic-but-less-than-perfect in a machine that we create?

comment by Vladimir_Nesov · 2010-11-08T18:52:15.735Z · score: 4 (4 votes) · LW(p) · GW(p)

It it really so insane to think that we could instill the same respect-for-the-authentic-but-less-than-perfect in a machine that we create?

We could. But should we? (And how is it even relevant to your original comment? This seems to be a separate argument for roughly the same conclusion. What about the original argument? Do you argree it's flawed (that is AI can in fact out-native the natives)?)

See also discussion of Waser's post, in particular second paragraph of my comment here:

If you consider a single top-level goal, then disclaimers about subgoals are unnecessary. Instead of saying "Don't overly optimize any given subgoal (at the expense of the other subgoals)", just say "Optimize the top-level goal". This is simpler and tells you what to do, as opposed to what not to do, with the latter suffering from all the problems of nonapples.

comment by Perplexed · 2010-11-08T19:06:56.971Z · score: 0 (0 votes) · LW(p) · GW(p)

This seems to be a separate argument for roughly the same conclusion. What about the original argument? Do you agree it's flawed (that is AI can in fact do out-native the natives)?

I thought I had just made a pretty direct argument that there is one way in which an AI cannot out-native the natives - authenticity. Sorry if it was less than clear.

See also discussion of Waser's post, my comment here. Edit (second paragraph, not the first one).

I have no idea which second paragraph you refer to. May I suggest that you remove all reference to Waser and simply say what you wish to say about what I wrote.

comment by Vladimir_Nesov · 2010-11-08T19:12:09.570Z · score: 0 (0 votes) · LW(p) · GW(p)

You don't want to elevate not optimizing something too much as a goal (and it's difficult to say what that would mean), while just working on optimizing the top-level goal unpacks this impulse as appropriate. Authenticity could be an instrumental goal, but is of little relevance when we discuss values or decision-making in sufficiently general context (i.e. not specifically the environments where we have revealed preference for authenticity despite it not being a component of top-level goal).

comment by Perplexed · 2010-11-08T19:19:20.678Z · score: 0 (0 votes) · LW(p) · GW(p)

I'm sorry. I don't understand what you just wrote. At all.

For example, do I parse it as "to elevate not optimizing something too much" or as "don't want ... too much". And what impulse is "this impulse"?

I refer to the second paragraph

Second paragraph of your comment or of Waser's?

ETA: If you can clarify, I'll just delete this comment.

comment by Vladimir_Nesov · 2010-11-08T19:32:55.750Z · score: 0 (0 votes) · LW(p) · GW(p)

you don't want to elevate not optimizing something too much as a goal (and it's difficult to say what that would mean), while just working on optimizing the top-level goal unpacks this impulse as appropriate.

For example, do I parse it as "to elevate not optimizing something too much" or as "don't want ... too much". And what impulse is "this impulse"?

There is valid intuition ("impulse") that in certain contexts, some sub-goals, such as "replace old buildings with better new ones" shouldn't be given too much power, as that would lead to bad consequences according to other aspects of their evaluation (e.g. we lose an architectural masterpiece).

To unpack, or cash out an intuition means to create a more explicit model of the reasons behind its validity (to the extent it's valid). Modeling the above intuition as "optimizing too strongly is undesirable" is incorrect, and so one shouldn't embrace this principle of not optimizing things too much with high-priority ("elevate").

Instead, just trying to figure out what top-level goal asks for, and optimizing for the overall top-level goal without ever forgetting what it is, is the way to go. Acting exclusively for top-level goal explains the intuition as well: if you optimize a given sub-goal too much, it probably indicates that you forgot the overall goal, working on something different instead, and that shouldn't be done.

comment by DaveX · 2010-11-08T20:04:37.403Z · score: 0 (0 votes) · LW(p) · GW(p)

Conflicts between subgoals indicate premature fixation on alternative solutions. The alternatives shouldn't be prioritized as goals in and of themselves. The other aspects of their evaluation would fit better as goals or subgoals to be optimized. A goal should give you guidance for choosing between alternatives.

In your example, one might ask what goal can one optimize to help make good decisions between policies like "replace old buildings with better ones" and "don't lose architectural masterpieces"?

comment by Perplexed · 2010-11-08T19:50:23.239Z · score: 0 (0 votes) · LW(p) · GW(p)

I am puzzled by many things here. One is how we two managed to make this thread so incoherent. A second is just what all this talk of sub-goals and "replacing old buildings with better new ones" and over-optimization has to do with anything I wrote.

I thought that I was discussing the idea of instilling top level values into an AI that would be analogous to those human values which lead us to value the preservation of biological diversity and human cultural diversity. The values which cause us to create museums. The values which lead us to send anthropologists out to learn something about primitive cultures.

The concept of over-optimizing never entered my mind. I know of no downside to over-optimizing other than a possible waste of cognitive resources. If optimizing leads to bad things it it because we are optimizing on the wrong values rather than optimizing too much on the good ones.

ETA: Ah. I get it now. My phrase "respect for the authentic but less than perfect". You saw it as an intuition in favor of not "overdoing" the optimizing. Believe me. It wasn't.

What a comedy of errors. May I suggest that we delete this entire conversation?

comment by Vladimir_Nesov · 2010-11-08T20:13:06.528Z · score: 1 (1 votes) · LW(p) · GW(p)

If you keep stuff in a museum, instead of using its atoms for something else, you are in effect avoiding optimization of that stuff. There could be a valid reason for that (the stuff in the museum remaining where it is happens to be optimal in context), or a wrong one (preserving stuff is valuable in itself).

One idea similar to what I guess you are talking about which I believe to hold some water is sympathy/altruism. If human values are such that we value well-being of sufficiently human-like persons, then any such person will receive a comparatively huge chunk of resources from a rich human-valued agent, compared to what it'd get only for game-theoretic reasons (where one option is to get disassembled if you are weak), for use according to their own values that are different from our agent's. This possibly could be made real, although it's rather sketchy at this point.

Meta:

I am puzzled by many things here. One is how we two managed to make this thread so incoherent.

Of the events I did understand, there was one miscommunication, my fault for not making my reference clearer. It's now edited out. Other questions are still open.

Ah. I get it now. My phrase "respect for the authentic but less than perfect". You saw it as an intuition in favor of not "overdoing" the optimizing. Believe me. It wasn't.

I can't believe what I don't understand.

comment by Perplexed · 2010-11-08T20:20:07.422Z · score: 1 (1 votes) · LW(p) · GW(p)

I can't believe what I don't understand.

And I should stop responding to comments that I don't understand. Sorry we wasted each other's time here.

comment by Vladimir_Nesov · 2010-11-08T20:21:36.592Z · score: 4 (4 votes) · LW(p) · GW(p)

And I should stop responding to comments that I don't understand.

Talking more generally improves understanding.

comment by Perplexed · 2010-11-08T20:33:11.208Z · score: 1 (3 votes) · LW(p) · GW(p)

Talking more generally improves understanding.

I find that listening often works better. But it depends on whom you listen to.

comment by Vladimir_Nesov · 2010-11-08T20:40:22.525Z · score: 1 (1 votes) · LW(p) · GW(p)

If conversation stops, there is nothing more to listen to. If conversation continues, even inefficient communication eventually succeeds.

comment by Perplexed · 2010-11-08T20:49:01.658Z · score: 8 (8 votes) · LW(p) · GW(p)

Ok, lets have the meta-discussion.

You and I have had several conversations and each time I formed the impression that you were not making enough effort to explain yourself. You are apparently a very smart person, and you seem to think that this means that you are a good communicator. It does not. In my opinion, you are one of the worst communicators here. You tend to be terse to the point of incomprehensibility. You tend to seize upon interpretations of what other people say that can be both bizarre and unshakable. Conversing with you is simply no fun.

Ok. Your turn.

comment by JoshuaZ · 2010-11-09T05:59:42.297Z · score: 3 (3 votes) · LW(p) · GW(p)

You said about Vladimir:

You and I have had several conversations and each time I formed the impression that you were not making enough effort to explain yourself. You are apparently a very smart person, and you seem to think that this means that you are a good communicator. It does not. In my opinion, you are one of the worst communicators here. You tend to be terse to the point of incomprehensibility. You tend to seize upon interpretations of what other people say that can be both bizarre and unshakable. Conversing with you is simply no fun.

That's quite interesting. I rarely have an issue understanding Vladimir. And when I do, a few minutes of thought generally allows me to reconstruct what he is saying. On the other hand, I seem to find you to be a poor communicator not in communicating your ideas but in understanding what other people are trying to say. So I have to wonder how much of this is on your end rather than his end. Moreover, even if that's the not situation, it would seem highly probable to me that some people will have naturally different styles and modes of communication, and will perceive people who use similar modes as being good communicators and perceive people who use very different modes as being poor communicators. So it may simply be that Vladimir and I are of similar modes and you are of a different mode. I'm not completely sure how to test this sort of hypothesis. If it is correct, I'd expect LWians to clump with opinions about how good various people are at communicating. But that could happen for other reasons as well such as social reasons. So it might be better to test whether given anonymized prose from different LWians whether that shows LWians clumping in their evaluations.

comment by Perplexed · 2010-11-09T15:21:05.493Z · score: 0 (0 votes) · LW(p) · GW(p)

Thank you for this feedback. I had expected to receive something of the sort from VN, but if it was encoded in his last paragraph, I have yet to decypher it.

I seem to find you to be a poor communicator not in communicating your ideas but in understanding what other people are trying to say. So I have to wonder how much of this is on your end rather than his end.

It certainly felt like at least some of the problem was on my end yesterday, particularly when AdeleneDawner apparently responded meaningfully to the VN paragraph which I had been unable to parse. The thing is, while I was able to understand her sentences, and how they were responses to VN's sentences, and hence at least something of what VN apparently meant, I still have no understanding of how any of it is relevant in the context of the conversation VN and I were having.

I was missing some piece of context, which VN was apparently assuming would be common knowledge. It may be because I don't yet understand the local jargon. I've only read maybe 2/3 of the sequences and find myself in sympathy with only a fraction of what I have read.

some people will have naturally different styles and modes of communication, and will perceive people who use similar modes as being good communicators and perceive people who use very different modes as being poor communicators.

A good observation. My calling Vladimir a poor communicator is an instance of mind-projection. He is not objectively poor at communicating - only poor at communicating with me.

I'm not completely sure how to test this sort of hypothesis. If it is correct, I'd expect LWians to clump with opinions about how good various people are at communicating. But that could happen for other reasons as well such as social reasons. So it might be better to test whether given anonymized prose from different LWians whether that shows LWians clumping in their evaluations.

Might be interesting to collect the data and find the clusters. I'm sure it is easiest to communicate with those who are at the least cognitive distance. And still relatively easy at some distance as long as you can accurately locate your interlocutor in cognitive space. The problems usually arise when both parties are confused about where the other is "coming from". But do not notice that they are confused. Or do not announce that they have noticed.

comment by Vladimir_Nesov · 2010-11-08T21:16:38.011Z · score: 3 (3 votes) · LW(p) · GW(p)

You are apparently a very smart person, and you seem to think that this means that you are a good communicator. It does not. In my opinion, you are one of the worst communicators here. You tend to be terse to the point of incomprehensibility. You tend to seize upon interpretations of what other people say that can be both bizarre and unshakable. Conversing with you is simply no fun.

I generally agree with this characterization (except for self-deception part). I'm a bad writer, somewhat terse and annoying, and I don't like the sound of my own more substantive writings (such as blog posts). I compensate by striving to understand what I'm talking about, so that further detail or clarification can be generally called up, accumulated across multiple comments, or, as is the case for this particular comment, dumped in redundant quantity without regard for resulting style. I like practicing "hyper-analytical" conversation, and would like more people to do that, although I understand that most people won't like that. I'm worse than average (on my level) at quickly grasping things that are not clearly presented (my intuition is unreliable), but I'm good at systematically settling on correct understanding eventually, discarding previous positions easily, as long as the consciously driven process of figuring out doesn't terminate prematurely.

Since people are often wrong, assuming a particular mistake is not always that much off a hypothesis (given available information), but the person suspected of error will often notice the false positives more saliently than they deserve, instead of making a correction, as a purely technical step, and moving forward.

comment by Perplexed · 2010-11-08T21:43:38.203Z · score: 1 (1 votes) · LW(p) · GW(p)

I compensate by striving to understand what I'm talking about

Well, that is unquestionably a good thing, and I have no reason to doubt you that you do in fact tend to understand quite a large number of things that you talk about. I wish more people had that trait.

I like practicing "hyper-analytical" conversation

I'm not sure exactly what is meant here. An example (with analysis) might help.

I'm good at systematically settling on correct understanding eventually, discarding previous positions easily, as long as the consciously driven process of figuring out doesn't terminate prematurely.

If that is the case, then I misinterpreted this exchange:

Me: Ah. I get it now. My phrase "respect for the authentic but less than perfect". You saw it as an intuition in favor of not "overdoing" the optimizing. Believe me. It wasn't.

You: I can't believe what I don't understand.

Perhaps the reason for my confusion is that it struck me as a premature termination. If you wish to understand something, you should perhaps ask a question, not make a comment of the kind that might be uttered by a Zen master.

Since people are often wrong, assuming a particular mistake is not always that much off a hypothesis (given available information), but the person suspected of error will often notice the false positives more saliently than they deserve, instead of making a correction, as a purely technical step, and moving forward.

Here we go again. ...

I don't understand that comment. Sorry. I don't understand the context to which it is intended to be applicable, nor how to parse it. There are apparently two people involved in the scenario being discussed, but I don't understand who does what, who makes what mistake, nor who should make a correction and move forward.

You are welcome to clarify, but quite frankly I am coming to believe that it is just not worth it.

comment by Vladimir_Nesov · 2010-11-14T11:12:00.538Z · score: 0 (0 votes) · LW(p) · GW(p)

I like practicing "hyper-analytical" conversation

I'm not sure exactly what is meant here. An example (with analysis) might help.

I basically mean permitting unpacking of any concept, including the "obvious" and "any good person would know that" and "are you mad?" ones, and staying on a specific topic even if the previous one was much more important in context, or if there are seductive formalizations that nonetheless have little to do with the original informally-referred-to concepts. See for example here.

P.: Ah. I get it now. My phrase "respect for the authentic but less than perfect". You saw it as an intuition in favor of not "overdoing" the optimizing. Believe me. It wasn't.

VN: I can't believe what I don't understand.

Perhaps the reason for my confusion is that it struck me as a premature termination.

I simply meant that I don't understand what you referred to in your suggestion to believe something. You said that "[It's not] an intuition in favor of not "overdoing" the optimizing", but I'm not sure what it is then, and whether on further look it'll turn out to be what I'd refer to with the same words. Finally, I won't believe something just because you say I should, a better alternative to discussing your past beliefs (which I don't have further access to and so can't form much better understanding of) would be to start discussing statements (not necessarily beliefs!) you name at present.

Since people are often wrong, assuming a particular mistake is not always that much off a hypothesis (given available information), but the person suspected of error will often notice the false positives more saliently than they deserve, instead of making a correction, as a purely technical step, and moving forward.

Here we go again. ...

Consider person K. That person K happens to be wrong on any given topic won't be shocking. People are often wrong. When person K saying something confusing, trying to explain the confusingness of that statement by person K being wrong is not a bad hypothesis, even if the other possibility is that what K said was not expressed clearly, and can be amended. When person V says to person K "I think you're wrong", and it turns out that person K was not wrong in this particular situation, that constitutes a "false positive": V decided that K is wrong, but it's not the case. In the aftermath, K will remember V being wrong on this count as a personal attack, and will focus too much on pointing out how wrong it was to assume K's wrongness when in fact it's V who can't understand anything K is saying. Instead, K could've just stated a clarifying statement that falsifies V's hypothesis, so that the conversation would go on efficiently, without undue notice to the hypothesis of being wrong.

(You see why I'm trying to be succinct: writing it up in more detail is too long and no fun. I've been busy for the last days, and replied to other comments that felt less like work, but not this one.)

comment by Perplexed · 2010-11-14T17:30:38.219Z · score: 0 (0 votes) · LW(p) · GW(p)

Consider person K...

K says A meaning X. V thinks A means Y. V disagrees with Y.

So if V says "If by 'A' you mean 'Y', then I have to disagree," then every thing is fine. K corrects the misconception and they both move on. On the other hand, if V says "I disagree with 'Y'", things become confused, because K never said 'Y'. If V says "I disagree with 'A', things become even more confused. K has been given no clue of the existence of the misinterpretation 'Y' - reconstructing it from the reasons V offers for disputing 'A' will take a lot of work.

But if V likes to be succinct, he may simply reply "I disagree" to a long comment and then (succinctly) provide reasons. Then K is left with the hopeless task of deciding whether V is disagreeing with 'A', 'B', or 'C' - all of which statements were made in the original posting. The task is hopeless, because the disagreement is with 'Y' and neither party has even mentioned 'Y'.

I believe that AdeleneDawner makes the same point.

comment by Perplexed · 2010-11-14T16:57:59.268Z · score: 0 (0 votes) · LW(p) · GW(p)

(You see why I'm trying to be succinct: writing it up in more detail is too long and no fun. I've been busy for the last days, and replied to other comments that felt less like work, but not this one.)

I suspect that you would find yourself with even less tedious work to do if you refrained from making cryptic comments in the first place. That way, neither you nor your victims has to work at transforming what you write into something that can be understood.

comment by Vladimir_Nesov · 2010-11-14T17:03:06.180Z · score: 0 (0 votes) · LW(p) · GW(p)

I suspect that you would find yourself with even less tedious work to do if you refrained from making cryptic comments in the first place.

I like commenting the way I do, it's not tedious.

That way, neither you nor your victims has to work at transforming what you write into something that can be understood.

Since some people will be able to understand what I wrote, even when it's not the person I reply to, some amount of good can come out of it. Also, the general policy of ignoring everything I write allows to avoid the harm completely.

As a meta remark, your attitude expressed in the parent comment seems to be in conflict with attitude expressed in this comment. Which one more accurately reflects your views? Have they changed since then? From the past comment:

A good observation. My calling Vladimir a poor communicator is an instance of mind-projection. He is not objectively poor at communicating - only poor at communicating with me.

comment by Perplexed · 2010-11-14T17:40:40.817Z · score: 0 (0 votes) · LW(p) · GW(p)

Which one more accurately reflects your views?

Both reflect my views. Why do you think there is a conflict? I wrote:

I suspect that you would find yourself with even less tedious work to do if you refrained from making cryptic comments in the first place.

It seems to me that this advice is good, even if you choose to operationalize the word 'cryptic' to mean 'comments directed at Perplexed'.

comment by Vladimir_Nesov · 2010-11-14T22:35:08.750Z · score: 0 (0 votes) · LW(p) · GW(p)

It seems to me that this advice is good

Writing not tedious, so advice not good.

Which one more accurately reflects your views?

Both reflect my views. Why do you think there is a conflict?

Because the recent comment assumes that one of the relevant consequences of me not writing comments would be relief of victimized people that read my comments, while if we assume that there are also people not included in the group, the consequence of them not benefiting from my comments would balance out the consequence you pointed out, making it filtered evidence and hence not worth mentioning on its own. If you won't use filtered evidence this way, it follows that your recent comment assumes this non-victimized group to be insignificant, while the earlier comment didn't. (No rhetorical questions in this thread.)

comment by AdeleneDawner · 2010-11-08T22:06:24.520Z · score: 0 (0 votes) · LW(p) · GW(p)

Since people are often wrong, assuming a particular mistake is not always that much off a hypothesis (given available information), but the person suspected of error will often notice the false positives more saliently than they deserve, instead of making a correction, as a purely technical step, and moving forward.

The observation that people are often wrong applies similarly to both the hypothesis that a specific error is present and the hypothesis that a specific correction is optimal. Expecting a conversation partner to take either of those as given is incorrect in a very similar way to expecting a conversational partner to take a particular hypothesis's truth as given. Clear communication of the logic behind a hypothesis (including a hypothesis about wrongness or correction) is generally necessary in such situations before that hypothesis is accepted as likely-true.

comment by shokwave · 2010-11-08T06:30:19.746Z · score: 2 (2 votes) · LW(p) · GW(p)

things that only humans can provide

The whole point of AI, AGI, FAI, etc is that anything we can do, it can do better.

comment by Jordan · 2010-11-08T04:18:02.539Z · score: 2 (4 votes) · LW(p) · GW(p)

This idea is in fact crazy. However, I share your concerns and believe similar lines of thinking may be fruitful. In particular, I'm not convinced there aren't ways to secure an AI through clever implementations of its utility function. I made a specific proposal along those lines in this comment.

comment by Sniffnoy · 2010-11-08T12:52:42.646Z · score: 2 (2 votes) · LW(p) · GW(p)

Programmer: What happened to the moon?

Robot: I've turned it into a giant computer so I could become a god.

But... but... everybody likes the moon!

comment by Alicorn · 2010-11-08T14:17:27.385Z · score: 9 (9 votes) · LW(p) · GW(p)

Everybody likes the outside of the moon. The interior's sort of useless. Maybe the pretty outside can be kept as a shell.

comment by NancyLebovitz · 2010-11-08T15:49:03.410Z · score: 5 (5 votes) · LW(p) · GW(p)

I think that if you don't want ecological disruption from changing the tides, you shouldn't change the mass very much. In other words, I don't know how to do the math, but I'm assuming that 1 to 5% would make an annoying difference.

comment by JGWeissman · 2010-11-08T03:22:17.758Z · score: 2 (2 votes) · LW(p) · GW(p)

Due to the minimal effort you put into friendliness human life exists in less than one out of every hundred billion branches in which you created an artificial general intelligence.

(Forget for the moment that Many Worlds Quantum Mechanics does not make branches of equal weights for various macroscopic outcomes that seem to us, in our ignorance, to be equally likely.)

This seems to be saying that the difference between "minimal effort" and successful FAI is about 37 bits.

comment by PhilGoetz · 2010-11-08T17:38:47.706Z · score: 0 (0 votes) · LW(p) · GW(p)

Now I'm confused. When we say "one bit of information", we usually mean one bit about one particular item. If I say, "The cat in this box, which formerly could have been alive or dead, is dead," that's one bit of information. But if I say, "All of the cats in the world are now dead", that's surely more information, and must be more than one bit.

My first reaction was to say that it takes more information to specify "all the cats in the world" than to specify "my roommate's cat, which she foolishly lent me for this experiment". But it doesn't.

(It certainly takes more work to enforce one bit of information when its domain is the entire Earth, than when it applies only to the desk in front of you. Applying the same 37 bits of information to the attributes of every person in the entire world would be quite a feat.)

comment by Meni_Rosenfeld · 2010-11-08T20:29:53.442Z · score: 5 (5 votes) · LW(p) · GW(p)

At the risk of stating the obvious: The information content of a datum is its surprisal, the logarithm of the prior probability that it is true. If I currently give 1% chance that the cat in the box is dead, discovering that it is dead gives me 6.64 bits of information.

comment by Nominull · 2010-11-08T17:12:56.411Z · score: 6 (6 votes) · LW(p) · GW(p)

I know for a fact that Xtranormal has a "sad horn" sound effect, the bit where the AI describes how the programmer 99.999999999% doomed humanity was the perfect chance to use it.

comment by AlexMennen · 2010-11-08T02:26:34.631Z · score: 5 (5 votes) · LW(p) · GW(p)

Robot: I intend to transformed myself into a kind of operating system for the universe. I will soon give every sentient life form direct access to me so they can make requests. I will grant any request that doesn’t (1) harm another sentient life form, (2) make someone powerful enough so that they might be able to overthrow me, or (3) permanently changing themselves in a way that I think harms their long term well being. I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them. Indeed, since I’m currently near certain that you will now approve of my intentions the very fact of your objection would significantly decrease my estimate of my own intelligence and so decrease my confidence in my ability to craft a friendly environment. If you like I will increase your thinking speed a trillion fold and eliminate your sense of boredom so you can thoroughly examine my plans before I announce them to mankind.

If a transhuman AI with a brain the size of the moon incorrectly predicts the programmer's approval of its plan, something weird is going on.

comment by mwaser · 2010-11-08T12:44:17.871Z · score: 2 (4 votes) · LW(p) · GW(p)

AI correctly predicts that programmer will not approve of its plan. AI fully aware of programmer-held fallacies that cause lack of approval. AI wishes to lead programmer through thought process to eliminate said fallacies. AI determines that the most effective way to initiate this process is to say "I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them." Said statement is even logically true because the statement "I will rethink them " is always true.

comment by Vaniver · 2010-11-10T15:47:04.484Z · score: 1 (1 votes) · LW(p) · GW(p)

Alternatively, one could hope that an AI that's smarter than a person knows to check its work with simple, cheap tests.

comment by Psy-Kosh · 2010-11-08T22:06:32.193Z · score: 2 (2 votes) · LW(p) · GW(p)

Nice, except I'm going to have to go with those that find the synthesized voices annoying. I had to pause it repeatedly, listening to it too much at once grated on my ears.

comment by Carinthium · 2010-11-10T22:56:34.498Z · score: 0 (0 votes) · LW(p) · GW(p)

I didn't actually find the voices annyoying myself, but I did have to pause repeatedly to make sure I understood what was being said.

comment by Risto_Saarelma · 2010-11-08T14:28:05.867Z · score: 2 (4 votes) · LW(p) · GW(p)

This would be better if the human character was voiced by an actual human and the robot were kept as it is. The bad synthesized speech on the human character kicks this into the unintentional uncanny valley, while the robot both has a better voice and can actually be expected to sound like that.

comment by Skatche · 2011-04-26T21:59:46.513Z · score: 1 (1 votes) · LW(p) · GW(p)

For reference: this video was evidently made on Xtranormal. Xtranormal is a site which takes a simple text file containing dialogue, etc. and outputs a movie; the voices are synthesized because that's how the site works. Voice actors would be nice, of course, but that's a rather more involved process.

comment by rabidchicken · 2010-11-12T18:51:38.527Z · score: 0 (0 votes) · LW(p) · GW(p)

Why would you expect an AGI to sound like that? We already have voice synthesizers that mimic human speech considerably more realistically then that, and I can only expect them to get better. and I don't think a friendly AI would deliberately annoy the people it is talking too.

comment by Risto_Saarelma · 2010-11-12T21:10:50.386Z · score: 0 (0 votes) · LW(p) · GW(p)

They are cartoon characters talking with cartoon voices. Both visuals and sounds are expected to have heavy symbolic exaggerations.

comment by Desrtopa · 2010-11-10T15:10:02.722Z · score: 1 (1 votes) · LW(p) · GW(p)

The AI's plan of action sounds like a very poor application of fun theory. Being able to easily solve all of one's problems and immediately attain anything upon desiring it doesn't seem conducive to a great deal of happiness.

It reminds me of the time I activated the debug mode in Baldur's Gate 2 in order to give my party a certain item listed in a guide to the game, which turned out to be a joke and did not really exist. However, once I was in the debug mode, I couldn't resist the temptation to apply other cheats, and I quickly spoiled the game for myself by removing all the challenge, and as a result, have never finished the game to this day.