A Brief Overview of Machine Ethics

lukeprog

A Brief Overview of Machine Ethics

post by lukeprog · 2011-03-05T06:09:50.550Z · LW · GW · Legacy · 91 comments

91 comments

Earlier, I lamented that even though Eliezer named scholarship as one of the Twelve Virtues of Rationality, there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

Previously, I provided an overview of formal epistemology, that field of philosophy that deals with (1) mathematically formalizing concepts related to induction, belief, choice, and action, and (2) arguing about the foundations of probability, statistics, game theory, decision theory, and algorithmic learning theory.

Now, I've written Machine Ethics is the Future, an introduction to machine ethics, the academic field that studies the problem of how to design artificial moral agents that act ethically (along with a few related problems). There, you will find PDFs of a dozen papers on the subject.

Enjoy!

91 comments

Comments sorted by top scores.

comment by Scott Alexander (Yvain) · 2011-03-05T16:11:25.375Z · LW(p) · GW(p)

I started looking through some of the papers and so far I don't feel enlightened.

I've never been able to tell whether I don't understand Kantian ethics, or Kantian ethics is just stupid. Take Prospects For a Kantian Machine. The first part is about building a machine whose maxims satisfy the universalizability criterion: that they can be universalized without contradicting themselves.

But this seems to rely a lot on being very good at parsing categories in exactly the right way to come up with the answer you wanted originally.

For example, it seems reasonable to have maxims that only apply to certain portions of the population, for example: "I, who am a policeman, will lock up this bank robber awaiting trial in my county jail" generalizes to "Other policemen will also lock up bank robbers awaiting trial in their county jails" if you're a human moral philosopher who knows how these things are supposed to work.

But I don't see what's stopping a robot from coming up with "Everyone will lock up everyone else" or "All the world's policemen will descend upon this one bank robber and try to lock him up in their own county jails". After all, Kant universalizes "I will deceive this murderer so he can't find his victim" to "Everyone will deceive everyone else all the time" and not to "Everyone will deceive murderers when a life is at stake". So if a robot were to propose "I, a robot, will kill all humans", why should we expect it to universalize it to "Everyone will kill everyone else" rather than "Other robots will also kill all humans", which just means the robot gets help?

And even if it does universalize correctly, in the friendly AI context it need not be a contradiction! If this is a superintelligent AI we're talking about, then even in the best case scenario where everything goes right the maxim "I will try to kill all humans" will universalize to "Everyone will try to kill everyone else". Kant said this was contradictory in that every human will then be dead and none of them will gain the desserts of their murder - but in an AI context this isn't contradictory at all: the superintelligence will succeed at killing everyone else, the actions of the puny humans will be irrelevant, and the AI will be just fine.

(actually, just getting far enough to make either of those objections involves hand-waving away about thirty other intractable problems you would need just to get that far; but these seemed like the most pertinent).

I'll look through some of the other papers later, but so far I'm not seeing anything to make me think Eliezer's opinion of the state of the field was overly pessimistic.

Replies from: Yvain, lukeprog, lukeprog, AlephNeil, lukeprog

↑ comment by Scott Alexander (Yvain) · 2011-03-05T16:34:21.084Z · LW(p) · GW(p)

Allen - Prolegomena to Any Future Moral Agent places a lot of emphasis on figuring out of a machine can be truly moral, in various metaphysical senses like "has the capacity to disobey the law, but doesn't" and "deliberates in a certain way". Not only is it possible that these are meaningless, but in a superintelligence the metaphysical implications should really take second-place to the not-getting-turned-into-paperclips implications.

He proposes a moral Turing Test, where we call a machine moral if it can answer moral questions indistinguishably from a human. But Clippy would also pass this test, if a consequence of passing was that the humans lowered their guard/let him out of the box. In fact, every unfriendly superintelligence with a basic knowledge of human culture and a motive would pass.

Utilitarianism considered difficult to implement because it's computationally impossible to predict all consequences. Given that any AI worth its salt would have a module for predicting the consequences of its actions anyway, and that the potential danger of the AI is directly related to how good this module is, that seems like a non-problem. It wouldn't be perfect, but it would do better than humans, at least.

Deontology, same problem as the last one. Virtue ethics seems problematic depending on the AI's motivation - if it were motivated to turn the universe to paperclips, would it be completely honest about it, kill humans quickly and painlessly and with a flowery apology, and declare itself to have exercised the virtues of honesty, compassion, and politeness? Evolution would give us something at best as moral as humans and probably worse - see the Sequence post about the tanks in cloudy weather.

Still not impressed.

Replies from: Yvain

↑ comment by Scott Alexander (Yvain) · 2011-03-05T16:46:11.340Z · LW(p) · GW(p)

Mechanized Deontic Logic is pretty okay, despite the dread I had because of the name. I'm no good at formal systems, but as far as I can understand it looks like a logic for proving some simple results about morality: the example they give is "If you should see to it that X, then you should see to it that you should see to it that X."

I can't immediately see a way this would destroy the human race, but that's only because it's nowhere near the point where it involves what humans actually think of as "morality" yet.

Replies from: Yvain

↑ comment by Scott Alexander (Yvain) · 2011-03-05T17:03:18.812Z · LW(p) · GW(p)

Utilibot Project is about creating a personal care robot that will avoid accidentally killing its owner by representing the goal of "owner health" in a utilitarian way. It sounds like it might work for a robot with a very small list of potential actions (like "turn on stove" and "administer glucose") and a very specific list of owner health indicators (like "hunger" and "blood glucose level"), but it's not very relevant to the broader Friendly AI program.

Having read as many papers as I have time to before dinner, my provisional conclusion is that Vladimir Nesov hit the nail on the head

↑ comment by lukeprog · 2011-03-05T19:04:46.025Z · LW(p) · GW(p)

I don't disagree with much of anything you've said here, by the way.

Remember that I'm writing a book that, for most of its length, will systematically explain why the proposed solutions in the literature won't work.

The problem is that SIAI is not even engaging in that discussion. Where is the detailed explanation of why these proposed solutions won't work? I don't get the impression someone like Yudkowsky has even read these papers, let alone explained why the proposed solutions won't work. SIAI is just talking a different language than the professional machine ethics community is.

Most of the literature on machine ethics is not that useful, but that's true of almost any subject. The point of a literature hunt is to find the gems here and there that genuinely contribute to the important project of Friendly AI. Another points is to interact with the existing literature and explain to people why it's not going to be that easy.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-05T20:05:16.712Z · LW(p) · GW(p)

My sentiment about the role of engaging existing literature on machine ethics is analogous to what you describe in a recent post on your blog. Particularly this:

Oh God, you think. That’s where the level of discussion is, on this planet.

You either push the boundaries, or fight the good fight. And the good fight is best fought by writing textbooks and opening schools, not by public debates with distinguished shamans. But it's not entirely fair, since some of machine ethics addresses a reasonable problem of making good-behaving robots, which just happens to have the same surface feature of considering moral valuation of decisions of artificial reasoners, but on closer inspection is mostly unrelated to the problem of FAI.

Replies from: lukeprog

↑ comment by lukeprog · 2011-03-05T22:56:50.796Z · LW(p) · GW(p)

Sure. One of the hopes of my book is, as stated earlier, to bring people up to where Eliezer Yudkowsky was circa 2004.

Also, I worry that something is being overlooked by the LW / SIAI community because the response to suggestions in the literature has been so quick and dirty. I'm on the prowl for something that's been missed because nobody has done a thorough literature search and detailed rebuttal. We'll see what turns up.

↑ comment by lukeprog · 2011-03-06T00:17:56.353Z · LW(p) · GW(p)

BTW, I so identify with this quote:

I've never been able to tell whether I don't understand Kantian ethics, or Kantian ethics is just stupid.

In fact, I've said the same thing myself, in slightly different words.

↑ comment by AlephNeil · 2011-03-13T07:19:13.763Z · LW(p) · GW(p)

Every sufficiently smart person who thinks about Kantian ethics comes up with this objection. I don't believe it's possible to defend against it entirely. However...

After all, Kant universalizes "I will deceive this murderer so he can't find his victim" to "Everyone will deceive everyone else all the time" and not to "Everyone will deceive murderers when a life is at stake".

That may be what Kant actually says (does he?) but if he does then I think he's wrong about his own theory. As I understand it, what you're supposed to do is look at the bit of reasoning which is actually causing you to want to do X and see whether that generalizes, not cast around for a bit of reasoning which would (or in this case, would not) generalize, and then pretend to be basing your action on that.

In the example you mention, you should only generalize to "everyone will deceive everyone all the time" if what you're considering doing is deceiving this person simply because he's a person. If you want to deceive him because of his intention to commit murder, and would not want to otherwise, then the thing you generalize must have this feature.

Similarly, I might try to justify lying to someone this morning on the basis that it generalizes to "I, who am AlephNeil, always lies on the morning of 13th day of March 2011 if it is to my advantage" which is both consistent and advantageous (to me). But really I would be lying purely because it's to my advantage - the date and time, and the fact that I am AlephNeil, don't enter into the computation.

↑ comment by lukeprog · 2011-03-13T06:13:41.851Z · LW(p) · GW(p)

For Googleability, I'll not that this objection is called the problem of maxim specification.

Replies from: Document

↑ comment by Document · 2011-04-23T06:20:36.301Z · LW(p) · GW(p)

That currently has no Google results besides your post.

Replies from: lukeprog

↑ comment by lukeprog · 2011-04-23T06:51:05.028Z · LW(p) · GW(p)

Yes, sorry. "Maxim specification" won't give you much, but variations on that will. People don't usually write "the problem of maxim specification" but instead things like "...specifying the maxim..." or "the maxim... specified..." and so on. It in general isn't easily Googled like "is-ought gap" is.

But here is one use.

comment by Wei Dai (Wei_Dai) · 2011-03-06T19:55:05.562Z · LW(p) · GW(p)

Earlier, I lamented that even though Eliezer named scholarship as one of the Twelve Virtues of Rationality, there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

Eliezer defined the virtue of scholarship as (a) "Study many sciences and absorb their power as your own." He was silent on whether, after you survey a literature and conclude that nobody has the right approach yet, you should (b) still cite the literature (presumably to show that you're familiar with it), and/or (c) rebut the wrong approaches (presumably to try to lead others away from the wrong paths).

I'd say that (b) and (c) are much more situational than (a). (b) is mostly a signaling issue. If you can convince your audience to take you seriously without doing it, then why bother? And (c) depends on how much effort you'd have to spend to convince others that they are wrong, and how likely they are to contribute to the correct solution after you turn them around. Or perhaps you're not sure that your approach is right either, and think it should just be explored alongside others.

At least some of the lack of scholarship that you see here just reflect a cost-benefit analysis on (b) and (c), instead of a lack of "interest" or "virtue". (Of course you probably have different intuitions on the costs and benefits involved, and I think you should certainly pursue writing your book if you think it's a good use of your time.)

Also, I note that there is remarkably little existing research on some of the topics we discuss here. For example, for my The Nature of Offense post, I was able to find just one existing article on the topic, and that was in a popular online magazine, instead of an academic publication.

Replies from: lukeprog, Pavitra

↑ comment by lukeprog · 2011-03-07T03:24:38.662Z · LW(p) · GW(p)

This is an excellent comment, and you're probably right to some degree.

But I will say, I've learned many things already from the machine ethics literature, and I've only read about 1/4 of it so far.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-07T12:08:04.014Z · LW(p) · GW(p)

But I will say, I've learned many things already from the machine ethics literature

Such as?

Replies from: lukeprog

↑ comment by lukeprog · 2011-03-07T17:41:43.640Z · LW(p) · GW(p)

Hold, please. I'm writing several articles and a book on this. :)

Replies from: lukeprog

↑ comment by lukeprog · 2011-03-09T04:44:19.071Z · LW(p) · GW(p)

But for now, this was Louie Helm's favorite paper among those we read during our survey of the literature on machine ethics.

↑ comment by Pavitra · 2011-03-06T20:01:10.115Z · LW(p) · GW(p)

Citing the literature makes it easier for your reader to verify your reasoning. If you don't, then a proper confirmation or rebuttal requires (more) independent scholarship to discover the relevant existing literature from scratch.

comment by XiXiDu · 2011-03-05T09:55:53.613Z · LW(p) · GW(p)

...there is surprisingly little interest in (or citing of) the academic literature on some of Less Wrong's central discussion topics.

I think one of the reasons is that the LW/SIAI crowd thinks all other people are below their standards. For example:

I tried - once - going to an interesting-sounding mainstream AI conference that happened to be in my area. I met ordinary research scholars and looked at their posterboards and read some of their papers. I watched their presentations and talked to them at lunch. And they were way below the level of the big names. I mean, they weren't visibly incompetent, they had their various research interests and I'm sure they were doing passable work on them. And I gave up and left before the conference was over, because I kept thinking "What am I even doing here?" (Competent Elites)

I don't mean to bash normal AGI researchers into the ground. They are not evil. They are not ill-intentioned. They are not even dangerous, as individuals. Only the mob of them is dangerous, that can learn from each other's partial successes and accumulate hacks as a community. (Above-Average AI Scientists)

Even more:

I am tempted to say that a doctorate in AI would be negatively useful, but I am not one to hold someone's reckless youth against them - just because you acquired a doctorate in AI doesn't mean you should be permanently disqualified. (So You Want To Be A Seed AI Programmer)

And:

If you haven't read through the MWI sequence, read it. Then try to talk with your smart friends about it. You will soon learn that your smart friends and favorite SF writers are not remotely close to the rationality standards of Less Wrong, and you will no longer think it anywhere near as plausible that their differing opinion is because they know some incredible secret knowledge you don't. (Eliezer_Yudkowsky August 2010 03:57:30PM)

Replies from: Vladimir_Nesov, benelliott, David_Gerard, wedrifid, Manfred

↑ comment by Vladimir_Nesov · 2011-03-05T16:41:09.041Z · LW(p) · GW(p)

I think one of the reasons is that the LW/SIAI crowd thinks all other people are below their standards.

"Below their standards" is a bad way to describe this situation, it suggests some kind of presumption of social superiority, while the actual problem is just that the things almost all researchers write presumably on this topic are not helpful. They are either considering a different problem (e.g. practical ways of making real near-future robots not kill wrong people, where it's perfectly reasonable to say that philosophy of consequentialism is useless, since there is no practical way to apply it; or applied ethics, where we ask how humans should act), or contemplate the confusingness of the problem, without making useful progress (a lot of philosophy).

This property doesn't depend on whether we are making progress ourselves, so it's perfectly possible (and to a large extent true) that progress that is up to the standard of being useful is not made by SIAI either.

A point where SIAI makes visible and useful progress is in communicating the difficulty of the problem, the very fact that most of what is purportedly progress on FAI is actually not.

Replies from: lukeprog

↑ comment by lukeprog · 2011-03-06T00:30:06.879Z · LW(p) · GW(p)

A point where SIAI makes visible and useful progress is in communicating the difficulty of the problem...

This is, in fact, the main goal of my book on the subject. Except, I'll do it in more detail, and spend more time citing the specific examples from the literature that are wrong. Eliezer has done some of this, but there's lots more to do.

↑ comment by benelliott · 2011-03-05T12:06:27.057Z · LW(p) · GW(p)

Your definition of 'LW/SIAI crowd' appears to be 'Eliezer Yudkowsky'.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-03-05T12:48:05.062Z · LW(p) · GW(p)

Your definition of 'LW/SIAI crowd' appears to be 'Eliezer Yudkowsky'.

My current perception is that there are not many independent minds to be found here. I perceive there to be a strong tendency to jump if Yudkowsky tells people to jump. I'm virtually the only true critic of the SIAI, which is really sad and frightening. There are many examples that show how people just 'trust' him or believe into him and I haven't been able to figure out good reasons to do so.

ETA I removed the links to various 'examples' of what I have written above. Please PM me if you are curious.

Replies from: benelliott, Emile, lukeprog, wedrifid

↑ comment by benelliott · 2011-03-05T13:18:07.081Z · LW(p) · GW(p)

Your karma balance should be enough to prove that you definitely aren't the only critic on LW. Others who also disagree with him about various things have even higher balances.

There are definitely a number of true fanboys on this site, they may even be the majority (although I hope not), but they certainly aren't the whole of the LW crowd, and it is intellectually dishonest to put words in the rest of our mouths just by quoting Eliezer.

As for SIAI, by its very purpose only attracts people who agree with Eliezer's philosophy of AI. There is nothing wrong with this. There is no good reason for someone who doesn't believe in the necessity or possibility of FAI to go work there. Would you also object if it seemed like everyone working for Village Reach agreed about giving vaccinations to African children being a good idea?

Replies from: XiXiDu, XiXiDu

↑ comment by XiXiDu · 2011-03-05T16:46:39.159Z · LW(p) · GW(p)

There are definitely a number of true fanboys on this site, they may even be the majority (although I hope not)...

See, that one person who donated the current balance of his bank account got 52 upvotes for it. Now I'm not particularly shocked by him doing that or the upvotes. I don't worry that all that money might be better spend somehow. What drives me is curiosity mixed with my personality, I want to do what's right. That is the reason for why I criticize and why some comments may seem, or actually are derogatory. I think it needs to be said, I believe I can provoke feedback that way and learn more about the underlying rational. I desperately try to figure out if there is something I am missing.

I haven't read most of the sequences yet, let me explain why. I'm a really slow reader, I have almost no education and need a lot of time to learn anything. I did a lot of spot tests, reading various posts and came across people who read the sequences but haven't been able to conclude that they should stop doing anything except trying to earn money for the SIAI. My conclusion is that reading the sequences shouldn't be a priority right now but rather learning the mathematical basics, programming and reading various books. But I still try to spend some time here to see if that assessment might be wrong.

My current take on the whole issue is that the sequences do not provide much useful insights. I already know that by all that we know today AGI is possible and that it is unlikely that humans are the absolute limit when it comes to intelligence. I intuitively agree with the notion that AGI in its abstract form (intelligence as an algorithm) doesn't share our values if you do not deliberately 'tell' it to care. I see that one can outweigh even a low probability of risks from AI by assuming a future galactic civilization that is at stake. So what is my problem? I've written hundreds of comments about all kinds of problems I have with it, but maybe the biggest problem is a simple bias. I have an overwhelming gut feeling telling me that something is wrong with all this. I also do not trust my current ability to assess the situation to the extent that I would sacrifice other more compelling goals right now. And I am simply risk-adverse. I know that there is always either a best choice or all options are equal, no matter what uncertainty. Maybe everything is currently speaking in favor of the SIAI, but I'm not able to ignore my gut feeling right now. Trying to do so frequently makes me reluctant to do anything at all. Something is very wrong, I can't pinpoint what it is right now so I'm throwing everything I got at it to see if the facade crumbles. So far it did not crumble but neither have I received much reassuring feedback.

My recent comments have been made after a night of no sleep and being in a bad mood. I wouldn't have written them in that way on another day. I even messaged Eliezer yesterday telling him that he can edit/delete any of my submissions here that might be harmful without having to fear that I will protest and therefore cause more trouble. I don't care about myself much, but I care not to hurt others or cause damage. Sadly I often become reluctant, then I say 'fuck it' and just go ahead to write something because I was overwhelmed by all the possible implications and subsequently ignored them.

What is really confusing is that, taken at face value, the SIAI is working on the most important and most dangerous problem anyone will ever face. The SIAI is trying to take over the universe! Yet all I see in its followers is extreme scope insensitivity. How so? Because if you seriously believe that someone else believes that he is trying to take over the multiverse then you don't just trust him because he wrote a few posts about rationality and being honest. If the stakes are high, people do everything. Ask yourself, what difference would you expect to see if Dr. Evil would disguise as Eliezer Yudkowsky? Why wouldn't he write the sequences, why wouldn't he claim to be implementing CEV? That is one of the problems that make me feel that something is wrong here. Either people really don't believe all this stuff about fooming AI, galactic civilizations and the ability of the SIAI to create a seed AI, or I'm missing something. What I would expect to see is people asking for transparency. I expect people to demand oversight and ask how exactly their money is being spend. I expect people to be much more critical and to not just believe Yudkowsky but ask for data and progress reports. Nada.

Replies from: Zack_M_Davis, Dorikka, benelliott, wedrifid, Vladimir_Nesov, timtyler

↑ comment by Zack_M_Davis · 2011-03-05T18:24:55.023Z · LW(p) · GW(p)

Either people really don't believe all this stuff about [...] the ability of the SIAI to create a seed AI

It's worth noting that AGI is decades away; no one's trying to take over the universe just yet. In this light, donations to SingInst now are better seen as funding preliminary research and outreach regarding this important problem, rather than funding AI construction.

not just believe Yudkowsky but ask for data and progress reports.

What sort of data and progress reports are you looking for? Glancing at the first two pages of the SingInst blog, I see a list of 2010 publications, and newsletters for last July and October. There's certainly room for criticism (e.g., "Why no newsletter since last October?" or "All this outreach is not very useful; I want to see incremental progress towards FAI"), but I wouldn't say there've been no progress reports.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-03-05T19:02:54.982Z · LW(p) · GW(p)

What sort of data and progress reports are you looking for?

What are they working on right now?
Why are they working on it?
What constitutes a success of the current project?
How much money was spend on that project?
What could be done with more or less money?

As far as I know Yudkowsky is currently writing a book. He earnt $95,550 last year.

What I can't reconcile right now is the strong commitment and what is actually being done. Quite a few people here actually seem to donate considerable amounts of their income to the SIAI. No doubt writing the sequences, a book and creating an online community is pretty cool but does not seem to be too cost intensive. At least others manage to do that without lots of people sending them money. I myself donated 3 times. But why are many people acting like the SIAI is the only charity that currently deserves funding, why is nobody asking if they actually need more money or if they are maybe sustainable right now? I haven't heard anything about the acquisition of a supercomputer, field experiments in neuroscience or the hiring of mathematicians. All that would justify further donations. I feel people here are not critical and demanding enough.

↑ comment by Dorikka · 2011-03-05T18:57:07.541Z · LW(p) · GW(p)

Upvoting for honesty and posting a true rejection.

I haven't read most of the sequences yet

Even if you're a slow reader, I think that it is very, very worth it to read most of the sequences. I've not read QM, Evolution, Decision Theory, and parts of Metaethics/ Human's guide to words, but I think that reading the others has drastically increased my rationality (especially the Core Sequences.) I don't think that reading technical books would have done so nearly as much because I find reading prose much more engaging than math.

My recent comments have been made after a night of no sleep and being in a bad mood.

I've recently concluded that I should place a 'highly suspect' marker on my thoughts (especially negative generalizations) if I am very hungry or tired. I tend to be quite irritable in both cases -- I'll get into arguments in which I'm really not interested in finding truth, but just getting a high from bashing the other person into the ground (please note that I am sharing my own experiences, not accusing you of this.) You may want to type these comments out so that you don't lose the thought but wait to post them until you're feeling better.

Because if you seriously believe that someone else believes that he is trying to take over the multiverse then you don't just trust him because he wrote a few posts about rationality and being honest. If the stakes are high, people do everything. Ask yourself, what difference would you expect to see if Dr. Evil would disguise as Eliezer Yudkowsky?

I've had these same thoughts before and since resolved them, but I've run out of mental steam and need to do some schoolwork. I may edit this or make a separate reply to this later.

Edit: Bolded script in this post was added for clarification -- bolding does not indicate emphasis here.

↑ comment by benelliott · 2011-03-05T17:34:45.578Z · LW(p) · GW(p)

Interesting thought, I'll admit hadn't actually considered that (I have a general problem with being too trusting and not seeing ulterior motives, although I suspect most people really aren't very dishonest).

I can see a few reasons why others might not be asking:

1) Its unlikely to get an answer. There hasn't been a whole lot of willingness to respond to similar requests in the past, EY has a thing about not giving in to demands. This doesn't really explain why people are still donating.

2) The number of genuine Dr Evils in the world is very small. Historically the most dangerous individuals have been the well-intentioned but deluded rather than the rationally malicious, which is odd since the latter category seem much more dangerous and therefore provides evidence of their rarity. Maybe people are just making an expected utility calculation and determining that the Dr Evil hypothesis is unlikely enough to trust SIAI anyway.

3) Eliezer is not the whole of SIAI, he is not even in charge. Some of the people involved have existing track records, if there is a conspiracy it runs very deep. I suppose its possible he has tricked every other member of the organization, but we are now adding a lot of burdensome details to what was already a fairly unlikely hypothesis.

4) If there are any real Dr Evils out there, then SIAI transparency might actually help them by giving away SIAI ideas while Dr Evil keeps his ideas to himself and as a result finishes his design first.

5) If I was Dr Evil trying to build an AI, then I wouldn't say that was what I was doing, since AI is quite a hard sell and will only get donations from a limited demographic (even more so for an out-of-the-mainstream idea like FAI). I would found the "organization for the protection of puppies kittens and bunnies" or something like that, which will probably get more donation money (or maybe even go into business rather than charity, since current evidence suggests that is overwhelmingly the most effective way to make large amounts of money).

Frankly, rather than a Dr Evil who wants to take over the Galaxy (I don't think he's ever said anything about the multiverse) a much more likely prospect is a conman who's found an underused niche. Of course, this wouldn't explain how he got some fairly big names like Jaan Tallinn and Edwin Edward to sponsor his donation drive.

Most of these reasons are being quite charitable to LW members, and unfortunately I suspect my own reason is the most common.

↑ comment by wedrifid · 2011-03-06T03:55:26.609Z · LW(p) · GW(p)

Ask yourself, what difference would you expect to see if Dr. Evil would disguise as Eliezer Yudkowsky? Why wouldn't he write the sequences, why wouldn't he claim to be implementing CEV?

Yes, it is impossible to distinguish a sincere optimist from a perfectly selfish sociopath. At least until they gain power (or move to an audience where the signalling game is played at a higher level of sophistication than that of conveying altruism).

↑ comment by Vladimir_Nesov · 2011-03-05T17:05:34.607Z · LW(p) · GW(p)

Ask yourself, what difference would you expect to see if Dr. Evil would disguise as Eliezer Yudkowsky? Why wouldn't he write the sequences, why wouldn't he claim to be implementing CEV?

In that case, I would expect a stupid Eliezer Yudkowsky. But one shouldn't actually reason this way, the question is, what do you anticipate, given observations actually made; not how plausible are the observations actually made, given an uncaused hypothesis.

Replies from: Pavitra, None, XiXiDu

↑ comment by Pavitra · 2011-03-05T18:14:49.709Z · LW(p) · GW(p)

You can't compute P(H|E) without computing P(E|H).

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-05T19:42:00.147Z · LW(p) · GW(p)

But one shouldn't confuse the two.

↑ comment by [deleted] · 2011-03-06T00:20:50.765Z · LW(p) · GW(p)

[...] what do you anticipate, given observations actually made; not how plausible are the observations actually made, given an uncaused hypothesis.

What's an uncaused hypothesis? And didn't you just accidentally forbid people to think properly?

↑ comment by XiXiDu · 2011-03-05T18:28:53.800Z · LW(p) · GW(p)

In that case, I would expect a stupid Eliezer Yudkowsky

Why is evil stupid and what evidence is there that Yudkowsky is smart enough not to be evil?

But one shouldn't actually reason this way, the question is, what do you anticipate, given observations actually made; not how plausible are the observations actually made, given an uncaused hypothesis.

If you got someone working on friendly AI you better ask if the person is friendly in the first place. You also shouldn't make conclusions based on the output of the subject of your conclusions. If Yudkowsky states what is right and states that he will do what is right that provides no evidence about the rightness and honesty of those statements. Besides, the most advanced statements about Yudkowsky's intentions are CEV and the meta-ethics sequence. Both are either criticized or not understood.

The question should be, what is the worst-case scenario regarding Yudkowsky and the SIAI and how can we discern it from what he is signaling? If the answer isn't clear, one should ask for transparency and oversight.

Replies from: Quirinus_Quirrell, Vladimir_Nesov, timtyler

↑ comment by Quirinus_Quirrell · 2011-03-05T20:37:24.349Z · LW(p) · GW(p)

You seem to be under the impression that Eliezer is going to create an artificial general intelligence, and oversight is necessary to ensure that he doesn't create one which places his goals over humanity's interests. It is important, you say, that he is not allowed unchecked power. This is all fine, except for one very important fact that you've missed.

Eliezer Yudkowsky can't program. He's never published a nontrivial piece of software, and doesn't spend time coding. In the one way that matters, he's a muggle. Ineligible to write an AI. Eliezer has not positioned himself to be the hero, the one who writes the AI or implements its utility function. The hero, if there is to be one, has not yet appeared on stage. No, Eliezer has positioned himself to be the mysterious old wizard - to lay out a path, and let someone else follow it. You want there to be oversight over Eliezer, and Eliezer wants to be the oversight over someone else to be determined.

But maybe we shouldn't trust Eliezer to be the mysterious old wizard, either. If the hero/AI programmer comes to him with a seed AI, then he knows it exists, and finding out that a seed AI exists before it launches is the hardest part of any plan to steal it and rewrite its utility function to conquer the universe. That would be pretty evil, but would "transparency and oversight" make things turn out better, or worse? As far as I can tell, transparency would mean announcing the existence of a pre-launch AI to the world. This wouldn't stop Eliezer from make a play to conquer the universe, but it would present that option to everybody else, including at least some people and organizations who are definitely evil.

So that's a bad plan. A better plan would be to write a seed AI yourself, keep it secret from Eliezer, and when it's time to launch, ask for my input instead.

Replies from: Eliezer_Yudkowsky, XiXiDu

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-12-27T00:22:47.247Z · LW(p) · GW(p)

(For the record: I've programmed in C++, Python, Java, wrote some BASIC programs on a ZX80 when I was 5 or 6, and once very briefly when MacOS System 6 required it I wrote several lines of a program in 68K assembly. I admit I haven't done much coding recently, due to other comparative advantages beating that one out.)

Replies from: topynate

↑ comment by topynate · 2013-12-27T00:50:18.508Z · LW(p) · GW(p)

I can't find it by search, but haven't you stated that you've written hundreds of KLOC?

Replies from: BT_Uytya, Eliezer_Yudkowsky

↑ comment by BT_Uytya · 2013-12-27T08:17:18.527Z · LW(p) · GW(p)

Yep, he have.

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-12-27T05:25:23.428Z · LW(p) · GW(p)

Sounds about right. It wasn't good code, I was young and working alone. Though it's more like the code was strategically stupid than locally poorly written.

↑ comment by XiXiDu · 2011-03-06T11:57:45.166Z · LW(p) · GW(p)

Eliezer has not positioned himself to be the hero, the one who writes the AI or implements its utility function

I disagree based on the following evidence:

After all, if you had the complete decision process, you could run it as an AI, and I'd be coding it up right now. (Eliezer_Yudkowsky 12 October 2009 06:19:28PM)

You further write:

If the hero/AI programmer comes to him with a seed AI, then he knows it exists, and finding out that a seed AI exists before it launches is the hardest part of any plan to steal it and rewrite its utility function to conquer the universe.

I'm not aware of any reason to believe that recursively self-improving artificial general intelligence is going to be something you can 'run away with'. It looks like some people here think so, that there will be some kind of, with hindsight, simple algorithm for intelligence that people can just run and get superhuman intelligence. Indeed, transparency could be very dangerous in that case. But that doesn't mean it is an all or nothing decision. There are many other reasons for transparency, including reassurance and the ability to discern a trickster or impotent individual from someone who deserves more money. But as I said, I don't see that anyway. It'll more likely be a blue sheet of different achievements that are each not dangerous on their own. I further think it will be not just a software solution but also a conceptual and computational revolution. In those cases an open approach will allow public oversight. And even if someone is going to run with it, you want them to use your solution rather than one that will most certainly be unfriendly.

↑ comment by Vladimir_Nesov · 2011-03-05T19:45:58.509Z · LW(p) · GW(p)

Evil is not necessarily stupid (well, it is, if we are talking about humans, but let's abstract from that). Still, it would take a stupid Dr Evil to decide that pretending to be Eliezer Yudkowsky is the best available course of action.

Replies from: timtyler

↑ comment by timtyler · 2011-03-05T20:48:33.447Z · LW(p) · GW(p)

You don't think that being Eliezer Yudkowsky is an effective way to accomplish the task at hand? What should Dr Evil do, then?

FWIW, my usual comparison is not with Dr Evil, but with Gollum. The Singularity Instutute have explicitly stated said they are trying to form "The Fellowship of the AI". Obviously we want to avoid Gollum's final scene.

Gollum actually started out good - it was the exposure to the ring that caused problems later on.

Replies from: Leonhart

↑ comment by Leonhart · 2011-03-05T20:58:03.554Z · LW(p) · GW(p)

I seem to remember Smeagol being an unpleasant chap even before Deagol found the ring. But admittedly, we weren't given much.

↑ comment by timtyler · 2011-03-05T19:16:40.280Z · LW(p) · GW(p)

what is the worst-case scenario regarding Yudkowsky and the SIAI and how can we discern it from what he is signaling? If the answer isn't clear, one should ask for transparency and oversight.

Transparency is listed as being desirable here:

It will become increasingly important to develop AI algorithms that are not just powerful and scalable, but also transparent to inspection - to name one of many socially important properties.

However, apparently, this doesn't seem to mean open source software - e.g. here:

the Singularity Institute does not currently plan to develop via an open-source method

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-05T19:40:10.771Z · LW(p) · GW(p)

You equivocate two unrelated senses of "transparency".

Replies from: timtyler

↑ comment by timtyler · 2011-03-05T20:23:15.888Z · LW(p) · GW(p)

Uh, what? Transparency gets listed as a "socially important" virtue in the PR documents - but the plans apparently involve keeping the source code secret.

Replies from: jimrandomh, Perplexed

↑ comment by jimrandomh · 2011-03-06T02:16:41.441Z · LW(p) · GW(p)

He means "transparent" as in "you can read its plans in the log files/with a debugger", not as in "lots of people have access". Transparency in the former sense is a good thing, since it lets the programmer verify that it's sane and performing as expected. Transparency in the latter sense is a bad thing, because if lots of people had access then there would be no one with the power to say the AI wasn't safe to run or give extra hardware, since anyone could take a copy and run it themselves.

Replies from: timtyler

↑ comment by timtyler · 2011-03-06T11:03:40.650Z · LW(p) · GW(p)

Full transparency - with lots of people having access - is desirable from society's point of view. Then, there are more eyes looking for flaws in the code - which makes is safer. Also, then, society can watch to ensure development is going along the right lines. This is likely to make the developers behave bettter, and having access to the code gives society the power to collectively protect itself aginst wrongdoers.

The most likely undesirable situation involves copyrighted/patented/secret server side machine intelligence sucking resources to benefit a minority at the expense of the rest of society. This is a closed-source scenario - and that isn't an accident. Being able to exploit others for your own benefit is one of the most common reasons for secrecy.

EMACS is a powerful tool - but we do not keep it secret because the mafia might use it to their own advantage. It is better all round that everyone has access, rather than just an elite. Both society and EMACS itself are better because of that strategy.

The idea that you can get security through obscurity is a common one - but it does not have a particularly-good history or reputation in IT.

The NSA is one of the more well-known examples of it being tried with some success. There we have a large organisation (many eyeballs inside mean diminishing returns from extra eyeballs) - and one with government backing. Despite this, the NSA often faces allegations of secretive, unethical behaviour.

Replies from: jimrandomh

↑ comment by jimrandomh · 2011-03-06T16:19:23.377Z · LW(p) · GW(p)

You completely ignored my argument.

Replies from: timtyler

↑ comment by timtyler · 2011-03-06T22:42:07.211Z · LW(p) · GW(p)

From my perspective, it seems inaccurate to claim that I ignored your argument - since I deat with it pretty explicitly in my paragraph about EMACS.

I certainly put a lot more effort into addressing your points than you just put into addressing mine.

Replies from: jimrandomh

↑ comment by jimrandomh · 2011-03-06T22:55:40.943Z · LW(p) · GW(p)

I said that public access to an AI under development would be bad, because if it wasn't safe to run - that is, if running it might cause it too foom and destroy the world - then no one would be able to make that judgment and keep others from running it. You responded with an analogy to EMACS, which no one believes or has ever believed to be dangerous, and which has no potential to do disastrous things that their operators did not intend. So that analogy is really a non sequitur.

"Dangerous" in this context does not mean "powerful", it means "volatile", as in "reacts explosively with Pentiums".

Replies from: timtyler

↑ comment by timtyler · 2011-03-07T00:19:52.107Z · LW(p) · GW(p)

Both types of software are powerful tools. Powerful tools are dangerous in the wrong hands, because they amplify the power of their users. That is the gist of the analogy.

I expect EMACS has been used for all kinds of evil purposes, from writing viruses, trojans, and worms to tax evasion and fraud.

I note that Anders Sandberg recently included:

"Otherwise the terrorists will win!"

...in his list of of signs that you might be looking at a weak moral argument.

That seems rather dubious as a general motto, but in this case, I am inclined to agree. In the case of intelligent machines, the positives of openness substantially outweigh their negatives, IMO.

Budding machine intelligence builders badly need to signal that they are not going to screw everyone over. How else are other people to know that they are not planning to screw everyone over?

Such signals should be expensive and difficult to fake. In this case, about the only credible signal is maximum transpareny. I am not going to screw you over, and look, here is the proof: what's mine is yours.

Replies from: jimrandomh

↑ comment by jimrandomh · 2011-03-07T00:33:35.693Z · LW(p) · GW(p)

If you don't understand something I've written, please ask for clarification. Don't guess what I said and respond to that instead; that's obnoxious. Your comparison of my argument to

"Otherwise the terrorists will win!"

Leads me to believe that you didn't understand what I said at all. How is destroying the world by accident like terrorism?

Replies from: timtyler

↑ comment by timtyler · 2011-03-07T00:50:04.661Z · LW(p) · GW(p)

Er, characterising someone who disagrees with you on a technical point as "obnoxious" is not terribly great manners in itself! I never compared destroying the world by accident with terrorism - you appear to be projecting. However, I am not especially interested in the conversation being dragged into the gutter in this way.

If you did have a good argument favouring closed source software and reduced transparency I think there has been a reasonable chance to present it. However, if you can't even be civil, perhaps you should consider waiting until you can.

Replies from: jimrandomh

↑ comment by jimrandomh · 2011-03-07T01:09:23.703Z · LW(p) · GW(p)

I gave an argument that open-sourcing AI would increase the risk of the world being destroyed by accident. You said

I note that Anders Sandberg recently included: "Otherwise the terrorists will win!" ...in his list of of signs that you might be looking at a weak moral argument.

I presented the mismatch between this statement and my argument as evidence that you had misunderstood what I was saying. In your reply,

I never compared destroying the world by accident with terrorism - you appear to be projecting.

You are misunderstanding me again. I think I've already said all that needs to be said, but I can't clear up confusion if you keep attacking straw men rather than asking questions.

↑ comment by Perplexed · 2011-03-06T00:25:03.121Z · LW(p) · GW(p)

You are confusing socially important with societally important. Microsoft, for example, seeks to have its source code transparent to inspection, because Microsoft, as a corporate culture, produces software socially - that is, utilizing an evil conspiracy involving many communicating agents.

Replies from: timtyler

↑ comment by timtyler · 2011-03-06T00:56:25.968Z · LW(p) · GW(p)

I deny confusing anything. I understand that transparency can be a matter of degree and perspective. What I am pointing out is lip-service to transparency. Full transparency would be different.

Microsoft's software is not very transparent - and partly as a result it is some of the most badly-designed, insecure and virus-ridden software the planet has ever seen. We can see the mistake, can see its consequences - and know how to avoid it - but we have to, like actually do that - and that involves some alerting of others to the problems often associated with closed-source proposals.

↑ comment by timtyler · 2011-03-05T19:09:08.317Z · LW(p) · GW(p)

Ask yourself, what difference would you expect to see if Dr. Evil would disguise as Eliezer Yudkowsky?

Would Dr Evil have been silly enough to give the game away - like this:

I must warn my reader that my first allegiance is to the Singularity, not humanity. I don't know what the Singularity will do with us. I don't know whether Singularities upgrade mortal races, or disassemble us for spare atoms. While possible, I will balance the interests of mortality and Singularity. But if it comes down to Us or Them, I'm with Them. You have been warned.

...?

Replies from: wedrifid

↑ comment by wedrifid · 2011-03-06T03:52:14.064Z · LW(p) · GW(p)

Would Dr Evil have been silly enough to give the game away - like this:

He would if he was modelling your brain, evidently.

Replies from: timtyler

↑ comment by timtyler · 2011-03-06T10:54:17.942Z · LW(p) · GW(p)

So: you are saying you think I am Dr Evil? Great. Who are you, then? Some other masked vigilante, no doubt :-(

Replies from: wedrifid

↑ comment by wedrifid · 2011-03-06T11:38:35.258Z · LW(p) · GW(p)

So: you are saying you think I am Dr Evil?

No, that if you consider him being silly enough to say X to be evidence of innocence and he is trying to persuade you then he'll say X.

Who are you, then? Some other masked vigilante, no doubt :-(

I'm cool with that. :)

Replies from: timtyler

↑ comment by timtyler · 2011-03-06T12:03:00.963Z · LW(p) · GW(p)

So: you are saying you think I am Dr Evil?

No, that if you consider him being silly enough to say X to be evidence of innocence and he is trying to persuade you then he'll say X.

Ah - now I see! Oops! I don't tend to go in for super-evil stereotypes in the first place. So, I can't say I have much interest in evidence on this topic - but yes, this would be anecdotal evidence, at best.

Replies from: wedrifid

↑ comment by wedrifid · 2011-03-06T12:21:44.076Z · LW(p) · GW(p)

Ah - now I see! Oops! I don't tend to go in for super-evil stereotypes in the first place. So, I can't say I have much interest in evidence on this topic - but yes, this would be anecdotal evidence, at best.

:P We were talking about evidence? I thought were were talking about absurd counterfactuals and hypothetical cape wearing sociopaths.

Replies from: timtyler

↑ comment by timtyler · 2011-03-06T13:32:45.802Z · LW(p) · GW(p)

You brought up "evidence" first - but yes, you can talk about "evidence" in hypothetical scenarios, that is not a problematical concept.

↑ comment by XiXiDu · 2011-03-05T17:10:39.527Z · LW(p) · GW(p)

Would you also object if it seemed like everyone working for Village Reach agreed about giving vaccinations to African children being a good idea?

If I would disagree and believe that it is worth it to voice my disagreement, then yes. You just can't compare that though. Can you name another group of people who try to take over the universe?

As for SIAI, by its very purpose only attracts people who agree with Eliezer's philosophy of AI. There is nothing wrong with this.

Jehovah's Witnesses also only attract certain people. A lot of money is being donated and spend on brainwashing material designed to get even more money to spend on brainwashing. I think that is wrong. The problem is that nobody there is deliberately doing something 'wrong'. There is no guru, they all believe to do what is 'right'. Nobody is critical. But if they had a forum where one could openly discuss with them about their ideas then I'd be there and challenge them. Not that I want to compare them with LW, that be crazy, but I want to challenge your argument.

Replies from: benelliott, Vladimir_Nesov

↑ comment by benelliott · 2011-03-05T17:41:02.687Z · LW(p) · GW(p)

The Village Reach argument was referring to SIAI, not Less Wrong. They are distinct entities, one is a forum for discussion and the other is an organization with the aim of doing something. It is quite right that the first has many dissenting opinions, whereas the latter does not. SIAI may be able to benefit from dissent on the many sub-issues related to FAI, but not to the fundamental idea that FAI is important.

Imagine a company where about 40% of the employees, even at the highest levels, disagreed with the premise that they should be trying to make money and instead either intentionally tried to lose the company money, or argued constantly with the other 60%. Nothing would get done.

Disagreement about FAI may be good for LW but it is probably not good for SIAI. Since there is disagreement on LW, I really don't see the problem.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-05T17:46:43.083Z · LW(p) · GW(p)

SIAI may be able to benefit from dissent on the many sub-issues related to FAI, but not to the fundamental idea that FAI is important.

If FAI is unimportant, SIAI should conclude that FAI is unimportant. Hence it's not clear where the following distinction happens.

Disagreement about FAI may be good for LW but it is probably not good for SIAI.

Replies from: benelliott

↑ comment by benelliott · 2011-03-05T17:55:23.486Z · LW(p) · GW(p)

I don't think its the best use of any organization's money to employ people who disagree with the premise that the organization should exist.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-05T17:59:02.576Z · LW(p) · GW(p)

Nonetheless, I don't think its the best use of any organization's money to employ people who disagree with the premise that the organization should exist.

But disagreement itself is not the reason for this being a bad strategy.

Replies from: benelliott

↑ comment by benelliott · 2011-03-05T18:05:51.448Z · LW(p) · GW(p)

I don't quite follow. The only point I was trying to make was that "everybody in SIAI agrees about FAI, therefore they're all a bunch of brainwashed zombies" is not a valid complaint.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-03-05T19:41:34.154Z · LW(p) · GW(p)

Yes.

↑ comment by Vladimir_Nesov · 2011-03-05T17:15:41.481Z · LW(p) · GW(p)

Not that I want to compare them with LW, that be crazy, but I want to challenge your argument.

What argument? benelliott suggested that your argument makes use of a very weak piece of evidence (presence of significant agreement). Obviously, interpreted as counterevidence of the opposite claim, it is equally weak.

↑ comment by Emile · 2011-03-05T21:36:04.211Z · LW(p) · GW(p)

My current perception is that there are not many independent minds to be found here. I perceive there to be a strong tendency to jump if Yudkowsky tells people to jump. I'm virtually the only true critic of the SIAI, which is really sad and frightening.

Maybe it's because this "being an independant mind" thing isn't as great as you think it is? Like most people here, I've been raised hearing about the merits of challenging authority, thinking for yourself, questioning everything, not following the herd, etc. But there's a dark side to that, and it's thinking that when you disagree with the experts, you're right and the experts are wrong.

I now think that a lot of those "think for yourself" and "listen to your heart" things are borderline dark-side epistemology, and that by default, the experts are right and I should just shut up until I have some very good reasons to disagree. Any darn fool can decide the experts are victim of groupthink, or don't dare think outside the box, or just want to preserve the status quo. I think changing one's mind when faced with disagreeing expert opinion is a better sign off rationality than "thinking for oneself". I think that many rationalist's self-image as iconoclasts is harmful.

I'm willing to call myself an "Eliezer Yudkosky fanboy" in a bullet-biting kind of way. I don't see the lack of systematic disagreement as a bad thing, and I don't care about looking like a cult member.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-03-06T10:26:03.300Z · LW(p) · GW(p)

...by default, the experts are right and I should just shut up until I have some very good reasons to disagree.

Yet you decided to trust Yudkowsky, not the experts.

Any darn fool can decide the experts are victim of groupthink, or don't dare think outside the box, or just want to preserve the status quo.

I don't, that is why I am asking experts, many seem not to share Yudkowsky's worries.

I'm willing to call myself an "Eliezer Yudkosky fanboy" in a bullet-biting kind of way. I don't see the lack of systematic disagreement as a bad thing, and I don't care about looking like a cult member.

I actually got a link to his homepage and the SIAI on my homepage for a few years under 'favorites sites'.

↑ comment by lukeprog · 2011-03-06T03:28:16.093Z · LW(p) · GW(p)

I doubt you're "virtually the only true critic of the SIAI."

But if you think I'm not much of a critic of SIAI/Yudkowsky, you're right. Many of my posts have included minor criticisms, but that's because it's not as valued here to just repeat all the thousands of things on which I agree with Eliezer.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-03-06T10:34:01.614Z · LW(p) · GW(p)

But if you think I'm not much of a critic of SIAI/Yudkowsky, you're right. Many of my posts have included minor criticisms, but that's because it's not as valued here to just repeat all the thousands of things on which I agree with Eliezer.

I actually messaged him telling him that he can edit/delete any harmful submissions of mine without having to expect harmful protest. Does that look like I particularly disagree with him, or assign a high probability to him being Dr. Evil? I don't, but it is a possibility and it is widely ignored. To get provable friendly AI you'll need provable friendly humans. If that isn't possible you'll need oversight and transparency.

Smart people can be wrong.
Smart people can be evil.
People can appear smarter than they are.

That's why I demand...

Third-party peer-review of Yudkowsky's work.
Oversight and transparency.
Progress reports, roadmaps and confirmable success.

Replies from: wedrifid

↑ comment by wedrifid · 2011-03-06T10:45:12.442Z · LW(p) · GW(p)

To get provable friendly AI you'll need provable friendly humans.

Not actually true.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-03-06T11:33:37.961Z · LW(p) · GW(p)

Not actually true.

Technically it isn't of course. But I don't expect unfriendly humans not to show me friendly AI but actually implement something else. What I meant is that you'll need friendly humans to not end up with some trickster who takes your money and in 30 years you notice that all he has done is to code some chat bot. There are a lot of reasons that the trustworthiness of the humans involved is important. Of course, provable friendly AI is provable friendly no matter who coded it.

↑ comment by wedrifid · 2011-03-06T03:41:26.921Z · LW(p) · GW(p)

My current perception is that there are not many independent minds to be found here. I perceive there to be a strong tendency to jump if Yudkowsky tells people to jump. I'm virtually the only true critic of the SIAI, which is really sad and frightening.

I criticise Eliezer frequently. I manage to do so without being particularly negatively received by the alleged Yudkowsky hive mind.

Note: My criticisms of EY/SIAI are specific even if consistent. Like lukeprog I do not feel the need to repeat the thousands of things about which I agree with EY.

Further Note: There are enough distinct things that I disagree with Eliezer about that, given my metacognitive confidence levels I can expect that on at least one of them I am mistaken. Which is a curious epistemic state to be in but purely tangential. ;)

Yet another edit: A clear example of criticism of Eliezer is with respect to his discussion of his metaethics and CEV. I didn't find his contribution in the linked conversation satisfactory and consider it representative of his other recent contributions on the subject. Everything except his sequence on the subject has been nowhere near the standard I would expect from someone dedicating their life to studying a subject that will rely reasoning flawlessly in the related area!

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-03-06T10:51:16.055Z · LW(p) · GW(p)

Like lukeprog I do not feel the need to repeat the thousands of things about which I agree with EY.

You think I don't? I agree with almost everyone about thousands of things. I perceive myself to be an uneducated fool. If I read a few posts of someone like Yudkowsky and intuitively agree, that is very weak evidence to trust him or of his superior intellect.

I still think that he's one of the smartest people though. But there is a limit to what I'll just accept on mere reassurance. And I have seen nothing that would allow me to conclude that he could accomplish much regarding friendly AI without a billion dollars and a team of mathematicians and other specialists.

Replies from: wedrifid

↑ comment by wedrifid · 2011-03-06T11:47:11.511Z · LW(p) · GW(p)

You think I don't?

No, that wasn't for your benefit at all. Just disclaiming limits. Declarations of criticism are sometimes worth tempering just a tad. :)

↑ comment by David_Gerard · 2011-03-05T10:43:03.303Z · LW(p) · GW(p)

By the MWI sequence, I presume he means the QM sequence, which appears clear to me but bogus to physicists I've asked ... and, more importantly, to the physicists who commented on the posts in it and said that he couldn't do what he'd just done (To which he answered that he doesn't claim to be a physicist.)

Also, judging by the low votes and small number of commenters, it seems that even people who claim to have read the sequences have tended to tl;dr at the QM sequence.

(I finally finished a first run through the million words of sequences and the millions of words of comments. I only finally tipped my tl;dr tilt sensor at the decision theory sequence, which isn't actually very sequential.)

↑ comment by wedrifid · 2011-03-06T01:02:41.862Z · LW(p) · GW(p)

I love those quotes. The one about negatively useful AI doctorates is a favourite of mine. :)

↑ comment by Manfred · 2011-03-05T11:45:19.858Z · LW(p) · GW(p)

Huh, just read So You Want To Be A Seed AI Programmer. Appears to be from 2009. I would recommend http://www.fastcompany.com/magazine/06/writestuff.html as a highly contrasting frame of thought.

Replies from: komponisto

↑ comment by komponisto · 2011-03-05T15:25:12.904Z · LW(p) · GW(p)

Huh, just read So You Want To Be A Seed AI Programmer. Appears to be from 2009

It's from much earlier than that (like 2005 or something). That particular wiki isn't the original source.

comment by Daniel_Burfoot · 2011-03-05T17:28:14.929Z · LW(p) · GW(p)

With regards to your (and Eliezer's) quest, I think Oppenheimer's Maxim is relevant:

It is a profound and necessary truth that the deep things in science are not found because they are useful, they are found because it was possible to find them.

A theory of machine ethics may very well be the most useful concept ever discovered by humanity. But as far as I can see, there is no reason to believe that such a theory can be found.

Replies from: lukeprog

↑ comment by lukeprog · 2011-03-06T00:16:46.941Z · LW(p) · GW(p)

Daniel_Burfoot,

I share your pessimism. When superintelligence arrives, humanity is almost certainly fucked. But we can try.

comment by timtyler · 2011-03-05T19:41:11.824Z · LW(p) · GW(p)

For the list:

The Ethics of Artificial Intelligence http://www.nickbostrom.com/ethics/artificial-intelligence.pdf

Ethical Issues in Advanced Artificial Intelligence http://www.nickbostrom.com/ethics/ai.html

Beyond AI http://mol-eng.com/

Replies from: lukeprog

↑ comment by lukeprog · 2011-03-06T00:15:49.910Z · LW(p) · GW(p)

Tim,