Reply to Holden on The Singularity Institute

post by lukeprog · 2012-07-10T23:20:18.690Z · LW · GW · Legacy · 214 comments

Contents

  Contents
  Comments
  Why many people care greatly about existential risk reduction
  AI risk: the most important existential risk
  SI can purchase several kinds of AI risk reduction more efficiently than others can
  My replies to Holden, point by point
    GiveWell Labs
    Three possible outcomes
    SI's mission is more important than SI as an organization
    SI's arguments need to be clearer
    Holden's objection #1 punts to objection #2
    Tool AI
    SI's mission assumes a scenario that is far less conjunctive than it initially appears.
    SI's public argumentation
    SI's endorsements
    SI and feedback loops
    SI and rationality
    SI's goals and activities
    Theft
    Pascal's Mugging
  Summary of my reply to Holden
    Conclusion
None
214 comments

Holden Karnofsky of GiveWell has objected to the Singularity Institute (SI) as a target for optimal philanthropy. As someone who thinks that existential risk reduction is really important and also that the Singularity Institute is an important target of optimal philanthropy, I would like to explain why I disagree with Holden on these subjects. (I am also SI's Executive Director.)

Mostly, I'd like to explain my views to a broad audience. But I'd also like to explain my views to Holden himself. I value Holden's work, I enjoy interacting with him, and I think he is both intelligent and capable of changing his mind about Big Things like this. Hopefully Holden and I can continue to work through the arguments together, though of course we are both busy with many other things.

I appreciate the clarity and substance of Holden's objections, and I hope to reply in kind. I begin with an overview of some basic points that may be familiar to most Less Wrong veterans, and then I reply point-by-point to Holden's post. In the final section, I summarize my reply to Holden.

Holden raised many different issues, so unfortunately this post needed to be long. My apologies to Holden if I have misinterpreted him at any point.


Contents


Comments

I must be brief, so while reading this post I am sure many objections will leap to your mind. To encourage constructive discussion on this post, each question (posted as a comment on this page) that follows the template described below will receive a reply from myself or another SI representative.

Please word your question as clearly and succinctly as possible, and don't assume your readers will have read this post before reading your question (because: the conversations here may be used as source material for a comprehensive FAQ).

Here's an example of how you could word the first paragraph of your question: "You claimed that [insert direct quote here], and also that [insert another direct quote here]. That seems to imply that [something something]. But that doesn't seem to take into account that [blah blah blah]. What do you think of that?"

If your question needs more explaining, leave the details to subsequent paragraphs in your comment. Please post multiple questions as multiple comments, so they can be voted upon and replied to individually. If you don't follow these rules, I can't guarantee SI will have time to give you a reply. (We probably won't.)


Why many people care greatly about existential risk reduction

Why do many people consider existential risk reduction to be humanity's most important task? I can't say it much better than Nick Bostrom does, so I'll just quote him:

An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development. Although it is often difficult to assess the probability of existential risks, there are many reasons to suppose that the total such risk confronting humanity over the next few centuries is significant...

Humanity has survived what we might call natural existential risks [asteroid impacts, gamma ray bursts, etc.] for hundreds of thousands of years; thus it is prima facie unlikely that any of them will do us in within the next hundred...

In contrast, our species is introducing entirely new kinds of existential risk—threats we have no track record of surviving... In particular, most of the biggest existential risks seem to be linked to potential future technological breakthroughs that may radically expand our ability to manipulate the external world or our own biology. As our powers expand, so will the scale of their potential consequences—intended and unintended, positive and negative. For example, there appear to be significant existential risks in some of the advanced forms of biotechnology, molecular nanotechnology, and machine intelligence that might be developed in the decades ahead.

What makes existential catastrophes especially bad is not that they would [cause] a precipitous drop in world population or average quality of life. Instead, their significance lies primarily in the fact that they would destroy the future... To calculate the loss associated with an existential catastrophe, we must consider how much value would come to exist in its absence. It turns out that the ultimate potential for Earth-originating intelligent life is literally astronomical.

One gets a large number even if one confines one’s consideration to the potential for biological human beings living on Earth. If we suppose... that our planet will remain habitable for at least another billion years, and we assume that at least one billion people could live on it sustainably, then the potential exist for at least 1018 human lives. [The numbers get way bigger if you consider the expansion of posthuman civilization to the rest of the galaxy or the prospect of mind uploading.]

Even if we use the most conservative of these estimates, which entirely ignores the possibility of space colonization and software minds, we find that the expected loss of an existential catastrophe is greater than the value of 1016 human lives...

These considerations suggest that the loss in expected value resulting from an existential catastrophe is so enormous that the objective of reducing existential risks should be a dominant consideration whenever we act out of an impersonal concern for humankind as a whole.

I refer the reader to Bostrom's paper for further details and additional arguments, but neither his paper nor this post can answer every objection one might think of.

Nor can I summarize all the arguments and evidence related to estimating the severity and time horizon of every proposed existential risk. Even the 500+ pages of Oxford University Press' Global Catastrophic Risks can barely scratch the surface of this enormous topic. As explained in Intelligence Explosion: Evidence and Import, predicting long-term technological progress is hard. Thus, we must

examine convergent outcomes that—like the evolution of eyes or the emergence of markets—can come about through any of several different paths and can gather momentum once they begin.

I'll say more about convergent outcomes later, but for now I'd just like to suggest that:

  1. Many humans living today value both current and future people enough that if existential catastrophe is plausible this century, then upon reflection (e.g. after counteracting their unconscious, default scope insensitivity) they would conclude that reducing the risk of existential catastrophe is the most valuable thing they can do — whether through direct work or by donating to support direct work. It is to these people I appeal. (I also have much to say to people who e.g. don't care about future people, but it is too much to say here and now.)

  2. As it turns out, we do have good reason to believe that existential catastrophe is plausible this century.

I don't have the space here to discuss the likelihood of different kinds of existential catastrophe that could plausibly occur this century (see GCR for more details), so instead I'll talk about just one of them: an AI catastrophe.


AI risk: the most important existential risk

There are two primary reasons I think AI is the most important existential risk:

Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa. There is an asymmetry between AI risk and other existential risks. If we mitigate the risks from (say) synthetic biology and nanotechnology (without building Friendly AI), this only means we have bought a few years or decades for ourselves before we must face yet another existential risk from powerful new technologies. But if we manage AI risk well enough (i.e. if we build a Friendly AI or "FAI"), we may be able to "permanently" (for several billion years) secure a desirable future. Machine superintelligence working in the service of humane goals could use its intelligence and resources to prevent all other existential catastrophes. (Eliezer: "I distinguish 'human', that which we are, from 'humane'—that which, being human, we wish we were.")

Reason 2: AI is probably the first existential risk we must face (given my evidence, only the tiniest fraction of which I can share in a blog post).

One reason AI may be the most urgent existential risk is that it's more likely for AI (compared to other sources of catastrophic risk) to be a full-blown existential catastrophe (as opposed to a merely billions dead catastrophe). Humans are smart and adaptable; we are already set up for a species-preserving number of humans to survive (e.g. in underground bunkers with stockpiled food, water, and medicine) major catastrophes from nuclear war, superviruses, supervolcano eruption, and many cases of asteroid impact or nanotechnological ecophagy.

Machine superintelligences, however, could intelligently seek out and neutralize humans which they (correctly) recognize as threats to the maximal realization of their goals. Humans are surprisingly easy to kill if an intelligent process is trying to do so. Cut off John's access to air for a few minutes, or cut off his water supply for a few days, or poke him with a sharp stick, and he dies. Forever. (Post-humans might shudder at this absurdity like we shudder at the idea that people used to die from their teeth.)

Why think AI is coming anytime soon? This is too complicated a topic to breach here. See Intelligence Explosion: Evidence and Import for a brief analysis of AI timelines. Or try The Uncertain Future, which outputs an estimated timeline for human-level AI based on your predictions of various technological developments. (SI is currently collaborating with the Future of Humanity Institute to write another paper on this subject.)

It's also important to mention that the case for caring about AI risk is less conjunctive that many seem to think, which I discuss in more detail here.


SI can purchase several kinds of AI risk reduction more efficiently than others can

The two organizations working most directly to reduce AI risk are the Singularity Institute and the Future of Humanity Institute (FHI). Luckily, these organizations complement each other well, as I pointed out back before I was running SI:

A few weeks later, Nick Bostrom (Director of FHI) said the same things (as far as I know, without having read my comment):

I think there is a sense that both organizations are synergistic. If one were about to go under... that would probably be the one [to donate to]. If both were doing well... different people will have different opinions. We work quite closely with the folks from [the Singularity Institute]...

There is an advantage to having one academic platform and one outside academia. There are different things these types of organizations give us. If you wanna get academics to pay more attention to this, to get postdocs to work on this, that's much easier to do within academia; also to get the ear of policy-makers and media... On the other hand, for [SI] there might be things that are easier for them to do. More flexibility, they're not embedded in a big bureaucracy. So they can more easily hire people with non-standard backgrounds... and also more grass-roots stuff like Less Wrong...

FHI is, despite its small size, a highly productive philosophy department. More importantly, FHI has focused its research work on AI risk issues for the past 9 months, and plans to continue on that path for at least another 12 months. This is important work that should be supported. (Note that FHI recently hired SI research associate Daniel Dewey.)

SI lacks FHI's publishing productivity and its university credibility, but as an organization SI is improving quickly, and it can seize many opportunities for AI risk reduction that FHI is not well-positioned to seize. (New organizations will also tend to be less capable of seizing these opportunities than SI, due to the financial and human capital already concentrated at SI and FHI.)

Here are some examples of projects that SI is probably better able to carry out than FHI, given its greater flexibility (and assuming sufficient funding):


My replies to Holden, point by point

Holden's post makes so many claims that I'll just have to work through his post from beginning to end, and then summarize where I think we stand at the end.


GiveWell Labs

Holden opened "Thoughts on the Singularity Institute" by noting that SI was previously outside Givewell's scope, since GiveWell was focused on specific domains like poverty reduction. With the launch of GiveWell Labs, GiveWell is now open to evaluating any giving opportunity, including SI.

I admire this move. I'm sure people have been bugging GiveWell to do this for a long time, but almost none of those people appreciate how hard it is to launch broad new initiatives like this with the limited budget of an organization like Givewell or the Singularity Institute. Most of them also do not understand how much work is required to write something like "Thoughts on the Singularity Institute", "Reply to Holden on Tool AI", or this post.


Three possible outcomes

Next, Holden wrote:

[I hope] that one of these three things (or some combination) will happen:

  1. New arguments are raised that cause me to change my mind and recognize SI as an outstanding giving opportunity. If this happens I will likely attempt to raise more money for SI (most likely by discussing it with other GiveWell staff and collectively considering a GiveWell Labs recommendation).

  2. SI concedes that my objections are valid and increases its determination to address them. A few years from now, SI is a better organization and more effective in its mission.

  3. SI can't or won't make changes, and SI's supporters feel my objections are valid, so SI loses some support, freeing up resources for other approaches to doing good.

As explained at the top of Holden's post, I had already conceded that many of Holden's objections (especially concerning past organizational competence) are valid, and had been working to address them, even before Holden's post was published. So outcome #2 is already true in part.

I hope for outcome #1, too, but I don't expect Holden to change his opinion overnight. There are too many possible objections to which Holden has not yet heard a good response. But hopefully this post and its comment threads will successfully address some of Holden's (and others') objections.

Outcome #3 is unlikely since SI is already making changes, though of course it's possible we will be unable to raise sufficient funding for SI despite making these changes, or even because of our efforts to make these changes. (Improving general organizational effectiveness is important but it costs money and is not exciting to donors.)


SI's mission is more important than SI as an organization

Holden said:

whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.

Clearly, SI's mission is more important than SI as an organization. If somebody launches an organization more effective (at AI risk reduction) than SI but just as flexible, then SI should probably fold itself and try to move its donor base, support community, and the best of its human capital to that new organization.

That said, it's probably easier to reform SI into a more effective organization than it is to launch a new one, since SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.

(On the other hand, SI has also concentrated some bad reputation which a new organization could launch without. But I still think the weight of the arguments is in favor of reforming SI.)


SI's arguments need to be clearer

Holden:

I do not believe that [my objections to SI's apparent views] constitute a sharp/tight case for the idea that SI's work has low/negative value; I believe, instead, that SI's own arguments are too vague for such a rebuttal to be possible. There are many possible responses to my objections, but SI's public arguments (and the private arguments) do not make clear which possible response (if any) SI would choose to take up and defend. Hopefully the dialogue following this post will clarify what SI believes and why.

I agree that SI's arguments are often vague. For example, Chris Hallquist reported:

I've been trying to write something about Eliezer's debate with Robin Hanson, but the problem I keep running up against is that Eliezer's points are not clearly articulated at all. Even making my best educated guesses about what's supposed to go in the gaps in his arguments, I still ended up with very little.

I know the feeling! That's why I've tried to write as many clarifying documents as I can, including the Singularity FAQ, Intelligence Explosion: Evidence and Import, The Singularity and Machine Ethics, Facing the Singularity, So You Want to Save the World, and How to Purchase AI Risk Reduction.

Unfortunately, it takes lots of resources to write up hundreds of arguments and responses to objections in clear and precise language, and we're working on it. (For comparison, Nick Bostrom's forthcoming book on machine superintelligence will barely scratch the surface of the things SI and FHI researchers have worked out in conversation, and it will probably take him 2+ years to write in total, and Bostrom is already an unusually prolific writer.) Hopefully SI's responses to Holden's post have helped to clarify our positions already.


Holden's objection #1 punts to objection #2

The first objection on Holden's numbered list was:

it seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous.

I'm glad Holden agrees with us that successful Friendly AI is very hard. SI has spent much of its effort trying to show people that the first 20 solutions they come up with all fail. See: AI as a Positive and Negative Factor in Global Risk, The Singularity and Machine Ethics, Complex Value Systems are Required to Realize Valuable Futures, etc. Holden mentions the standard SI worry about the hidden complexity of wishes, and the one about a friendly utility function still causing havoc because the AI's priors are wrong (problem 3.6 from my list of open problems in AI risk research).

There are reasons to think FAI is harder still. What if we get the utility function right and we get the priors right but the AI's values change for the worse when it updates its ontology? What if the smartest, most careful, most insanely safety-conscious AI researchers humanity can produce just aren't smart enough to solve the problem? What if no humans are altruistic enough to choose to build FAI over an AI that will make them king of the universe? What if the idea of FAI is incoherent? (The human brain is an existence proof for the possibility of general intelligence, but we have no existence proof for the possibility of a decision theoretic agent which stably optimizes the world according to a set of preferences over states of affairs.)

So, yeah. Friendly AI is hard. But as I said elsewhere:

The point is that not trying as hard as you can to build Friendly AI is even worse, because then you almost certainly get uFAI. At least by trying to build FAI, we've got some chance of winning.

So Holden's objection #1 objection really just punts to objection #2, about tool-AGI, as the last paragraph in this section of Holden's post seems to indicate:

So far, all I have argued is that the development of "Friendliness" theory can achieve at best only a limited reduction in the probability of an unfavorable outcome. However, as I argue in the next section, I believe there is at least one concept - the "tool-agent" distinction - that has more potential to reduce risks, and that SI appears to ignore this concept entirely.

So if Holden's objection #2 doesn't work, then objection #1 ends up reducing to "the development of Friendliness theory can achieve at best a reduction in AI risk," which is what SI has been saying all along.


Tool AI

Holden's second numbered objection was:

SI appears to neglect the potentially important distinction between "tool" and "agent" AI.

Eliezer wrote a whole post about this here. To sum up:

(1) Whether you're working with Tool AI or Agent AI, you need the "Friendly AI" domain experts that SI is trying to recruit:

A "Friendly AI programmer" is somebody who specializes in seeing the correspondence of mathematical structures to What Happens in the Real World. It's somebody who looks at Hutter's specification of AIXI and reads the actual equations - actually stares at the Greek symbols and not just the accompanying English text - and sees, "Oh, this AI will try to gain control of its reward channel," as well as numerous subtler issues like, "This AI presumes a Cartesian boundary separating itself from the environment; it may drop an anvil on its own head." Similarly, working on TDT means e.g. looking at a mathematical specification of decision theory, and seeing "Oh, this is vulnerable to blackmail" and coming up with a mathematical counter-specification of an AI that isn't so vulnerable to blackmail.

Holden's post seems to imply that if you're building a non-self-modifying planning Oracle (aka 'tool AI') rather than an acting-in-the-world agent, you don't need a Friendly AI programmer because FAI programmers only work on agents. But this isn't how the engineering skills are split up. Inside the AI, whether an agent AI or a planning Oracle, there would be similar AGI-challenges like "build a predictive model of the world", and similar FAI-conjugates of those challenges like finding the 'user' inside an AI-created model of the universe. The insides would look a lot more similar than the outsides. An analogy would be supposing that a machine learning professional who does sales optimization for an orange company couldn't possibly do sales optimization for a banana company, because their skills must be about oranges rather than bananas.

(2) Tool AI isn't that much safer than Agent AI, because Tool AIs have lots of hidden "gotchas" that cause havoc, too. (See Eliezer's post for examples.)

These points illustrate something else Eliezer wrote:

What the human species needs from an x-risk perspective is experts on This Whole Damn Problem [of AI risk], who will acquire whatever skills are needed to that end. The Singularity Institute exists to host such people and enable their research—once we have enough funding to find and recruit them.

Indeed. We need places for experts who specialize in seeing the consequences of mathematical objects for things humans value (e.g. the Singularity Institute) just like we need places for experts on efficient charity (e.g. Givewell).

Anyway, it's worth pointing out that Holden did not make the common (and mistaken) argument that "We should just build Tool AIs instead of Agent AIs and then we'll be fine." This is wrong for many reasons, but one obvious point is that there are incentives to build Agent AIs (because they're powerful), so even if the first 6 teams are careful enough to build only Tool AIs, the 7th team could still build Agent AI and destroy the world.

Instead, Holden pointed out that you could use Tool AI to increase your chances of successfully building agenty FAI:

if developing "Friendly AI" is what we seek, a tool-AGI could likely be helpful enough in thinking through this problem as to render any previous work on "Friendliness theory" moot. Among other things, a tool-AGI would allow transparent views into the AGI's reasoning and predictions without any reason to fear being purposefully misled, and would facilitate safe experimental testing of any utility function that one wished to eventually plug into an "agent."

After reading Eliezer's reply, however, you can probably guess my replies to this paragraph:

  1. Tool AI isn't as safe as Holden thinks.
  2. But yeah, a Friendly AI team may very well use "Tool AI" to aid Friendliness research if it can figure out a safe way to do that. This doesn't obviate the need for Friendly AI researchers; it's part of their research toolbox.

So Holden's Objection #2 doesn't work, which (as explained earlier) means that his Objection #1 (as stated) doesn't work either.


SI's mission assumes a scenario that is far less conjunctive than it initially appears.

Holden's objection #3 is:

SI's envisioned scenario is far more specific and conjunctive than it appears at first glance, and I believe this scenario to be highly unlikely.

His main concern here seemed to be that technological developments and other factors would render earlier FAI work irrelevant. But Eliezer's clarifications about what we mean by "FAI team" render this objection moot, at least as it is currently stated. The purpose of an FAI team is not to blindly develop one particular approach to Friendly AI without checking to see whether this work will be obsoleted by future developments. Instead, the purpose of an FAI team is to develop highly specialized expertise on, among other things, which kinds of research are more and less likely to be relevant given future developments.

Holden's confusion about what SI means by "FAI team" is common and understandable, and it is one reason that SI's mission assumes a scenario that is far less conjunctive than it appears to many. We aren't saying we need an FAI team because we know lots of specific things about how AGI will be built 30 years from now. We're saying you need experts on "the consequences of mathematical objects for things humans value" (an FAI team) because AGIs are mathematical objects and will have big consequences. That's pretty disjunctive.

Similarly, many people think SI's mission is predicated on hard takeoff. After all, we call ourselves the "Singularity Institute," Eliezer has spent a lot of time arguing for hard takeoff, and our current research summary frames AI risk in terms of recursive self-improvement.

But the case for AI as a global risk, and thus the need for dedicated experts on AI risk and "the consequences of mathematical objects for things humans value", isn't predicated on hard takeoff. Instead, it looks something like this:

(1) Eventually, most tasks are performed by machine intelligences.

The improved flexibility, copyability, and modifiability of machine intelligences make them economically dominant even without other advantages (Brynjolfsson & McAfee 2011; Hanson 2008). In addition, there is plenty of room "above" the human brain in terms of hardware and software for general intelligence (Muehlhauser & Salamon 2012; Sotala 2012; Kurzweil 2005).

(2) Machine intelligences don't necessarily do things we like.

We don't necessarily control AIs, since advanced intelligences may be inherently goal-oriented (Omohundro 2007), and even if we build advanced "Tool AIs," these aren't necessarily safe either (Yudkowsky 2012) and there will be significant economic incentives to transform them into autonomous agents (Brynjolfsson & McAfee 2011). We don't value most possible futures, but it's very hard to get an autonomous AI to do exactly what you want (Yudkowsky 2008, 2011; Muehlhauser & Helm 2012; Arkin 2009).

(3) There are things we can do to increase the probability that machine intelligences do things we like.

Further research can clarify (1) the nature and severity of the risk, (2) how to engineer goal-oriented systems safely, (3) how to increase safety with differential technological development, (4) how to limit and control machine intelligences (Armstrong et al. 2012; Yampolskiy 2012), (5) solutions to AI development coordination problems, and more.

(4) We should do those things now.

People aren't doing much about these issues now. We could wait until we understand better (e.g.) what kind of AI is likely, but: (1) it might take a long time to resolve the core issues, including difficult technical subproblems that require time-consuming mathematical breakthroughs, (2) incentives may be badly aligned (e.g. there seem to be strong economic incentives to build AI, but not to take into account social and global risks for AI), (3) AI may not be that far away (Muehlhauser & Salamon 2012), and (4) the transition to machine dominance may be surprisingly rapid due to (e.g.) intelligence explosion (Chalmers 2010, 2012; Muehlhauser & Salamon 2012) or computing overhang.

What do I mean by "computing overhang"? We may get the hardware needed for AI long before we get the software, such that once software for general intelligence is figured out, there is tons of computing hardware sitting around for running AIs (a "computing overhang"). Thus we could switch from a world with one autonomous AI to a world with 10 billion autonomous AIs at the speed of copying software, and thereby transition rapidly from human dominance to AI dominance even without an intelligence explosion. (This is one of the many, many things we haven't yet written up in detail up due to lack of resources.)

(This broad argument is greatly compressed from a paper outline developed by Paul Christiano, Carl Shulman, Nick Beckstead, and myself. We'd love to write the paper at some point, but haven't had the resources to do so. The fuller version of this argument is of course more detailed.)


SI's public argumentation

Next, Holden turned to the topic of SI's organizational effectiveness:

when evaluating a group such as SI, I can't avoid placing a heavy weight on (my read on) the general competence, capability and "intangibles" of the people and organization, because SI's mission is not about repeating activities that have worked in the past...

There are several reasons that I currently have a negative impression of SI's general competence, capability and "intangibles."

The first reason Holden gave for his negative impression of SI is:

SI has produced enormous quantities of public argumentation... Yet I have never seen a clear response to any of the three basic objections I listed in the previous section. One of SI's major goals is to raise awareness of AI-related risks; given this, the fact that it has not advanced clear/concise/compelling arguments speaks, in my view, to its general competence.

I agree in part. Here's what I think:


SI's endorsements

The second reason Holden gave for his negative impression of SI is "a lack of impressive endorsements." This one is generally true, despite the three "celebrity endorsements" on our new donate page. More impressive than these is the fact that, as Eliezer mentioned, the latest edition of the leading AI textbook spend several pages talking about AI risk and Friendly AI, and discusses the work of SI-associated researchers like Eliezer Yudkowsky and Steve Omohundro while completely ignoring the existence of the older, more prestigious, and vastly larger mainstream academic field of "machine ethics."

Why don't we have impressive endorsements? To my knowledge, SI hasn't tried very hard to get them. That's another thing we're in the process of changing.


SI and feedback loops

The third reason Holden gave for his negative impression of SI is:

SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments... Pursuing more impressive endorsements and developing benign but objectively recognizable innovations (particularly commercially viable ones) are two possible ways to impose more demanding feedback loops.

We have thought many times about commercially viable innovations we could develop, but these would generally be large distractions from the work of our core mission. (The Center for Applied Rationality, in contrast, has many opportunities to develop commercially viable innovations in line with its core mission.)

Still, I do think it's important for the Singularity Institute to test itself with tight feedback loops wherever feasible. This is particularly difficult to do for a research organization doing a philosophy of long-term forecasting (30 years is not a "tight" feedback loop in the slightest), but that's what FHI does and they have more "objectively impressive" (that is, "externally proclaimed") accomplishments: lots of peer-reviewed publications, some major awards for its top researcher Nick Bostrom, etc.


SI and rationality

Holden's fourth concern about SI is that it is overconfident about the level of its own rationality, and that this seems to show itself in (e.g.) "insufficient self-skepticism" and "being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously."

What would provide good evidence of rationality? Holden explains:

I endorse Eliezer Yudkowsky's statement, "Be careful … any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility." To me, the best evidence of superior general rationality (or of insight into it) would be objectively impressive achievements (successful commercial ventures, highly prestigious awards, clear innovations, etc.) and/or accumulation of wealth and power. As mentioned above, SI staff/supporters/advocates do not seem particularly impressive on these fronts...

Unfortunately, this seems to misunderstand the term "rationality" as it is meant in cognitive science. As I explained elsewhere:

Like intelligence and money, rationality is only a ceteris paribus predictor of success.

So while it's empirically true (Stanovich 2010) that rationality is a predictor of life success, it's a weak one. (At least, it's a weak predictor of success at the levels of human rationality we are capable of training today.) If you want to more reliably achieve life success, I recommend inheriting a billion dollars or, failing that, being born+raised to have an excellent work ethic and low akrasia.

The reason you should "be careful… any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility" is because you should "never end up envying someone else's mere choices." You are still allowed to envy their resources, intelligence, work ethic, mastery over akrasia, and other predictors of success.

But I don't mean to dodge the key issue. I think SIers are generally more rational than most people (and so are LWers, it seems), but I think SIers have often overestimated their own rationality, myself included. Certainly, I think SI's leaders have been pretty irrational about organizational development at many times in the past. In internal communications about why SI should help launch CFAR, one reason on my list has been: "We need to improve our own rationality, and figure out how to create better rationalists than exist today."


SI's goals and activities

Holden's fifth concern about SI is the apparent disconnect between SI's goals and its activities:

SI seeks to build FAI and/or to develop and promote "Friendliness theory" that can be useful to others in building FAI. Yet it seems that most of its time goes to activities other than developing AI or theory.

This one is pretty easy to answer. We've focused mostly on movement-building rather than direct research because, until very recently, there wasn't enough community interest or funding to seriously begin to form an FAI team. To do that you need (1) at least a few million dollars a year, and (2) enough smart, altruistic people to care about AI risk that there exist some potential superhero mathematicians for the FAI team. And to get those two things, you've got to do mostly movement-building, e.g. Less Wrong, HPMoR, the Singularity Summit, etc.


Theft

And of course, Holden is (rightly) concerned about the 2009 theft of $118,000 from SI, and the lack of public statements from SI on the matter.

Briefly:


Pascal's Mugging

In another section, Holden wrote:

A common argument that SI supporters raise with me is along the lines of, "Even if SI's arguments are weak and its staff isn't as capable as one would like to see, their goal is so important that they would be a good investment even at a tiny probability of success."

I believe this argument to be a form of Pascal's Mugging and I have outlined the reasons I believe it to be invalid...

Some problems with Holden's two posts on this subject will be explained in a forthcoming post by Steven Kaas. But as Holden notes, some SI principals like Eliezer don't use "small probability of large impact" arguments, anyway. We in fact argue that the probability of a large impact is not tiny.


Summary of my reply to Holden

Now that I have addressed so many details, let us return to the big picture. My summarized reply to Holden goes like this:

Holden's first two objections can be summarized as arguing that developing the Friendly AI approach is more dangerous than developing non-agent "Tool" AI. Eliezer's post points out that "Friendly AI" domain experts are what you need whether you're working with Tool AI or Agent AI, because (1) both of these approaches require FAI experts (experts in seeing the consequences of mathematical objects for what humans value), and because (2) Tool AI isn't necessarily much safer than Agent AI, because Tool AIs have lots of hidden gotchas, too. Thus, "What the human species needs from an x-risk perspective is experts on This Whole Damn Problem [of AI risk], who will acquire whatever skills are needed to that end. The Singularity Institute exists to host such people and enable their research — once we have enough funding to find and recruit them."

Holden's third objection was that the argument behind SI's mission is more conjunctive than it seems. I replied that the argument behind SI's mission is actually less conjunctive than it often seems, because an "FAI team" works on a broader set of problems than Holden had realized, and because the case for AI risk is more disjunctive than many people realize. These confusions are understandable, however, and they probably are a result of insufficient clear argumentative writing from SI on these matters — a problem we am trying to fix with several recent and forthcoming papers and other communications (like this one).

Holden's next objection concerned SI as an organization: "SI has, or has had, multiple properties that I associate with ineffective organizations." I acknowledged these problems before Holden published his post, and have since outlined the many improvements we've made to organizational effectiveness since I was made Executive Director. I addressed several of Holden's specific worries here.

Finally, Holden recommended giving to a donor-advised fund rather than to SI:

I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y. For donors determined to donate within this cause, I encourage you to consider donating to a donor-advised fund while making it clear that you intend to grant out the funds to existential-risk-reduction-related organizations in the future....

For one who accepts my arguments about SI, I believe withholding funds in this way is likely to be better for SI's mission than donating to SI

By now I've called into question most of Holden's arguments about SI, but I will still address the issue of donating to SI vs. donating to a donor-advised fund.

First: Which public charity would administer the donor-advised fund? Remember also that in the U.S., the administering charity need not spend from the donor-advised fund as the donor wishes, though they often do.

Second: As I said earlier,

it's probably easier to reform SI into a more effective organization than it is to launch a new one, since SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.

The case for funding improvements and growth at SI (as opposed to starving SI as Holden suggests) is bolstered by the fact that SI's productivity and effectiveness have been improving rapidly of late, and many other improvements (and exciting projects) are on our "to-do" list if we can raise sufficient funding to implement them.

Holden even seems to share some of this optimism:

Luke's... recognition of the problems I raise... increases my estimate of the likelihood that SI will work to address them...

I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years...


Conclusion

For brevity's sake I have skipped many important details. I may also have misinterpreted Holden somewhere. And surely, Holden and other readers have follow-up questions and objections. This is not the end of the conversation; it is closer to the beginning. I invite you to leave your comments, preferably in accordance with these guidelines (for improved discussion clarity).

214 comments

Comments sorted by top scores.

comment by Rain · 2012-07-10T02:32:50.581Z · LW(p) · GW(p)

I think this post makes a strong case for needing further donations. Have $3,000.

Replies from: ChrisHallquist, lukeprog, amywilley, Ioven, MichaelAnissimov
comment by ChrisHallquist · 2012-07-19T14:31:59.143Z · LW(p) · GW(p)

I agree. Have another $1,100. Also, for those who are interested, a link to a blog post I wrote explaining why I donated.

Replies from: lukeprog
comment by lukeprog · 2012-07-24T03:06:44.433Z · LW(p) · GW(p)

Thanks!!!

comment by lukeprog · 2012-07-10T03:00:25.678Z · LW(p) · GW(p)

Thanks!!!

comment by amywilley · 2012-07-10T03:21:24.823Z · LW(p) · GW(p)

Thank you :)

comment by Ioven · 2012-07-10T03:47:32.722Z · LW(p) · GW(p)

Thanks Rain !!

comment by MichaelAnissimov · 2012-07-11T05:52:23.882Z · LW(p) · GW(p)

Thank you Rain.

comment by lukeprog · 2012-07-10T00:10:25.432Z · LW(p) · GW(p)

This post and the reactions to it will be an interesting test for my competing models about the value of giving detailed explanations to supporters. Here are just two of them:

One model says that detailed communication with supporters is good because it allows you to make your case for why your charity matters, and thus increase the donors' expectation that your charity can turn money into goods that they value, like poverty reduction or AI risk reduction.

Another model says that detailed communication with supporters is bad because (1) supporters are generally giving out of positive affect toward the organization, and (2) that positive affect can't be increased much once they grok the mission enough to start donating, but (3) the positive affect they feel toward the charity can be overwhelmed by the absolute number of the organization's statements with which they disagree, and (4) more detailed communication with supporters increases this absolute number more quickly than limited communication that repeats the same points again and again (e.g. in a newsletter).

I worry that model #2 may be closer to the truth, in part because of things like (Dilbert-creator) Scott Adams' account of why he decided to blog less:

I hoped that people who loved the blog would spill over to people who read Dilbert, and make my flagship product stronger. Instead, I found that if I wrote nine highly popular posts, and one that a reader disagreed with, the reaction was inevitably “I can never read Dilbert again because of what you wrote in that one post.” Every blog post reduced my income, even if 90% of the readers loved it.

Replies from: komponisto, AlexMennen, wedrifid, ChrisHallquist, Giles, TheOtherDave
comment by komponisto · 2012-07-10T00:34:04.484Z · LW(p) · GW(p)

An issue that SI must inevitably confront is how much rationality it will assume of its target population of donors. If it simply wanted to raise as much money as possible, there are, I expect, all kinds of Dark techniques it could use (of which decreasing communication is only the tip of the iceberg). The problem is that SI also wants to raise the sanity waterline, since that is integral to its larger mission -- and it's hard (not to mention hypocritical) to do that while simultaneously using fundraising methods that depend on the waterline being below a certain level among its supporters.

comment by AlexMennen · 2012-07-10T01:15:59.039Z · LW(p) · GW(p)

How do you expect to determine the effects of this information on donations from the comments made by supporters? In my case, for instance, I've been fairly encouraged by the explanations like this that have been coming out of SI (and had been somewhat annoyed by the lack of them previously), but my comments tend to sound negative because I tend to focus on things that I'm still not completely satisfied with.

Replies from: lukeprog
comment by lukeprog · 2012-07-10T01:28:17.430Z · LW(p) · GW(p)

It's very hard. Comments like this help a little.

comment by wedrifid · 2012-07-10T01:08:30.105Z · LW(p) · GW(p)

Another model says that detailed communication with supporters is bad because (1) supporters are generally giving out of positive affect toward the organization, and (2) that positive affect can't be increased much once they grok the mission enough to start donating, but (3) the positive affect they feel toward the charity can be overwhelmed by the absolute number of the organization's statements with which they disagree, and (4) more detailed communication with supporters increases this absolute number more quickly than limited communication that repeats the same points again and again (e.g. in a newsletter).

As an example datapoint Eliezer's reply to Holden caused a net decrease (not necessarily an enormous one) in both my positive affect for and abstract evaluation of the merit of the organisation based off one particularly bad argument that shocked me. It prompted some degree (again not necessarily a large degree) of updating towards the possibility that SingInst could suffer the same kind of mind-killed thinking and behavior I expect from other organisations in the class of pet-cause idealistic charities. (And that matters more for FAI oriented charities than save-the-puppies charities, with the whole think-right or destroy the world thing.)

When allowing for the possibility that I am wrong and Eliezer is right you have to expect most other supporters to be wrong a non-trivial proportion of the time too so too much talking is going to have negative side effects.

Replies from: lukeprog
comment by lukeprog · 2012-07-10T01:27:51.522Z · LW(p) · GW(p)

Which issue are you talking about? Is there already a comments thread about it on Eliezer's post?

Replies from: wedrifid
comment by wedrifid · 2012-07-10T01:39:55.161Z · LW(p) · GW(p)

Which issue are you talking about? Is there already a comments thread about it on Eliezer's post?

Found it. It was nested too deep in a comment tree.

The particular line was:

I would ask him what he knows now, in advance, that all those sane intelligent people will miss. I don't see how you could (well-justifiedly) access that epistemic state.

The position is something I think it is best I don't mention again until (unless) I get around to writing the post "Predicting Failure Without Details" to express the position clearly with references and what limits apply to that kind of reasoning.

Replies from: Cyan
comment by Cyan · 2012-07-10T01:43:25.856Z · LW(p) · GW(p)

Isn't it just straight-up outside view prediction?

Replies from: wedrifid
comment by wedrifid · 2012-07-10T01:48:20.443Z · LW(p) · GW(p)

Isn't it just straight-up outside view prediction?

I thought so.

comment by ChrisHallquist · 2012-07-11T08:27:57.579Z · LW(p) · GW(p)

I can think of a really big example favoring model #2 within the atheist community. On the oyher hand, you and Eliezer have written so much about your views on these matters that the "detailed communication" toothpaste may not be going back in the tube. And this piece made me much more inclined to support SI, particularly the disjunctive vs. Conjunctive section which did a lot for worries raised by things Eliezer has said in the past.

comment by Giles · 2012-07-24T02:56:05.958Z · LW(p) · GW(p)

Is it possible that supporters might update on communicativeness, separately from updating on what you actually have to say? Generally when I see the SI talking to people, I feel the warm fuzziness before I actually read what you're saying. It just seems like people might associate "detailed engagement with supporters and critics" with the reference class of "good organizations".

Replies from: lukeprog
comment by lukeprog · 2012-07-24T03:05:53.678Z · LW(p) · GW(p)

Yup, that might be true. I hope so.

comment by TheOtherDave · 2012-07-10T02:39:15.389Z · LW(p) · GW(p)

Presumably, even under model #1, the extent to which detailed communication increases donor expectations of my charity's ability to turn money into valuable goods depends a lot on their pre-existing expectations, the level of expectations justified by the reality, and how effective the communication is at conveying the reality.

comment by RobertLumley · 2012-07-09T22:42:50.748Z · LW(p) · GW(p)

Regarding the theft:

I was telling my friend (who recently got into HPMOR and lurks a little on LW) about Holden's critique, specifically with regard to the theft. He's an accounting and finance major, and was a bit taken aback. His immediate response was to ask if SI had an outside accountant audit their statements. We searched around and it doesn't look like to us that you do. He immediately said that he would never donate to an organization that did not have an accountant audit their statements, and knowing how much I follow LW, immediately advised me to not to either. This seems like a really good step for addressing the transparency issues here, and now that he mentions it, seems a very prudent and obvious thing for any nonprofit to do.

Edit 2: Luke asked me to clarify, I am not necessarily endorsing not donating to SI because of this, unless this problem is a concern of yours. My intent was only to suggest ways SI can improve and may be turning away potential donors.

Edit: He just mentioned to me that the big four accounting firms often do pro bono work because it can be a tax write-off. This may be worth investigating.

Replies from: lukeprog, lukeprog, Eliezer_Yudkowsky
comment by lukeprog · 2012-07-09T23:33:07.938Z · LW(p) · GW(p)

Also note that thefts of this size are not as rare as they appear, because many non-profits simply don't report them. I have inside knowledge about very few charities, but even I know one charity that suffered a larger theft than SI did, and they simply didn't tell anybody. They knew that donors would punish them for the theft and not at all reward them for reporting it. Unfortunately, this is probably true for SI, too, which did report the theft.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-10T00:56:21.088Z · LW(p) · GW(p)

Yep. We knew that would happen at the time - it was explicitly discussed in the Board meeting - and we went ahead and reported it anyway, partly because we didn't want to have exposable secrets, partly because we felt honesty was due our donors, and partially because I'd looked up embezzlement-related stuff online and had found that a typical nonprofit-targeting embezzler goes through many nonprofits before being reported and prosecuted by a nonprofit "heroic" enough, if you'll pardon the expression, to take the embarrassment-hit in order to stop the embezzler.

Replies from: shminux
comment by Shmi (shminux) · 2012-07-10T01:39:44.851Z · LW(p) · GW(p)

I suspect that some of the hit was due to partial disclosure. Outsiders were left guessing what exactly had transpired and why, and what specific steps were taken to address the issue. Maybe you had to do it this way for legal reasons, but this was never spelled out explicitly.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-10T19:32:35.387Z · LW(p) · GW(p)

Pretty sure it was spelled out explicitly.

comment by lukeprog · 2012-07-09T22:47:19.951Z · LW(p) · GW(p)

Yes, we're currently in the process of hiring a bookkeeper (interviewed one, scheduling interviews with 2 others), which will allow us to get our books in enough order that an accountant will audit our statements. We do have an outside accountant prepare our 990s already. Anyway, this all requires donations. We can't get our books cleaned up and audited unless we have the money to do so.

Also, it's my impression that many or most charities our size and smaller don't have their books audited by an accountant because it's expensive to do so. It's largely the kind of thing a charity does when they have a bigger budget than we currently do. But I'd be curious to see if there are statistics on this somewhere; I could be wrong.

And yes, we are investigating the possibility of getting pro bono work from an accounting firm; it's somewhere around #27 on my "urgent to-do list." :)

Edit: BTW, anyone seriously concerned about this matter is welcome to earmark their donations for "CPA audit" so that those donations are only used for (1) paying a bookkeeper to clean up our processes enough so that an accountant will sign off on them, and (2) paying for a CPA audit of our books. I will personally make sure those earmarks are honored.

Replies from: private_messaging
comment by private_messaging · 2012-07-11T16:11:39.487Z · LW(p) · GW(p)

How many possible universes could here be (what % of the universes), where not donating to a charity that does not do accounting right when pulling in 500 grand a year, would result in destruction of mankind? 500 grand a year is not so little when you can get away with it. My GF's family owns a company smaller than that (in the US) and it has books in order.

Replies from: homunq
comment by homunq · 2012-07-23T16:41:30.977Z · LW(p) · GW(p)

Yeah, that would be really unfair, wouldn't it? And so it's hard to believe it could be true. And so it must not be.

(I actually don't believe it is likely to be true. But the fact it sounds silly and unfairly out-of-proportion is one of the worst possible arguments against it.)

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-10T00:47:41.578Z · LW(p) · GW(p)

You can't deduct the value of services donated to nonprofits. Not sure your friend is as knowledgeable as stated. Outside accounting is expensive and the IRS standard is to start doing it once your donations hit $2,000,000/year, which we haven't hit yet. Also, SIAI recently passed an IRS audit.

Replies from: Vaniver, lukeprog, lukeprog, somervta
comment by Vaniver · 2012-07-10T01:48:13.011Z · LW(p) · GW(p)

Fifteen seconds of Googling resulted in Deloitte's pro-bono service, which is done for CSR and employee morale rather than tax avoidance. Requests need to originate with Deloitte personnel- I know a friend who works there who might be interested in LW, but it'd be a while before I'd be comfortable asking him to recommend SI. It's a big enough company that it's likely that there are some HPMOR or LW fans that work there.

Replies from: Cosmos, lukeprog
comment by Cosmos · 2012-07-11T01:47:13.937Z · LW(p) · GW(p)

Interesting!

"Applications for a contribution of pro bono professional services must be made by Deloitte personnel. To be considered for a pro bono engagement, a nonprofit organization (NPO) with a 501c3 tax status must have an existing relationship with Deloitte through financial support, volunteerism, Deloitte personnel serving on its Board of Directors or Trustees, or a partner, principal or director (PPD) sponsor (advocate for the duration of the engagement). External applications for this program are not accepted. Organizations that do not currently have a relationship with Deloitte are welcome to introduce themselves to the Deloitte Community Involvement Leader in their region, in the long term interest of developing one."

Deloitte is requiring a very significant investment from its employees before offering pro bono services. Nonetheless, I have significant connections there and would be willing to explore this option with them.

Replies from: RobertLumley
comment by RobertLumley · 2012-07-11T14:27:09.785Z · LW(p) · GW(p)

You might want to pm this directly to lukeprog to make sure that he sees this comment. Since you replied to Vaniver, he may have not seen it, and this seems important enough to merit the effort.

Replies from: Cosmos
comment by Cosmos · 2012-07-13T03:46:28.741Z · LW(p) · GW(p)

Thanks for the excellent idea! I did in fact email Lukeprog personally to let him know. :)

comment by lukeprog · 2012-07-10T02:03:02.756Z · LW(p) · GW(p)

Thanks. As I said, this is something on our to-do list, but I didn't know about Deloitte in particular.

comment by lukeprog · 2012-07-10T00:53:39.240Z · LW(p) · GW(p)

Clarifications:

  • In California, a non-profit is required to hire a CPA audit once donations hit $2m/yr, which SI hasn't hit yet. That's the way in which outside accounting is "IRS standard" after $2m/yr.
  • SI is in the process of passing an IRS audit for the year 2010.
comment by lukeprog · 2012-07-10T01:07:58.740Z · LW(p) · GW(p)

Eliezer is right: RobertLumley's friend is mistaken:

can the value of your time and services while providing pro bono legal services qualify as a charitable contribution that is deductible from gross income on your federal tax return? Unfortunately, in a word, nope.

According to IRS Publication 526, “you cannot deduct the value of your time or services, including blood donations to the Red Cross or to blood banks, and the value of income lost while you work as an unpaid volunteer for a qualified organization.”

comment by somervta · 2012-07-10T01:01:50.051Z · LW(p) · GW(p)

He may be referring to the practice of being paid for work, then giving it back as a tax-deductible charitable donation. My understanding is that you can also deduct expenses you incur while working for a non-profit - admittedly not something I can see applying to accounting. There's also cause marketing, but that's getting a bit further afield.

Replies from: GuySrinivasan, lukeprog
comment by SarahNibs (GuySrinivasan) · 2012-07-10T02:02:45.983Z · LW(p) · GW(p)

In the one instance of a non-profit getting accounting work done that I know of, the non-profit paid and then received an equal donation. Magic.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-11T00:04:28.633Z · LW(p) · GW(p)

This is exactly equivalent to not paying, which is precisely the IRS rationale for why donated services aren't directly deductible.

comment by lukeprog · 2012-07-10T01:12:11.546Z · LW(p) · GW(p)

"the big four accounting firms often do pro bono work because it can be a tax write-off" doesn't sound much like "being paid for work, then giving it back as a tax-deductible charitable donation".

Replies from: RobertLumley
comment by RobertLumley · 2012-07-10T13:56:56.617Z · LW(p) · GW(p)

In talking to him, I think he may have just known they do pro bono work and assumed it was because of taxes. Given Vaniver's comment, this seems pretty likely to me. He did say that the request usually has to originate from inside the company, which is consistent with that comment.

Replies from: somervta
comment by somervta · 2012-07-13T00:30:17.256Z · LW(p) · GW(p)

Ah. That would make more sense.

comment by gwern · 2012-07-10T00:40:13.141Z · LW(p) · GW(p)

Certainly the fact that some really awful charities are untruthful doesn't mean SI shouldn't be held accountable merely because it managed to tell the truth.

I think you're missing Luke's implied argument that more than 'some' charities are untruthful, but quite a lot of them are. The situation is the same as with, say, corporations getting hacked: they have no incentive to report it because only bad things will happen, and this leads to systematic underreporting, which reinforces the equilibrium as anyone reporting honestly will be seen as an outlier (as indeed they are) and punished. A vicious circle.

(Given the frequency of corporations having problems, and the lack of market discipline for nonprofits and how they depend on patrons, I could well believe that nonprofits routinely have problems with corruption, embezzlement, self-dealing, etc.)

Replies from: David_Gerard
comment by David_Gerard · 2012-07-15T19:47:31.584Z · LW(p) · GW(p)

Charities tend to be a trusting lot and not think of this sort of thing until it happens to them. Because they don't hear about it, for the reasons Luke sets out above. I just found out about another charity that got done in a similar manner to SIAI, though for not nearly as much money, and is presently going through the pains of disclosure.

comment by Jonathan_Graehl · 2012-07-10T01:53:00.110Z · LW(p) · GW(p)

How do I know that supporting SI doesn't end up merely funding a bunch of movement-building leading to no real progress?

It seems to me that the premise of funding SI is that people smarter (or more appropriately specialized) than you will then be able to make discoveries that otherwise would be underfunded or wrongly-purposed.

I think the (friendly or not) AI problem is hard. So it seems natural for people to settle for movement-building or other support when they get stuck.

That said, some of the collateral output to date has been enjoyable.

Replies from: lukeprog, ciphergoth, lukeprog, JaneQ
comment by lukeprog · 2013-06-11T01:59:12.201Z · LW(p) · GW(p)

Behold, I come bearing real progress! :)

Replies from: Jonathan_Graehl
comment by Jonathan_Graehl · 2013-06-11T04:38:39.024Z · LW(p) · GW(p)

The best possible response. I haven't ready any of them yet, but the topics seem relevant to the long range goal of becoming convinced of the Friendliness of complicated programs.

comment by Paul Crowley (ciphergoth) · 2012-07-10T09:01:26.516Z · LW(p) · GW(p)

For SI, movement building is directly progress more than it is for, say, Oxfam, because a big part of their mission is to try and persuade people not to do the very dangerous thing.

Replies from: Jonathan_Graehl
comment by Jonathan_Graehl · 2012-07-10T22:05:01.200Z · LW(p) · GW(p)

Good point. But I don't see any evidence that anyone who was likely to create an AI soon, now won't.

Those whose profession and status is in approximating AI largely won't change course for what must seem to them like sci-fi tropes. [1]

Or, put another way, there are working computer scientists who are religious - you can't expect reason everywhere in someone's life.

[1] but in the long run, perhaps SI and others can offer a smooth transition for dangerously smart researchers into high-status alternatives such as FAI or other AI risk mitigation.

Replies from: endoself, lincolnquirk
comment by endoself · 2012-07-11T00:09:52.013Z · LW(p) · GW(p)

But I don't see any evidence that anyone who was likely to create an AI soon, now won't.

According to Luke, Moshe Looks (head of Google's AGI team) is now quite safety conscious, and a Singularity Institute supporter.

Replies from: lukeprog
comment by lukeprog · 2012-10-19T00:53:52.628Z · LW(p) · GW(p)

Update: It's not really correct to say that Google has "an AGI team." Moshe Looks has been working on program induction, and this guy said that some people are working on AI "on a large scale," but I'm not aware of any publicly-visible Google project which has the ambitions of, say, Novamente.

comment by lincolnquirk · 2012-07-11T04:46:09.442Z · LW(p) · GW(p)

The plausible story in movement-building is not convincing existing AGI PIs to stop a long program of research, but instead convincing younger people who would otherwise eventually become AGI researchers to do something safer. The evidence to look for would be people who said "well, I was going to do AI research but instead I decided to get involved with SingInst type goals" -- and I suspect someone who knows the community better might be able to cite quite a few people for whom this is true, though I don't have any names myself.

Replies from: Jonathan_Graehl
comment by Jonathan_Graehl · 2012-07-11T17:40:09.481Z · LW(p) · GW(p)

I didn't think of that. I expect current researchers to be dead or nearly senile by the time we have plentiful human substitutes/emulations, so I shouldn't care that incumbents are unlikely to change careers (except for the left tail - I'm very vague in my expectation).

comment by lukeprog · 2012-07-10T01:58:29.649Z · LW(p) · GW(p)

Movement-building is progress, but...

I hear ya. If I'm your audience, you're preaching to the choir. Open Problems in Friendly AI — more in line with what you'd probably call "real progress" — is something I've been lobbying for since I was hired as a researcher in September 2011, and I'm excited that Eliezer plans to begin writing it in mid-August, after SPARC.

some of the collateral output to date has been enjoyable

Such as?

Replies from: Jonathan_Graehl
comment by Jonathan_Graehl · 2012-07-10T22:01:01.424Z · LW(p) · GW(p)

The philosophy and fiction have been fun (though they hardly pay my bills).

I've profited from reading well-researched posts on the state of evidence-based (social-) psychology / nutrition / motivation / drugs, mostly from you, Yvain, Anna, gwern, and EY (and probably a dozen others whose names aren't available).

The bias/rationality stuff was fun to think but "ugh fields", for me at least, turned out to be the only thing that mattered. I imagine that's different for other types of people, though.

Additionally, the whole project seems to have connected people who didn't belong to any meaningful communities (thinking of various regional meetup clusters).

comment by JaneQ · 2012-07-12T07:18:33.823Z · LW(p) · GW(p)

It seems to me that the premise of funding SI is that people smarter (or more appropriately specialized) than you will then be able to make discoveries that otherwise would be underfunded or wrongly-purposed.

But then SI has to have dramatically better idea what research has to be funded to protect the mankind, than every other group of people capable of either performing such research or employing people to perform such research.

Muehlhauser has stated that SI should be compared to alternatives in form of the organizations working on the AI risk mitigation, but that seems like an overly narrow choice reliant on presumption that it is not an alternative to not work on AI risk mitigation now.

For example, 100 years ago it would seem to have been too early to fund work on AI risk mitigation; that may still be the case; as the time gone on one could naturally expect that the opinions will form a distribution and the first organizations offering AI risk mitigation will pop up earlier than the time at which such work is effective. When we look into the past through the goggles of notoriety, we don't see all the failed early starts.

Replies from: Vladimir_Nesov, Jonathan_Graehl
comment by Vladimir_Nesov · 2012-07-13T00:54:16.038Z · LW(p) · GW(p)

For example, 100 years ago it would seem to have been too early to fund work on AI risk mitigation

Disagree. There are many remaining theoretical (philosophical and mathematical) difficulties whose investigation doesn't depend on the current level of technology. It would've been better to start working on the problem 300 years ago, when AI risk was still far away. Value of information on this problem is high, and we don't (didn't) know that there is nothing to be discovered, it wouldn't be surprising if some kind of progress is made.

Replies from: Eliezer_Yudkowsky, None, army1987
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-14T00:43:30.442Z · LW(p) · GW(p)

I do think OP is right that in practice, 100 years ago, it would have been really hard to figure out what an AI issue looked like. This was pre-Godel, pre-decision-theory, pre-Bayesian-revolution, and pre-computer. Yes, a sufficiently competent Earth would be doing AI math before it had the technology for computers, in full awareness of what it meant - but that's a pretty darned competent Earth we're talking about.

Replies from: JaneQ
comment by JaneQ · 2012-07-14T09:29:50.339Z · LW(p) · GW(p)

I think it is fair to say Earth was doing the "AI math" before the computers. Extending to the today - there is a lot of mathematics to be done for a good, safe AI - but how are we to know that the SI has the actionable effort planning skills required to correctly identify and fund research in such mathematics?

I know that you believe that you have the required skills; but note that in my model such belief results from both the presence of extraordinary effort planning skill, and from absence of effort planning skills. The prior probability of extraordinary effort planning skill is very low. Furthermore as the effort planning is, to some extent, a cross domain skill, the prior inefficacy (which was criticized by Holden) seem to be a fairly strong evidence against extraordinary skills in this area.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-14T23:36:07.973Z · LW(p) · GW(p)

If my writings (on FAI, on decision theory, and on the form of applied-math-of-optimization called human rationality) so far haven't convinced you that I stand a sufficient chance of identifying good math problems to solve to maintain the strength of an input into existential risk, you should probably fund CFAR instead. This is not, in any way shape or form, the same skill as the ability to manage a nonprofit. I have not ever, ever claimed to be good at managing people, which is why I kept trying to have other people doing it.

Replies from: JaneQ
comment by JaneQ · 2012-07-16T14:07:09.557Z · LW(p) · GW(p)

I'm not sure why you think that such writings should convince a rational person that you have the relevant skill. If you were an art critic, even a very good one, that would not convince people you are a good artist.

This is not, in any way shape or form, the same skill as the ability to manage a nonprofit.

Indeed, but you are asking me to assume that the skills you display writing your articles are the same skill as the skills relevant to directing the AI effort.

edit: Furthermore, when it comes to works on rationality as 'applied math of optimization', the most obvious way to classify those writings is to look for some great success attributable to your writings - some highly successful businessmen saying how much the article on such and such fallacy helped them succeed, that sort of thing.

Replies from: AlexanderD
comment by AlexanderD · 2012-11-14T07:27:10.998Z · LW(p) · GW(p)

It seems to me that the most obvious way to demonstrate the brilliance and excellent outcomes of the applied math of optimization would be to generate large sums of money, rather than seeking endorsements.

The Singularity Institute could begin this at no cost (beyond opportunity cost of staff time) by employing the techniques of rationality in a fake market, for example, if stock opportunities were the chosen venue. After a few months of fake profits, SI could set them up with $1,000. If that kept growing, then a larger investment could be considered.

This has been done, very recently. Someone on Overcoming Bias recently wrote of how they and some friends made about $500 each with a small investment by identifying an opportunity for arbitrage between the markets on InTrade and another prediction market, without any loss.

Money can be made, according to proverb, by being faster, luckier, or smarter. It's impossible to create luck in the market, and in the era of microsecond purchases by Goldman Sachs it's very nearly impossible to be faster, but an organization (or perhaps associated organizations?) devoted to defeating internal biases and mathematically assessing the best choices in the world should be striving to be smarter.

While it seems very interesting and worthwhile to work on existential risk from UFAI directly, it seems like the smarter thing to do might be to devote a decade to making an immense pile of money for the institute and developing the associated infrastructure (hiring money managers, socking a bunch away into Berkshire Hathaway for safety, etc.) Then hire a thousand engineers and mathematicians. And what's more, you'll raise awareness of UFAI an incredibly greater amount than you would have otherwise, plugging along as another $1-2m charity.

I'm sure this must have been addressed somewhere, of course - there is simply way too much written in too many places by too many smart people. But it is odd to me that SI's page on Strategic Insight doesn't have as #1: Become Rich. Maybe if someone notices this comment, they can point me to the argument against it?

Replies from: Kawoomba
comment by Kawoomba · 2012-11-14T08:28:29.514Z · LW(p) · GW(p)

The official introductory SI pages may have to sugarcoat such issues due to PR considerations ("everyone get rich, then donate your riches" sends off a bad vibe).

As you surmised, your idea has been brought up quite often in various contexts, especially in optimal charity discussions. For many/most endeavors, the globally optimal starting steps are "acquire more capabilities / become more powerful" (players of strategy games may be more explicitly cognizant of that stratagem).

I also do remember speculation that friendly AI and unfriendly AI may act very similarly at first - both choosing the optimal path to powering up, so that they can pursue the differing goals of their respective utility functions more efficiently at a future point in time. So your thoughts on the matter seem compatible with the local belief cluster.

Your money proverb seems to still hold true, anecdotally I'm acquainted with some CS people making copious amounts of money on NASDAQ doing simple ANOVA analyses, while barely being able to spell the companies' names. So why aren't we doing that? Maybe a combination of mental inertia and being locked into a research/get endorsements modus operandi, which may be hard to shift out of into a more active "let's create start-ups"/"let's do day-trading" mode.

A goal-function of "seek influential person X's approval" will lead to a different mind set from "let quantifiable results speak for themselves", the latter will allow you not to optimize every step of the way for signalling purposes.

comment by [deleted] · 2012-07-13T01:20:37.108Z · LW(p) · GW(p)

How would you even pose the question of AI risk to someone in the eighteenth century?

I'm trying to imagine what comes out the other end of Newton's chronophone, but it sounds very much like "You should think really hard about how to prevent the creation of man-made gods."

Replies from: Vladimir_Nesov, summerstay, johnlawrenceaspden, army1987
comment by Vladimir_Nesov · 2012-07-13T01:27:24.607Z · LW(p) · GW(p)

I don't think it's plausible that people could stumble on the problem statement 300 years ago, but within that hypothetical, it wouldn't have been too early.

Replies from: JaneQ
comment by JaneQ · 2012-07-13T14:04:02.280Z · LW(p) · GW(p)

It seems to me that 100 years ago (or more) you would have to consider pretty much any philosophy and mathematics to be relevant to AI risk reduction, as well as reduction of other potential risks, and the attempts to select the work particularly conductive to the AI risk reduction would not be able to succeed. Effort planning is the key to success.

On somewhat unrelated: Reading the publications and this thread, there is point of definitions that I do not understand: what exactly does S.I. mean when it speaks of "utility function" in the context of an AI? Is it a computable mathematical function over a model, such that the 'intelligence' component computes the action that results in maximum of that function taken over the world state resulting from the action?

Replies from: johnlawrenceaspden
comment by johnlawrenceaspden · 2012-07-16T11:30:09.402Z · LW(p) · GW(p)

Surely "Effort planning is a key to success"?

Also, and not just wanting to flash academic applause lights but also genuinely curious, which mathematical successes have been due to effort planning? Even in my own mundane commercial programming experiences, the company which won the biggest was more "This is what we'd like, go away and do it and get back to us when it's done..." than "We have this Gantt chart...".

comment by summerstay · 2012-07-18T14:06:22.541Z · LW(p) · GW(p)

There are very few people who would have understood in the 18th century, but Leibniz would have understood in the 17th. He underestimated the difficulty in creating an AI, like everyone did before the 1970s, but he was explicitly trying to do it.

Replies from: None
comment by [deleted] · 2012-07-18T17:52:04.080Z · LW(p) · GW(p)

Your definition of "explicit" must be different from mine. Working on prototype arithmetic units and toying with the universal characteristic is AI research? He subscribed wholeheartedly to the ideographic myth; the most he would have been capable of is a machine that passes around LISP tokens.

In any case, based on the Monadology, I don't believe Leibniz would consider the creation of a godlike entity to be theologically possible.

comment by johnlawrenceaspden · 2012-07-16T11:34:43.201Z · LW(p) · GW(p)

How about: "Eventually your machines will be so powerful they can grant wishes. But remember that they are not benevolent. What will you wish for when you can make a wish-machine?"

comment by A1987dM (army1987) · 2012-07-15T12:28:58.325Z · LW(p) · GW(p)

Oh, wait... The tale of the Tower of Babel was told via chronophone by people from the future right before succumbing to uFAI!

comment by A1987dM (army1987) · 2012-07-13T22:14:03.398Z · LW(p) · GW(p)

That's hindsight. Nobody could have reasonably foreseen the rise of very powerful computing machines that far ago.

comment by Jonathan_Graehl · 2012-07-12T23:49:51.525Z · LW(p) · GW(p)

100 years ago it would seem to have been too early to fund work on AI risk mitigation

Hilarious, and an unfairly effective argument. I'd like to know such people, who can entertain an idea that will still be tantalizing yet unresolved a century out.

that seems like an overly narrow choice reliant on presumption that it is not an alternative to not work on AI risk mitigation now.

Yes. I agree with everything else, too, with the caveat that SI is not the first organization to draw attention to AI risk) - not that you said so.

comment by HoldenKarnofsky · 2012-08-01T14:16:55.899Z · LW(p) · GW(p)

I greatly appreciate the response to my post, particularly the highly thoughtful responses of Luke (original post), Eliezer, and many commenters.

Broad response to Luke's and Eliezer's points:

As I see it, there are a few possible visions of SI's mission:

  • M1. SI is attempting to create a team to build a "Friendly" AGI.
  • M2. SI is developing "Friendliness theory," which addresses how to develop a provably safe/useful/benign utility function without needing iterative/experimental development; this theory could be integrated into an AGI developed by another team, in order to ensure that its actions are beneficial.
  • M3. SI is broadly committed to reducing AGI-related risks, and work on whatever will work toward that goal, including potentially M1 and M2.

My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team. An organization with a very narrow, specific mission - such as "analyzing how to develop a provably safe/useful/benign utility function without needing iterative/experimental development" - can, relatively easily, establish which other organizations (if any) are trying to provide what it does and what the relative qualifications are; it can set clear expectations for deliverables over time and be held accountable to them; its actions and outputs are relatively easy to criticize and debate. By contrast, an organization with broader aims and less clearly relevant deliverables - such as "broadly aiming to reduce risks from AGI, with activities currently focused on community-building" - is giving a donor (or evaluator) less to go on in terms of what the space looks like, what the specific qualifications are and what the specific deliverables are. In this case it becomes more important that a donor be highly confident in the exceptional effectiveness of the organization and team as a whole.

Many of the responses to my criticisms (points #1 and #4 in Eliezer's response; "SI's mission assumes a scenario that is far less conjunctive than it initially appears" and "SI's goals and activities" section of Luke's response) correctly point out that they have less force, as criticisms, when one views SI's mission as relatively broad. However, I believe that evaluating SI by a broader mission raises the burden of affirmative arguments for SI's impressiveness. The primary such arguments I see in the responses are in Luke's list:

(1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world, and (3) the Singularity Summit, a mainstream-aimed conference that brings in people who end up making significant contributions to the movement — e.g. Tomer Kagan (an SI donor and board member) and David Chalmers (author of The Singularity: A Philosophical Analysis and The Singularity: A Reply).

I've been a consumer of all three of these, and while I've found them enjoyable, I don't find them sufficient for the purpose at hand. Others may reach a different conclusion. And of course, I continue to follow SI's progress, as I understand that it may submit more impressive achievements in the future.

Both Luke and Eliezer seem to disagree with the basic approach I'm taking here. They seem to believe that it is sufficient to establish that (a) AGI risk is an overwhelmingly important issue and that (b) SI compares favorably to other organizations that explicitly focus on this issue. For my part, I (a) disagree with the statement: "the loss in expected value resulting from an existential catastrophe is so enormous that the objective of reducing existential risks should be a dominant consideration whenever we act out of an impersonal concern for humankind as a whole"; (b) do not find Luke's argument that AI, specifically, is the most important existential risk to be compelling (it discusses only how beneficial it would be to address the issue well, not how likely a donor is to be able to help do so); (c) believe it is appropriate to compare the overall organizational impressiveness of the Singularity Institute to that of all other donation-soliciting organizations, not just to that of other existential-risk- or AGI-focused organizations. I would guess that these disagreements, particularly (a) and (c), come down to relatively deep worldview differences (related to the debate over "Pascal's Mugging") that I will probably write more about in the future.

On tool AI:

Most of my disagreements with SI representatives seem to be over how broad a mission is appropriate for SI, and how high a standard SI as an organization should be held to. However, the debate over "tool AI" is different, with both sides making relatively strong claims. Here SI is putting forth a specific point as an underappreciated insight and thus as a potential contribution/accomplishment; my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.

My latest thoughts on this disagreement were posted separately in a comment response to Eliezer's post on the subject.

A few smaller points:

  • I disagree with Luke's claim that " objection #1 punts to objection #2." Objection #2 (regarding "tool AI") points out one possible approach to AGI that I believe is both consonant with traditional software development and significantly safer than the approach advocated by SI. But even if the "tool AI" approach is not in fact safer, there may be safer approaches that SI hasn't thought of. SI does not just emphasize the general problem that AGI may be dangerous (something that I believe is a fairly common view), but emphasizes a particular approach to AGI safety, one that seems to me to be highly dangerous. If SI's approach is dangerous relative to other approaches that others are taking/advocating, or even approaches that have yet to be developed (and will be enabled by future tools and progress on AGI), this is a problem for SI.
  • Luke states that rationality is "only a ceteris paribus predictor of success" and that it is a "weak one." I wish to register that I believe rationality is a strong (though not perfect) predictor of success, within the population of people who are as privileged (in terms of having basic needs met, access to education, etc.) as most SI supporters/advocates/representatives. So while I understand that success is not part of the definition of rationality, I stand by my statement that it is "the best evidence of superior general rationality (or of insight into it)."
  • Regarding donor-advised funds: opening an account with Vanguard, Schwab or Fidelity is a simple process, and I doubt any of these institutions would overrule a recommendation to donate to an organization such as SI (in any case, this is easily testable).
Replies from: Wei_Dai, DaFranker, lukeprog, John_Maxwell_IV
comment by Wei Dai (Wei_Dai) · 2012-08-02T10:29:56.051Z · LW(p) · GW(p)

My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team.

Can you describe a hypothetical organization and some examples of the impressive achievements it might have, which would pass the bar for handling mission M3? What is your estimate of the probability of such an organization coming into existence in the next five or ten years, if a large fraction of current SI donors were to put their money into donor-advised funds instead?

comment by DaFranker · 2012-08-01T15:29:42.214Z · LW(p) · GW(p)

I'm very much an outsider to this discussion, and by no means a "professional researcher", but I believe those to be the primary reasons why I'm actually qualified to make the following point. I'm sure it's been made before, but a rapid scan revealed no specific statement of this argument quite as directly and explicitly.

HoldenKarnofsky: (...) my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.

I've always understood SI's position on this matter not as one of "We should not focus on building Tool AI! Fully reflectively self-modifying AGIs are the only way to go!", but rather that it is extremely unlikely that we can prevent everyone else from building one.

To my understanding, logic goes: If any programmer with relevant skills is sufficiently convinced, by whatever means and for whatever causes, that building a full traditional AGI is more efficient and will more "lazily" achieve his goals with less resources or achieve them faster, the programmer will build it whether you think it's a good idea or not. As such, SI's "Moral Imperative" is to account for this scenario as there is non-negligible probability of it actually happening, for if they do not, they effectively become hypocritical in claiming to work towards reducing existential AI risk.

To reiterate with silly scare-formatting: It is completely irrelevant, in practice, what SI "advocates" or "promotes" as a preferred approach to building safe AI, because the probability that someone, somewhere, some day is going to use the worst possible approach is definitely non-negligible. If there is not already a sufficiently advanced Friendly AI in place to counter such a threat, we are then effectively defenseless.

To metaphorize, this is a case of: "It doesn't matter if you think only using remote-controlled battle robots would be a better way to resolve international disputes. At some point, someone somewhere is going to be convinced that killing all of you is going to be faster and cheaper and more certain of achieving their goals, so they'll build one giant bomb and throw it at you without first making sure they won't kill themselves in the process."

Replies from: John_Maxwell_IV, None
comment by John_Maxwell (John_Maxwell_IV) · 2012-08-03T05:28:21.162Z · LW(p) · GW(p)

This looks similar to this point Kaj Sotala made. My own restatement: As the body of narrow AI research devoted to making tools grows larger and larger, building agent AGI gets easier and easier, and there will always be a few Shane Legg types who are crazy enough to try it.

I sometimes suspect that Holden's true rejection to endorsing SI is that the optimal philanthropy movement is fringe enough already, and he doesn't want to associate it with nutty-seeming beliefs related to near-inevitable doom from superintelligence. Sometimes I wish SI would market themselves as being similar to nuclear risk organizations like the Bulletin of Atomic Scientists. After all, EY was an AI researcher who quit and started working on Friendliness when he saw the risks, right? I think you could make a pretty good case for SI's usefulness just working based on analogies from nuclear risk, without any mention of FOOM or astronomical waste or paperclip maximizers.

Ideally we'd have wanted to know about nuclear weapon risks before having built them, not afterwards, right?

Replies from: DaFranker
comment by DaFranker · 2012-08-03T12:58:33.371Z · LW(p) · GW(p)

Personally, I highly doubt that to be Holden's true rejection, though it is most likely one of the emotional considerations that cannot be ignored in a strategic perspective. Holden claims to have gone through most of the relevant LessWrong sequence and SIAI public presentation material, which makes the likelihood of a deceptive (or self-deceptive) argumentation lower, I believe.

No, what I believe to be the real issue is that Holden and (Most of SIAI) have disagreements over many specific claims used to justify broader claims - if the specific claims are granted in principle, both seem to generally agree in good bayesian fashion on the broader or more general claim. Much of the disagreements on those specifics also appears to stem from different priors in ethical and moral values, as well as differences in their evaluations and models of human population behaviors and specific (but often unspecified) "best guess" probabilities.

For a generalized example, one strong claim for existential risk being optimal effort is that even a minimal decrease in risk provides immense expected value simply from the sheer magnitude of what could most likely be achieved by humanity throughout the rest of its course of existence. Many experts and scientists outright reject this on the grounds that "future, intangible, merely hypothetical other humans" should not be assigned value on the same order-of-magnitude as current humans, or even one order of magnitude lower.

comment by [deleted] · 2012-08-01T16:01:43.779Z · LW(p) · GW(p)

but rather that it is extremely unlikely that we can prevent everyone else from building one.

Well, SI's mission makes sense on the premise that the best way to prevent a badly built AGI from being developed or deployed is to build a friendly AGI which has that as one of its goals. 'Best way' here is a compromise between, on the one hand, the effectiveness of the FAI relative to other approaches, and on the other, the danger presented by the FAI itself as opposed to other approaches.

So I think Holden's position is that the ratio of danger vs. effectiveness does not weigh favorably for FAI as opposed to tool AI. So to argue against Holden, we would have to argue either that FAI will be less dangerous than he thinks, or that tool AI will be less effective than he thinks.

I take it the latter is the more plausible.

Replies from: DaFranker
comment by DaFranker · 2012-08-02T18:27:23.138Z · LW(p) · GW(p)

Indeed, we would have to argue that to argue against Holden.

My initial reaction was to counter this with a claim that we should not be arguing against anyone in the first place, but rather looking for probable truth (concentrate anticipations). And then I realized how stupid that was: Arguments Are Soldiers. If SI (and by the Blue vs Green principle, any SI-supporter) can't even defend a few claims and defeat its opponents, it is obviously stupid and not worth paying attention to.

SI needs some amount of support, yet support-maximization strategies carry a very high risk of introducing highly dangerous intellectual contamination through various forms (including self-reinforcing biases in the minds of researchers and future supporters) that could turn out to cause even more existential risk. Yet, at the same time, not gathering enough support quickly enough dramatically augments the risk that someone, somewhere, is going to trip on a power cable and poof, all humans are just gone.

I am definitely not masterful enough in mathematics and bayescraft to calculate the optimal route through this differential probabilistic maze, but I suspect others could provide a very good estimate.

Also, it's very much worth noting that these very considerations, on a meta level, are an integral part of SI's mission, so figuring out whether that premise you stated is true or not, and whether there are better solutions or not actually is SI's objective. Basically, while I might understand some of the cognitive causes for it, I am still very much rationally confused when someone questions SI's usefulness by questioning the efficiency of subgoal X, while SI's original and (to my understanding) primary mission is precisely to calculate the efficiency of subgoal X.

comment by lukeprog · 2012-08-03T04:23:00.195Z · LW(p) · GW(p)

Just a few thoughts for now:

  • I agree that some of our disagreements "come down to relatively deep worldview differences (related to the debate over 'Pascal's Mugging')." The forthcoming post on this subject by Steven Kaas may be a good place to engage further on this matter.
  • I retain the claim that Holden's "objection #1 punts to objection #2." For the moment, we seem to be talking past each other on this point. The reply Eliezer and I gave on Tool AI was not just that Tool AI has its own safety concerns, but also that understanding the tool AI approach and other possible approaches to the AGI safety problem are part of what an "FAI Programmer" does. We understand why people have gotten the impression that SI's FAI team is specifically about building a "self-improving CEV-maximizing agent", but that's just one approach under consideration, and figuring out which approach is best requires the kind of expertise that SI aims to host.
  • The evidence suggesting that rationality is a weak predictor of success comes from studies on privileged Westerners. Perhaps Holden has a different notion of what counts as a measure of rationality than the ones currently used by psychologists?
  • I've looked further into donor advised funds and now agree that the institutions named by Holden are unlikely to overrule their client's wishes.
  • I, too, would be curious to hear Holden's response to Wei Dai's question.
Replies from: aaronsw
comment by aaronsw · 2012-08-04T11:18:27.475Z · LW(p) · GW(p)

On the question of the impact of rationality, my guess is that:

  1. Luke, Holden, and most psychologists agree that rationality means something roughly like the ability to make optimal decisions given evidence and goals.

  2. The main strand of rationality research followed by both psychologists and LWers has been focused on fairly obvious cognitive biases. (For short, let's call these "cognitive biases".)

  3. Cognitive biases cause people to make choices that are most obviously irrational, but not most importantly irrational. For example, it's very clear that spinning a wheel should not affect people's estimates of how many African countries are in the UN. But do you know anyone for whom this sort of thing is really their biggest problem?

  4. Since cognitive biases are the primary focus of research into rationality, rationality tests mostly measure how good you are at avoiding them. These are the tests used in the studies psychologists have done on whether rationality predicts success.

  5. LW readers tend to be fairly good at avoiding cognitive biases (and will be even better if CFAR takes off).

  6. But there are a whole series of much more important irrationalities that LWers suffer from. (Let's call them "practical biases" as opposed to "cognitive biases", even though both are ultimately practical and cognitive.)

  7. Holden is unusually good at avoiding these sorts of practical biases. (I've found Ray Dalio's "Principles", written by Holden's former employer, an interesting document on practical biases, although it also has a lot of stuff I disagree with or find silly.)

  8. Holden's superiority at avoiding practical biases is a big part of why GiveWell has tended to be more successful than SIAI. (Givewell.org has around 30x the amount of traffic as Singularity.org according to Compete.com and my impression is that it moves several times as much money, although I can't find a 2011 fundraising total for SIAI.)

  9. lukeprog has been better at avoiding practical biases than previous SIAI leadership and this is a big part of why SIAI is improving. (See, e.g., lukeprog's debate with EY about simply reading Nonprofit Kit for Dummies.)

  10. Rationality, properly understood, is in fact a predictor of success. Perhaps if LWers used success as their metric (as opposed to getting better at avoiding obvious mistakes), they might focus on their most important irrationalities (instead of their most obvious ones), which would lead them to be more rational and more successful.

Replies from: lukeprog
comment by lukeprog · 2013-02-19T04:03:16.859Z · LW(p) · GW(p)

For the record, I basically agree with all this.

comment by John_Maxwell (John_Maxwell_IV) · 2012-08-03T19:03:01.330Z · LW(p) · GW(p)

I would guess that these disagreements, particularly (a) and (c), come down to relatively deep worldview differences (related to the debate over "Pascal's Mugging") that I will probably write more about in the future.

How does Givewell plan to deal with the possibility that people who come to Givewell looking for charity advice may have a variety of worldviews that impact their thinking on this?

comment by fubarobfusco · 2012-07-09T22:48:16.170Z · LW(p) · GW(p)

Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa. There is an asymmetry between AI risk and other existential risks. If we mitigate the risks from (say) synthetic biology and nanotechnology (without building Friendly AI), this only means we have bought a few years or decades for ourselves before we must face yet another existential risk from powerful new technologies. But if we manage AI risk well enough (i.e. if we build a Friendly AI or "FAI"), we may be able to "permanently" (for several billion years) secure a desirable future.

This equates "managing AI risk" and "building FAI" without actually making the case that these are equivalent. Many people believe that dangerous research can be banned by governments, for instance; it would be useful to actually make the case (or link to another place where it has been made) that managing AI risk is intractable without FAI.

Replies from: lukeprog
comment by lukeprog · 2012-07-09T23:02:18.776Z · LW(p) · GW(p)

This is one of the 10,000 things I didn't have the space to discuss in the original post, but I'm happy to briefly address it here!

It's much harder to successfully ban AI research than to successfully ban, say, nuclear weapons. Nuclear weapons require rare and expensive fissile material that requires rare heavy equipment to manufacture. Such things can be tracked to some degree. In contrast, AI research requires... um... a few computers.

Moreover, it's really hard to tell whether the code somebody is running on a computer is potentially dangerous AI stuff or something else. Even if you magically had a monitor installed on every computer to look for dangerous AI stuff, it would have to know what "dangerous AI stuff" looks like, which is hard to do before the dangerous AI stuff is built in the first place.

The monetary, military, and political incentives to build AGI are huge, and would be extremely difficult to counteract through a worldwide ban. You couldn't enforce the ban, anyway, for the reasons given above. That's why Ben Goertzel advocates "Nanny AI," though Nanny AI may be FAI-complete, as mentioned here.

I hope that helps?

Replies from: fubarobfusco, None
comment by fubarobfusco · 2012-07-10T00:35:48.944Z · LW(p) · GW(p)

Yes.

comment by [deleted] · 2012-07-13T01:07:08.606Z · LW(p) · GW(p)

It does help...I had the same reaction as fubarob.

However, your argument assumes that our general IT capabilities have already matured to the point where AGI is possible. I agree that restricting AGI research then is likely a lost cause. Much less clear to me is whether it is equally futile to try restricting IT and computing research or even general technological progress before such a point. Could we expect to bring about global technological stasis? One may be tempted to say that such an effort is doomed to a fate like global warming accords except ten times deader. I disagree entirely. Both Europe and the United States, in fact, have in recent years been implementing a quite effective policy of zero economic growth! It is true that progress in computing has continued despite the general slowdown but this seems hardly written in stone for the future. In fact we need only consider this paper on existential risks and the "Crunches" section for several examples of how stasis might be brought about.

Can anyone recommend detailed discussions of broad relinquishment, from any point of view? The closest writings I know are Bill McKibben's book Enough and Bill Joy's essay, but anything else would be great.

Replies from: lukeprog
comment by lukeprog · 2012-07-13T03:09:09.289Z · LW(p) · GW(p)

I'm pretty sure we have the computing power to run at least one AGI, but I can't prove that. Still, restricting general computing progress should delay the arrival of AGI because the more hardware you have, the "dumber" you can be with solving the AGI software problem. (You can run less efficient algorithms.)

Global technological stasis seems just as hard or maybe harder than restricting AGI research. The incentives for AGI are huge, but there might be some points at which you have to spend a lot of money before you get much additional economic/political/military advantage. But when it comes to general computing progress, then it looks to be always the case that a bit more investment can always yield better returns, e.g. by squeezing more processor cores into a box a little more tightly.

Other difficulties of global technological stasis are discussed in the final chapter of GCR, called "The Totalitarian Threat." Basically, you'd need some kind of world government, because any country that decides to slow down its computing progress rapidly falls behind other nations. But political progress is so slow that it seems unlikely we'll get a world government in the next century unless somebody gets a decisive technological advantage via AGI, in which case we're talking about the AGI problem anyway. (The world government scenario looks like the most plausible of Bostrom's "crunches", which is why I chose to discuss it.)

Relinquishment is also discussed (in very different ways) by Berglas (2009), Kaczynski (1995), and De Garis (2005).

Replies from: None, None
comment by [deleted] · 2012-07-13T03:28:11.031Z · LW(p) · GW(p)

Global technological stasis seems just as hard or maybe harder than restricting AGI research.

I don't want to link directly to what I believe is a solution to this problem, as it was so widely misinterpreted, but I thought there was already a solution to this problem that hinged on how expensive it is to harden fabrication plants. In that case, you wouldn't even need one world government; a sufficiently large collaboration (e.g., NATO) would do the trick.

comment by [deleted] · 2012-07-13T06:00:14.651Z · LW(p) · GW(p)

Quoting the B-man:

A world government may not be the only form of stable social equilibrium that could permanently thwart progress. Many regions of the world today have great difficulty building institutions that can support high growth. And historically, there are many places where progress stood still or retreated for significant periods of time. Economic and technological progress may not be as inevitable as is appears to us.

It seems to me like saying world government is necessary underestimates the potential impact of growth-undermining ideas. If all large governments for example buy into the idea that cutting government infrastructure spending in a recession boosts employment, then we can assume that global growth will slow as a result of the acceptance of this false assertion. To me the key seems less to have some world nabob pronouncing edicts on technology and more shifting the global economy so as to make edicts merely the gift wrap.

I will definitely take a look at that chapter of GCR. Thanks also for the other links. The little paper by Berglas I found interesting. Mr. K needs no comment, naturally. Might be good reading for similar reasons that Mien Kampf is. With De Garis I have long felt like the kook-factor was too high to warrant messing with. Anyone read him much? Maybe it would be good just for some ideas.

Replies from: lukeprog
comment by lukeprog · 2012-07-13T06:45:45.367Z · LW(p) · GW(p)

It certainly could be true that economic growth and technological progress can slow down. In fact, I suspect the former if not the latter will slow down, and perhaps the latter, too. That's very different from stopping technological progress that will lead to AGI, though.

Replies from: elharo, None
comment by elharo · 2013-02-08T12:59:04.753Z · LW(p) · GW(p)

Not only can economic growth and technological progress slow down. They can stop and reverse. Just because we're now further out in front than humanity has ever been before in history does not mean that we can't go backwards. Economic growth is probably more likely to reverse than technological progress. That's what a depression is, after all.

But a sufficiently bad global catastrophe, perhaps one that destroyed the electrical grid and other key infrastructure, could reverse a lot of technological progress too and perhaps knock us way back without necessarily causing complete extinction.

comment by [deleted] · 2012-07-14T03:51:55.126Z · LW(p) · GW(p)

I think technological stasis could really use more discussion. For example I was able to find this paper by James Hughes discussing relinquishment. He however treats the issue as one of regulation and verification, similar to nuclear weapons, noting that:

We do not yet have effective global institutions capable of preventing determined research of any kind, even apocalyptic. I believe we will be build strong global institutions in the next couple of decades. Global institutions may regulate for safety, prevent weaponization, support technology transfer, and so on. But there will be no support for global governance that attempts to deny developing countries the right to emerging technologies.

Regulation and verification may indeed be a kind of Gordian knot. The more specific the technologies you want to stop, the harder it becomes to do that and still advance generally. Berglas recognizes that problem in his paper and so proposes restricting computing as a whole. Even this, however, may be too specific to cut the knot. The feasibility of stopping economic growth entirely so we never reach the point where regulation is necessary seems to me an unexplored question. If we look at global GDP growth over the past 50 years it's been uniformly positive except for the most recent recession. It's also been quite variable over short periods. Clearly stopping it for longer than a few years would require some new phenomenon driving a qualitative break with the past. That does not mean however that a stop is impossible.

There does exist a small minority camp within the economics profession advocating no-growth policies for environmental or similar reasons. I wonder if anyone has created a roadmap for bringing such policies about on a global level.

Replies from: gwern
comment by gwern · 2012-07-14T23:42:55.089Z · LW(p) · GW(p)

Out of curiosity, have you read my little essay "Slowing Moore's Law"? It seems relevant.

comment by lukeprog · 2012-07-10T00:13:28.624Z · LW(p) · GW(p)

Certainly the fact that some really awful charities are untruthful doesn't mean SI shouldn't be held accountable merely because it managed to tell the truth.

I didn't mean that SI shouldn't be held accountable for the theft. I was merely lamenting my expectation that it will probably be punished for reporting it.

Replies from: Vaniver
comment by Vaniver · 2012-07-10T01:41:32.845Z · LW(p) · GW(p)

I was merely lamenting my expectation that it will probably be punished for reporting it.

Conservation of expected evidence often has unpleasant implications.

comment by lukeprog · 2012-07-09T23:43:41.336Z · LW(p) · GW(p)

A clarification. In Thoughts on the Singularity Institute, Holden wrote:

I will commit to is reading and carefully considering up to 50,000 words of content that are (a) specifically marked as SI-authorized responses to the points I have raised; (b) explicitly cleared for release to the general public as SI-authorized communications. In order to consider a response "SI-authorized and cleared for release," I will accept explicit communication from SI's Executive Director or from a majority of its Board of Directors endorsing the content in question.

As SI's Executive Director I am hereby marking three different things as "SI-authorized responses" to the points Holden raised: my "Reply to Holden on the Singularity Institute" (the post above), my long comment on recent organizational improvements at SI, and Eliezer's Reply to Holden on Tool AI.

According to Word Count Tool, these three things add up to a mere 13,940 words.

Replies from: wedrifid
comment by wedrifid · 2012-07-10T01:11:36.990Z · LW(p) · GW(p)

As SI's Executive Director I am hereby marking three different things as "SI-authorized responses" to the points Holden raised: my "Reply to Holden on the Singularity Institute" (the post above), my long comment on recent organizational improvements at SI, and Eliezer's Reply to Holden on Tool AI.

Consider removing the first sentence of the final link:

This comment is not intended to be part of the 50,000-word response which Holden invited.

Replies from: lukeprog
comment by lukeprog · 2012-07-10T01:26:23.176Z · LW(p) · GW(p)

Good point. :)

Fixed.

comment by lukeprog · 2012-07-11T04:34:29.074Z · LW(p) · GW(p)

I expected more disagreement than this. Was my post really that persuasive?

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2012-07-11T08:21:49.771Z · LW(p) · GW(p)

I linked this to an IRC channel full of people skeptical of SI. One person commented that

the reply doesn't seem to be saying much

and another that

I think most arguments are 'yes we are bad but we will improve'
and some opinion based statement about how FAI is the most improtant thing on the world.

Which was somewhat my reaction as well - I can't put a finger on it and say exactly what it is that's wrong, but somehow it feels like this post isn't "meaty" enough to elicit much of a reaction, positive or negative. Which on the other feels odd, since e.g. the "SI's mission assumes a scenario that is far less conjunctive than it initially appears" heading makes an important point that SI hasn't really communicated well in the past. Maybe it just got buried under the other stuff, or something.

Replies from: ChrisHallquist, lukeprog
comment by ChrisHallquist · 2012-07-11T09:25:35.005Z · LW(p) · GW(p)

I found the "less conjunctive" section very persuasive, suspect Kaj may be right about it getting burried.

comment by lukeprog · 2012-07-11T08:31:32.321Z · LW(p) · GW(p)

That's an unfortunate response, given that I offered a detailed DH6-level disagreement (quote the original article directly, and refute the central points), and also offered important novel argumentation not previously published by SI. I'm not sure what else people could have wanted.

If somebody figures out why Kaj and some others had the reaction they did, I'm all ears.

Replies from: TheOtherDave, Grognor, Rain, Jack
comment by TheOtherDave · 2012-07-11T13:39:44.276Z · LW(p) · GW(p)

I can't speak for anyone else, and had been intending to sit this one out, since my reactions to this post were not really the kind of reaction you'd asked for.

But, OK, my $0.02.

The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one, and extraordinary claims require extraordinary evidence. This puts a huge burden on you, as the person attempting to provide that evidence.

So, I'll ask you: do you think your response provides such evidence?

If you do, then your problem seems to be (as others have suggested) one of document organization. Perhaps starting out with an elevator-pitch answer to the question "Why should I believe that SI is capable of this extraordinary feat?" might be a good idea.

Because my take-away from reading this post was "Well, nobody else is better suited to do it, and SI does some cool movement-building stuff (the Sequences, the Rationality Camps, and HPMoR) that attracts smart people and encourages them to embrace a more rational approach to their lives, and SI is fixing some of its organizational and communication problems but we need more money to really make progress on our core mission."

Which, if I try to turn it into an answer to the initial question, gives me "Well, we're better-suited than anyone else because, unlike them, we're focused on the right problem... even though you can't really tell, because what we are really focused on is movement-building, but once we get a few million dollars and the support of superhero mathematicians, we will totally focus on the right problem, unlike anyone else."

If that is in fact your answer, then one thing that might help is to make a more credible visible precommitment to that eventuality.

For example: if you had that "few million dollars a year" revenue stream, and if you had the superhero mathematician, what exactly would you do with them for, say, the first six months? Lay out that project plan in detail, establish what your criteria would be to make sure you were still focused on the right problem three months in, and set up an escrow fund (a la Kickstarter, where the funds are returned if the target is not met) to support that project plan so people who are skeptical of SI's organizational ability to actually do any of that stuff have a way of supporting the plan IFF they're wrong about SI, without waiting for their wrongness to be demonstrated before providing the support.

If your answer is in fact something else, then stating it more clearly might help.

Replies from: Eliezer_Yudkowsky, lukeprog, Mass_Driver
comment by lukeprog · 2012-07-11T18:11:48.489Z · LW(p) · GW(p)

The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one... "Why should I believe that SI is capable of this extraordinary feat?"

SI is not exceptionally well-suited for x-risk mitigation relative to some ideal organization, but relative to the alternatives (as you said). But the reason I gave for this was not "unlike them, we're focused on the right problem", though I think that's true. Instead, the reasons I gave (twice!) were:

SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.

As for getting back to the original problem rather than just doing movement-building, well... that's what I've been fighting for since I first showed up at SI, via Open Problems in Friendly AI. And now it's finally happening, after SPARC.

if you had that "few million dollars a year" revenue stream, and if you had the superhero mathematician, what exactly would you do with them for, say, the first six months? Lay out that project plan in detail, establish what your criteria would be to make sure you were still focused on the right problem three months in, and set up an escrow fund (a la Kickstarter, where the funds are returned if the target is not met) to support that project plan so people who are skeptical of SI's organizational ability to actually do any of that stuff have a way of supporting the plan IFF they're wrong about SI, without waiting for their wrongness to be demonstrated before providing the support.

Yes, this is a promising idea. It's also probably 40-100 hours of work, and there are many other urgent things for us to do as well. That's not meant as a dismissal, just as a report from the ground of "Okay, yes, everyone's got a bunch of great ideas, but where are the resources I'm supposed to use to do all those cool things? I've been working my ass off but I can't do even more stuff that people want without more resources."

Replies from: TheOtherDave
comment by TheOtherDave · 2012-07-11T19:20:32.041Z · LW(p) · GW(p)

It's also probably 40-100 hours of work, and there are many other urgent things for us to do as well.

Absolutely. As I said in the first place, I hadn't initially intended to reply to this, as I didn't think my reactions were likely to be helpful given the situation you're in. But your followup comment seemed more broadly interested in what people might have found compelling, and less in specific actionable suggestions, than your original post. So I decided to share my thoughts on the former question.

I totally agree that you might not have the wherewithal to do the things that people might find compelling, and I understand how frustrating that is.

It might help emotionally to explicitly not-expect that convincing people to donate large sums of money to your organization is necessarily something that you, or anyone, are able to do with a human amount of effort. Not that this makes the problem any easier, but it might help you cope better with the frustration of being expected to put forth an amount of effort that feels unreasonably superhuman.

Or it might not.

Instead, the reasons I gave (twice!) were: [..]

I'll observe that the bulk of the text you quote here is not reasons to believe SI is capable of it, but reasons to believe the task is difficult. What's potentially relevant to the former question is:

SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons [and] has lots of experience with these issues;

If that is your primary answer to "Why should I believe SI is capable of mitigating x-risk given $?", then you might want to show why the primary obstacles to mitigating x-risk are psychological/organizational issues rather than philosophical/technical ones, such that SI's competence at addressing the former set is particularly relevant. (And again, I'm not asserting that showing this is something you are able to do, or ought to be able to do. It might not be. Heck, the assertion might even be false, in which case you actively ought not be able to show it.)

You might also want to make more explicit the path from "we have experience addressing these psychological/organizational issues" to "we are good at addressing these psychological/organizational issues (compared to relevant others)". Better still might be to focus your attention on demonstrating the latter and ignore the former altogether.

Replies from: lukeprog
comment by lukeprog · 2012-07-11T20:04:46.631Z · LW(p) · GW(p)

Thank you for understanding. :)

My statement "SI has successfully concentrated lots of attention, donor support, and human capital [and also] has learned many lessons [and] has lots of experience with [these unusual, complicated] issues" was in support of "better to help SI grow and improve rather than start a new, similar AI risk reduction organization", not in support of "SI is capable of mitigating x-risk given money."

However, if I didn't also think SI was capable of reducing x-risk given money, then I would leave SI and go do something else, and indeed will do so in the future if I come to believe that SI is no longer capable of reducing x-risk given money. How to Purchase AI Risk Reduction is a list of things that (1) SI is currently doing to reduce AI risk, or that (2) SI could do almost immediately (to reduce AI risk) if it had sufficient funding.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-07-11T20:22:19.094Z · LW(p) · GW(p)

My statement [..] was in support of "better to help SI grow and improve rather than start a new, similar AI risk reduction organization", not in support of "SI is capable of mitigating x-risk given money."

Ah, OK. I misunderstood that; thanks for the clarification.
For what it's worth, I think the case for "support SI >> start a new organization on a similar model" is pretty compelling.

And, yes, the "How to Purchase AI Risk Reduction" series is an excellent step in the direction of making SI's current and planned activities, and how they relate to your mission, more concrete and transparent. Yay you!

comment by Mass_Driver · 2012-07-11T21:32:33.906Z · LW(p) · GW(p)

I strongly agree with this comment, and also have a response to Eliezer's response to it. While I share TheOtherDave's views, as TheOtherDave noted, he doesn't necessarily share mine!

It's not the large consequences that make it a priori unlikely that an organization is really good at mitigating existential risks -- it's the objectively small probabilities and lack of opportunity to learn by trial and error.

If your goal is to prevent heart attacks in chronically obese, elderly people, then you're dealing with reasonably large probabilities. For example, the AHA estimates that a 60-year-old, 5'8" man weighing 220 pounds has a 10% chance of having a heart attack in the next 10 years. You can fiddle with their calculator here. This is convenient, because you can learn by trial or error whether your strategies are succeeding. If only 5% of a group of the elderly obese under your treatment have heart attacks over the next 10 years, then you're probably doing a good job. If 12% have heart attacks, you should probably try another tactic. These are realistic swings to expect from an effective treatment -- it might really be possible to cut the rate of heart attacks in half among a particular population.This study, for example, reports a 25% relative risk reduction. If an organization claims to be doing really well at preventing heart attacks, it's a credible signal -- if they weren't doing well, someone could check their results and prove it, which would be embarrassing for the organization. So, that kind of claim only needs a little bit of evidence to support it.

On the other hand, any given existential risk has a small chance of happening, a smaller chance of being mitigated, and, by definition, little or no opportunity to learn by trial and error. For example, the odds of an artificial intelligence explosion in the next 10 years might be 1%. A team of genius mathematicians funded with $5 million over the next 10 years might be able to reduce that risk to 0.8%. However, this would be an extraordinarily difficult thing to estimate. These numbers come from back-of-the-envelope Fermi calculations, not from hard data. They can't come from hard data -- by definition, existential risks haven't happened yet. Suppose 10 years go by, and the Singularity Institute gets plenty of funding, and they declare that they successfully reduced the risk of unfriendly AI down to 0.5%, and that they are on track to do the same for the next decade. How would anyone even go about checking this claim?

An unfriendly intelligence explosion, by its very nature, will use tactics and weaknesses that we are not presently aware of. If we learn about some of these weaknesses and correct them, then uFAI would use other weaknesses. The Singularity Institute wants to promote the development of a provably friendly AI; the thought is that if the AI's source code can be shown mathematically to be friendly, then, as long as the proof is correct and the code is faithfully entered by the programmers and engineers, we can achieve absolute protection against uFAI, because the FAI will be smart enough to figure that out for us. But while it's very plausible to think that we will face significant AI risk in the next 30 years (i.e., the risk arises under a disjunctive list of conditions), it's not likely that we will face AI risk, and that AI will turn out to have the capacity to exponentially self-improve, and that there is a theoretical piece of source code that would be friendly, and that at least one such code can provably be shown to be friendly, and that a team of genius mathematicians will actually find that proof, and that these mathematicians will prevail upon a group of engineers to build the FAI before anyone else builds a competing model. This is a conjunctive scenario.

It's not at all clear to me how just generally having a team of researchers who are moderately familiar with the properties of the mathematical objects that determine the friendliness of AI could do anything to reduce existential risk if this conjunctive scenario doesn't come to pass. In other words, if we get self-replicating autonomous moderately intelligent AIs, or if it turns out that there's no such thing as a mathematical proof of friendliness, or if AI first comes about by way of whole brain emulation, then I don't understand how the Singularity Institute proposes to make itself useful. It's not a crazy thought that having a ready-made team of seasoned amateurs ready to tackle the problems of AI would yield better results than having to improvise a response team from scratch...but there are other charitable proposals (including proposals to reduce other kinds of x-risk) that I find considerably more compelling. If you want me to donate to the Singularity Institute, you'll have to come up with a better plan than "This incredibly specific scenario might come to pass and we have a small chance of being able to mitigate the consequences if it does, and even if the scenario doesn't come to pass, it would still probably be good to have people like us on hand to cope with unspecified similar problems in unspecified ways."

By way of analogy, a group of forward-thinking humanitarians in 1910 could have plausibly argued that somebody ought to start getting ready to think about ways to help protect the world against the unknown risks of new discoveries in theoretical physics...but they probably would have been better off thinking up interesting ways of stopping World War I or a re-occurrence of the dreaded 1893 Russian Flu. The odds that even a genius team of humanitarian physicists would have anticipated the specific course that cutting-edge physics would take -- involving radioactivity, chain reactions, uranium enrichment, and implosion bombs -- just from baseline knowledge about Bohr's model of the atom and Marie Curie's discovery of radioactivity -- are already incredibly low. The further odds that they would take useful steps, in the 1910s, to devise and execute an effective plan to stop the development of nuclear weapons or even to ensure that they were not used irresponsibly, seem astronomically low. The team might manage, in a general way, to help improve the security controls on known radioactive materials -- but, as actually happened, new materials were found to be radioactive, and new ways were found of artificially enhancing the radioactivity of a substance, and in any event most governments had secret stockpiles of fissile material that would not have been reached by ordinary security controls.

Today, we know a little something about computer science, and it's understandable to want to develop expertise in how to keep computers safe -- but we can't anticipate the specific course of discoveries in cutting-edge computer science, and even if we could, it's unlikely that we'll be able to take action now to help us cope with them, and if our guesses about the future prove to be close but not exactly accurate, then it's even more unlikely that the plans we make now based on our guesses will wind up being useful.

That's why I prefer to donate to charities that are attempting either to (a) alleviate suffering that is currently and verifiably happening, e.g., Deworm the World, or (b) obviously useful for preventing existential risks in a disjunctive way, e.g., the Millenium Seed Bank. I have nothing against the SI -- I wish you well and hope you grow and succeed. I think you're doing better than the vast majority of charities out there. I just also think there are even better uses for my money.

EDIT: Clarified that my views may be different from TheOtherDave's, even though I agree with his views.

Replies from: TheOtherDave
comment by TheOtherDave · 2012-07-11T21:38:24.494Z · LW(p) · GW(p)

I should say, incidentally (since this was framed as agreement to my comment) that Mass_Driver's point is rather different from mine.

comment by Grognor · 2012-07-11T14:48:19.247Z · LW(p) · GW(p)

One sad answer is that your post is boring, which is another way of saying it doesn't have enough Dark Arts to be sufficiently persuasive.

There are many ways to infect a population with a belief; presenting evidence for its accuracy is among the least effective

-Sister Y

comment by Rain · 2012-07-11T12:52:32.234Z · LW(p) · GW(p)

It didn't have the same cohesiveness as Holden's original post; there were many more dangling threads, to borrow the same metaphor I used to say why his post was so interesting. You wrote it as a technical, thoroughly cited response and literature review instead of a heartfelt, wholly self-contained Mission Statement, and you made it very clear of that by stating at least 10 times that there was much more info 'somewhere else' (in conversations, in people's heads, yet to be written, etc.).

He wrote an intriguing short story, you wrote a dry paper.

Edit: Also, the answer to every question seems to be, "That will be in Eliezer's next Sequence," which postpones further debate.

comment by Jack · 2012-07-11T15:04:33.195Z · LW(p) · GW(p)

I doubt random skeptics on the internet followed links to papers. Their thoughts are unlikely to be diagnostic. The group of people who disagree with you and will earnestly go through all the arguments is small. Also, explanations of the form "Yes this was a problem but we're going to fix it." are usually just read as rationalizations. It sounds a bit like "Please, sir, give me another chance. I know I can do better" or "I'm sorry I cheated on you. It will never happen again". The problems actually have to be fixed before the argument is rebutted. It will go better when you can say things like "We haven't had any problems of this kind in 5 years".

Replies from: private_messaging
comment by private_messaging · 2012-07-11T15:44:11.004Z · LW(p) · GW(p)

The group of people who disagree with you and will earnestly go through all the arguments is small.

It is also really small for e.g. perpetual motion device constructed using gears, weights, and levers - very few people would even look at blueprint. It is a bad strategy to dismiss critique on grounds that the critic did not read the whole. Meta considerations work sometimes.

Sensible priors for p(our survival at risk|rather technically unaccomplished are the most aware of the risk) and p(rather technically unaccomplished are the most aware of the risk|our survival at risk) are very, very low. Meanwhile p(rather technically unaccomplished are the most aware of the risk|our survival is not actually at risk) is rather high (its commonly the case that someone's scared of something). p(high technical ability) is low to start with, p(highest technical ability) is very very low, and p(high technical ability | no technical achievement) is much lower still especially given reasonable awareness that technical achievement is instrumental to being taken seriously. p(ability to self deceive) is not very low, p(ability to deceive oneself and others) is not very low, there is a well known tendency to overspend on safety (see TSA), the notion of the living machine killing it's creator is very very old, and there's a plenty of movies to that point. In absence of some sort of achievement that is highly unlikely to be an evaluation error, the probability that you guys matter is very low. That's partly what Holden told about. The strongest point of his - you are not performing to the standards - even if he buys into AI danger or FAI importance he would not recommend donating to you.

comment by AlexMennen · 2012-07-10T00:56:38.643Z · LW(p) · GW(p)

The purpose of an FAI team is not to blindly develop one particular approach to Friendly AI without checking to see whether this work will be obsoleted by future developments. Instead, the purpose of an FAI team is to develop highly specialized expertise on, among other things, which kinds of research are more and less likely to be relevant given future developments.

This is unsettling. It sounds a lot like trying to avoid saying anything specific.

Replies from: lukeprog
comment by lukeprog · 2012-07-10T01:03:47.764Z · LW(p) · GW(p)

Eliezer will have lots of specific things to say in his forthcoming "Open Problems in Friendly AI" sequence (I know; I've seen the outline). In any case, wouldn't it be a lot more unsettling if, at this early stage, we pretended we knew enough to commit entirely to one very particular approach?

Replies from: AlexMennen
comment by AlexMennen · 2012-07-10T02:16:52.206Z · LW(p) · GW(p)

It's unsettling that this is still an early stage. SI has been around for over a decade. I'm looking forward to the open problems sequence; perhaps I should shut up about the lack of explanation of SI's research for now, considering that the sequence seems like a credible promise to remedy this.

comment by Thrasymachus · 2012-07-13T14:45:48.122Z · LW(p) · GW(p)

When making the case for SI's comparative advantage, you point to these things:

... [A]nd the ability to do unusual things that are nevertheless quite effective at finding/creating lots of new people interested in rationality and existential risk reduction: (1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world...

What evidence supports these claims?

Replies from: Thrasymachus, Thrasymachus
comment by Thrasymachus · 2012-07-22T21:43:36.134Z · LW(p) · GW(p)

each question (posted as a comment on this page) that follows the template described below will receive a reply from myself or another SI representative.

I appreciate you folks are busy, but I'm going to bump as it has been more than a week. Besides, it strikes me as an important question given the prominence of these things to the claim that SI can buy x-risk reduction more effectively than other orgs.

Replies from: endoself
comment by endoself · 2012-07-24T05:16:13.606Z · LW(p) · GW(p)

You can PM Luke if you want. It's the "Send message" button next to the username on the user page.

comment by Thrasymachus · 2012-08-02T16:38:25.627Z · LW(p) · GW(p)

I'm bumping this again because there's been no response to this question (three weeks since asking), and I poked Luke via PM a week ago. Given this is the main plank supporting SI's claim that it is a good way of spending money, I think this question should be answered.

(especially compare to Holden's post)

comment by ScottMessick · 2012-07-11T18:25:08.379Z · LW(p) · GW(p)

I'm really glad you pointed out that SI's strategy is not predicated on hard take-off. I don't recall if this has been discussed elsewhere, but that's something that always bothered me since I think hard take-off is relatively unlikely. (Admittedly, soft take-off still considerably diminishes my expected impact for SI and donating to it.)

Replies from: Bruno_Coelho
comment by Bruno_Coelho · 2012-07-14T05:33:21.301Z · LW(p) · GW(p)

For some time I think EY support hard takeoff -- the bunch of guys in the garage argument --, but if luke say now it's not so, then ok.

comment by MatthewBaker · 2012-07-11T00:14:33.826Z · LW(p) · GW(p)

If I earmark my donations for "HPMOR Finale or CPA Audit whichever comes first" would that act as positive or negative pressure towards Eliezer's fiction creation complex? (I only ask because bugging him for an update has been previously suggested to reduce update speed)

Furthermore. Oracle AI/Nanny AI seem to both fail the heuristic of "other country is about to beat us in a war, should we remove the safety programming" that I use quite often with nearly everyone I debate AI about from outside the LW community. Thank you both for writing such concise yet detailed responses that helped me understand the problem areas of Tool AI better.

Replies from: lukeprog, David_Gerard
comment by lukeprog · 2012-07-11T00:36:49.400Z · LW(p) · GW(p)

If I earmark my donations for "HPMOR Finale or CPA Audit whichever comes first" would that act as positive or negative pressure towards Eliezer's fiction creation complex?

I think the issue is that we need a successful SPARC and an "Open Problems in Friendly AI" sequence more urgently than we need an HPMOR finale.

Replies from: shokwave, MatthewBaker
comment by shokwave · 2012-07-11T01:07:13.743Z · LW(p) · GW(p)

"Open Problems in Friendly AI" sequence

an HPMOR finale

A sudden, confusing vision just occurred, of the two being somehow combined. Aaagh.

Replies from: shminux
comment by Shmi (shminux) · 2012-07-11T04:59:21.684Z · LW(p) · GW(p)

Spoiler: Voldemort is a uFAI.

Replies from: arundelo
comment by arundelo · 2012-07-11T05:41:43.273Z · LW(p) · GW(p)

For the record:

Nothing in this story so far represents either FAI or UFAI. Consider it Word of God.

(And later in the thread, when asked about "so far": "And I have no intention at this time to do it later, but don't want to make it a blanket prohibition.")

Replies from: NancyLebovitz
comment by NancyLebovitz · 2012-07-15T02:03:33.878Z · LW(p) · GW(p)

In the earlier chapters, it seemed to me that the Hogwarts facility dealing with Harry was something like being faced with an AI of uncertain Friendliness.

Correction: It was more like the faculty dealing with an AI that's trying to get itself out of its box.

comment by MatthewBaker · 2012-07-11T22:21:43.906Z · LW(p) · GW(p)

I think our values our positively maximized by delaying the HPMOR finale as long as possible, my post was more out of curiosity to see what would be most helpful to Eliezer.

comment by David_Gerard · 2012-07-13T08:44:06.734Z · LW(p) · GW(p)

In general - never earmark donations. It's a stupendous pain in the arse to deal with. If you trust an organisation enough to donate to them, trust them enough to use the money for whatever they see a need for. Contrapositive: If you don't trust them enough to use the money for whatever they see a need for, don't donate to them.

Replies from: MatthewBaker
comment by MatthewBaker · 2012-07-13T18:01:02.408Z · LW(p) · GW(p)

I never have before but this CPA Audit seemed like a logical thing that would encourage my wealthy parents to donate :)

comment by pcm · 2012-07-20T23:49:46.641Z · LW(p) · GW(p)

The discussion of how conjunctive SIAI's vision is seems unclear to me. Luke appears to have responded to only part of what I think Holden is likely to have meant.

Some assumptions whose conjunctions seem important to me (in order of decreasing importance):

1) The extent to which AGI will consist of one entity taking over the world versus many diverse entities with limited ability to dominate the others.

2) The size of the team required to build the first AGI (if it requires thousands of people, a nonprofit is unlikely to acquire the necessary resources; if it can be done by one person, I wouldn't expect that person to work with SIAI [1]).

3) The degree to which concepts such as "friendly" or "humane" can be made clear enough to be implemented in software.

4) The feasibility of an AGI whose goals whose goals can be explicitly programmed before AGIs with messier goals become dominant. We have an example of intelligence with messy goals, which gives us some clues about how hard it is to create one. We have no comparable way of getting an outside view of the time and effort required for an intelligence with clean goals.

It seems reasonable to infer from this that SIAI has a greater than 90% chance of becoming irrelevant. But for an existential risk organization, a 90% chance of being irrelevant should seem like a very weak argument against it.

I believe that the creation of CFAR is a serious attempt to bypass problems associated with assumption 2, and my initial impression of CFAR is that it (but not SIAI) has a good claim to being the most valuable charity.

[1] I believe an analogy to Xanadu is useful, especially in the unlikely event that an AGI can be built by a single person. The creation of the world wide web was somewhat predictable and predicted, and for a long time Xanadu stood out as the organization which had given the most thought to how the web should be implemented. I see many similarities between the people at Xanadu and the people at SIAI in terms of vision and intelligence (although people at SIAI seem more willing to alter their beliefs). Yet if Tim Berners-Lee had joined Xanadu, he wouldn't have created the web. Two of the reasons is that the proto-transhumanist culture with which Xanadu was associated was reluctant to question the beliefs that the creators of the web needed to charge money for their product, and that the web should ensure that authors were paid for their work. I failed to question those beliefs in 1990. I haven't seen much evidence that either I or SIAI are much better today at doing the equivalent of identifying those as assumptions that were important to question.

comment by ChrisHallquist · 2012-07-13T08:15:24.342Z · LW(p) · GW(p)

After being initially impressed by this, I found one thing to pick at:

Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa.

"Could" here tells you very little. The question isn't whether "build FAI" could work as a strategy for mitigating all other existential risks, it's whether that strategy has a good enough chance of working to be superior to other strategies for mitigating the other risks. What's missing is an argument for saying "yes" to that second question.

comment by AlexMennen · 2012-07-10T01:31:31.639Z · LW(p) · GW(p)

our new donate page

This is off-topic, but I'm curious: What were you and Louie working on in that photo on the donate page?

Replies from: lukeprog, beoShaffer
comment by lukeprog · 2012-07-10T01:33:43.195Z · LW(p) · GW(p)

Why, we were busy working on a photo for the donate page! :)

Hopefully that photo is a more helpful illustration of the problems we work on than a photo of our normal work, which looks like a bunch of humans hunched over laptops, reading and typing.

Replies from: komponisto, Spurlock, ciphergoth
comment by komponisto · 2012-07-10T01:39:17.214Z · LW(p) · GW(p)

Support Singularity Institute and Make Your Mark on the Future

Definite articles missing in a number of places on that page (and others at the site).

Replies from: lukeprog
comment by lukeprog · 2012-07-10T01:45:11.447Z · LW(p) · GW(p)

Fixed.

comment by Spurlock · 2012-07-13T06:14:40.239Z · LW(p) · GW(p)

Just for the sake of feedback, that photo immediately made me laugh. It just seemed so obviously staged. I agree that it's better than "hunched over laptops" though.

comment by Paul Crowley (ciphergoth) · 2012-07-11T20:55:24.786Z · LW(p) · GW(p)

I have posed for a similar photo myself. Happily a colleague had had genuine cause to draw a large, confusing looking diagram not long beforehand, so we could all stand around it pointing at bits and looking thoughtful...

Replies from: Benquo
comment by Benquo · 2012-07-23T02:47:46.411Z · LW(p) · GW(p)

Same here.

comment by beoShaffer · 2012-07-13T07:07:41.695Z · LW(p) · GW(p)

It could just be me but it somehow seems wrong that Peter Theil is paired with the google option rather than pay-pal.

comment by homunq · 2012-07-16T01:09:01.577Z · LW(p) · GW(p)

You mention "computing overhang" as a threat essentially akin to hard takeoff. But regarding the value of FAI knowledge, it does not seem similar to me at all. A hard-takeoff AI can, at least in principal, be free from darwinian pressure. A "computing overhang" explosion of many small AIs will tend to be diverse and thus subject to strong evolutionary pressures of all kinds[1]. Presuming that FAI-ness is more-or-less delicate[1.5], those pressures are likely to destroy it as AIs multiply across available computing power (or, if we're extremely "lucky"[2], to cause FAI-ness of some kind to arise as an evolutionary adaptation). Thus, the "computing overhang" argument would seem to reduce, rather than increase, the probable value [3] of the FAI knowledge / expertise developed by SI. Can you comment on this?

[1] For instance, all else equal, an AI that was easier/faster to train, or able to install/care for its own "children", or more attractive to humans to "download", would have an advantage over one that wasn't; and though certain speculative arguments can be made, it is impossible to predict the combined evolutionary consequences of these various factors.

[1.5] The presumption that FAI-ness is delicate seems to be uncontroversial in the SI paradigm.

[2] I put "lucky" in quotes, because whether or not evolution pushes AIs towards or away from friendliness is probably a fact of mathematics (modulo a sufficiently-clear definition of friendliness[4]). Thus, this is somewhat like saying, "If I'm lucky, 4319 (a number I just arbitrarily chose, not divisible by 2, 3, or 5) is a prime number." This may or may not accord with your definition of probability theory and "luck".

[3] Instrumental value, that is; in terms of averting existential risk. Computing overhang would do nothing to reduce the epistemic value – the scientific, moral, or aesthetic interest of knowing how doomed we are (and/or how we are doomed), which is probably quite significant – of the marginal knowledge/expertise developed by SI.

[4] By the way, for sufficiently-broad definitions of friendliness, it is very plausibly true that evolution produces them naturally. If "friendly" just means "not likely to result in a boring universe", then evolution seems to fit the bill, from experience. But there are many tighter meanings of "friendly" for which it's hard to imagine how evolution could hit the target. So YMMV a good amount in this regard. But it doesn't change the argument that computing overhang generally argues against, not for, the instrumental value of SI knowledge/expertise.

Replies from: nshepperd
comment by nshepperd · 2012-07-16T05:43:12.799Z · LW(p) · GW(p)

One way for the world to quickly go from one single AI to millions of AIs is for the first AGI to deliberately copy itself, or arrange for itself to be copied many times, in order to take advantage of the world's computing power.

In this scenario, assuming the AI takes the first halfway-intelligent security measure of checksumming all its copies to prevent corruption, the vast majority of the copies will have exactly the same code. Hence, to begin with, there's no real variation for natural selection to work on. Secondly, unless the AI was programmed to have some kind of "selfish" goal system, the resulting copies will all also have the same utility function, so they'll want to cooperate, not compete (which is, after all, the reason an AI would want to copy itself. No point doing it if your copies are going to be your enemies).

Of course, a more intelligent first AGI would—rather than creating copies—modify itself to run on a distributed architecture allowing the one AI to take advantage of all the available computing power without all the inefficiency of message passing between independent copies.

In this situation there would still seem to be huge advantages to making the first AGI Friendly, since if it's at all competent, almost all its children ought to be Friendly too, and they can consequently use their combined computing power to weed out the defective copies. In some respects it's rather like an intelligence explosion, but using extra computing power rather than code modification to increase its speed and intelligence.

I suppose one possible alternative is if the AGI isn't smart enough to figure all this out by itself, and so the main method of copying is, to begin with, random humans downloading the FAI source code from, say, wikileaks. If humans are foolish, which they are, some of them will alter the code and run the modified programs, introducing the variation needed for evolution into the system.

Replies from: homunq
comment by homunq · 2012-07-16T13:37:50.345Z · LW(p) · GW(p)

The whole assumption that prompted this scenario is that there's no hard takeoff, so the first agi is probably around human-level in insight and ingenuity, though plausibly much faster. It seems likely that in these circumstances, human actions would still be significant. If it starts aggressively taking over computing resources, humanity will react, and unless the original programmers were unable to prevent v1.0 from being skynet-level unfriendly, at least some humans will escalate as far as necessary to get "their" computers under their control. At that point, it would be trivially easy to start up a mutated version; perhaps even one designed for better friendliness. But once mutations happen, evolution takes over.

Oh, and by the way, checksums may not work to safeguard friendliness for v1.0. For instance, most humans seem pretty friendly, but the wrong upbringinging could turn them bad.

Tl;dr: no-mutations is an inherently more-conjunctive scenario than mutations.

comment by Johnicholas · 2012-07-12T13:35:58.927Z · LW(p) · GW(p)

Thanks for posting this!

I am also grateful to Holden for provoking this - as far as I can tell, the only substantial public speech from SIAI on LessWrong. SIAI often seems to be far more concerned with internal projects than communicating with its supporters, such as most of us on LessWrong.

Replies from: lukeprog
comment by lukeprog · 2012-07-12T15:23:15.477Z · LW(p) · GW(p)

as far as I can tell, the only substantial public speech from SIAI on LessWrong

Also see How to Purchase AI Risk Reduction, So You Want to Save the World, AI Risk & Opportunity: A Strategic Analysis...

Replies from: Johnicholas
comment by Johnicholas · 2012-07-12T16:25:28.628Z · LW(p) · GW(p)

Those are interesting reviews but I didn't know they were speeches in SIAI's voice.

comment by AlexMennen · 2012-07-10T00:09:44.846Z · LW(p) · GW(p)

What if the smartest, most careful, most insanely safety-conscious AI researchers humanity can produce just aren't smart enough to solve the problem?

This is very worrying, especially in light of the lack of a public research agenda. SI's inability to describe its research agenda suggests the possibility that they cannot describe their research agenda because they do not know what they are doing because FAI is such a ridiculously hard problem that they have no idea where to begin. I'm hoping that SI will soon be able to make it clear that this is not the case.

What if no humans are altruistic enough to choose to build FAI over an AI that will make them king of the universe?

This is weak. Humans are pretty good at cooperation, and FAI will have to be a cooperative endeavor anyway. I suppose an organization could conspire to create AGI that will optimize for the organization's collective preferences rather than humanity's collective preferences, but this won't happen because:

  1. No one will throw a fit and defect from an FAI project because they won't be getting special treatment, but people will throw a fit if they perceive unfairness, so Friendly-to-humanity-AI will be a lot easier to get funding and community support for than friendly-to-exclusive-club-AI.
  2. Our near mode reasoning cannot comprehend how much better a personalized AGI slave would be over FAI for us personally, so people will make that sort of decision in far mode, where idealistic values can outweigh greediness.

Finally, even if some exclusive club did somehow create an AGI that was friendly to them in particular, it wouldn't be that bad. Even if people don't care about each other very much, we do at least a little bit. Let's say that an AGI optimizing an exclusive club's CEV devotes .001% of its resources to things the rest of humanity would care about, and the rest to the things that just the club cares about. This is only worse than FAI by a factor of 10^5, which is negligible compared to the difference between FAI and UFAI.

Replies from: lukeprog, ChrisHallquist
comment by lukeprog · 2012-07-10T00:16:42.394Z · LW(p) · GW(p)

This is very worrying, especially in light of the lack of a public research agenda. SI's inability to describe its research agenda suggests the possibility that they cannot describe their research agenda because they do not know what they are doing because FAI is such a ridiculously hard problem that they have no idea where to begin. I'm hoping that SI will soon be able to make it clear that this is not the case.

Yeah, this is the point of Eliezer's forthcoming 'Open Problems in Friendly AI' sequence, which I personally wish he had written in 2009 after his original set of sequences.

comment by ChrisHallquist · 2012-07-11T10:17:46.003Z · LW(p) · GW(p)

I find your points abput altruism unpersuasive, because humans are very good at convincing themselves that whatever's best for them, individually, is right or at least permissible. Even if they don't explicitly program it to care about only their CEV, they might work out the part of the program that's supposed to handle friendliness in a way subtly biased towards themselves.

comment by [deleted] · 2012-07-12T04:00:10.041Z · LW(p) · GW(p)

Lately I've been wondering whether it would make more sense to simply try to prevent the development of AGI rather than work to make it "friendly," at least for the foreseeable future. My thought is that AGI carries substantial existential risks, developing other innovations first might reduce those risks. and anything we can do to bring about such reductions is worth even enormous costs. In other words, if it takes ten thousand years to develop social or other innovations that would reduce the risk of terminal catastrophe by even 1% when AGI is finally developed, then that is well worth the delay.

Bostrom has mentioned surveillance, information restriction, and global coordination as ways of reducing risk (and I will add space exploration to make SIRCS), so why not focus on those right now instead of AGI? The same logic goes for advanced nanotechnology and biotechnology. Why develop any of these risky bio- and nanotechnologies before SIRCS? Do we think that effort spent trying to inhibit the development of AGI/bio/nano would be wasted because they are inevitable or at least so difficult to derail that "friendly" AI is our best shot? Where then has a detailed argument been made for this? Can someone point me to it? Or maybe we think SIRCS (especially surveillance) cannot be adequately developed without AGI/bio/nano? But surely global coordination and information restriction do not depend much on technology, so even without the surveillance and with limited space exploration, it still makes sense to further the others as much as possible before finally proceeding with AGI/bio/nano.

Replies from: Vaniver, Kaj_Sotala, Strange7, TheOtherDave
comment by Vaniver · 2012-07-12T04:21:00.696Z · LW(p) · GW(p)

simply try to prevent the development of AGI

That sounds like a goal, rather than a sequence of actions.

Replies from: None
comment by [deleted] · 2012-07-12T06:32:12.447Z · LW(p) · GW(p)

Sorry, I don't understand your point.

Replies from: Vaniver
comment by Vaniver · 2012-07-12T16:15:53.140Z · LW(p) · GW(p)

Consider an alternative situation: "simply try to prevent your teenage daughter from having sex." Well, actually achieving that goal takes more than just trying, and effective plans (which don't cause massive collateral damage) are rarely simple.

Replies from: None, fubarobfusco
comment by [deleted] · 2012-07-13T01:50:34.756Z · LW(p) · GW(p)

But even averting massive collateral damage could be less important than mitigating existential risk.

I think my above comment applies here.

Replies from: Vaniver
comment by Vaniver · 2012-07-13T04:48:07.812Z · LW(p) · GW(p)

It could be less important! The challenge is navigating value disagreements. Some people are willing to wait a century to make sure the future happens correctly, and others discuss how roughly 2 people die every second, which might stop once we reach the future, and others would comment that, if we delay for a century, we will be condemning them to death since we will ruin their chance of reaching the deathless future. Even among those who only care about existential risk, there are tradeoffs between different varieties of existential risk- it may be that by slowing down technological growth, we decrease our AGI risk but increase our asteroid risk.

Replies from: None
comment by [deleted] · 2012-07-14T01:40:12.082Z · LW(p) · GW(p)

Value disagreements are no doubt important. It depends on the discount rate. However, Bostrom has said that the biggest existential risks right now stem from human technology, so I think asteroid risk is not such a huge factor for the next century. If we expand that to the next ten thousand years then one might have to do some calculations.

If we assume a zero discount rate then the primary consideration becomes whether or not we can expect to have any impact on existential risk from AGI by putting it off. If we can lower the AGI-related existential risk by even 1% then it makes sense to delay AGI for even huge timespans assuming other risks are not increased too much. It therefore becomes very important to answer the question of whether such delays would in fact reduce AGI-related risk. Obviously it depends on the reasons for the delay. If the reason for the delay is a nuclear war that nearly annihilates humanity but we are lucky enough to slowly crawl back from the brink, I don't see any obvious reason why AGI-related risk would be reduced at all. But if the reason for the delay includes some conscious effort to focus first on SIRCS then some risk reduction seems likely.

comment by fubarobfusco · 2012-07-12T19:12:39.496Z · LW(p) · GW(p)

Would you mind switching to an example that doesn't assume so much about your audience?

Replies from: Vaniver
comment by Vaniver · 2012-07-12T20:16:04.410Z · LW(p) · GW(p)

If you can come up with a good one, I'll switch. I'm having trouble finding something where the risk of collateral damage is obvious (and obviously undesirable) and there are other agents with incentives to undermine the goal.

Replies from: fubarobfusco
comment by fubarobfusco · 2012-07-12T22:38:12.856Z · LW(p) · GW(p)

Sorry — your response indicates exactly in which way I should have been more clear.

Using "teenage daughter having sex" to stand for something "obviously undesirable" assumes a lot about your audience. For one, it assumes that your audience does not contain any sexually-active teenage women; nor any sex-positive parents of teenage women; nor any sex-positive sex-educators or queer activists; nor anyone who has had positive (and thus not "obviously undesirable") experiences as (or with) a sexually active teenage woman. To any of the above folks, "teenage daughter having sex" communicates something not undesirable at all (assuming the sex is wanted, of course).

Going by cultural tropes, your choice of example gives the impression that your audience is made of middle-aged, middle-class, straight, socially conservative men — or at least, people who take the views of that sort of person to be normal, everyday, and unmarked. On LW, a lot of your audience doesn't fit those assumptions: 25% of us are under 21; 17% of us are non-heterosexual; 38% of us grew up with non-theistic family values; and between 13% and 40% of us are non-monogamous, according to the 2011 survey for instance).

To be clear, I'm not concerned that you're offending or hurting anyone with your example. Rather, if you're trying to make a point to a general audience, you might consider drawing on examples that don't assume so much.

As for alternatives: "Simply try to prevent your house from being robbed" perhaps? I suspect that a very small fraction of LWers are burglars or promoters of burglary.

Replies from: army1987, PECOS-9, Vaniver, wedrifid
comment by A1987dM (army1987) · 2012-07-13T09:01:26.783Z · LW(p) · GW(p)

I don't have the goal of preventing my teenage daughter from having sex (firstly because I have no daughter yet, and secondly because the kind of people who would have such a goal often have a similar goal about younger sisters, and I don't -- indeed, I sometimes introduce single males to her); but I had no problem with pretending I had that goal for the sake of argument. Hell, even if Vaniver had said "simply try to cause more paperclips to exist" I would have pretended I had that goal.

BTW, I don't think that is the real reason why people flinch at such examples. If Vaniver had said “try to win your next motorcycle race” -- a goal that probably even fewer people share -- would anyone have objected?

Replies from: GLaDOS
comment by GLaDOS · 2012-07-14T17:27:44.383Z · LW(p) · GW(p)

BTW, I don't think that is the real reason why people flinch at such examples. If Vaniver had said “try to win your next motorcycle race” -- a goal that probably even fewer people share -- would anyone have objected?

I agree. I find it annoying when people pretend otherwise.

comment by PECOS-9 · 2012-07-13T00:18:19.840Z · LW(p) · GW(p)

Small correction: The term "obviously undesirable" referred to the potential collateral damage from trying to prevent the daughter from having sex, not to her having sex.

Replies from: fubarobfusco
comment by fubarobfusco · 2012-07-13T08:13:59.315Z · LW(p) · GW(p)

Oh. Well, that does make a little more sense.

comment by Vaniver · 2012-07-13T04:42:08.592Z · LW(p) · GW(p)

Using "teenage daughter having sex" to stand for something "obviously undesirable" assumes a lot about your audience.

I understand your perspective, and that's a large part of why I like it as an example. Is AGI something that's "obviously undesirable"?

comment by wedrifid · 2012-07-13T05:57:58.399Z · LW(p) · GW(p)

As for alternatives: "Simply try to prevent your house from being robbed" perhaps? I suspect that a very small fraction of LWers are burglars or promoters of burglary.

Burglary is an integral part of my family heritage. That's how we earned our passage to Australia. Specifically, burgaling some items a copper kettle, getting a death sentence and having it commuted to life in the prison continent.

With those kind of circumstances in mind I say burglary is ethically acceptable when, say, your family is starving but usually far too risky to be practical or advisable.

comment by Kaj_Sotala · 2012-07-12T11:05:04.717Z · LW(p) · GW(p)

Do we think that effort spent trying to inhibit the development of AGI/bio/nano would be wasted because they are inevitable or at least so difficult to derail that "friendly" AI is our best shot? Where then has a detailed argument been made for this? Can someone point me to it?

Here's one such argument, which I find quite persuasive.

Also, look at how little success the environmentalists have had with trying to restrict carbon emissions, or how the US government eventually gave up its attempts to restrict cryptography:

Lastly, national measures that prohibit publication will not work in an international community, especially in the Internet age. If either Science or Nature had refused to publish the H5N1 papers, they would have been published somewhere else. Even if some countries stop funding—or ban—this sort of research, it will still happen in another country.

The U.S. cryptography community saw this in the 1970s and early 1980s. At that time, the National Security Agency (NSA) controlled cryptography research, which included denying funding for research, classifying results after the fact, and using export-control laws to limit what ended up in products. This was the pre-Internet world, and it worked for a while. In the 1980s they gave up on classifying research, because an international community arose (6). The limited ability for U.S. researchers to get funding for block-cipher cryptanalysis merely moved that research to Europe and Asia. The NSA continued to limit the spread of cryptography via export-control laws; the U.S.-centric nature of the computer industry meant that this was effective. In the 1990s they gave up on controlling software because the international online community became mainstream; this period was called “the Crypto Wars” (7). Export-control laws did prevent Microsoft from embedding cryptography into Windows for over a decade, but it did nothing to prevent products made in other countries from filling the market gaps.

Today, there are no restrictions on cryptography, and many U.S. government standards are the result of public international competitions.

Replies from: None
comment by [deleted] · 2012-07-12T20:27:36.064Z · LW(p) · GW(p)

Anyone know of anything more on deliberate relinquishment? I have seen some serious discussion by Bill McKibben in his book Enough but that's about it.

In the linked post on the government controlling AGI development, the arguments say that it's hard to narrowly tailor the development of specific technologies. Information technology was advancing rapidly and cryptography proved impossible to control. The government putting specific restrictions on "soft AI" amid otherwise advancing IT similarly seems far-fetched. But there are other routes. Instead we could enact policies that would deliberately slow growth in broad sectors like IT, biotechnology, and anything leading to self-replicating nanotechnology. Or maybe slow economic growth entirely and have the government direct resources at SIRCS. One can hardly argue that it is impossible to slow or even stop economic growth. We are in the middle of a worldwide economic slowdown as we type. The United States has seen little growth for at least the past ten years. I think broad relinquishment certainly cannot be dismissed without extensive discussion and to me it seems the natural way to deal with existential risk.

Replies from: Kaj_Sotala, None
comment by Kaj_Sotala · 2012-07-13T10:26:11.299Z · LW(p) · GW(p)

One can hardly argue that it is impossible to slow or even stop economic growth. We are in the middle of a worldwide economic slowdown as we type. The United States has seen little growth for at least the past ten years.

Yes, but most governments are doing their best to undo that slowdown: you'd need immense political power in order to make them encourage it.

Replies from: None
comment by [deleted] · 2012-07-14T00:20:43.467Z · LW(p) · GW(p)

Given some of today's policy debates you might need less power than one might think. I think many governments, Europe being a clear case, are not doing their best to undo the slowdown. Rather, they are publicly proclaiming to be doing their best while actually setting very far from optimal policies. In a democracy you must always wear at least a cloak of serving the perceived public interest but that does not necessarily mean that you truly work in that perceived interest.

So when your Global Stasis Party wins 1% of the vote, you do not have 1% of people trying to bring about stasis and 99% trying to increase growth. Instead, already 50% of politicians may publicly proclaim to want increased growth but actually pursue growth reducing policies, and your 1% breaks the logjam and creates a 51% majority against growth. This assumes that you understand which parties are actually for and against growth, that is, you are wise enough to see through people's facades.

I wonder how today's policymakers would react to challengers seriously favoring no-growth economics. Would this have the effect of shifting the Overton Window? This position is so radically different from anything I've heard of that perhaps a small dose would have outsized effects.

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2012-07-14T08:33:28.440Z · LW(p) · GW(p)

You're right about that. And there is already the degrowth movement, plus lately I've been hearing even some less radical politicians talking about scaling down economic growth (due to it not increasing well-being in the developed countries anymore). So perhaps something could in fact be done about that.

comment by [deleted] · 2012-07-12T21:00:11.272Z · LW(p) · GW(p)

And of course there is Bill Joy's essay. I forgot about that. But seems like small potatoes.

comment by Strange7 · 2012-07-14T06:09:03.180Z · LW(p) · GW(p)

But surely global coordination and information restriction do not depend much on technology,

Please, oh please, think about this for five minutes. Coordination cannot happen without communication, and global communication depends very much on technology.

Replies from: wedrifid, None
comment by wedrifid · 2012-07-14T09:46:26.799Z · LW(p) · GW(p)

Coordination cannot happen without communication

Not technically true. True enough for humans though.

comment by [deleted] · 2012-07-14T06:30:48.422Z · LW(p) · GW(p)

Well I agree that it is not as obvious as I made out. However, for this purpose it suffices to note that these innovations/social features could be greatly furthered without more technological advances.

comment by TheOtherDave · 2012-07-12T15:04:42.503Z · LW(p) · GW(p)

Do you see any reason to believe this argument wasn't equally sound (albeit with different scary technologies) thirty years ago, or a hundred?

Replies from: None
comment by [deleted] · 2012-07-12T16:02:36.620Z · LW(p) · GW(p)

Thirty years ago it may have still been valid although difficult to make since nobody knew about the risks of AGI or self-replicating assemblers. A hundred years ago it would not have been valid in this form since we lacked surveillance and space exploration technologies.

Keep in mind that we have a certain bias on this question since we happen to have survived up until this point in history but there is no guarantee of that in the future.

comment by siodine · 2012-07-11T16:55:44.779Z · LW(p) · GW(p)

SI and rationality

Paraphrasing:

Holden expects us to have epistemic and instrumental powers of rationality that would make us successful in Western society, however this is a strawman. Being rational isn't succeeding in society, but succeeding at your own goals.

(Btw, I'm going to coin a new term for this: the straw-morra [a reference to the main character from Limitless]).

Now that being said, you shouldn't anticipate that the members of SI would be morra-like.

There's a problem with this: arguments made to support an individual are not nearly as compelling for groups. We should anticipate that people will have weird goals, and end up doing things that break societal convention, but on average? I don't think so. When you have a group of people like those working at SI, you should anticipate that there are a few people that are morra-like -- and those people should be able to turn around everything and thereby make the group straw-morra not much of a strawman.

And what do you know? There is at least one morra-like person I've seen at SI: lukeprog. After Luke first heard about the singularity, he became the executive director of the SI within three months (?), and without a degree or the relevant experience. Since then, he appears to have made every effort to completely turn around SI for the better, and appears to be succeeding.

I think your argument should be that the SI has turned over a new leaf since you've joined, Luke.

(Not saying that in large part it isn't your argument, but I don't think it would be wrong to make it explicit that you will make SI successful like Jobs made Apple successful.)

Furthermore, LWers and SIers doing well on Frederick's CRT is as impressive as doing well on a multiple-choice driving test without having ever driven--by itself, at least. Connect training for the CRT and then doing well with the CRT to something real via research.

Replies from: lukeprog
comment by lukeprog · 2012-07-11T17:53:16.639Z · LW(p) · GW(p)

I reject the paraphrase, and the test you link to involved a lot more than the CRT.

Replies from: siodine
comment by siodine · 2012-07-11T18:08:37.959Z · LW(p) · GW(p)

I reject the paraphrase

Why?

Direct quotes:

Holden: To me, the best evidence of superior general rationality (or of insight into it) would be objectively impressive achievements (successful commercial ventures, highly prestigious awards, clear innovations, etc.) and/or accumulation of wealth and power. As mentioned above, SI staff/supporters/advocates do not seem particularly impressive on these fronts...

That is synonymous with success in Western society. His definition of superior general rationality or insight (read: instrumental and epistemic rationality) fits with my paraphrase of that direct quote.

Luke: Unfortunately, this seems to misunderstand the term "rationality" as it is meant in cognitive science. As I explained elsewhere:

You think his definition is wrong.

Luke: Like intelligence and money, rationality is only a ceteris paribus predictor of success. So while it's empirically true (Stanovich 2010) that rationality is a predictor of life success, it's a weak one. (At least, it's a weak predictor of success at the levels of human rationality we are capable of training today.) If you want to more reliably achieve life success, I recommend inheriting a billion dollars or, failing that, being born+raised to have an excellent work ethic and low akrasia.

I.e., we shouldn't necessarily expect rational people to be successful. The only problem I see with my paraphrase is in explaining why some people aren't successful given that they're rational (per your definition), which is by having atypical goals. Well, that should make sense if they're instrumentally rational (of course, this discounts luck. but i don't think luck is an overriding factor on average, here.)

the test you link to involved a lot more than the CRT.

This isn't useful information unless you also link to the other tests and show why they're meaningful after training to do well on them. I would take it out of your argument, as is. (Also, it's a spider web of links -- which I've read before).

Replies from: lukeprog
comment by lukeprog · 2012-07-11T18:24:05.187Z · LW(p) · GW(p)

Your paraphrase of me was:

Holden expects us to have epistemic and instrumental powers of rationality that would make us successful in Western society, however this is a strawman. Being rational isn't succeeding in society, but succeeding at your own goals.

But I didn't think that what Holden got wrong was a confusion between one's own goals and "success in Western society" goals. Many of SI's own goals include "success in Western society" goals like lots of accumulated wealth and power. Instead, what I thought Holden got wrong was his estimate of the relation between rationality and success.

Re: the testing. LWers hadn't trained specifically for the battery of tests given them that day, but they outperformed every other group I know of who has taken those tests. I agree that these data aren't as useful as the data CFAR is collecting now about the impact of rationality training on measures of life success, but they are suggestive enough to support a weak, qualified claim like the one I made, that "it seems" like LWers are more rational than the general population.

Replies from: komponisto, siodine
comment by komponisto · 2012-07-11T19:02:38.895Z · LW(p) · GW(p)

It occurs to me that Holden's actual reasoning (never mind what he said) is perhaps not about rationality per se and instead may be along these lines: "Since SI staff haven't already accumulated wealth and power, they probably suffer from something like insufficient work ethic or high akrasia or not-having-inherited-billions, and thus will probably be ineffective at achieving the kind of extremely-ambitious goals they have set for themselves."

Replies from: lmm
comment by lmm · 2013-02-10T20:04:34.862Z · LW(p) · GW(p)

It may or may not be Holden's, but I think you've put your finger on my real reasons for not wanting to donate to SI. I'd be interested to hear any counterpoint.

comment by siodine · 2012-07-11T18:35:34.323Z · LW(p) · GW(p)

But I didn't think that what Holden got wrong was a confusion between one's own goals and "success in Western society" goals. Many of SI's own goals include "success in Western society" goals like lots of accumulated wealth and power. Instead, what I thought Holden got wrong was his estimate of the relation between rationality and success.

Right, then I (correctly, I think) took your reasoning a step farther than you did. The SI's goals don't necessarily correspond with its members' goals. SIers may be there because they want to be around a lot of cool people, and may not have any particular desire for being successful (I suspect many of them do). But this discounts luck, like luck in being born conscientiousness -- the power to accomplish your goals. And like I said, poor luck like that is unconvincing when applied to a group of people.

that "it seems" like LWers are more rational than the general population.

When I say "it seems", being an unknown here, people will likely take me to be reporting an anecdote. When you, the executive director of SI and a researcher on this topic, says "it seems" I think people will take it as a weak impression of the available research. Scientists adept at communicating with journalists get around this by saying "I speculate" instead.

comment by Aeonios · 2012-07-19T03:41:14.453Z · LW(p) · GW(p)

There are several reasons why I agree with the "Pascal's Mugging" comment:

  1. Intelligence Explosion: There are several reasons why an intelligence explosion is highly unlikely. First, upgrading computer fabrication equipment requires on the order of 5-15 billion dollars. Second, intelligence is not measured in gigaflops or petaflops, and mere improvement of fabrication technology is insufficient to increase intelligence. Finally, the requisite variety that drives innovation and creation will be extremely difficult to produce in AIs of a limited quantity. Succeeding in engineering or science requires copious amounts of failure, and AIs are not immune to this either.

2.Computing Overhang: The very claim of "computing overhang" shows total ignorance of actual AI, and of the incredible complexity of human intelligence. The human brain is made up of numerous small regions which both "run programs" inside of themselves and communicate via synchronous signals with the rest of the brain in concert (in neural, and not transistor form). A human level AI would be the same, and could not simply be run on, say, your average web server, no matter how decked out it is. An AI that could run on "extra" hardware would probably be too primitive to reproduce itself on purpose, and if it did it would be a minor nuisance at worst.

  1. The idea that AIs can be "programmed" is mostly nonsense. Very simple AIs can be "programmed", sure, but neural networks require training by experience, just like humans. An AI with human level intelligence or greater would need to be taught like a child, and any "friendliness" that came of it would be the result of its "instincts" (I'm guessing we wouldn't want AIs with aggression) and of its experience. Additionally, as mentioned above, the need for variety in intelligence to produce real progress means that copying them will not be as economical as it might seem, not to mention not nearly as simple as you make it out to be.

  2. The timescales you present are absurd. Humans barely have an understanding of human psychology, and they do terrible at it with the knowledge they do have. We may have teraflops desktop computers in 20 years, but that does not imply that they will magically sprout intelligence! Technically, even with today's technology you could produce a program much more sophisticated than shrdlu was, and receive orders of magnitude better performance than the original did, but it is the complexity of programming something that learns that prevents it from occurring commonly. It will likely be a hundred maybe two hundred years before we have a sophisticated enough understanding of human intelligence to reproduce it in any meaningful way. We have only taken the bare first steps into the field thus far, and development has been much slower than for the rest of the computing industry.

In short, human stupidity that is occurring right now is a much greater threat to our future as a species than is any hypothetical superintelligent AI that might finally appear a hundred years or more in the future. If human civilization is even to maintain its integrity long enough to produce such a thing ever, then widespread ignorance of economics, spirituality/psychology, and general lack of sensitivity to culture and art must be dealt with first and foremost.

Replies from: DaFranker, Nautilus
comment by DaFranker · 2012-07-19T16:17:22.720Z · LW(p) · GW(p)

Nice try. You've almost succeeded at summarizing practically all the relevant arguments against the SI initiative that have already been refuted. Notice the last part there that says "have already been refuted".

Each of the assertions you make are ones that members of the SI have already adressed and refuted. I'd take the time to decompose your post into a list of assertions and give you links to the particular articles and posts where those arguments were taken down, but I believe this would be an unwise use of my time.

It would, at any rate, be much simpler to tell you to at least read the articles on the Facing the Singularity site, which are a good vulgarized introduction to the topic. In particular, the point of timescale overestimates is clearly adressed there, as is that of the "complexity" of human intelligence.

I'd like to also indicate that you are falsely overcomplexifying the activity of the human brain. There are no such things as "numerous small regions" that "run programs" or "communicate". These are interpretations of patterns within the natural events, which are simply, first and foremost, a huge collection of neurons sending signals to other neurons, each with its own unique set of links to particular other neurons and a domain of nearby neurons to which it could potentially link itself. This is no different from the old core sequence article here on LessWrong where Eliezer talks about how reality doesn't actually follow the rules of aerodynamics to move air around a plane - it's merely interactions of countless tiny [bits of something] on a grand scale, with each tiny [bit of something] doing its own thing, and nowhere along the entire process do the formulae we use for aerodynamics get "solved" to decide where one of the [bits of something] must go.

Anyway, I'll cut myself short here - I doubt any more deserves to be said on this. If you are willing to learn and question yourself, and actually want to become a better rationalist and obtain more correct beliefs, the best way to start is to go read some of the articles that are already on LessWrong and actually read the material on the Singinst.org website, most of which is very readable even without prior technical knowledge or experience.

Replies from: homunq, Aeonios
comment by homunq · 2012-07-23T16:58:10.259Z · LW(p) · GW(p)

I don't pretend I've read every refutation of Aeonios's arguments that's out there, but I've read a few. Generally, those "refutations" strike me as plausible arguments by smart people, but far from bulletproof. Thus, I think that your [DaFranker's] attitude of "I know better so I barely have time for this" isn't the best one.

(I'm sorry, I don't have time to get into the details of the arguments themselves, so this post is all meta. I realize that that's somewhat hypocritical, but "hypocrisy is the tribute vice pays to virtue" so I'm OK with that.)

Replies from: DaFranker
comment by DaFranker · 2012-07-23T22:26:33.434Z · LW(p) · GW(p)

Indeed, most of them are nothing but smart arguments by smart people, and have not been formally proven. However, none of the arguments for anything in AI research is formally proven, except for some very primitive mathematics and computer science stuff. Basically, at the moment all we have to go on is a lot of thought, some circumstantial "evidence" and our sets of beliefs.

All I'm saying is that, if you watch the trend, it's much more likely (with my priors, at least) that the S.I. is "right" and that the arguments that keep being brought against it are unenlightened, in light of a few key observables; each argument against S.I. being "refuted" one after another historically, most of the critics of the S.I. not having spent nearly as much time thinking about the issues at hand and actually researching AIs, etc.

It's not that I know better, merely that with the evidence presented to me from "both sides" (if one were to arbitrarily delimit two specific opposing factions, for simplification) and my own knowledge of the world seem to indicate towards the "S.I. side" having propositions which are much more likely to be true. I'll admit that the end result does project that attitude, but this is mainly incidental from the fact that I actually was pressed for time when I wrote that particular post, and I did believe true that it be pointless to discuss and argument further for the benefit of an outsider that hadn't yet read the relevant material on the topic at hand.

Replies from: homunq
comment by homunq · 2012-07-24T02:13:25.289Z · LW(p) · GW(p)

But in this case, "more likely to be true" means something like "a good enough argument to move my priors by roughly an order of magnitude, or two at the outside". Since in the face of our ignorance of the future, reasonable priors could differ by several orders of magnitude, even the best arguments I've seen aren't enough to dismiss any "side" as silly or not worthy of further consideration (except stuff that was obviously silly to begin with).

Replies from: DaFranker
comment by DaFranker · 2012-07-24T13:56:31.248Z · LW(p) · GW(p)

That's a very good point.

I was intuitively tempted to retort a bunch of things about likelyness of exception and information taken into consideration, but I realized before posting that I was actually falling victim to several biases in that train of thought. You've actually given me a new way to think of the issue. I'm still of the intuition that any new way to think about it will only reinforce my beliefs and support the S.I. over time, though.

For now, I'm content to concede that I was weighing too heavily on my priors and my confidence in my own knowledge of the universe (on which my posteriors for AI issues inevitably depend, in one way or another), among possibly more mistakes. However, it seems at first glance to be even more evidence for the need of a new mathematical or logical language to discuss these questions more in depth, detail and formality.

comment by Aeonios · 2012-07-20T06:23:27.654Z · LW(p) · GW(p)

I decided to read through the essays on facingthesingularity, and I have found more faults than I care to address. Also, I can see why you might think that the workings of the human mind are simple, given that the general attitude here is that you should go around maximizing your "utility function". That is utter and complete nonsense for reasons that deserve their own blog post. What I see more than anything is a bunch of ex-christians worshipping their newfound hypothetical machine god, and doing so by lowering themselves to the level of machine rather than raising machine to the level of man.

I'll give one good example to make clear what I mean: (from facingthesingularity) But that can’t possibly be correct. The probability of Linda being a bank teller can’t be less than the probability of her being a bank teller and a feminist.

This is my “Humans are crazy” Exhibit A: The laws of probability theory dictate that as a story gets more complicated, and depends on the truth of more and more claims, its probability of being true decreases. But for humans, a story often seems more likely as it is embellished with details that paint a compelling story: “Linda can’t be just a bank teller; look at her! She majored in philosophy and participated in antinuclear demonstrations. She’s probably a feminist bank teller.”


But, the thing is, context informs us that while a philosophy major is unlikely to work for a bank, a feminist is much more likely to work a "pink collar job" such as secretarial work or as a bank teller, where they can use the state to monger for positions, pay and benefits above and beyond what they deserve. A woman who otherwise would have no interest in business or finance, when indoctrinated by the feminist movement, will leap to take crappy office jobs so they can raise their fists in the air in onionistic fashion against the horrible man-oppression they righteously defeated with their superior women intellects. The simple fact that "philosophy" in a modern school amounts to "The History of Philosophy", and is utterly useless might also clue one in on the integrity or lack thereof that a person might have, although of course it isn't conclusive.

In short, impressive "logical" arguments about how probabilities of complements must be additive can only be justified in a vacuum without context, a situation that does not exist in the real world.

Replies from: Richard_Kennaway, DaFranker
comment by Richard_Kennaway · 2012-07-20T11:17:14.339Z · LW(p) · GW(p)

But, the thing is, context informs us

None of that fog obscures the basic fact that the number of feminist female bank tellers cannot possibly be greater than the number of female bank tellers. The world is complex, but that does not mean that there are no simple truths about it. This is one of them.

People have thought up all manner of ways of exonerating people from the conjunction fallacy, but if you go back to Eliezer's two posts about it, you will find some details of the experiments that have been conducted. His conclusion:

The conjunction fallacy is probably the single most questioned bias ever introduced, which means that it now ranks among the best replicated. The conventional interpretation has been nearly absolutely nailed down. Questioning, in science, calls forth answers.

The conjunction error is an error, and people do make it.

Replies from: Aeonios
comment by Aeonios · 2012-07-21T09:22:05.612Z · LW(p) · GW(p)

I reread that section, and you are correct, given that they don't tell you whether or not she is a feminist, it cannot be used as a criterion to determine whether or not she is a banker. However, I would say that the example, in typical public education style, is loaded and begs an incorrect answer. Since the only data you are given is insufficient to draw any conclusions, the participant is lead to speculate without understanding the limitations of the question.

As for "utility function", there are at least three reasons why it is not just wrong, but entirely impossible.

1: Utility is heterogeneous. Which gives you more "utility", a bowl of ice cream or a chair? The question itself is nonsensical, the quality/type of utility gained from a chair and a bowl of ice cream are entirely different.

2: Utility is complementary. If I own a field, the field by itself may be useless to me. Add a picnic table, and some food and suddenly the field gains utility beyond the food, table, and field individually. Perhaps I could run horses through the field, or add some labor and intelligent work and turn it into a garden, but the utility I get from it depends on my preferences (which may change) and the combination with other resources and a plan. Another example, a person who owns a yaht would probably get more "utility" out of going to the ocean than someone who does not.

3: Utility is marginal. For the first three scoops of ice cream, I'd say I get equal "utility" from each. The fourth scoop yields comparably less "utility" than the previous three, and by the fifth the utility becomes negative, as I feel sick afterwards. By six scoops I'm throwing away ice cream. On the other hand, if I have 99 horses, whether I gain or lose one would not make much difference as to the utility I get from them, but if I only have 2 horses, losing one could mean losing more than half of my utility. Different things have different useful quantities in different situations depending on how they are used.

4: Utility cannot be measured. This should be obvious. Even if we were to invent a magical brain scanner that could measure brain activity in high resolution in vivo, utility is not always the same for the same thing every time it is experienced, and you still have the apples-oranges problem that makes the comparison meaningless to begin with.

5: Human psychology is not a mere matter of using logic correctly or not. In this case, it is definitely a misapplication, but it seems the only psychology that gets any attention around here is anecdotes from college textbooks on decisions and some oversimplified mechanistic theorizing from neuroscience. You talk about anchoring like it's some horrible disease, when it's the same fundamental process required for memory and mastery of concepts. You've probably heard of dissociation but you probably wouldn't believe me if I told you that memory can be flipped on and off like a light switch at the whim of your unconscious.

That aside, treating intelligence as a machine that optimizes things is missing the entire point of intelligence. If you had ever read Douglas Hofstadter's "Godel Escher Bach", or Christopher Alexander's "The Nature of Order" series, you might have a greater appreciation for the role that abstract pattern recognition and metaphor plays in intelligence.

Finally, I read two "papers" from SI, and found them entirely unprofessional. They were both full of vague terminology and unjustified assertions and were written in a colloquial style that pretty much begs the reader to believe the crap they're spewing. You get lots of special graphs showing how a superhuman AI would be something like two orders of magnitude more intelligent than humans, but no justification for how these machines will magically be able to produce the economic resources to reach that level of development "overnight". Comparing modern "AIs" to mice is probably the most absurd fallacy I've seen thus far. Even the most sophisticated AI for driving cars cannot drive on a real road, its "intelligence" is overall still lacking in sophistication compared to a honey bee, and the equipment required to produce its rudimentary driving skills far outweigh the benefits. Computer hardware may improve regularly by Moore's Law, but the field of AI research does not, and there is no evidence that we will see a jump in computer intelligence from below insects to above orangutans any time soon. When we do, it will probably take them 50-100 years to leave us fully at orangutan level.

Replies from: wedrifid, thomblake, army1987, hairyfigment, David_Gerard
comment by wedrifid · 2012-07-21T18:58:54.622Z · LW(p) · GW(p)

As for "utility function", there are at least three reasons why it is not just wrong, but entirely impossible.

You don't understand what that term means.

comment by thomblake · 2012-07-23T17:11:16.204Z · LW(p) · GW(p)

Even the most sophisticated AI for driving cars cannot drive on a real road,

This is false. Though currently there are situations that may come up that will prompt it to give up control to the human driver, and there are some situations (such as high reflectivity / packed snow) that they can't handle yet.

comment by A1987dM (army1987) · 2012-07-21T18:00:25.766Z · LW(p) · GW(p)

1: Utility is heterogeneous. Which gives you more "utility", a bowl of ice cream or a chair? The question itself is nonsensical, the quality/type of utility gained from a chair and a bowl of ice cream are entirely different.

It's not nonsensical; it means “would you rather have a bowl of ice cream or a chair?” Of course the answer is “it depends”, but no-one ever claimed that U(x + a bowl of ice cream) − U(x) doesn't depend on x.

comment by hairyfigment · 2012-07-21T19:06:16.941Z · LW(p) · GW(p)

To focus on one problem with this, you write:

treating intelligence as a machine that optimizes things is missing the entire point of intelligence. If you had ever read Douglas Hofstadter's "Godel Escher Bach", or Christopher Alexander's "The Nature of Order" series, you might have a greater appreciation for the role that abstract pattern recognition and metaphor plays in intelligence.

Eliezer has read GEB and praised it above the mountains (literally). So a charitable reader of him and his colleagues might suppose that they know the point about pattern recognition, but do not see the connection that you find obvious. And in fact I don't know what you're responding to, or what you think your second quoted sentence has to do with the first, or what practical conclusion you draw from it through what argument. Perhaps you could spell it out in detail for us mortals?

comment by David_Gerard · 2012-07-21T11:35:00.420Z · LW(p) · GW(p)

Finally, I read two "papers" from SI, and found them entirely unprofessional.

Which two papers, by the way?

comment by DaFranker · 2012-07-20T09:42:28.800Z · LW(p) · GW(p)

In short, impressive "logical" arguments about how probabilities of complements must be additive can only be justified in a vacuum without context, a situation that does not exist in the real world.

Every "context" can be described as a set of facts and parameters, AKA more data. Perfect data on the context means perfect information. Perfect information means perfect choice and perfect predictions. Sure, it might seem to you like the logical arguments expressed are "too basic to apply to the real world", but a utility function is really only ever "wrong" when it fails to apply the correct utility to the correct element ("sorting out your priorities"), whether that's by improper design, lack of self-awareness, missing information or some other hypothetical reason.

For every "no but theory doesn't apply to the real world" or "theory and practice are different" argument, there is always an explanation for the proposed difference between theory and reality, and this explanation can be included in the theory. The point isn't to throw out reality and use our own virtual-theoretical world. It's to update our model (the theory) in the most sane and rational way, over and over again (constantly and continuously) so that we get better.

Likewise, maximizing one's own utility function is not the reduce-oneself-to-machine-worshipper-of-the-machine-god that you seem to believe. I have emotions, I get angry, I get irritated (e.g. at your response*), I am happy, etc. Yet it appears that for several years, in hindsight, I've been maximizing my utility function without knowing that that's how it's called (and learning the terminology and more correct/formal ways of talking about it once I started reading LessWrong).

Your "utility function" is not one simple formula that you use to plug in values to variables, compute, and then call it a decision. The utility function of a person is the entire, general completeness of what that person wants and desires and values. If I tried to write down for you my own utility function, it would be both utterly incomprehensible and probably ridiculously ugly. That's assuming I'd even be capable of writing it all down - limited self-awareness, biases, continuous change, and all that stuff.

To put it all in perspective, "maximizing one's utility function" is very much equivalent to "according to what information you have, spend as much time as you think is worth taking deciding on the probably-best course of action available, and then act on it, such that in hindsight you'll have maximized your chances of reaching your own objectives". This doesn't mean obtaining perfect information or never being wrong or worshipping a formula. It simply means living your own life, in your own way, with better (and improving) awareness of yourself and updating (changing) your own beliefs when they're no longer correct so that you can act and behave more rationally. In this optic, LessWrong is essentially a large self-help group for normal people who just want to be better at knowing things and making decisions in general.

On a last note, FacingTheSingularity does not contain a bunch of scientific essays that would be the end answer to all singularity concerns. At best, it could be considered as one multi-chapter essay going through various points to support the primary thesis that the one author believes that the various experts are right about the Singularity being "imminent" (within this century at the outset). This is clearly stated on the front page, which is also the table of contents. As I've said in my previous reply, it's a good vulgarized introduction. However, the real meat comes from the SingInst articles, essays and theses, as well as some of the more official stuff on LessWrong. Eliezer's Timeless Decision Theory paper is a good example of more rigorous and technical writing, though it's by far not the most relevant, nor do I think it's the first one that a newcomer should read. If you're interested in possible AI decision-making techniques, though, it's a very interesting and pertinent reading.

*(I was slightly irritated that I failed to fully communicate my point and at the dismissal of long-thought-and-debated theories, including beliefs I've revalidated time and time again over the years, along with the childish comment on ex-christians and their "machine god". This does not mean, however, that I transpose this irritation towards you or some other, unrelated outlet. My irritation is my own and a product of my own mental models.)

Edit: Fixed some of the text and added missing footnote.

comment by Nautilus · 2012-07-21T14:11:33.356Z · LW(p) · GW(p)

general lack of sensitivity to culture and art must be dealt with first

Where'd that come from? Are you an artists / anthropologist?