Thoughts on the Singularity Institute (SI)

holdenkarnofsky

Thoughts on the Singularity Institute (SI)

post by HoldenKarnofsky · 2012-05-11T04:31:30.364Z · LW · GW · Legacy · 1274 comments

      Summary of my views
      Intent of this post
  Does SI have a well-argued case that its work is beneficial and important?
    Objection 1: it seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous.
    Objection 2: SI appears to neglect the potentially important distinction between "tool" and "agent" AI.
    Objection 3: SI's envisioned scenario is far more specific and conjunctive than it appears at first glance, and I believe this scenario to be highly unlikely.
    Other objections to SI's views
    Wrapup
  Is SI the kind of organization we want to bet on?
    Wrapup
  But if there's even a chance …
  Existential risk reduction as a cause
  How I might change my views
  Acknowledgements
None
1274 comments

This post presents thoughts on the Singularity Institute from Holden Karnofsky, Co-Executive Director of GiveWell. Note: Luke Muehlhauser, the Executive Director of the Singularity Institute, reviewed a draft of this post, and commented: "I do generally agree that your complaints are either correct (especially re: past organizational competence) or incorrect but not addressed by SI in clear argumentative writing (this includes the part on 'tool' AI). I am working to address both categories of issues." I take Luke's comment to be a significant mark in SI's favor, because it indicates an explicit recognition of the problems I raise, and thus increases my estimate of the likelihood that SI will work to address them.

September 2012 update: responses have been posted by Luke and Eliezer (and I have responded in the comments of their posts). I have also added acknowledgements.

The Singularity Institute (SI) is a charity that GiveWell has been repeatedly asked to evaluate. In the past, SI has been outside our scope (as we were focused on specific areas such as international aid). With GiveWell Labs we are open to any giving opportunity, no matter what form and what sector, but we still do not currently plan to recommend SI; given the amount of interest some of our audience has expressed, I feel it is important to explain why. Our views, of course, remain open to change. (Note: I am posting this only to Less Wrong, not to the GiveWell Blog, because I believe that everyone who would be interested in this post will see it here.)

I am currently the GiveWell staff member who has put the most time and effort into engaging with and evaluating SI. Other GiveWell staff currently agree with my bottom-line view that we should not recommend SI, but this does not mean they have engaged with each of my specific arguments. Therefore, while the lack of recommendation of SI is something that GiveWell stands behind, the specific arguments in this post should be attributed only to me, not to GiveWell.

Summary of my views

The argument advanced by SI for why the work it's doing is beneficial and important seems both wrong and poorly argued to me. My sense at the moment is that the arguments SI is making would, if accepted, increase rather than decrease the risk of an AI-related catastrophe. More
SI has, or has had, multiple properties that I associate with ineffective organizations, and I do not see any specific evidence that its personnel/organization are well-suited to the tasks it has set for itself. More
A common argument for giving to SI is that "even an infinitesimal chance that it is right" would be sufficient given the stakes. I have written previously about why I reject this reasoning; in addition, prominent SI representatives seem to reject this particular argument as well (i.e., they believe that one should support SI only if one believes it is a strong organization making strong arguments). More
My sense is that at this point, given SI's current financial state, withholding funds from SI is likely better for its mission than donating to it. (I would not take this view to the furthest extreme; the argument that SI should have some funding seems stronger to me than the argument that it should have as much as it currently has.)
I find existential risk reduction to be a fairly promising area for philanthropy, and plan to investigate it further. More
There are many things that could happen that would cause me to revise my view on SI. However, I do not plan to respond to all comment responses to this post. (Given the volume of responses we may receive, I may not be able to even read all the comments on this post.) I do not believe these two statements are inconsistent, and I lay out paths for getting me to change my mind that are likely to work better than posting comments. (Of course I encourage people to post comments; I'm just noting in advance that this action, alone, doesn't guarantee that I will consider your argument.) More

Intent of this post

I did not write this post with the purpose of "hurting" SI. Rather, I wrote it in the hopes that one of these three things (or some combination) will happen:

New arguments are raised that cause me to change my mind and recognize SI as an outstanding giving opportunity. If this happens I will likely attempt to raise more money for SI (most likely by discussing it with other GiveWell staff and collectively considering a GiveWell Labs recommendation).
SI concedes that my objections are valid and increases its determination to address them. A few years from now, SI is a better organization and more effective in its mission.
SI can't or won't make changes, and SI's supporters feel my objections are valid, so SI loses some support, freeing up resources for other approaches to doing good.

Which one of these occurs will hopefully be driven primarily by the merits of the different arguments raised. Because of this, I think that whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.

Does SI have a well-argued case that its work is beneficial and important?

I know no more concise summary of SI's views than this page, so here I give my own impressions of what SI believes, in italics.

There is some chance that in the near future (next 20-100 years), an "artificial general intelligence" (AGI) - a computer that is vastly more intelligent than humans in every relevant way - will be created.

This AGI will likely have a utility function and will seek to maximize utility according to this function.

This AGI will be so much more powerful than humans - due to its superior intelligence - that it will be able to reshape the world to maximize its utility, and humans will not be able to stop it from doing so.

Therefore, it is crucial that its utility function be one that is reasonably harmonious with what humans want. A "Friendly" utility function is one that is reasonably harmonious with what humans want, such that a "Friendly" AGI (FAI) would change the world for the better (by human standards) while an "Unfriendly" AGI (UFAI) would essentially wipe out humanity (or worse).

Unless great care is taken specifically to make a utility function "Friendly," it will be "Unfriendly," since the things humans value are a tiny subset of the things that are possible.

Therefore, it is crucially important to develop "Friendliness theory" that helps us to ensure that the first strong AGI's utility function will be "Friendly." The developer of Friendliness theory could use it to build an FAI directly or could disseminate the theory so that others working on AGI are more likely to build FAI as opposed to UFAI.

From the time I first heard this argument, it has seemed to me to be skipping important steps and making major unjustified assumptions. However, for a long time I believed this could easily be due to my inferior understanding of the relevant issues. I believed my own views on the argument to have only very low relevance (as I stated in my 2011 interview with SI representatives). Over time, I have had many discussions with SI supporters and advocates, as well as with non-supporters who I believe understand the relevant issues well. I now believe - for the moment - that my objections are highly relevant, that they cannot be dismissed as simple "layman's misunderstandings" (as they have been by various SI supporters in the past), and that SI has not published anything that addresses them in a clear way.

Below, I list my major objections. I do not believe that these objections constitute a sharp/tight case for the idea that SI's work has low/negative value; I believe, instead, that SI's own arguments are too vague for such a rebuttal to be possible. There are many possible responses to my objections, but SI's public arguments (and the private arguments) do not make clear which possible response (if any) SI would choose to take up and defend. Hopefully the dialogue following this post will clarify what SI believes and why.

Some of my views are discussed at greater length (though with less clarity) in a public transcript of a conversation I had with SI supporter Jaan Tallinn. I refer to this transcript as "Karnofsky/Tallinn 2011."

Objection 1: it seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous.

Suppose, for the sake of argument, that SI manages to create what it believes to be an FAI. Suppose that it is successful in the "AGI" part of its goal, i.e., it has successfully created an intelligence vastly superior to human intelligence and extraordinarily powerful from our perspective. Suppose that it has also done its best on the "Friendly" part of the goal: it has developed a formal argument for why its AGI's utility function will be Friendly, it believes this argument to be airtight, and it has had this argument checked over by 100 of the world's most intelligent and relevantly experienced people. Suppose that SI now activates its AGI, unleashing it to reshape the world as it sees fit. What will be the outcome?

I believe that the probability of an unfavorable outcome - by which I mean an outcome essentially equivalent to what a UFAI would bring about - exceeds 90% in such a scenario. I believe the goal of designing a "Friendly" utility function is likely to be beyond the abilities even of the best team of humans willing to design such a function. I do not have a tight argument for why I believe this, but a comment on LessWrong by Wei Dai gives a good illustration of the kind of thoughts I have on the matter:

What I'm afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or the formalization of the notion of "safety" used by the proof is wrong. This kind of thing happens a lot in cryptography, if you replace "safety" with "security". These mistakes are still occurring today, even after decades of research into how to do such proofs and what the relevant formalizations are. From where I'm sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations. There's good recent review of the history of provable security, titled Provable Security in the Real World, which might help you understand where I'm coming from.

I think this comment understates the risks, however. For example, when the comment says "the formalization of the notion of 'safety' used by the proof is wrong," it is not clear whether it means that the values the programmers have in mind are not correctly implemented by the formalization, or whether it means they are correctly implemented but are themselves catastrophic in a way that hasn't been anticipated. I would be highly concerned about both. There are other catastrophic possibilities as well; perhaps the utility function itself is well-specified and safe, but the AGI's model of the world is flawed (in particular, perhaps its prior or its process for matching observations to predictions are flawed) in a way that doesn't emerge until the AGI has made substantial changes to its environment.

By SI's own arguments, even a small error in any of these things would likely lead to catastrophe. And there are likely failure forms I haven't thought of. The overriding intuition here is that complex plans usually fail when unaccompanied by feedback loops. A scenario in which a set of people is ready to unleash an all-powerful being to maximize some parameter in the world, based solely on their initial confidence in their own extrapolations of the consequences of doing so, seems like a scenario that is overwhelmingly likely to result in a bad outcome. It comes down to placing the world's largest bet on a highly complex theory - with no experimentation to test the theory first.

So far, all I have argued is that the development of "Friendliness" theory can achieve at best only a limited reduction in the probability of an unfavorable outcome. However, as I argue in the next section, I believe there is at least one concept - the "tool-agent" distinction - that has more potential to reduce risks, and that SI appears to ignore this concept entirely. I believe that tools are safer than agents (even agents that make use of the best "Friendliness" theory that can reasonably be hoped for) and that SI encourages a focus on building agents, thus increasing risk.

Objection 2: SI appears to neglect the potentially important distinction between "tool" and "agent" AI.

Google Maps is a type of artificial intelligence (AI). It is far more intelligent than I am when it comes to planning routes.

Google Maps - by which I mean the complete software package including the display of the map itself - does not have a "utility" that it seeks to maximize. (One could fit a utility function to its actions, as to any set of actions, but there is no single "parameter to be maximized" driving its operations.)

Google Maps (as I understand it) considers multiple possible routes, gives each a score based on factors such as distance and likely traffic, and then displays the best-scoring route in a way that makes it easily understood by the user. If I don't like the route, for whatever reason, I can change some parameters and consider a different route. If I like the route, I can print it out or email it to a friend or send it to my phone's navigation application. Google Maps has no single parameter it is trying to maximize; it has no reason to try to "trick" me in order to increase its utility.

In short, Google Maps is not an agent, taking actions in order to maximize a utility parameter. It is a tool, generating information and then displaying it in a user-friendly manner for me to consider, use and export or discard as I wish.

Every software application I know of seems to work essentially the same way, including those that involve (specialized) artificial intelligence such as Google Search, Siri, Watson, Rybka, etc. Some can be put into an "agent mode" (as Watson was on Jeopardy!) but all can easily be set up to be used as "tools" (for example, Watson can simply display its top candidate answers to a question, with the score for each, without speaking any of them.)

The "tool mode" concept is importantly different from the possibility of Oracle AI sometimes discussed by SI. The discussions I've seen of Oracle AI present it as an Unfriendly AI that is "trapped in a box" - an AI whose intelligence is driven by an explicit utility function and that humans hope to control coercively. Hence the discussion of ideas such as the AI-Box Experiment. A different interpretation, given in Karnofsky/Tallinn 2011, is an AI with a carefully designed utility function - likely as difficult to construct as "Friendliness" - that leaves it "wishing" to answer questions helpfully. By contrast with both these ideas, Tool-AGI is not "trapped" and it is not Unfriendly or Friendly; it has no motivations and no driving utility function of any kind, just like Google Maps. It scores different possibilities and displays its conclusions in a transparent and user-friendly manner, as its instructions say to do; it does not have an overarching "want," and so, as with the specialized AIs described above, while it may sometimes "misinterpret" a question (thereby scoring options poorly and ranking the wrong one #1) there is no reason to expect intentional trickery or manipulation when it comes to displaying its results.

Another way of putting this is that a "tool" has an underlying instruction set that conceptually looks like: "(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc." An "agent," by contrast, has an underlying instruction set that conceptually looks like: "(1) Calculate which action, A, would maximize parameter P, based on existing data set D. (2) Execute Action A." In any AI where (1) is separable (by the programmers) as a distinct step, (2) can be set to the "tool" version rather than the "agent" version, and this separability is in fact present with most/all modern software. Note that in the "tool" version, neither step (1) nor step (2) (nor the combination) constitutes an instruction to maximize a parameter - to describe a program of this kind as "wanting" something is a category error, and there is no reason to expect its step (2) to be deceptive.

I elaborated further on the distinction and on the concept of a tool-AI in Karnofsky/Tallinn 2011.

This is important because an AGI running in tool mode could be extraordinarily useful but far more safe than an AGI running in agent mode. In fact, if developing "Friendly AI" is what we seek, a tool-AGI could likely be helpful enough in thinking through this problem as to render any previous work on "Friendliness theory" moot. Among other things, a tool-AGI would allow transparent views into the AGI's reasoning and predictions without any reason to fear being purposefully misled, and would facilitate safe experimental testing of any utility function that one wished to eventually plug into an "agent."

Is a tool-AGI possible? I believe that it is, and furthermore that it ought to be our default picture of how AGI will work, given that practically all software developed to date can (and usually does) run as a tool and given that modern software seems to be constantly becoming "intelligent" (capable of giving better answers than a human) in surprising new domains. In addition, it intuitively seems to me (though I am not highly confident) that intelligence inherently involves the distinct, separable steps of (a) considering multiple possible actions and (b) assigning a score to each, prior to executing any of the possible actions. If one can distinctly separate (a) and (b) in a program's code, then one can abstain from writing any "execution" instructions and instead focus on making the program list actions and scores in a user-friendly manner, for humans to consider and use as they wish.

Of course, there are possible paths to AGI that may rule out a "tool mode," but it seems that most of these paths would rule out the application of "Friendliness theory" as well. (For example, a "black box" emulation and augmentation of a human mind.) What are the paths to AGI that allow manual, transparent, intentional design of a utility function but do not allow the replacement of "execution" instructions with "communication" instructions? Most of the conversations I've had on this topic have focused on three responses:

Self-improving AI. Many seem to find it intuitive that (a) AGI will almost certainly come from an AI rewriting its own source code, and (b) such a process would inevitably lead to an "agent." I do not agree with either (a) or (b). I discussed these issues in Karnofsky/Tallinn 2011 and will be happy to discuss them more if this is the line of response that SI ends up pursuing. Very briefly:
- The idea of a "self-improving algorithm" intuitively sounds very powerful, but does not seem to have led to many "explosions" in software so far (and it seems to be a concept that could apply to narrow AI as well as to AGI).
- It seems to me that a tool-AGI could be plugged into a self-improvement process that would be quite powerful but would also terminate and yield a new tool-AI after a set number of iterations (or after reaching a set "intelligence threshold"). So I do not accept the argument that "self-improving AGI means agent AGI." As stated above, I will elaborate on this view if it turns out to be an important point of disagreement.
- I have argued (in Karnofsky/Tallinn 2011) that the relevant self-improvement abilities are likely to come with or after - not prior to - the development of strong AGI. In other words, any software capable of the relevant kind of self-improvement is likely also capable of being used as a strong tool-AGI, with the benefits described above.
- The SI-related discussions I've seen of "self-improving AI" are highly vague, and do not spell out views on the above points.
Dangerous data collection. Some point to the seeming dangers of a tool-AI's "scoring" function: in order to score different options it may have to collect data, which is itself an "agent" type action that could lead to dangerous actions. I think my definition of "tool" above makes clear what is wrong with this objection: a tool-AGI takes its existing data set D as fixed (and perhaps could have some pre-determined, safe set of simple actions it can take - such as using Google's API - to collect more), and if maximizing its chosen parameter is best accomplished through more data collection, it can transparently output why and how it suggests collecting more data. Over time it can be given more autonomy for data collection through an experimental and domain-specific process (e.g., modifying the AI to skip specific steps of human review of proposals for data collection after it has become clear that these steps work as intended), a process that has little to do with the "Friendly overarching utility function" concept promoted by SI. Again, I will elaborate on this if it turns out to be a key point.
Race for power. Some have argued to me that humans are likely to choose to create agent-AGI, in order to quickly gain power and outrace other teams working on AGI. But this argument, even if accepted, has very different implications from SI's view.
Conventional wisdom says it is extremely dangerous to empower a computer to act in the world until one is very sure that the computer will do its job in a way that is helpful rather than harmful. So if a programmer chooses to "unleash an AGI as an agent" with the hope of gaining power, it seems that this programmer will be deliberately ignoring conventional wisdom about what is safe in favor of shortsighted greed. I do not see why such a programmer would be expected to make use of any "Friendliness theory" that might be available. (Attempting to incorporate such theory would almost certainly slow the project down greatly, and thus would bring the same problems as the more general "have caution, do testing" counseled by conventional wisdom.) It seems that the appropriate measures for preventing such a risk are security measures aiming to stop humans from launching unsafe agent-AIs, rather than developing theories or raising awareness of "Friendliness."

One of the things that bothers me most about SI is that there is practically no public content, as far as I can tell, explicitly addressing the idea of a "tool" and giving arguments for why AGI is likely to work only as an "agent." The idea that AGI will be driven by a central utility function seems to be simply assumed. Two examples:

I have been referred to Muehlhauser and Salamon 2012 as the most up-to-date, clear explanation of SI's position on "the basics." This paper states, "Perhaps we could build an AI of limited cognitive ability — say, a machine that only answers questions: an 'Oracle AI.' But this approach is not without its own dangers (Armstrong, Sandberg, and Bostrom 2012)." However, the referenced paper (Armstrong, Sandberg and Bostrom 2012) seems to take it as a given that an Oracle AI is an "agent trapped in a box" - a computer that has a basic drive/utility function, not a Tool-AGI. The rest of Muehlhauser and Salamon 2012 seems to take it as a given that an AGI will be an agent.
I have often been referred to Omohundro 2008 for an argument that an AGI is likely to have certain goals. But this paper seems, again, to take it as given that an AGI will be an agent, i.e., that it will have goals at all. The introduction states, "To say that a system of any design is an 'artiﬁcial intelligence', we mean that it has goals which it tries to accomplish by acting in the world." In other words, the premise I'm disputing seems embedded in its very definition of AI.

The closest thing I have seen to a public discussion of "tool-AGI" is in Dreams of Friendliness, where Eliezer Yudkowsky considers the question, "Why not just have the AI answer questions, instead of trying to do anything? Then it wouldn't need to be Friendly. It wouldn't need any goals at all. It would just answer questions." His response:

To which the reply is that the AI needs goals in order to decide how to think: that is, the AI has to act as a powerful optimization process in order to plan its acquisition of knowledge, effectively distill sensory information, pluck "answers" to particular questions out of the space of all possible responses, and of course, to improve its own source code up to the level where the AI is a powerful intelligence. All these events are "improbable" relative to random organizations of the AI's RAM, so the AI has to hit a narrow target in the space of possibilities to make superintelligent answers come out.

This passage appears vague and does not appear to address the specific "tool" concept I have defended above (in particular, it does not address the analogy to modern software, which challenges the idea that "powerful optimization processes" cannot run in tool mode). The rest of the piece discusses (a) psychological mistakes that could lead to the discussion in question; (b) the "Oracle AI" concept that I have outlined above. The comments contain some more discussion of the "tool" idea (Denis Bider and Shane Legg seem to be picturing something similar to "tool-AGI") but the discussion is unresolved and I believe the "tool" concept defended above remains essentially unaddressed.

In sum, SI appears to encourage a focus on building and launching "Friendly" agents (it is seeking to do so itself, and its work on "Friendliness" theory seems to be laying the groundwork for others to do so) while not addressing the tool-agent distinction. It seems to assume that any AGI will have to be an agent, and to make little to no attempt at justifying this assumption. The result, in my view, is that it is essentially advocating for a more dangerous approach to AI than the traditional approach to software development.

Objection 3: SI's envisioned scenario is far more specific and conjunctive than it appears at first glance, and I believe this scenario to be highly unlikely.

SI's scenario concerns the development of artificial general intelligence (AGI): a computer that is vastly more intelligent than humans in every relevant way. But we already have many computers that are vastly more intelligent than humans in some relevant ways, and the domains in which specialized AIs outdo humans seem to be constantly and continuously expanding. I feel that the relevance of "Friendliness theory" depends heavily on the idea of a "discrete jump" that seems unlikely and whose likelihood does not seem to have been publicly argued for.

One possible scenario is that at some point, we develop powerful enough non-AGI tools (particularly specialized AIs) that we vastly improve our abilities to consider and prepare for the eventuality of AGI - to the point where any previous theory developed on the subject becomes useless. Or (to put this more generally) non-AGI tools simply change the world so much that it becomes essentially unrecognizable from the perspective of today - again rendering any previous "Friendliness theory" moot. As I said in Karnofsky/Tallinn 2011, some of SI's work "seems a bit like trying to design Facebook before the Internet was in use, or even before the computer existed."

Perhaps there will be a discrete jump to AGI, but it will be a sort of AGI that renders "Friendliness theory" moot for a different reason. For example, in the practice of software development, there often does not seem to be an operational distinction between "intelligent" and "Friendly." (For example, my impression is that the only method programmers had for evaluating Watson's "intelligence" was to see whether it was coming up with the same answers that a well-informed human would; the only way to evaluate Siri's "intelligence" was to evaluate its helpfulness to humans.) "Intelligent" often ends up getting defined as "prone to take actions that seem all-around 'good' to the programmer." So the concept of "Friendliness" may end up being naturally and subtly baked in to a successful AGI effort.

The bottom line is that we know very little about the course of future artificial intelligence. I believe that the probability that SI's concept of "Friendly" vs. "Unfriendly" goals ends up seeming essentially nonsensical, irrelevant and/or unimportant from the standpoint of the relevant future is over 90%.

Other objections to SI's views

There are other debates about the likelihood of SI's work being relevant/helpful; for example,

It isn't clear whether the development of AGI is imminent enough to be relevant, or whether other risks to humanity are closer.
It isn't clear whether AGI would be as powerful as SI's views imply. (I discussed this briefly in Karnofsky/Tallinn 2011.)
It isn't clear whether even an extremely powerful UFAI would choose to attack humans as opposed to negotiating with them. (I find it somewhat helpful to analogize UFAI-human interactions to human-mosquito interactions. Humans are enormously more intelligent than mosquitoes; humans are good at predicting, manipulating, and destroying mosquitoes; humans do not value mosquitoes' welfare; humans have other goals that mosquitoes interfere with; humans would like to see mosquitoes eradicated at least from certain parts of the planet. Yet humans haven't accomplished such eradication, and it is easy to imagine scenarios in which humans would prefer honest negotiation and trade with mosquitoes to any other arrangement, if such negotiation and trade were possible.)

Unlike the three objections I focus on, these other issues have been discussed a fair amount, and if these other issues were the only objections to SI's arguments I would find SI's case to be strong (i.e., I would find its scenario likely enough to warrant investment in).

Wrapup

I believe the most likely future scenarios are the ones we haven't thought of, and that the most likely fate of the sort of theory SI ends up developing is irrelevance.
I believe that unleashing an all-powerful "agent AGI" (without the benefit of experimentation) would very likely result in a UFAI-like outcome, no matter how carefully the "agent AGI" was designed to be "Friendly." I see SI as encouraging (and aiming to take) this approach.
I believe that the standard approach to developing software results in "tools," not "agents," and that tools (while dangerous) are much safer than agents. A "tool mode" could facilitate experiment-informed progress toward a safe "agent," rather than needing to get "Friendliness" theory right without any experimentation.
Therefore, I believe that the approach SI advocates and aims to prepare for is far more dangerous than the standard approach, so if SI's work on Friendliness theory affects the risk of human extinction one way or the other, it will increase the risk of human extinction. Fortunately I believe SI's work is far more likely to have no effect one way or the other.

For a long time I refrained from engaging in object-level debates over SI's work, believing that others are better qualified to do so. But after talking at great length to many of SI's supporters and advocates and reading everything I've been pointed to as relevant, I still have seen no clear and compelling response to any of my three major objections. As stated above, there are many possible responses to my objections, but SI's current arguments do not seem clear on what responses they wish to take and defend. At this point I am unlikely to form a positive view of SI's work until and unless I do see such responses, and/or SI changes its positions.

Is SI the kind of organization we want to bet on?

This part of the post has some risks. For most of GiveWell's history, sticking to our standard criteria - and putting more energy into recommended than non-recommended organizations - has enabled us to share our honest thoughts about charities without appearing to get personal. But when evaluating a group such as SI, I can't avoid placing a heavy weight on (my read on) the general competence, capability and "intangibles" of the people and organization, because SI's mission is not about repeating activities that have worked in the past. Sharing my views on these issues could strike some as personal or mean-spirited and could lead to the misimpression that GiveWell is hostile toward SI. But it is simply necessary in order to be fully transparent about why I hold the views that I hold.

Fortunately, SI is an ideal organization for our first discussion of this type. I believe the staff and supporters of SI would overwhelmingly rather hear the whole truth about my thoughts - so that they can directly engage them and, if warranted, make changes - than have me sugar-coat what I think in order to spare their feelings. People who know me and my attitude toward being honest vs. sparing feelings know that this, itself, is high praise for SI.

One more comment before I continue: our policy is that non-public information provided to us by a charity will not be published or discussed without that charity's prior consent. However, none of the content of this post is based on private information; all of it is based on information that SI has made available to the public.

There are several reasons that I currently have a negative impression of SI's general competence, capability and "intangibles." My mind remains open and I include specifics on how it could be changed.

Weak arguments. SI has produced enormous quantities of public argumentation, and I have examined a very large proportion of this information. Yet I have never seen a clear response to any of the three basic objections I listed in the previous section. One of SI's major goals is to raise awareness of AI-related risks; given this, the fact that it has not advanced clear/concise/compelling arguments speaks, in my view, to its general competence.
Lack of impressive endorsements. I discussed this issue in my 2011 interview with SI representatives and I still feel the same way on the matter. I feel that given the enormous implications of SI's claims, if it argued them well it ought to be able to get more impressive endorsements than it has.
I have been pointed to Peter Thiel and Ray Kurzweil as examples of impressive SI supporters, but I have not seen any on-record statements from either of these people that show agreement with SI's specific views, and in fact (based on watching them speak at Singularity Summits) my impression is that they disagree. Peter Thiel seems to believe that speeding the pace of general innovation is a good thing; this would seem to be in tension with SI's view that AGI will be catastrophic by default and that no one other than SI is paying sufficient attention to "Friendliness" issues. Ray Kurzweil seems to believe that "safety" is a matter of transparency, strong institutions, etc. rather than of "Friendliness." I am personally in agreement with the things I have seen both of them say on these topics. I find it possible that they support SI because of the Singularity Summit or to increase general interest in ambitious technology, rather than because they find "Friendliness theory" to be as important as SI does.

Clear, on-record statements from these two supporters, specifically endorsing SI's arguments and the importance of developing Friendliness theory, would shift my views somewhat on this point.
Resistance to feedback loops. I discussed this issue in my 2011 interview with SI representatives and I still feel the same way on the matter. SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments. This is a problem because of (a) its extremely ambitious goals (among other things, it seeks to develop artificial intelligence and "Friendliness theory" before anyone else can develop artificial intelligence); (b) its view of its staff/supporters as having unusual insight into rationality, which I discuss in a later bullet point.
SI's list of achievements is not, in my view, up to where it needs to be given (a) and (b). Yet I have seen no declaration that SI has fallen short to date and explanation of what will be changed to deal with it. SI's recent release of a strategic plan and monthly updates are improvements from a transparency perspective, but they still leave me feeling as though there are no clear metrics or goals by which SI is committing to be measured (aside from very basic organizational goals such as "design a new website" and very vague goals such as "publish more papers") and as though SI places a low priority on engaging people who are critical of its views (or at least not yet on board), as opposed to people who are naturally drawn to it.

I believe that one of the primary obstacles to being impactful as a nonprofit is the lack of the sort of helpful feedback loops that lead to success in other domains. I like to see groups that are making as much effort as they can to create meaningful feedback loops for themselves. I perceive SI as falling well short on this front. Pursuing more impressive endorsements and developing benign but objectively recognizable innovations (particularly commercially viable ones) are two possible ways to impose more demanding feedback loops. (I discussed both of these in my interview linked above).
Apparent poorly grounded belief in SI's superior general rationality. Many of the things that SI and its supporters and advocates say imply a belief that they have special insights into the nature of general rationality, and/or have superior general rationality, relative to the rest of the population. (Examples here, here and here). My understanding is that SI is in the process of spinning off a group dedicated to training people on how to have higher general rationality.
Yet I'm not aware of any of what I consider compelling evidence that SI staff/supporters/advocates have any special insight into the nature of general rationality or that they have especially high general rationality.

I have been pointed to the Sequences on this point. The Sequences (which I have read the vast majority of) do not seem to me to be a demonstration or evidence of general rationality. They are about rationality; I find them very enjoyable to read; and there is very little they say that I disagree with (or would have disagreed with before I read them). However, they do not seem to demonstrate rationality on the part of the writer, any more than a series of enjoyable, not-obviously-inaccurate essays on the qualities of a good basketball player would demonstrate basketball prowess. I sometimes get the impression that fans of the Sequences are willing to ascribe superior rationality to the writer simply because the content seems smart and insightful to them, without making a critical effort to determine the extent to which the content is novel, actionable and important.

I endorse Eliezer Yudkowsky's statement, "Be careful … any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility." To me, the best evidence of superior general rationality (or of insight into it) would be objectively impressive achievements (successful commercial ventures, highly prestigious awards, clear innovations, etc.) and/or accumulation of wealth and power. As mentioned above, SI staff/supporters/advocates do not seem particularly impressive on these fronts, at least not as much as I would expect for people who have the sort of insight into rationality that makes it sensible for them to train others in it. I am open to other evidence that SI staff/supporters/advocates have superior general rationality, but I have not seen it.

Why is it a problem if SI staff/supporter/advocates believe themselves, without good evidence, to have superior general rationality? First off, it strikes me as a belief based on wishful thinking rather than rational inference. Secondly, I would expect a series of problems to accompany overconfidence in one's general rationality, and several of these problems seem to be actually occurring in SI's case:
- Insufficient self-skepticism given how strong its claims are and how little support its claims have won. Rather than endorsing "Others have not accepted our arguments, so we will sharpen and/or reexamine our arguments," SI seems often to endorse something more like "Others have not accepted their arguments because they have inferior general rationality," a stance less likely to lead to improvement on SI's part.
- Being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously.
- Paying insufficient attention to the limitations of the confidence one can have in one's untested theories, in line with my Objection 1.
Overall disconnect between SI's goals and its activities. SI seeks to build FAI and/or to develop and promote "Friendliness theory" that can be useful to others in building FAI. Yet it seems that most of its time goes to activities other than developing AI or theory. Its per-person output in terms of publications seems low. Its core staff seem more focused on Less Wrong posts, "rationality training" and other activities that don't seem connected to the core goals; Eliezer Yudkowsky, in particular, appears (from the strategic plan) to be focused on writing books for popular consumption. These activities seem neither to be advancing the state of FAI-related theory nor to be engaging the sort of people most likely to be crucial for building AGI.
A possible justification for these activities is that SI is seeking to promote greater general rationality, which over time will lead to more and better support for its mission. But if this is SI's core activity, it becomes even more important to test the hypothesis that SI's views are in fact rooted in superior general rationality - and these tests don't seem to be happening, as discussed above.
Theft. I am bothered by the 2009 theft of $118,803.00 (as against a $541,080.00 budget for the year). In an organization as small as SI, it really seems as though theft that large relative to the budget shouldn't occur and that it represents a major failure of hiring and/or internal controls.
In addition, I have seen no public SI-authorized discussion of the matter that I consider to be satisfactory in terms of explaining what happened and what the current status of the case is on an ongoing basis. Some details may have to be omitted, but a clear SI-authorized statement on this point with as much information as can reasonably provided would be helpful.

A couple positive observations to add context here:

I see significant positive qualities in many of the people associated with SI. I especially like what I perceive as their sincere wish to do whatever they can to help the world as much as possible, and the high value they place on being right as opposed to being conventional or polite. I have not interacted with Eliezer Yudkowsky but I greatly enjoy his writings.
I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years, particularly regarding the last couple of statements listed above. That said, SI is an organization and it seems reasonable to judge it by its organizational track record, especially when its new leadership is so new that I have little basis on which to judge these staff.

Wrapup

While SI has produced a lot of content that I find interesting and enjoyable, it has not produced what I consider evidence of superior general rationality or of its suitability for the tasks it has set for itself. I see no qualifications or achievements that specifically seem to indicate that SI staff are well-suited to the challenge of understanding the key AI-related issues and/or coordinating the construction of an FAI. And I see specific reasons to be pessimistic about its suitability and general competence.

When estimating the expected value of an endeavor, it is natural to have an implicit "survivorship bias" - to use organizations whose accomplishments one is familiar with (which tend to be relatively effective organizations) as a reference class. Because of this, I would be extremely wary of investing in an organization with apparently poor general competence/suitability to its tasks, even if I bought fully into its mission (which I do not) and saw no other groups working on a comparable mission.

But if there's even a chance …

A common argument that SI supporters raise with me is along the lines of, "Even if SI's arguments are weak and its staff isn't as capable as one would like to see, their goal is so important that they would be a good investment even at a tiny probability of success."

I believe this argument to be a form of Pascal's Mugging and I have outlined the reasons I believe it to be invalid in two posts (here and here). There have been some objections to my arguments, but I still believe them to be valid. There is a good chance I will revisit these topics in the future, because I believe these issues to be at the core of many of the differences between GiveWell-top-charities supporters and SI supporters.

Regardless of whether one accepts my specific arguments, it is worth noting that the most prominent people associated with SI tend to agree with the conclusion that the "But if there's even a chance …" argument is not valid. (See comments on my post from Michael Vassar and Eliezer Yudkowsky as well as Eliezer's interview with John Baez.)

Existential risk reduction as a cause

I consider the general cause of "looking for ways that philanthropic dollars can reduce direct threats of global catastrophic risks, particularly those that involve some risk of human extinction" to be a relatively high-potential cause. It is on the working agenda for GiveWell Labs and we will be writing more about it.

However, I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y. For donors determined to donate within this cause, I encourage you to consider donating to a donor-advised fund while making it clear that you intend to grant out the funds to existential-risk-reduction-related organizations in the future. (One way to accomplish this would be to create a fund with "existential risk" in the name; this is a fairly easy thing to do and one person could do it on behalf of multiple donors.)

For one who accepts my arguments about SI, I believe withholding funds in this way is likely to be better for SI's mission than donating to SI - through incentive effects alone (not to mention my specific argument that SI's approach to "Friendliness" seems likely to increase risks).

How I might change my views

My views are very open to revision.

However, I cannot realistically commit to read and seriously consider all comments posted on the matter. The number of people capable of taking a few minutes to write a comment is sufficient to swamp my capacity. I do encourage people to comment and I do intend to read at least some comments, but if you are looking to change my views, you should not consider posting a comment to be the most promising route.

Instead, what I will commit to is reading and carefully considering up to 50,000 words of content that are (a) specifically marked as SI-authorized responses to the points I have raised; (b) explicitly cleared for release to the general public as SI-authorized communications. In order to consider a response "SI-authorized and cleared for release," I will accept explicit communication from SI's Executive Director or from a majority of its Board of Directors endorsing the content in question. After 50,000 words, I may change my views and/or commit to reading more content, or (if I determine that the content is poor and is not using my time efficiently) I may decide not to engage further. SI-authorized content may improve or worsen SI's standing in my estimation, so unlike with comments, there is an incentive to select content that uses my time efficiently. Of course, SI-authorized content may end up including excerpts from comment responses to this post, and/or already-existing public content.

I may also change my views for other reasons, particularly if SI secures more impressive achievements and/or endorsements.

One more note: I believe I have read the vast majority of the Sequences, including the AI-foom debate, and that this content - while interesting and enjoyable - does not have much relevance for the arguments I've made.

Again: I think that whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.

Acknowledgements

Thanks to the following people for reviewing a draft of this post and providing thoughtful feedback (this of course does not mean they agree with the post or are responsible for its content): Dario Amodei, Nick Beckstead, Elie Hassenfeld, Alexander Kruel, Tim Ogden, John Salvatier, Jonah Sinick, Cari Tuna, Stephanie Wykstra.

1274 comments

Comments sorted by top scores.

comment by lukeprog · 2012-05-10T21:24:19.513Z · LW(p) · GW(p)

Update: My full response to Holden is now here.

As Holden said, I generally think that Holden's objections for SI "are either correct (especially re: past organizational competence) or incorrect but not addressed by SI in clear argumentative writing (this includes the part on 'tool' AI)," and we are working hard to fix both categories of issues.

In this comment I would merely like to argue for one small point: that the Singularity Institute is undergoing comprehensive changes — changes which I believe to be improvements that will help us to achieve our mission more efficiently and effectively.

Holden wrote:

I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years...

Louie Helm was hired as Director of Development in September 2011. I was hired as a Research Fellow that same month, and made Executive Director in November 2011. Below are some changes made since September. (Pardon the messy presentation: LW cannot correctly render tables in comments.)

SI before Sep. 2011: Very few peer-reviewed research publications.
SI today: More peer-reviewed publications coming in 2012 than in all past years combined. Additionally, I alone have a dozen papers in development, for which I am directing every step of research and writing, and will write the final draft, but am collaborating with remote researchers so as to put in only 5%-20% of the total hours required myself.

SI before Sep. 2011: No donor database / a very broken one.
SI today: A comprehensive donor database.

SI before Sep. 2011: Nearly all work performed directly by SI staff.
SI today: Most work outsourced to remote collaborators so that SI staff can focus on the things that only they can do.

SI before Sep. 2011: No strategic plan.
SI today: A strategic plan developed with input from all SI staff, and approved by the Board.

SI before Sep. 2011: Very little communication about what SI is doing.
SI today: Monthly progress reports, plus three Q&As with Luke about SI research and organizational development.

SI before Sep. 2011: No list of the research problems SI is working on.
SI today: A long, fully-referenced list of research problems SI is working on.

SI before Sep. 2011: Very little direct management of staff and projects.
SI today: Luke monitors all projects and staff work, and meets regularly with each staff member.

SI before Sep. 2011: Almost no detailed tracking of the expense of major SI projects (e.g. Summit, papers, etc.). The sole exception seems to be that Amy was tracking the costs of the 2011 Summit in NYC.
SI today: Detailed tracking of the expense of major SI projects for which this is possible (Luke has a folder in Google docs for these spreadsheets, and the summary spreadsheet is shared with the Board).

SI before Sep. 2011: No staff worklogs.
SI today: All staff members share their worklogs with Luke, Luke shares his worklog with all staff plus the Board.

SI before Sep. 2011: Best practices not followed for bookkeeping/accounting; accountant's recommendations ignored.
SI today: Meetings with consultants about bookkeeping/accounting; currently working with our accountant to implement best practices and find a good bookkeeper.

SI before Sep. 2011: Staff largely separated, many of them not well-connected to the others.
SI today: After a dozen or so staff dinners, staff much better connected, more of a team.

SI before Sep. 2011: Want to see the basics of AI Risk explained in plain language? Read The Sequences (more than a million words) or this academic book chapter by Yudkowsky.
SI today: Want to see the basics of AI Risk explained in plain language? Read Facing the Singularity (now in several languages, with more being added) or listen to the podcast version.

SI before Sep. 2011: Very few resources created to support others' research in AI risk.
SI today: IntelligenceExplosion.com, Friendly-AI.com, list of open problems in the field, with references, AI Risk Bibliography 2012, annotated list of journals that may publish papers on AI risk, a partial history of AI risk research, and a list of forthcoming and desired articles on AI risk.

SI before Sep. 2011: A hard-to-navigate website with much outdated content.
SI today: An entirely new website that is easier to navigate and has much new content (nearly complete; should launch in May or June).

SI before Sep. 2011: So little monitoring of funds that $118k was stolen in 2010 before SI noticed. (Note that we have won stipulated judgments to get much of this back, and have upcoming court dates to argue for stipulated judgments to get the rest back.)
SI today: Our bank accounts have been consolidated, with 3-4 people regularly checking over them.

SI before Sep. 2011: SI publications exported straight to PDF from Word or Google Docs, sometimes without even author names appearing.
SI today: All publications being converted into slick, useable LaTeX template (example), with all references checked and put into a central BibTeX file.

SI before Sep. 2011: No write-up of our major public technical breakthrough (TDT) using the mainstream format and vocabulary comprehensible to most researchers in the field (this is what we have at the moment).
SI today: Philosopher Rachael Briggs, whose papers on decision theory have been twice selected for the Philosopher's Annual, has been contracted to write an explanation of TDT and publish it in one of a select few leading philosophy journals.

SI before Sep. 2011: No explicit effort made toward efficient use of SEO or our (free) Google Adwords.
SI today: Highly optimized use of Google Adwords to direct traffic to our sites; currently working with SEO consultants to improve our SEO (of course, the new website will help).

(Just to be clear, I think this list shows not that "SI is looking really great!" but instead that "SI is rapidly improving and finally reaching a 'basic' level of organizational function.")

Replies from: lukeprog, None, Eliezer_Yudkowsky, ghf, siodine, JoshuaFox, army1987, Pablo_Stafforini, aceofspades

↑ comment by lukeprog · 2012-05-11T02:54:28.896Z · LW(p) · GW(p)

...which is not to say, of course, that things were not improving before September 2011. It's just that the improvements have accelerated quite a bit since then.

For example, Amy was hired in December 2009 and is largely responsible for these improvements:

Built a "real" Board and officers; launched monthly Board meetings in February 2010.
Began compiling monthly financial reports in December 2010.
Began tracking Summit expenses and seeking Summit sponsors.
Played a major role in canceling many programs and expenses that were deemed low ROI.

↑ comment by [deleted] · 2012-05-11T04:25:54.689Z · LW(p) · GW(p)

Our bank accounts have been consolidated, with 3-4 people regularly checking over them.

In addition to reviews, should SI implement a two-man rule for manipulating large quantities of money? (For example, over 5k, over 10k, etc.)

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-11T05:00:20.014Z · LW(p) · GW(p)

And note that these improvements would not and could not have happened without more funding than the level of previous years - if, say, everyone had been waiting to see these kinds of improvements before funding.

Replies from: lukeprog, ghf, John_Maxwell_IV

↑ comment by lukeprog · 2012-05-11T08:13:02.053Z · LW(p) · GW(p)

note that these improvements would not and could not have happened without more funding than the level of previous years

Really? That's not obvious to me. Of course you've been around for all this and I haven't, but here's what I'm seeing from my vantage point...

Recent changes that cost very little:

Donor database
Strategic plan
Monthly progress reports
A list of research problems SI is working on (it took me 16 hours to write)
IntelligenceExplosion.com, Friendly-AI.com, AI Risk Bibliography 2012, annotated list of journals that may publish papers on AI risk, a partial history of AI risk research, and a list of forthcoming and desired articles on AI risk (each of these took me only 10-25 hours to create)
Detailed tracking of the expenses for major SI projects
Staff worklogs
Staff dinners (or something that brought staff together)
A few people keeping their eyes on SI's funds so theft would be caught sooner
Optimization of Google Adwords

Stuff that costs less than some other things SI had spent money on, such as funding Ben Goertzel's AGI research or renting downtown Berkeley apartments for the later visiting fellows:

Research papers
Management of staff and projects
Rachael Briggs' TDT write-up
Best-practices bookkeeping/accounting
New website
LaTeX template for SI publications; references checked and then organized with BibTeX
SEO

Do you disagree with these estimates, or have I misunderstood what you're claiming?

Replies from: David_Gerard, Eliezer_Yudkowsky

↑ comment by David_Gerard · 2012-05-12T18:37:08.031Z · LW(p) · GW(p)

A lot of charities go through this pattern before they finally work out how to transition from a board-run/individual-run tax-deductible band of conspirators to being a professional staff-run organisation tuned to doing the particular thing they do. The changes required seem simple and obvious in hindsight, but it's a common pattern for it to take years, so SIAI has been quite normal, or at the very least not been unusually dumb.

(My evidence is seeing this pattern close-up in the Wikimedia Foundation, Wikimedia UK (the first attempt at which died before managing it, the second making it through barely) and the West Australian Music Industry Association, and anecdotal evidence from others. Everyone involved always feels stupid at having taken years to achieve the retrospectively obvious. I would be surprised if this aspect of the dynamics of nonprofits had not been studied.)

edit: Luke's recommendation of The Nonprofit Kit For Dummies looks like precisely the book all the examples I know of needed to have someone throw at them before they even thought of forming an organisation to do whatever it is they wanted to achieve.

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-12T04:04:19.165Z · LW(p) · GW(p)

Things that cost money:

Amy Willey
Luke Muehlhauser
Louie Helm
CfAR
trying things until something worked

Replies from: lukeprog, lukeprog

↑ comment by lukeprog · 2012-05-14T10:07:06.118Z · LW(p) · GW(p)

I don't think this response supports your claim that these improvements "would not and could not have happened without more funding than the level of previous years."

I know your comment is very brief because you're busy at minicamp, but I'll reply to what you wrote, anyway: Someone of decent rationality doesn't just "try things until something works." Moreover, many of the things on the list of recent improvements don't require an Amy, a Luke, or a Louie.

I don't even have past management experience. As you may recall, I had significant ambiguity aversion about the prospect of being made Executive Director, but as it turned out, the solution to almost every problem X has been (1) read what the experts say about how to solve X, (2) consult with people who care about your mission and have solved X before, and (3) do what they say.

When I was made Executive Director and phoned our Advisors, most of them said "Oh, how nice to hear from you! Nobody from SingInst has ever asked me for advice before!"

That is the kind of thing that makes me want to say that SingInst has "tested every method except the method of trying."

Donor database, strategic plan, staff worklogs, bringing staff together, expenses tracking, funds monitoring, basic management, best-practices accounting/bookkeeping... these are all literally from the Nonprofits for Dummies book.

Maybe these things weren't done for 11 years because SI's decision-makers did make good plans but failed to execute them due to the usual defeaters. But that's not the history I've heard, except that some funds monitoring was insisted upon after the large theft, and a donor database was sorta-kinda-not-really attempted at one point. The history I've heard is that SI failed to make these kinds of plans in the first place, failed to ask advisors for advice, failed to read Nonprofits for Dummies, and so on.

Money wasn't the barrier to doing many of those things, it was a gap in general rationality.

I will agree, however, that what is needed now is more money. We are rapidly becoming a more robust and efficient and rational organization, stepping up our FAI team recruiting efforts, stepping up our transparency and accountability efforts, and stepping up our research efforts, and all those things cost money.

At the risk of being too harsh… When I began to intern with the Singularity Institute in April 2011, I felt uncomfortable suggesting that people donate to SingInst, because I could see it from the inside and it wasn't pretty. (And I'm not the only SIer who felt this way at the time.)

But now I do feel comfortable asking people to donate to SingInst. I'm excited about our trajectory and our team, and if we can raise enough support then we might just have a shot at winning after all.

Replies from: Eliezer_Yudkowsky, MarkusRamikin, JoshuaZ, Benquo, Steve_Rayhawk, David_Gerard, David_Gerard

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-21T04:29:45.237Z · LW(p) · GW(p)

Luke has just told me (personal conversation) that what he got from my comment was, "SIAI's difficulties were just due to lack of funding" which was not what I was trying to say at all. What I was trying to convey was more like, "I didn't have the ability to run this organization, and knew this - people who I hoped would be able to run the organization, while I tried to produce in other areas (e.g. turning my back on everything else to get a year of FAI work done with Marcello or writing the Sequences) didn't succeed in doing so either - and the only reason we could hang on long enough to hire Luke was that the funding was available nonetheless and in sufficient quantity that we could afford to take risks like paying Luke to stay on for a while, well before we knew he would become Executive Director".

Replies from: Will_Sawin

↑ comment by Will_Sawin · 2012-06-12T05:23:10.888Z · LW(p) · GW(p)

Does Luke disagree with this clarified point? I do not find a clear indicator in this conversation.

Replies from: lukeprog

↑ comment by lukeprog · 2013-08-28T19:40:42.474Z · LW(p) · GW(p)

Update: I came out of a recent conversation with Eliezer with a higher opinion of Eliezer's general rationality, because several things that had previously looked to me like unforced, forseeable mistakes by Eliezer now look to me more like non-mistakes or not-so-forseeable mistakes.

↑ comment by MarkusRamikin · 2012-05-14T15:41:32.235Z · LW(p) · GW(p)

You're allowed to say these things on the public Internet?

I just fell in love with SI.

Replies from: lukeprog, shminux, TheOtherDave

↑ comment by lukeprog · 2012-05-26T00:33:50.957Z · LW(p) · GW(p)

You're allowed to say these things on the public Internet?

Well, at our most recent board meeting I wasn't fired, reprimanded, or even questioned for making these comments, so I guess I am. :)

Replies from: MarkusRamikin

↑ comment by MarkusRamikin · 2012-05-26T06:00:43.342Z · LW(p) · GW(p)

Not even funny looks? ;)

↑ comment by Shmi (shminux) · 2012-05-14T18:04:43.800Z · LW(p) · GW(p)

I just fell in love with SI.

It's Luke you should have fallen in love with, since he is the one turning things around.

Replies from: wedrifid, None

↑ comment by wedrifid · 2012-05-26T02:24:14.790Z · LW(p) · GW(p)

It's Luke you should have fallen in love with, since he is the one turning things around.

On the other hand I can count with one hand the number of established organisations I know of that would be sociologically capable of ceding power, status and control to Luke the way SingInst did. They took an untrained intern with essentially zero external status from past achievements and affiliations and basically decided to let him run the show (at least in terms of publicly visible initiatives). It is clearly the right thing for SingInst to do and admittedly Luke is very tall and has good hair which generally gives a boost when it comes to such selections - but still, making the appointment goes fundamentally against normal human behavior.

(Where I say "count with one hand" I am not including the use of any digits thereupon. I mean one.)

Replies from: Matt_Simpson

↑ comment by Matt_Simpson · 2012-07-19T19:05:00.965Z · LW(p) · GW(p)

...and admittedly Luke is very tall and has good hair which generally gives a boost when it comes to such selections...

It doesn't matter that I completely understand why this phrase was included, I still found it hilarious in a network sitcom sort of way.

↑ comment by [deleted] · 2012-05-14T19:58:32.512Z · LW(p) · GW(p)

Consider the implications in light of the HoldenKarnofsky's critique about SI pretensions to high rationality.

Rationality is winning.
SI, at the same time as it was claiming extraordinary rationality, was behaving in ways that were blatantly irrational.
Although this is supposedly due to "the usual causes," rationality (winning) subsumes overcoming akrasia.
HoldenKarnofsky is correct that SI made claims for its own extraordinary rationality at a time when its leaders weren't rational.
Further: why should anyone give SI credibility today—when it stands convicted of self-serving misrepresentation in the recent past?

Replies from: thomblake, ciphergoth, shminux

↑ comment by thomblake · 2012-05-14T20:03:44.715Z · LW(p) · GW(p)

As a minor note, observe that claims of extraordinary rationality do not necessarily contradict claims of irrationality. The sanity waterline is very low.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-14T21:12:55.166Z · LW(p) · GW(p)

Do you mean to imply in context here that the organizational management of SIAI at the time under discussion was above average for a nonprofit organization? Or are you just making a more general statement that a system can be irrational while demonstrating above average rationality? I certainly agree with the latter.

Replies from: ciphergoth, thomblake

↑ comment by Paul Crowley (ciphergoth) · 2012-05-15T06:30:46.018Z · LW(p) · GW(p)

Are you comparing it to the average among nonprofits started, or nonprofits extant? I would guess that it was well below average for extant nonprofits, but about or slightly above average for started nonprofits. I'd guess that most nonprofits are started by people who don't know what they're doing and don't know what they don't know, and that SI probably did slightly better because the people who were being a bit stupid were at least very smart, which can help. However, I'd guess that most such nonprofits don't live long because they don't find a Peter Thiel to keep them alive.

Replies from: David_Gerard, TheOtherDave, private_messaging

↑ comment by David_Gerard · 2012-05-16T11:07:48.227Z · LW(p) · GW(p)

Your assessment looks about right to me. I have considerable experience of averagely-incompetent nonprofits, and SIAI looks normal to me. I am strongly tempted to grab that "For Dummies" book and, if it's good, start sending copies to people ...

↑ comment by TheOtherDave · 2012-05-15T12:44:48.271Z · LW(p) · GW(p)

In the context of thomblake's comment, I suppose nonprofits started is the proper reference class.

↑ comment by private_messaging · 2012-05-16T11:39:37.413Z · LW(p) · GW(p)

I don't see what's the point to comparing to average nonprofits. Average for-profits don't realize any profit, and average non-profits just waste money.

I would say SIAI is best paralleled to average started 'research' organization that is developing some free energy something, run by non-scientists, with some hired scientists as chaff.

Replies from: CronoDAS

↑ comment by CronoDAS · 2012-08-06T00:12:08.352Z · LW(p) · GW(p)

Sadly, I agree. Unless you look at it very closely, SIAI pattern-matches to "crackpots trying to raise money to fund their crackpottiness" fairly well. (What saves them is that their ideas are a lot better than the average crackpot.)

↑ comment by thomblake · 2012-05-15T13:51:19.993Z · LW(p) · GW(p)

Or are you just making a more general statement that a system can be irrational while demonstrating above average rationality?

Yes, this.

On an arbitrary scale I just made up, below 100 degrees of rationality is "irrational", and 0 degrees of rationality is "ordinary". 50 is extraordinarily rational and yet irrational.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-16T12:43:04.875Z · LW(p) · GW(p)

50 while you're thinking you're at 100 is being an extraordinary loser (overconfidence leads to big failures)

In any case this is just word play. Holden seen many organizations that are/were more rational, that's probably what he means by lack of extraordinary rationality.

↑ comment by Paul Crowley (ciphergoth) · 2012-05-15T06:26:06.711Z · LW(p) · GW(p)

You've misread the post - Luke is saying that he doesn't think the "usual defeaters" are the most likely explanation.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-25T17:42:34.437Z · LW(p) · GW(p)

Correct.

↑ comment by Shmi (shminux) · 2012-05-14T20:10:09.724Z · LW(p) · GW(p)

Just to let you know, you've just made it on my list of the very few LW regulars I no longer bother replying to, due to the proven futility of any communications. In your case it is because you have a very evident ax to grind, which is incompatible with rational thought.

Replies from: metaphysicist

↑ comment by metaphysicist · 2012-05-14T20:34:42.099Z · LW(p) · GW(p)

This comment seems strange. Is having an ax to grind opposed to rationality? Then why does Eliezer Yudkowsky, for example, not hesitate to advocate for causes such as friendly AI? Doesn't he have an ax to grind? More of one really, since this ax chops trees of gold.

It would seem intellectual honesty would require that you say you reject discussions with people with an ax to grind, unless you grind a similar ax.

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-05-14T20:46:21.006Z · LW(p) · GW(p)

From http://www.usingenglish.com: "If you have an axe to grind with someone or about something, you have a grievance, a resentment and you want to get revenge or sort it out." One can hardly call the unacknowledged emotions of resentment and needing a revenge/retribution compatible with rationality. srdiamond piled a bunch of (partially correct but irrelevant in the context of my comment) negative statements about SI, making these emotions quite clear.

Replies from: metaphysicist

↑ comment by metaphysicist · 2012-05-14T21:17:48.643Z · LW(p) · GW(p)

That's a restrictive definition of "ax to grind," by the way—it's normally used to mean any special interest in the subject: "an ulterior often selfish underlying purpose " (Merriam-Webster's Collegiate Dictionary)

But I might as well accept your meaning for discussion purposes. If you detect unacknowledged resentment in srdiamond, don't you detect unacknowledged ambition in Eliezer Yudkowsky?

There's actually good reason for the broader meaning of "ax to grind." Any special stake is a bias. I don't think you can say that someone who you think acts out of resentment, like srdiamond, is more intractably biased than someone who acts out of other forms of narrow self-interest, which almost invariably applies when someone defends something he gets money from.

I don't think it's a rational method to treat people differently, as inherently less rational, when they seem resentful. It is only one of many difficult biases. Financial interest is probably more biasing. If you think the arguments are crummy, that's something else. But the motive--resentment or finances--should probably have little bearing on how a message is treated in serious discussion.

Replies from: JGWeissman, shminux

↑ comment by JGWeissman · 2012-05-14T21:58:11.831Z · LW(p) · GW(p)

don't you detect unacknowledged ambition in Eliezer Yudkowsky?

Eliezer certainly has a lot of ambition, but I am surprised to see an accusation that this ambition is unacknowledged.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-14T22:10:51.082Z · LW(p) · GW(p)

The impression I get from scanning their comment history is that metaphysicist means to suggest here that EY has ambitions he hasn't acknowledged (e.g., the ambition to make money without conventional credentials), not that he fails to acknowledge any of the ambitions he has.

↑ comment by Shmi (shminux) · 2012-05-14T22:10:22.766Z · LW(p) · GW(p)

I don't think it's a rational method to treat people differently, as inherently less rational, when they seem resentful.

Thank you for this analysis, it made me think more about my motivations and their validity. I believe that my decision to permanently disengage from discussions with some people is based on the futility of such discussions in the past, not on the specific reasons they are futile. At some point I simply decide to cut my losses.

There's actually good reason for the broader meaning of "ax to grind." Any special stake is a bias.

Indeed, present company not excluded. The question is whether it permanently prevents the ax-grinder from listening. EY, too, has his share of unacknowledged irrationalities, but both his status and his ability to listen and to provide insights makes engaging him in a discussion a rewarding, if sometimes frustrating experience.

I don't not know why srdiamond's need to bash SI is so entrenched, and whether it can be remedied to a degree where he is once again worth talking to, so at this point it is instrumentally rational for me to avoid replying to him.

↑ comment by TheOtherDave · 2012-05-14T16:20:43.370Z · LW(p) · GW(p)

Well, all we really know is that he chose to. It may be that everyone he works with then privately berated him for it.
That said, I share your sentiment.
Actually, if SI generally endorses this sort of public "airing of dirty laundry," I encourage others involved in the organization to say so out loud.

↑ comment by JoshuaZ · 2012-05-14T15:44:03.795Z · LW(p) · GW(p)

The largest concern from reading this isn't really what it brings up in management context, but what it says about the SI in general. Here an area where there's real expertise and basic books that discuss well-understood methods and they didn't do any of that. Given that, how likely should I think it is that when the SI and mainstream AI people disagree that part of the problem may be the SI people not paying attention to basics?

Replies from: TheOtherDave, private_messaging, ciphergoth

↑ comment by TheOtherDave · 2012-05-14T16:17:42.240Z · LW(p) · GW(p)

(nods) The nice thing about general-purpose techniques for winning at life (as opposed to domain-specific ones) is that there's lots of evidence available as to how effective they are.

↑ comment by private_messaging · 2012-05-16T13:43:25.819Z · LW(p) · GW(p)

Precisely. For example of one existing base: the existing software that searches for solutions to engineering problems. Such as 'self improvement' via design of better chips. Works within narrowly defined field, to cull the search space. Should we expect state of the art software of this kind to be beaten by someone's contemporary paperclip maximizer? By how much?

Incredibly relevant to AI risk, but analysis can't be faked without really having technical expertise.

↑ comment by Paul Crowley (ciphergoth) · 2012-05-21T18:06:19.413Z · LW(p) · GW(p)

I doubt there's all that much of a correlation between these things to be honest.

↑ comment by Benquo · 2012-05-14T14:21:30.482Z · LW(p) · GW(p)

This makes me wonder... What "for dummies" books should I be using as checklists right now? Time to set a 5-minute timer and think about it.

Replies from: None, David_Gerard

↑ comment by [deleted] · 2012-05-26T23:38:50.136Z · LW(p) · GW(p)

What did you come up with?

Replies from: Benquo

↑ comment by Benquo · 2012-05-28T21:02:01.016Z · LW(p) · GW(p)

I haven't actually found the right books yet, but these are the things where I decided I should find some "for beginners" text. the important insight is that I'm allowed to use these books as skill/practice/task checklists or catalogues, rather than ever reading them all straight through.

General interest:

Career
Networking
Time management
Fitness

For my own particular professional situation, skills, and interests:

Risk management
Finance
Computer programming
SAS
Finance careers
Career change
Web programming
Research/science careers
Math careers
Appraising
Real Estate
UNIX

Replies from: grendelkhan

↑ comment by grendelkhan · 2013-03-28T14:43:27.275Z · LW(p) · GW(p)

For fitness, I'd found Liam Rosen's FAQ (the 'sticky' from 4chan's /fit/ board) to be remarkably helpful and information-dense. (Mainly, 'toning' doesn't mean anything, and you should probably be lifting heavier weights in a linear progression, but it's short enough to be worth actually reading through.)

↑ comment by David_Gerard · 2012-05-14T15:32:38.303Z · LW(p) · GW(p)

The For Dummies series is generally very good indeed. Yes.

↑ comment by Steve_Rayhawk · 2012-10-21T10:10:58.703Z · LW(p) · GW(p)

these are all literally from the Nonprofits for Dummies book. [...] The history I've heard is that SI [...]

failed to read Nonprofits for Dummies,

I remember that, when Anna was managing the fellows program, she was reading books of the "for dummies" genre and trying to apply them... it's just that, as it happened, the conceptual labels she accidentally happened to give to the skill deficits she was aware of were "what it takes to manage well" (i.e. "basic management") and "what it takes to be productive", rather than "what it takes to (help) operate a nonprofit according to best practices". So those were the subjects of the books she got. (And read, and practiced.) And then, given everything else the program and the organization was trying to do, there wasn't really any cognitive space left over [? · GW] to effectively notice the possibility that those wouldn't be the skills that other people afterwards would complain that nobody acquired and obviously should have known to. The rest of her budgeted self-improvement effort mostly went toward overcoming self-defeating emotional/social blind spots and motivated cognition. (And I remember Jasen's skill learning focus was similar, except with more of the emphasis on emotional self-awareness and less on management.)

failed to ask advisors for advice,

I remember Anna went out of her way to get advice from people who she already knew, who she knew to be better than her at various aspects of personal or professional functioning. And she had long conversations with supporters who she came into contact with for some other reasons; for those who had executive experience, I expect she would have discussed her understanding of SIAI's current strategies with them and listened to their suggestions. But I don't know how much she went out of her way to find people she didn't already have reasonably reliable positive contact with, to get advice from them.

I don't know much about the reasoning of most people not connected with the fellows program about the skills or knowledge they needed. I think Vassar was mostly relying on skills tested during earlier business experience, and otherwise was mostly preoccupied with the general crisis of figuring out how to quickly-enough get around the various hugely-saliently-discrepant-seeming-to-him psychological barriers that were causing everyone inside and outside the organization to continue unthinkingly shooting themselves in the feet with respect to this outside-evolutionary-context-problem of existential risk mitigation. For the "everyone outside's psychological barriers" side of that, he was at least successful enough to keep SIAI's public image on track to trigger people like David Chalmers and Marcus Hutter into meaningful contributions to and participation in a nascent Singularity-studies academic discourse. I don't have a good idea what else was on his mind as something he needed to put effort into figuring out how to do, in what proportions occupying what kinds of subjective effort budgets, except that in total it was enough to put him on the threshold of burnout. Non-profit best practices apparently wasn't one of those things though.

But the proper approach to retrospective judgement is generally a confusing question.

the kind of thing that makes me want to say [. . .]

The general pattern, at least post-2008, may have been one where the people who could have been aware of problems felt too metacognitively exhausted and distracted by other problems to think about learning what to do about them, and hoped that someone else with more comparative advantage would catch them, or that the consequences wouldn't be bigger than those of the other fires they were trying to put out.

strategic plan [...] SI failed to make these kinds of plans in the first place,

There were also several attempts at building parts of a strategy document or strategic plan, which together took probably 400-1800 hours. In each case, the people involved ended up determining, from how long it was taking, that, despite reasonable-seeming initial expectations, it wasn't on track to possibly become a finished presentable product soon enough to justify the effort. The practical effect of these efforts was instead mostly just a hard-to-communicate cultural shared understanding of the strategic situation and options -- how different immediate projects, forms of investment, or conditions in the world might feed into each other on different timescales.

expenses tracking, funds monitoring [...] some funds monitoring was insisted upon after the large theft

There was an accountant (who herself already cost like $33k/yr as the CFO, despite being split three ways with two other nonprofits) who would have been the one informally expected to have been monitoring for that sort of thing, and to have told someone about it if she saw something, out of the like three paid administrative slots at the time... well, yeah, that didn't happen.

I agree with a paraphrase of John Maxwell's characterization: "I'd rather hear Eliezer say 'thanks for funding us until we stumbled across some employees who are good at defeating their akrasia and [had one of the names of the things they were aware they were supposed to] care about [happen to be "]organizational best practices["]', because this seems like a better depiction of what actually happened." Note that this was most of the purpose of the Fellows program in the first place -- to create an environment where people could be introduced to the necessary arguments/ideas/culture and to help sort/develop those people into useful roles, including replacing existing management, since everyone knew there were people who would be better at their job than they were and wished such a person could be convinced to do it instead.

Replies from: Louie, John_Maxwell_IV

↑ comment by Louie · 2012-11-18T10:04:40.950Z · LW(p) · GW(p)

Note that this was most of the purpose of the Fellows program in the first place -- [was] to help sort/develop those people into useful roles, including replacing existing management

FWIW, I never knew the purpose of the VF program was to replace existing SI management. And I somewhat doubt that you knew this at the time, either. I think you're just imagining this retroactively given that that's what ended up happening. For instance, the internal point system used to score people in the VFs program had no points for correctly identifying organizational improvements and implementing them. It had no points for doing administrative work (besides cleaning up the physical house or giving others car rides). And it had no points for rising to management roles. It was all about getting karma on LW or writing conference papers. When I first offered to help with the organization directly, I was told I was "too competent" and that I should go do something more useful with my talent, like start another business... not "waste my time working directly at SI."

↑ comment by John_Maxwell (John_Maxwell_IV) · 2012-12-19T13:31:42.463Z · LW(p) · GW(p)

"I'd rather hear Eliezer say 'thanks for funding us until we stumbled across some employees who are good at defeating their akrasia and [had one of the names of the things they were aware they were supposed to] care about [happen to be "]organizational best practices["]', because this seems like a better depiction of what actually happened."

Seems like a fair paraphrase.

↑ comment by David_Gerard · 2012-05-26T23:32:43.316Z · LW(p) · GW(p)

This inspired me to make a blog post: You need to read Nonprofit Kit for Dummies.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-27T08:02:08.481Z · LW(p) · GW(p)

... which Eliezer has read and responded to, noting he did indeed read just that book in 2000 when he was founding SIAI. This suggests having someone of Luke's remarkable drive was in fact the missing piece of the puzzle.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-27T09:26:28.100Z · LW(p) · GW(p)

Fascinating! I want to ask "well, why didn't it take then?", but if I were in Eliezer's shoes I'd be finding this discussion almost unendurably painful right now, and it feels like what matters has already been established. And of course he's never been the person in charge of that sort of thing, so maybe he's not who we should be grilling anyway.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-27T10:22:17.859Z · LW(p) · GW(p)

Obviously we need How to be Lukeprog for Dummies. Luke appears to have written many fragments for this, of course.

Beating oneself up with hindsight bias is IME quite normal in this sort of circumstance, but not actually productive. Grilling the people who failed makes it too easy to blame them personally, when it's a pattern I've seen lots and lots, suggesting the problem is not a personal failing.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-27T11:23:11.935Z · LW(p) · GW(p)

Agreed entirely - it's definitely not a mark of a personal failing. What I'm curious about is how we can all learn to do better at the crucial rationalist skill of making use of the standard advice about prosaic tasks - which is manifestly a non-trivial skill.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-27T13:52:32.284Z · LW(p) · GW(p)

The Bloody Obvious For Dummies. If only common sense were!

From the inside (of a subcompetent charity - and I must note, subcompetent charities know they're subcompetent), it feels like there's all this stuff you're supposed to magically know about, and lots of "shut up and do the impossible" moments. And you do the small very hard things, in a sheer tour de force of remarkable effort. But it leads to burnout. Until the organisation makes it to competence and the correct paths are retrospectively obvious.

That actually reads to me like descriptions I've seen of the startup process.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-27T14:39:58.187Z · LW(p) · GW(p)

The problem is that there are two efficiencies/competences here, the efficiency as in doing the accounting correctly, which is relatively easy in comparison to the second: the efficiency as in actually doing relevant novel technical work that matters. The former you could get advice from some books, the latter you won't get any advice on, it's a harder problem, and typical level of performance is exactly zero (even for those who get the first part right). The difference in difficulties is larger than that between building a robot kit by following instructions vs designing a ground breaking new robot and making a billion dollars off it.

The best advice to vast majority of startups is: dissolve startup and get normal jobs, starting tomorrow. The best advice to all is to take a very good look at themselves knowing that the most likely conclusion should be "dissolve and get normal jobs". The failed startups I've seen so far were propelled by pure, unfounded belief in themselves (like in a movie where someone doesn't want to jump, other says yes you can do that!! then that person jumps, but rather than sending positive message and jumping over and surviving, falls down to instant death, while the fire that the person was running away from just goes out). The successful startups, on the other hand, had very well founded belief in themselves (good track record, attainable goals), or started from a hobby project that gone successful.

Replies from: TheOtherDave, David_Gerard

↑ comment by TheOtherDave · 2012-05-27T15:52:58.950Z · LW(p) · GW(p)

Judging from the success rate that VCs have at predicting successful startups, I conclude that the "pure unfounded belief on the one hand, well-founded belief on the other" metric is not easily applied to real organizations by real observers.

↑ comment by David_Gerard · 2012-05-27T15:18:25.925Z · LW(p) · GW(p)

Mm. This is why an incompetent nonprofit can linger for years: no-one is doing what they do, so they feel they still have to exist, even though they're not achieving much, and would have died already as a for-profit business. I am now suspecting that the hard part for a nonprofit is something along the lines of working out what the hell you should be doing to achieve your goal. (I would be amazed if there were not extensive written-up research in this area, though I don't know what it is.)

↑ comment by David_Gerard · 2012-05-14T15:30:18.997Z · LW(p) · GW(p)

That book looks like the basic solution to the pattern I outline here, and from your description, most people who have any public good they want to achieve should read it around the time they think of getting a second person involved.

↑ comment by lukeprog · 2012-07-15T22:57:25.916Z · LW(p) · GW(p)

You go to war with the army you have, not the army you might want.

Donald Rumsfeld

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-15T23:38:21.042Z · LW(p) · GW(p)

...this was actually a terrible policy in historical practice.

Replies from: Vaniver

↑ comment by Vaniver · 2012-07-16T00:16:19.824Z · LW(p) · GW(p)

That only seems relevant if the war in question is optional.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-07-16T02:09:44.680Z · LW(p) · GW(p)

Rumsfeld is speaking of the Iraq war. It was an optional war, the army turned out to be far understrength for establishing order, and they deliberately threw out the careful plans for preserving e.g. Iraqi museums from looting that had been drawn up by the State Department, due to interdepartmental rivalry.

This doesn't prove the advice is bad, but at the very least, Rumsfeld was just spouting off Deep Wisdom that he did not benefit from spouting; one would wish to see it spoken by someone who actually benefited from the advice, rather than someone who wilfully and wantonly underprepared for an actual war.

Replies from: Vaniver

↑ comment by Vaniver · 2012-07-16T02:27:10.608Z · LW(p) · GW(p)

just spouting off Deep Wisdom that he did not benefit from spouting

Indeed. The proper response, which is surely worth contemplation, would have been:

Victorious warriors win first and then go to war, while defeated warriors go to war first and then seek to win.

Sun Tzu

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-07-17T18:22:47.746Z · LW(p) · GW(p)

Victorious warriors win first and then go to war, while defeated warriors go to war first and then seek to win.

This is a circular definition, not an advice.

Replies from: army1987, framsey, TheOtherDave

↑ comment by A1987dM (army1987) · 2012-07-18T16:05:14.497Z · LW(p) · GW(p)

I would naively read it as “don't start a fight unless you know you're going to win”.

↑ comment by framsey · 2012-07-17T18:36:08.545Z · LW(p) · GW(p)

If you read it literally. I think Sun Tzu is talking about the benefit of planning.

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-07-17T21:52:28.243Z · LW(p) · GW(p)

I'm guessing that something got lost in translation,

Replies from: framsey

↑ comment by framsey · 2012-07-17T22:39:04.639Z · LW(p) · GW(p)

In context: http://suntzusaid.com/book/4

I think the quote is an alternative translation of paragraph 15 in the link above:

"Thus it is that in war the victorious strategist only seeks battle after the victory has been won, whereas he who is destined to defeat first fights and afterwards looks for victory."

It has an associated commentary:

Ho Shih thus expounds the paradox: "In warfare, first lay plans which will ensure victory, and then lead your army to battle; if you will not begin with stratagem but rely on brute strength alone, victory will no longer be assured."

↑ comment by TheOtherDave · 2012-07-17T18:54:02.617Z · LW(p) · GW(p)

I don't see the circularity.
Just because a warrior is victorious doesn't necessarily mean they won before going to war; it might be instead that victorious warriors go to war first and then seek to win, and defeated warriors do the same thing.
Can you spell out the circularity?

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-07-17T22:00:33.877Z · LW(p) · GW(p)

Unless you interpret "win first" as "prepare for every eventuality, calculate the unbiased probability of winning and be comfortable with the odds when going to battle", "win first" can only be meaningfully applied in retrospect.

Replies from: thomblake, TheOtherDave

↑ comment by thomblake · 2012-07-18T16:25:26.541Z · LW(p) · GW(p)

I think you've stumbled upon the correct interpretation.

Sun Tzu was fond of making warfare about strategy and logistics rather than battles, so that one would only fight when victory is a foregone conclusion.

↑ comment by TheOtherDave · 2012-07-17T22:38:48.713Z · LW(p) · GW(p)

Ah, I see what you mean now.
Thanks for the clarification.

↑ comment by ghf · 2012-05-11T22:06:54.752Z · LW(p) · GW(p)

And note that these improvements would not and could not have happened without more funding than the level of previous years

Given the several year lag between funding increases and the listed improvements, it appears that this was less a result of a prepared plan and more a process of underutilized resources attracting a mix of parasites (the theft) and talent (hopefully the more recent staff additions).

Which goes towards a critical question in terms of future funding: is SIAI primarily constrained in its mission by resources or competence?

Of course, the related question is: what is SIAI's mission? Someone donating primarily for AGI research might not count recent efforts (LW, rationality camps, etc) as improvements.

What should a potential donor expect from money invested into this organization going forward? Internally, what are your metrics for evaluation?

Edited to add: I think that the spin-off of the rationality efforts is a good step towards answering these questions.

↑ comment by John_Maxwell (John_Maxwell_IV) · 2012-05-11T05:07:40.206Z · LW(p) · GW(p)

This seems like a rather absolute statement. Knowing Luke, I'll bet he would've gotten some of it done even on a limited budget.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-11T06:08:58.966Z · LW(p) · GW(p)

Luke and Louie Helm are both on paid staff.

Replies from: John_Maxwell_IV

↑ comment by John_Maxwell (John_Maxwell_IV) · 2012-05-12T00:29:55.598Z · LW(p) · GW(p)

I'm pretty sure their combined salaries are lower than the cost of the summer fellows program that SI was sponsoring four or five years ago. Also, if you accept my assertion that Luke could find a way to do it on a limited budget, why couldn't somebody else?

Givewell is interested in finding charities that translate good intentions into good results. This requires that the employees of the charity have low akrasia, desire to learn about and implement organizational best practices, not suffer from dysrationalia, etc. I imagine that from Givewell's perspective, it counts as a strike against the charity if some of the charity's employees have a history of failing at any of these.

I'd rather hear Eliezer say "thanks for funding us until we stumbled across some employees who are good at defeating their akrasia and care about organizational best practices", because this seems like a better depiction of what actually happened. I don't get the impression SI was actively looking for folks like Louie and Luke.

Replies from: None

↑ comment by [deleted] · 2012-05-12T01:48:28.670Z · LW(p) · GW(p)

Yes to this. Eliezer's claim about the need for funding may suffer many of Luke's criticisms above. But usually the most important thing you need is talent and that does require funding.

↑ comment by ghf · 2012-05-11T22:38:10.640Z · LW(p) · GW(p)

My hope is that the upcoming deluge of publications will answer this objection, but for the moment, I am unclear as to the justification for the level of resources being given to SIAI researchers.

Additionally, I alone have a dozen papers in development, for which I am directing every step of research and writing, and will write the final draft, but am collaborating with remote researchers so as to put in only 5%-20% of the total hours required myself.

This level of freedom is the dream of every researcher on the planet. Yet, it's unclear why these resources should be devoted to your projects. While I strongly believe that the current academic system is broken, you are asking for a level of support granted to top researchers prior to have made any original breakthroughs yourself.

If you can convince people to give you that money, wonderful. But until you have made at least some serious advancement to demonstrate your case, donating seems like an act of faith.

It's impressive that you all have found a way to hack the system and get paid to develop yourselves as researchers outside of the academic system and I will be delighted to see that development bear fruit over the coming years. But, at present, I don't see evidence that the work being done justifies or requires that support.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T22:48:13.467Z · LW(p) · GW(p)

This level of freedom is the dream of every researcher on the planet. Yet, it's unclear why these resources should be devoted to your projects.

Because some people like my earlier papers and think I'm writing papers on the most important topic in the world?

It's impressive that you all have found a way to hack the system and get paid to develop yourselves as researchers outside of the academic system...

Note that this isn't uncommon. SI is far from the only think tank with researchers who publish in academic journals. Researchers at private companies do the same.

Replies from: ghf, Bugmaster, metaphysicist

↑ comment by ghf · 2012-05-11T23:15:03.978Z · LW(p) · GW(p)

First, let me say that, after re-reading, I think that my previous post came off as condescending/confrontational which was not my intent. I apologize.

Second, after thinking about this for a few minutes, I realized that some of the reason your papers seem so fluffy to me is that they argue what I consider to be obvious points. In my mind, of course we are likely "to develop human-level AI before 2100." Because of that, I may have tended to classify your work as outreach more than research.

But outreach is valuable. And, so that we can factor out the question of the independent contribution of your research, having people associated with SIAI with the publications/credibility to be treated as experts has gigantic benefits in terms of media multipliers (being the people who get called on for interviews, panels, etc). So, given that, I can see a strong argument for publication support being valuable to the overall organization goals regardless of any assessment of the value of the research.

Note that this isn't uncommon. SI is far from the only think tank with researchers who publish in academic journals. Researchers at private companies do the same.

My only point was that, in those situations, usually researchers are brought in with prior recognized achievements (or, unfortunately all too often, simply paper credentials). SIAI is bringing in people who are intelligent but unproven and giving them the resources reserved for top talent in academia or industry. As you've pointed out, one of the differences with SIAI is the lack of hoops to jump through.

Edit: I see you commented below that you view your own work as summarization of existing research and we agree on the value of that. Sorry that my slow typing speed left me behind the flow of the thread.

↑ comment by Bugmaster · 2012-05-11T22:53:44.772Z · LW(p) · GW(p)

Researchers at private companies do the same.

It's true at my company, at least. There are quite a few papers out there authored by the researchers at the company where I work. There are several good business reasons for a company to invest time into publishing a paper; positive PR is one of them.

↑ comment by metaphysicist · 2012-05-11T23:02:30.053Z · LW(p) · GW(p)

Because some people like my earlier papers and think I'm writing papers on the most important topic in the world?

But then you put your intellect at issue, and I think I'm entitled to opine that you lack the qualities of intellect that would make such recommendation credible. You're a budding scholar; a textbook writer at heart. You lack any of the originality of a thinker.

You confirm the lead poster's allegations that SIA staff are insular and conceited.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T23:11:55.421Z · LW(p) · GW(p)

I think I'm entitled to opine...

Of course you are. And, you may not be one of the people who "like my earlier papers."

You confirm the lead poster's allegations that SIA staff are insular and conceited.

Really? How? I commented earlier on LW (can't find it now) about how the kind of papers I write barely count as "original research" because for the most part they merely summarize and clarify the ideas of others. But as Beckstead says, there is a strong need for that right now.

For insights in decision theory and FAI theory, I suspect we'll have to look to somebody besides Luke Muehlhauser. We keep trying to hire such people but they keep saying "No." (I got two more "no"s just in the last 3 weeks.) Part of that may be due to the past and current state of the organization — and luckily, fixing that kind of thing is something I seem to have some skills with.

You're... a textbook writer at heart.

True, dat.

↑ comment by siodine · 2012-05-11T13:35:22.323Z · LW(p) · GW(p)

Isn't this very strong evidence in support for Holden's point about "Apparent poorly grounded belief in SI's superior general rationality" (excluding Luke, at least)? And especially this?

Replies from: lukeprog, lessdazed

↑ comment by lukeprog · 2012-05-11T20:13:20.034Z · LW(p) · GW(p)

This topic is something I've been thinking about lately. Do SIers tend to have superior general rationality, or do we merely escape a few particular biases? Are we good at rationality, or just good at "far mode" rationality (aka philosophy)? Are we good at epistemic but not instrumental rationality? (Keep in mind, though, that rationality is only a ceteris paribus predictor of success.)

Or, pick a more specific comparison. Do SIers tend to be better at general rationality than someone who can keep a small business running for 5 years? Maybe the tight feedback loops of running a small business are better rationality training than "debiasing interventions" can hope to be.

Of course, different people are more or less rational in different domains, at different times, in different environments.

This isn't an idle question about labels. My estimate of the scope and level of people's rationality in part determines how much I update from their stated opinion on something. How much evidence for Hypothesis X (about organizational development) is it when Eliezer gives me his opinion on the matter, as opposed to when Louie gives me his opinion on the matter? When Person B proposes to take on a totally new kind of project, I think their general rationality is a predictor of success — so, what is their level of general rationality?

Replies from: Bugmaster, TheOtherDave, siodine

↑ comment by Bugmaster · 2012-05-11T22:49:28.381Z · LW(p) · GW(p)

Are we good at epistemic but not instrumental rationality?

Holden implies (and I agree with him) that there's very little evidence at the moment to suggest that SI is good at instrumental rationality. As for epistemic rationality, how would we know ? Is there some objective way to measure it ? I personally happen to believe that if a person seems to take it as a given that he's great at epistemic rationality, this fact should count as evidence (however circumstantial) against him being great at epistemic rationality... but that's just me.

↑ comment by TheOtherDave · 2012-05-11T21:10:55.204Z · LW(p) · GW(p)

If you accept that your estimate of someone's "rationality" should depend on the domain, the environment, the time, the context, etc... and what you want to do is make reliable estimates of the reliability of their opinion, their chances of success. etc... it seems to follow that you should be looking for comparisons within a relevant domain, environment, etc.

That is, if you want to get opinions about hypothesis X about organizational development that serve as significant evidence, it seems the thing to do is to find someone who knows a lot about organizational development -- ideally, someone who has been successful at developing organizations -- and consult their opinions. How generally rational they are might be very relevant causally, or it might not, but is in either case screened off by their domain competence... and their domain competence is easier to measure than their general rationality.

So is their general rationality worth devoting resources to determining?

It seems this only makes sense if you have already (e.g.) decided to ask Eliezer and Louie for their advice, whether it's good evidence or not, and now you need to know how much evidence it is, and you expect the correct answer is different from the answer you'd get by applying the metrics you know about (e.g., domain familiarity and previously demonstrated relevant expertise).

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T21:55:52.957Z · LW(p) · GW(p)

I do spend a fair amount of time talking to domain experts outside of SI. The trouble is that the question of what we should do about thing X doesn't just depend on domain competence but also on thousands of details about the inner workings of SI and our mission that I cannot communicate to domain experts outside SI, but which Eliezer and Louie already possess.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-11T22:14:49.868Z · LW(p) · GW(p)

So it seems you have a problem in two domains (organizational development + SI internals) and different domain experts in both domains (outside domain experts + Eliezer/Louie), and need some way of cross-linking the two groups' expertise to get a coherent recommendation, and the brute-force solutions (e.g. get them all in a room together, or bring one group up to speed on the other's domain) are too expensive to be worth it. (Well, assuming the obstacle isn't that the details need to be kept secret, but simply that expecting an outsider to come up to speed on all of SI's local potentially relevant trivia simply isn't practical.)

Yes?

Yeah, that can be a problem.

In that position, for serious questions I would probably ask E/L for their recommendations and a list of the most relevant details that informed that decision, then go to outside experts with a summary of the competing recommendations and an expanded version of that list and ask for their input. If there's convergence, great. If there's divergence, iterate.

This is still a expensive approach, though, so I can see where a cheaper approximation for less important questions is worth having.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T22:18:53.862Z · LW(p) · GW(p)

Yes to all this.

↑ comment by siodine · 2012-05-11T23:08:47.169Z · LW(p) · GW(p)

In the world in which a varied group of intelligent and especially rational people are organizing to literally save humanity, I don't see the relatively trivial, but important, improvements you've made in a short period of time being made because they were made years ago. And I thought that already accounting for the points you've made.

I mean, the question this group should be asking themselves is "how can we best alter the future so as to navigate towards FAI?" So, how did they apparently miss something like opportunity cost? Why, for instance, has their salaries increased when they could've been using it to improve the foundation of their cause from which everything else follows?

(Granted, I don't know the history and inner workings of the SI, and so I could be missing some very significant and immovable hurdles, but I don't see that as very likely; at least, not as likely as Holden's scenario.)

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T23:18:25.605Z · LW(p) · GW(p)

I don't see the relatively trivial, but important, improvements you've made in a short period of time being made because they were made years ago. And I thought that already accounting for the points you've made.

I don't know what these sentences mean.

So, how did they apparently miss something like opportunity cost? Why, for instance, has their salaries increased when they could've been using it to improve the foundation of their cause from which everything else follows?

Actually, salary increases help with opportunity cost. At very low salaries, SI staff ends up spending lots of time and energy on general life cost-saving measures that distract us from working on x-risk reduction. And our salaries are generally still pretty low. I have less than $6k in my bank accounts. Outsourcing most tasks to remote collaborators also helps a lot with opportunity cost.

Replies from: siodine, None, None

↑ comment by siodine · 2012-05-12T00:01:50.875Z · LW(p) · GW(p)

I don't know what these sentences mean.

People are more rational in different domains, environments, and so on.
The people at SI may have poor instrumental rationality while being adept at epistemic rationality.
Being rational doesn't necessarily mean being successful.

I accept all those points, and yet I still see the Singularity Institute having made the improvements that you've made since being hired before you were hired if they have superior general rationality. That is, you wouldn't have that list of relatively trivial things to brag about because someone else would have recognized the items on that list as important and got them done somehow (ignore any negative connotations--they're not intended).

For instance, I don't see a varied group of people with superior general rationality not discovering or just not outsourcing work they don't have a comparative advantage in (i.e., what you've done). That doesn't look like just a failure in instrumental rationality, or just rationality operating on a different kind of utility function, or just a lack of domain specific knowledge.

The excuses available to a person acting in a way that's non-traditionally rational are less convincing when you apply them to a group.

Actually, salary increases help with opportunity cost. At very low salaries, SI staff ends up spending lots of time and energy on general life cost-saving measures that distract us from working on x-risk reduction. And our salaries are generally still pretty low. I have less than $6k in my bank accounts.

No, I get that. But that still doesn't explain away the higher salaries like EY's 80k/year and its past upwards trend. I mean, these higher paid people are the most committed to the cause, right? I don't see those people taking a higher salary when they could use that money for more outsourcing, or another employee, or better employees, if they want to literally save humanity while being superior in general rationality. It's like a homeless person desperately in want of shelter trying save enough for an apartment and yet buying meals at some restaurant.

Outsourcing most tasks to remote collaborators also helps a lot with opportunity cost.

That's the point I was making, why wasn't that done earlier? How did these people apparently miss out on opportunity cost? (And I'm just using outsourcing as an example because it was one of the most glaring changes you made that I think should have probably been made much earlier.)

Replies from: lukeprog, Rain, komponisto

↑ comment by lukeprog · 2012-05-12T00:20:39.396Z · LW(p) · GW(p)

Right, I think we're saying the same thing, here: the availability of so much low-hanging fruit in organizational development as late as Sept. 2011 is some evidence against the general rationality of SIers. Eliezer seems to want to say it was all a matter of funding, but that doesn't make sense to me.

Now, on this:

I don't see those people taking a higher salary when they could use that money for more outsourcing, or another employee, or better employees, if they want to literally save humanity while being super in general rationality.

For some reason I'm having a hard time parsing your sentences for unambiguous meaning, but if I may attempt to rephrase: "SIers wouldn't take any salaries higher than (say) $70k/yr if they were truly committed to the cause and good in general rationality, because they would instead use that money to accomplish other things." Is that what you're saying?

Replies from: Rain, siodine

↑ comment by Rain · 2012-05-12T00:29:53.205Z · LW(p) · GW(p)

I've heard the Bay Area is expensive, and previously pointed out that Eliezer earns more than I do, despite me being in the top 10 SI donors.

I don't mind, though, as has been pointed out, even thinking about muffins might be a question invoking existential risk calculations.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T00:39:54.925Z · LW(p) · GW(p)

despite me being in the top 10 SI donors

...and much beloved for it.

Yes, the Bay Area is expensive. We've considered relocating, but on the other hand the (by far) best two places for meeting our needs in HR and in physically meeting with VIPs are SF and NYC, and if anything NYC is more expensive than the Bay Area. We cut living expenses where we can: most of us are just renting individual rooms.

Also, of course, it's not like the Board could decide we should relocate to a charter city in Honduras and then all our staff would be able to just up and relocate. :)

(Rain may know all this; I'm posting it for others' benefit.)

Replies from: komponisto, TheOtherDave

↑ comment by komponisto · 2012-05-12T18:58:03.493Z · LW(p) · GW(p)

I think it's crucial that SI stay in the Bay Area. Being in a high-status place signals that the cause is important. If you think you're not taken seriously enough now, imagine if you were in Honduras...

Not to mention that HR is without doubt the single most important asset for SI. (Which is why it would probably be a good idea to pay more than the minimum cost of living.)

↑ comment by TheOtherDave · 2012-05-12T01:31:59.697Z · LW(p) · GW(p)

Out of curiosity only: what were the most significant factors that led you to reject telepresence options?

Replies from: David_Gerard, lukeprog

↑ comment by David_Gerard · 2012-05-12T18:02:44.485Z · LW(p) · GW(p)

FWIW, Wikimedia moved from Florida to San Francisco precisely for the immense value of being at the centre of things instead of the middle of nowhere (and yes, Tampa is the middle of nowhere for these purposes, even though it still has the primary data centre). Even paying local charity scale rather than commercial scale (there's a sort of cycle where WMF hires brilliant kids, they do a few years working at charity scale then go to Facebook/Google/etc for gobs of cash), being in the centre of things gets them staff and contacts they just couldn't get if they were still in Tampa. And yes, the question came up there pretty much the same as it's coming up here: why be there instead of remote? Because so much comes with being where things are actually happening, even if it doesn't look directly related to your mission (educational charity, AI research institute).

Replies from: komponisto

↑ comment by komponisto · 2012-05-12T19:00:32.677Z · LW(p) · GW(p)

FWIW, Wikimedia moved from Florida to San Francisco

I didn't know this, but I'm happy to hear it.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-12T19:04:58.762Z · LW(p) · GW(p)

The charity is still registered in Florida but the office is in SF. I can't find the discussion on a quick search, but all manner of places were under serious consideration - including the UK, which is a horrible choice for legal issues in so very many ways.

↑ comment by lukeprog · 2012-05-12T01:56:50.756Z · LW(p) · GW(p)

In our experience, monkeys don't work that way. It sounds like it should work, and then it just... doesn't. Of course we do lots of Skyping, but regular human contact turns out to be pretty important.

Replies from: TheOtherDave, HoverHell

↑ comment by TheOtherDave · 2012-05-12T02:04:01.998Z · LW(p) · GW(p)

(nods) Yeah, that's been my experience too, though I've often suspected that companies like Google probably have a lot of research on the subject lying around that might be informative.

Some friends of mine did some experimenting along these lines when doing distributed software development (in both senses) and were somewhat startled to realize that Dark Age of Camelot worked better for them as a professional conferencing tool than any of the professional conferencing tools their company had. They didn't mention this to their management.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-12T18:53:40.812Z · LW(p) · GW(p)

and were somewhat startled to realize that Dark Age of Camelot worked better for them as a professional conferencing tool than any of the professional conferencing tools their company had. They didn't mention this to their management.

I am reminded that Flickr started as a photo add-on for an MMORPG...

↑ comment by HoverHell · 2012-05-13T08:16:45.010Z · LW(p) · GW(p)

↑ comment by siodine · 2012-05-12T00:34:35.518Z · LW(p) · GW(p)

some evidence

Enough for you to agree with Holden on that point?

"SIers wouldn't take any salaries higher than (say) $70k/yr if they were truly committed to the cause and good in general rationality, because they would instead use that money to accomplish other things." Is that what you're saying?

Yes, but I wouldn't set a limit at a specific salary range; I'd expect them to give as much as they optimally could, because I assume they're more concerned with the cause than the money. (re the 70k/yr mention: I'd be surprised if that was anywhere near optimal)

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T00:46:18.405Z · LW(p) · GW(p)

Enough for you to agree with Holden on that point?

Probably not. He and I continue to dialogue in private about the point, in part to find the source of our disagreement.

Yes, but I wouldn't set a limit at a specific salary range; I'd expect them to give as much as they optimally could, because I assume they're more concerned with the cause than the money. (re the 70k/yr mention: I'd be surprised if that was anywhere near optimal)

I believe everyone except Eliezer currently makes between $42k/yr and $48k/yr — pretty low for the cost of living in the Bay Area.

Replies from: siodine, komponisto, metaphysicist

↑ comment by siodine · 2012-05-12T01:37:39.562Z · LW(p) · GW(p)

Probably not. He and I continue to dialogue in private about the point, in part to find the source of our disagreement.

So, if you disagree with Holden, I assume you think SIers have superior general rationality: why?

And I'm confident SIers will score well on rationality tests, but that looks like specialized rationality. I.e., you can avoid a bias but you can't avoid a failure in your achieving your goals. To me, the SI approach seems poorly leveraged. I expect more significant returns from simple knowledge acquisition. E.g., you want to become successful? YOU WANT TO WIN?! Great, read these textbooks on microeconomics, finance, and business. I think this is more the approach you take anyway.

I believe everyone except Eliezer currently makes between $42k/yr and $48k/yr — pretty low for the cost of living in the Bay Area.

That isn't as bad as I thinking it was; I don't know if that's optimal, but it seems at least reasonable.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T01:47:10.564Z · LW(p) · GW(p)

I assume you think SIers have superior general rationality: why?

I'll avoid double-labor on this and wait to reply until my conversation with Holden is done.

I expect more significant returns from simple knowledge acquisition. E.g., you want to become successful? ...Great, read these textbooks on microeconomics, finance, and business. I think this is more the approach you take anyway.

Right. Exercise the neglected virtue of scholarship and all that.

Replies from: siodine

↑ comment by siodine · 2012-05-12T01:52:51.772Z · LW(p) · GW(p)

Right. Exercise the neglected virtue of scholarship and all that.

It's not that easy to dismiss; if it's as poorly leveraged as it looks relative to other approaches then you have little reason to be spreading and teaching SI's brand of specialized rationality (except for perhaps income).

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T01:55:17.733Z · LW(p) · GW(p)

I'm not dismissing it, I'm endorsing it and agreeing with you that it has been my approach ever since my first post on LW.

Replies from: siodine

↑ comment by siodine · 2012-05-12T02:10:31.432Z · LW(p) · GW(p)

Weird, I have this perception of SI being heavily invested in overcoming biases and epistemic rationality training to the detriment of relevant domain specific knowledge, but I guess that's wrong?

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T02:25:08.638Z · LW(p) · GW(p)

I'm lost again; I don't know what you're saying.

Replies from: siodine

↑ comment by siodine · 2012-05-12T15:58:27.848Z · LW(p) · GW(p)

I'm not dismissing it, I'm endorsing it and agreeing with you that it has been my approach ever since my first post on LW.

I wasn't talking about you; I was talking about SI's approach in spreading and training rationality. You(SI) have Yudkowsky writing books, you have rationality minicamps, you have lesswrong, you and others are writing rationality articles and researching the rationality literature, and so on.

That kind of rationality training, research, and message looks poorly leveraged in achieving your goals, is what I'm saying. Poorly leveraged for anyone trying to achieve goals. And at its most abstract, that's what rationality is, right? Achieving your goals.

So, I don't care if your approach was to acquire as much relevant knowledge as possible before dabbling in debiasing, bayes, and whatnot (i.e., prioritizing the most leveraged approach). I wondering why your approach doesn't seem to be SI's approach. I'm wondering why SI doesn't prioritize rationality training, research, and message by whatever is the most leveraged in achieving SI's goals. I'm wondering why SI doesn't spread the virtue of scholarship to the detriment of training debiasing and so on.

SI wants to raise the sanity waterline, is what the SI doing even near optimal for that? Knowing what SIers knew and trained for couldn't even get them to see an opportunity for trading in on opportunity cost for years; that is sad.

↑ comment by komponisto · 2012-05-12T02:04:06.251Z · LW(p) · GW(p)

(Disclaimer: the following comment should not be taken to imply that I myself have concluded that SI staff salaries should be reduced.)

I believe everyone except Eliezer currently makes between $42k/yr and $48k/yr — pretty low for the cost of living in the Bay Area.

I'll grant you that it's pretty low relative to other Bay Area salaries. But as for the actual cost of living, I'm less sure.

I'm not fortunate enough to be a Bay Area resident myself, but here is what the internet tells me:

After taxes, a $48,000/yr gross salary in California equates to a net of around $3000/month.
A 1-bedroom apartment in Berkeley and nearby places can be rented for around $1500/month. (Presumably, this is the category of expense where most of the geography-dependent high cost of living is contained.)
If one assumes an average spending of $20/day on food (typically enough to have at least one of one's daily meals at a restaurant), that comes out to about $600/month.
That leaves around $900/month for miscellaneous expenses, which seems pretty comfortable for a young person with no dependents.

So, if these numbers are right, it seems that this salary range is actually right about what the cost of living is. Of course, this calculation specifically does not include costs relating to signaling (via things such as choices of housing, clothing, transportation, etc.) that one has more money than necessary to live (and therefore isn't low-status). Depending on the nature of their job, certain SI employees may need, or at least find it distinctly advantageous for their particular duties, to engage in such signaling.

↑ comment by metaphysicist · 2012-05-12T00:55:14.888Z · LW(p) · GW(p)

I believe everyone except Eliezer currently makes between $42k/yr and $48k/yr — pretty low for the cost of living in the Bay Area.

Damn good for someone just out of college—without a degree!

Replies from: lukeprog, katydee, Davorak

↑ comment by lukeprog · 2012-05-12T01:03:44.321Z · LW(p) · GW(p)

The point is that we're consequentialists, and lowering salaries even further would save money (on salaries) but result in SI getting less done, not more — for the same reason that outsourcing fewer tasks would save money (on outsourcing) but cause us to get less done, not more.

Replies from: steven0461

↑ comment by steven0461 · 2012-05-12T02:19:10.664Z · LW(p) · GW(p)

result in SI getting less done

You say this as though it's obvious, but if I'm not mistaken, salaries used to be about 40% of what they are now, and while the higher salaries sound like they are making a major productivity difference, hiring 2.5 times as many people would also make a major productivity difference. (Though yes, obviously marginal hires would be lower in quality.)

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T02:34:02.985Z · LW(p) · GW(p)

I don't think salaries were ever as low as 40% of what they are now. When I came on board, most people were at $36k/yr.

To illustrate why lower salaries means less stuff gets done: I've been averaging 60 hours per week, and I'm unusually productive. If I am paid less, that means that (to pick just one example from this week) I can't afford to take a taxi to and from the eye doctor, which means I spend 1.5 hrs each way changing buses to get there, and spend less time being productive on x-risk. That is totally not worth it. Future civilizations would look back on this decision as profoundly stupid.

Replies from: Will_Newsome

↑ comment by Will_Newsome · 2012-05-14T04:08:06.343Z · LW(p) · GW(p)

Pretty sure Anna and Steve Rayhawk had salaries around $20k/yr at some point while living in Silicon Valley.

I don't think that you're really responding to Steven's point. Yes, as Steven said, if you were paid less then clearly that would impose more costs on you, so ceteris paribus your getting paid less would be bad. But, as Steven said, the opportunity cost is potentially very high. You haven't made a rationally compelling case that the missed opportunity is "totally not worth it" or that heeding it would be "profoundly stupid", you've mostly just re-asserted your conclusion, contra Steven's objection. What are your arguments that this is the case? Note that I personally think it's highly plausible that $40-50k/yr is optimal, but as far as I can see you haven't yet listed any rationally compelling reasons to think so.

(This comment is a little bit sterner than it would have been if you hadn't emphatically asserted that conclusions other than your own would be "profoundly stupid" without first giving overwhelming justification for your conclusion. It is especially important to be careful about such apparent overconfidence on issues where one clearly has a personal stake in the matter.)

Replies from: steven0461, lukeprog

↑ comment by steven0461 · 2012-05-14T20:36:00.231Z · LW(p) · GW(p)

I will largely endorse Will's comment, then bow out of the discussion, because this appears to be too personal and touchy a topic for a detailed discussion to be fruitful.

↑ comment by lukeprog · 2012-05-14T08:05:24.761Z · LW(p) · GW(p)

Pretty sure Anna and Steve Rayhawk had salaries around $20k/yr at some point while living in Silicon Valley.

If so, I suspect they were burning through savings during this time or had some kind of cheap living arrangement that I don't have.

What are your arguments that [paying you less wouldn't be worth it]?

I couldn't really get by on less, so paying me less would cause me to quit the organization and do something else instead, which would cause much of this good stuff to probably not happen.
It's VERY hard for SingInst to purchase value as efficiently as by purchasing Luke-hours. At $48k/yr for 60 hrs/wk, I make $15.38/hr, and one Luke-hour is unusually productive for SingInst. Paying me less and thereby causing me to work fewer hours per week is a bad value proposition for SingInst.

Or, as Eliezer put it:

paying me less would require me to do things that take up time and energy in order to get by with a smaller income. Then, assuming all goes well, future intergalactic civilizations would look back and think this was incredibly stupid; in much the same way that letting billions of person-containing brains rot in graves, and humanity allocating less than a million dollars per year to the Singularity Institute, would predictably look pretty stupid in retrospect. At Singularity Institute board meetings we at least try not to do things which will predictably make future intergalactic civilizations think we were being willfully stupid. That's all there is to it, and no more.

Replies from: ciphergoth, Bugmaster, SexyBayes

↑ comment by Paul Crowley (ciphergoth) · 2012-05-14T09:36:16.138Z · LW(p) · GW(p)

This seems to me unnecessarily defensive. I support the goals of SingInst, but I could never bring myself to accept the kind of salary cut you guys are taking in order to work there. Like every other human on the planet, I can't be accurately modelled with a utility function that places any value on far distant strangers; you can more accurately model what stranger-altruism I do show as purchase of moral satisfaction, though I do seek for such altruism to be efficient. SingInst should pay the salaries it needs to pay to recruit the kind of staff it needs to fulfil its mission; it's harder to recruit if staff are expected to be defensive about demanding market salaries for their expertise, with no more than a normal adjustment for altruistic work much as if they were working for an animal sanctuary.

Replies from: lukeprog, army1987

↑ comment by lukeprog · 2012-05-14T09:48:39.259Z · LW(p) · GW(p)

I could never bring myself to accept the... salary cut you guys are taking in order to work [at SI]... SingInst should pay the salaries it needs to pay to recruit the kind of staff it needs to fulfill its mission; it's harder to recruit if staff are expected to be defensive about demanding market salaries for their expertise...

Yes, exactly.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-14T10:07:05.931Z · LW(p) · GW(p)

So when I say "unnecessarily defensive", I mean that all the stuff about the cost of taxis is after-the-fact defensive rationalization; it can't be said about a single dollar you spend on having a life outside of SI. The truth is that even the best human rationalist in the world isn't going to agree to giving those up, and since you have to recruit humans, you'd best pay the sort of salary that is going to attract and retain them. That of course includes yourself.

The same goes for saying "move to the Honduras". Your perfectly utility-maximising AGIs will move to the Honduras, but your human staff won't; they want to live in places like the Bay Area.

↑ comment by A1987dM (army1987) · 2012-05-14T18:55:37.425Z · LW(p) · GW(p)

I could never bring myself to accept the kind of salary cut you guys are taking in order to work there

You know that the Bay Area is freakin' expensive, right?

Replies from: ciphergoth, thomblake, Eugine_Nier, katydee

↑ comment by Paul Crowley (ciphergoth) · 2012-05-15T06:15:11.274Z · LW(p) · GW(p)

Re-reading, the whole thing is pretty unclear!

As katydee and thomblake say, I mean that working for SingInst would mean a bigger reduction in my salary than I could currently bring myself to accept. If I really valued the lives of strangers as a utilitarian, the benefits to them of taking a salary cut would be so huge that it would totally outweigh the costs to me. But it looks like I only really place direct value on the short-term interests of myself and those close to me, and everything else is purchase of moral satisfaction. Happily, purchase of moral satisfaction can still save the world if it is done efficiently.

Since the labour pool contains only human beings, with no true altruistic utility maximizers, SingInst should hire and pay accordingly; the market shows that people will accept a lower salary for a job that directly does good, but not a vastly lower salary. It would increase SI-utility if Luke accepted a lower salary, but it wouldn't increase Luke-utility, and driving Luke away would cost a lot of SI-utility, so calling for it is in the end a cheap shot and a bad recommendation.

I live in London, which is also freaking expensive - but so are all the places I want to live. There's a reason people are prepared to pay more to live in these places.

↑ comment by thomblake · 2012-05-14T19:12:50.224Z · LW(p) · GW(p)

Hmm... Perhaps you don't know that "salary cut" above means taking much less money?

Replies from: army1987, ciphergoth

↑ comment by A1987dM (army1987) · 2012-05-15T12:00:54.296Z · LW(p) · GW(p)

I had missed the word cut. Damn it, I shouldn't be commenting while sleep-deprived!

↑ comment by Paul Crowley (ciphergoth) · 2012-05-15T06:22:10.410Z · LW(p) · GW(p)

Indeed. I guess "taking a cut" can sometimes mean "taking some of the money", so you could interpret this as meaning "I couldn't accept all that money", which as you say is the opposite of what I meant!

↑ comment by Eugine_Nier · 2012-05-15T04:28:36.122Z · LW(p) · GW(p)

So why not relocate SIAI somewhere with a more reasonable cost of living?

Replies from: katydee, TraderJoe

↑ comment by katydee · 2012-05-15T04:50:47.056Z · LW(p) · GW(p)

I think the standard answer is that the networking and tech industry connections available in the Bay Area are useful enough to SIAI to justify the high costs of operating there.

↑ comment by TraderJoe · 2012-05-17T15:20:35.401Z · LW(p) · GW(p)

[comment deleted]

↑ comment by katydee · 2012-05-14T19:10:01.759Z · LW(p) · GW(p)

Perhaps that's why he's saying he wouldn't be willing to live there on a low salary?

↑ comment by Bugmaster · 2012-05-14T22:37:02.400Z · LW(p) · GW(p)

I understand the point you're making regarding salaries, and for once I agree.

However, it's rather presumptuous of you (and/or Eliezer) to assume, implicitly, that our choices are limited to only two possibilities: "Support SIAI, save the world", and "Don't support SIAI, the world is doomed". I can envision many other scenarios, such as "Support SIAI, but their fears were overblown and you implicitly killed N children by not spending the money on them instead", or "Don't support SIAI, support some other organization instead because they'll have a better chance of success", etc.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-15T21:45:26.017Z · LW(p) · GW(p)

Where did we say all that?

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-15T22:01:31.542Z · LW(p) · GW(p)

In your comment above, you said:

...I can't afford to take a taxi to and from the eye doctor, which means I spend 1.5 hrs each way changing buses to get there, and spend less time being productive on x-risk. That is totally not worth it. Future civilizations would look back on this decision as profoundly stupid.

You also quoted Eliezer saying something similar.

This outlook implies strongly that whatever SIAI is doing is of such monumental significance that future civilizations will not only remember its name, but also reverently preserve every decision it made. You are also quite fond of saying that the work that SIAI is doing is tantamount to "saving the world"; and IIRC Eliezer once said that, if you have a talent for investment banking, you should make as much money as possible and then donate it all to SIAI, as opposed to any other charity.

This kind of grand rhetoric presupposes not only that the SIAI is correct in its risk assessment regarding AGI, but also that they are uniquely qualified to address this potentially world-ending problem, and that, over the ages, no one more qualified could possibly come along. All of this could be true, but it's far from a certainty, as your writing would seem to imply.

Replies from: lukeprog, ciphergoth, jacob_cannell

↑ comment by lukeprog · 2012-05-15T23:59:22.356Z · LW(p) · GW(p)

I'm not seeing how the above implies the thing you said:

[You assume] our choices are limited to only two possibilities: "Support SIAI, save the world", and "Don't support SIAI, the world is doomed".

(Note that I don't necessarily endorse things you report Eliezer as having said.)

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-16T21:21:09.943Z · LW(p) · GW(p)

You appear to be very confident that future civilizations will remember SIAI in a positive way, and care about its actions. If so, they must have some reason for doing so. Any reason would do, but the most likely reason is that SIAI will accomplish something so spectacularly beneficial that it will affect everyone in the far future. SIAI's core mission is to save the world from UFAI, so it's reasonable to assume that this is the highly beneficial effect that the SIAI will achieve.

I don't have a problem with this chain of events, just with your apparent confidence that a). it's going to happen in exactly that way, and b). your organization is the only one who is qualified to save the world in this specific fashion.

(EDIT: I forgot to say that, if we follow your reasoning to its conclusion, then you are indeed implying that donating as much money or labor as possible to SIAI is the only smart move for any rational agent.)

Note that I have no problem with your main statement, i.e. "lowering the salaries of SIAI members would bring us too much negative utility to compensate for the monetary savings". This kind of cost-benefit analysis is done all the time, and future civilizations rarely enter into it.

↑ comment by Paul Crowley (ciphergoth) · 2012-05-16T09:29:13.117Z · LW(p) · GW(p)

Well no, of course it's not a certainty. All efforts to make a difference are decisions under uncertainty. You're attacking a straw man.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-16T21:06:46.607Z · LW(p) · GW(p)

Please substitute "certainty minus epsilon" for "certainty" wherever you see it in my post. It was not my intention to imply 100% certainty; just a confidence value so high that it amounts to the same thing for all practical purposes.

Replies from: dlthomas, ciphergoth

↑ comment by dlthomas · 2012-05-16T21:34:18.450Z · LW(p) · GW(p)

I don't think "certainty minus epsilon" improves much. It moves it from theoretical impossibility to practical - but looking that far out, I expect "likelihood" might be best.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-17T01:49:45.006Z · LW(p) · GW(p)

I don't understand your comment... what's the practical difference between "extremely high likelihood" and "extremely high certainty" ?

↑ comment by Paul Crowley (ciphergoth) · 2012-05-17T05:35:24.636Z · LW(p) · GW(p)

And where do SI claim even that? Obviously some of their discussions are implicitly conditioned on the fundamental assumptions behind their mission being true, but that doesn't mean that they have extremely high confidence in those assumptions.

↑ comment by jacob_cannell · 2012-05-16T09:43:41.718Z · LW(p) · GW(p)

This outlook implies strongly that whatever SIAI is doing is of such monumental significance that future civilizations will not only remember its name, but also reverently preserve every decision it made.

In the SIA/Transhumanist outlook, if civilization survives some large (perhaps majority) of extant human minds will survive as uploads. As a result, all of their memories will likely be stored, dissected, shared, searched, judged, and so on. Much will be preserved in such a future. And even without uploading, there are plenty of people who have maintained websites since the early days of the internet with no loss of information, and this is quite likely to remain true far into the future if civilization survives.

↑ comment by SexyBayes · 2012-05-17T14:11:35.157Z · LW(p) · GW(p)

"1. I couldn't really get by on less"

It is called a budget, son.

Plenty of people make less than you and work harder than you. Look in every major city and you will find plenty of people that fit this category, both in business and labor.

"That is totally not worth it. Future civilizations would look back on this decision as profoundly stupid."

Elitism plus demanding that you don't have to budget. Seems that you need to work more and focus less on how "awesome" you are.

You make good contributions...but let's not get carried away.

If you really cared about future risk you would be working away at the problem even with a smaller salary. Focus on your work.

Replies from: Rain, Cyan

↑ comment by Rain · 2012-05-17T14:13:06.785Z · LW(p) · GW(p)

If you really cared about future risk you would be working away at the problem even with a smaller salary. Focus on your work.

What we really need is some kind of emotionless robot who doesn't care about its own standard of living and who can do lots of research and run organizations and suchlike without all the pesky problems introduced by "being human".

Oh, wait...

↑ comment by Cyan · 2012-05-17T14:22:43.137Z · LW(p) · GW(p)

If you really cared about future risk you would be working away at the problem even with a smaller salary. Focus on your work.

Downvoted for this; Rain's reply to the parent goes for me too.

↑ comment by katydee · 2012-05-14T19:15:02.709Z · LW(p) · GW(p)

That's not actually that good, I don't think-- I go to a good college, and I know many people who are graduating to 60k-80k+ jobs with recruitment bonuses, opportunities for swift advancement, etc. Some of the best people I know could literally drop out now (three or four weeks prior to graduation) and immediately begin making six figures.

SIAI wages certainly seem fairly low to me relative to the quality of the people they are seeking to attract, though I think there are other benefits to working for them that cause the organization to attract skillful people regardless.

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-05-14T19:34:11.686Z · LW(p) · GW(p)

A Dilbert comic said it.

Replies from: katydee

↑ comment by katydee · 2012-05-14T19:52:15.356Z · LW(p) · GW(p)

Ouch. I'd like to think that the side benefits for working for SIAI outweigh the side benefits for working for whatever soulless corporation Dilbert's workplace embodies, though there is certainly a difference between side benefits and actual monetary compensation.

↑ comment by Davorak · 2012-05-18T18:06:18.945Z · LW(p) · GW(p)

I graduated ~5 years ago with a engineering degree from a first tier University and I would have consider those starting salaries to be low to decent and not high. This is especially true in places with high cost of living like the bay area.

Having a good internship durring college often ment starting out at 60k/yr if not higher.

If this is significantly different for engineers exiting first tier University now it would be interesting to know.

↑ comment by Rain · 2012-05-12T00:10:54.034Z · LW(p) · GW(p)

To summarize and rephrase: in a "counterfactual" world where SI was actually rational, they would have found all these solutions and done all these things long ago.

↑ comment by komponisto · 2012-05-12T00:47:08.744Z · LW(p) · GW(p)

Many of your sentences are confusing because you repeatedly use the locution "I see X"/ "I don't see X" in a nonstandard way, apparently to mean "X would have happened" /"X would not have happened".

This is not the way that phrase is usually understood. Normally, "I see X" is taken to mean either "I observe X" or "I predict X". For example I might say (if I were so inclined):

Unlike you, I see a lot of rationality being demonstrated by SI employees.

meaning that I believe (from my observation) they are in fact being rational. Or, I might say:

I don't see Luke quitting his job at SI tomorrow to become a punk rocker.

meaning that I don't predict that will happen. But I would not generally say:

* I don't see these people taking a higher salary.

if what I mean is "these people should/would not have taken a higher salary [if such-and-such were true]".

Replies from: siodine

↑ comment by siodine · 2012-05-12T01:04:35.362Z · LW(p) · GW(p)

Oh, I see ;) Thanks. I'll definitely act on your comment, but I was using "I see X" as "I predict X"--just in the context of a possible world. E.g., I predict in the possible world in which SIers are superior in general rationality and committed to their cause, Luke wouldn't have that list of accomplishments. Or, "yet I still see the Singularity Institute having made the improvements..."

I now see that I've been using 'see' as syntactic sugar for counterfactual talk... but no more!

Replies from: komponisto

↑ comment by komponisto · 2012-05-12T01:21:01.044Z · LW(p) · GW(p)

I was using "I see X" as "I predict X"--just in the context of a possible world.

To get away with this, you really need, at minimum, an explicit counterfactual clause ("if", "unless", etc.) to introduce it: "In a world where SIers are superior in general rationality, I don't see Luke having that list of accomplishments."

The problem was not so much that your usage itself was logically inconceivable, but rather that it collided with the other interpretations of "I see X" in the particular contexts in which it occurred. E.g. "I don't see them taking higher salaries" sounded like you were saying that they weren't taking higher salaries. (There was an "if" clause, but it came way too late!)

↑ comment by [deleted] · 2012-05-16T19:23:21.282Z · LW(p) · GW(p)

Have you considered the possibility that even higher salaries might raise productivity further?

I think we should search systematically for ways to convert money into increased productivity.

↑ comment by [deleted] · 2012-05-12T07:19:52.243Z · LW(p) · GW(p)

And our salaries are generally still pretty low.

By what measure do you figure that?

I have less than $6k in my bank accounts.

That might be informative if we knew anything about your budget, but without any sort of context it sounds purely obfuscatory. (Also, your bank account is pretty close to my annual salary, so you might want to consider what you're actually signalling here and to whom.)

↑ comment by lessdazed · 2012-05-31T05:54:35.797Z · LW(p) · GW(p)

Apparent poorly grounded belief in SI's superior general rationality

I found this complaint insufficiently detailed and not well worded.

Average people think their rationality is moderately good. Average people are not very rational. SI affiliated people think they are adept or at least adequate at rationality. SI affiliated people are not complete disasters at rationality.

SI affiliated people are vastly superior to others in generally rationality. So the original complaint literally interpreted is false.

An interesting question might be on the level of: "Do SI affiliates have rationality superior to what the average person falsely believes his or her rationality is?"

Holden's complaints each have their apparent legitimacy change differently under his and my beliefs. Some have to do with overconfidence or incorrect self-assessment, others with other-assessment, others with comparing SI people to others. Some of them:

Insufficient self-skepticism given how strong its claims are

Largely agree, as this relates to overconfidence.

...and how little support its claims have won.

Moderately disagree, as this relies on the rationality of others.

Being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously.

Largely disagree, as this relies significantly on the competence of others.

Paying insufficient attention to the limitations of the confidence one can have in one's untested theories, in line with my Objection 1.

Largely agree, as this depends more on accurate assessment of one's on rationality.

Rather than endorsing "Others have not accepted our arguments, so we will sharpen and/or reexamine our arguments," SI seems often to endorse something more like "Others have not accepted their arguments because they have inferior general rationality," a stance less likely to lead to improvement on SI's part.

There is instrumental value in falsely believing others to have a good basis for disagreement so one's search for reasons one might be wrong is enhanced. This is aside from the actual reasons of others.

It is easy to imagine an expert in a relevant field objecting to SI based on something SI does or says seeming wrong, only to have the expert couch the objection in literally false terms, perhaps ones that flow from motivated cognition and bear no trace of the real, relevant reason for the objection. This could be followed by SI's evaluation and dismissal of it and failure of a type not actually predicted by the expert...all such nuances are lost in the literally false "Apparent poorly grounded belief in SI's superior general rationality."

Such a failure comes to mind and is easy for me to imagine as I think this is a major reason why "Lack of impressive endorsements" is a problem. The reasons provided by experts for disagreeing with SI on particular issues are often terrible, but such expressions are merely what they believe their objections to be, and their expertise is in math or some such, not in knowing why they think what they think.

↑ comment by JoshuaFox · 2012-05-17T15:12:28.708Z · LW(p) · GW(p)

As a supporter and donor to SI since 2006, I can say that I had a lot of specific criticisms of the way that the organization was managed. The points Luke lists above were among them. I was surprised that on many occasions management did not realize the obvious problems and fix them.

But the current management is now recognizing many of these points and resolving them one by one, as Luke says. If this continues, SI's future looks good.

↑ comment by A1987dM (army1987) · 2012-05-11T08:18:32.654Z · LW(p) · GW(p)

I was hired as a Research Fellow that same month

Luke alone has a dozen papers in development

Why did you start referring to yourself in the first person and then change your mind? (Or am I missing something?)

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T08:20:33.067Z · LW(p) · GW(p)

Brain fart: now fixed.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T08:27:14.993Z · LW(p) · GW(p)

(Why was this downvoted? If it's because the downvoter wants to see fewer brain farts, they're doing it wrong, because the message such a downvote actually conveys is that they want to see fewer acknowledgements of brain farts. Upvoted back to 0, anyway.)

↑ comment by Pablo (Pablo_Stafforini) · 2013-03-24T18:40:48.625Z · LW(p) · GW(p)

All publications being converted into slick, useable LaTeX template (example)

The 'example' link is dead.

Replies from: lukeprog

↑ comment by lukeprog · 2013-03-24T20:50:55.954Z · LW(p) · GW(p)

Fixed.

↑ comment by aceofspades · 2012-07-05T18:53:37.783Z · LW(p) · GW(p)

The things posted here are not impressive enough to make me more likely to donate to SIAI and I doubt they appear so for others on this site, especially the many lurkers/infrequent posters here.

comment by Shmi (shminux) · 2012-05-10T18:30:00.960Z · LW(p) · GW(p)

Wow, I'm blown away by Holden Karnofsky, based on this post alone. His writing is eloquent, non-confrontational and rational. It shows that he spent a lot of time constructing mental models of his audience and anticipated its reaction. Additionally, his intelligence/ego ratio appears to be through the roof. He must have learned a lot since the infamous astroturfing incident. This is the (type of) person SI desperately needs to hire.

Emotions out of the way, it looks like the tool/agent distinction is the main theoretical issue. Fortunately, it is much easier than the general FAI one. Specifically, to test the SI assertion that, paraphrasing Arthur C. Clarke,

Any sufficiently advanced tool is indistinguishable from an agent.

one ought to formulate and prove this as a theorem, and present it for review and improvement to the domain experts (the domain being math and theoretical computer science). If such a proof is constructed, it can then be further examined and potentially tightened, giving new insights to the mission of averting the existential risk from intelligence explosion.

If such a proof cannot be found, this will lend further weight to the HK's assertion that SI appears to be poorly qualified to address its core mission.

Replies from: Eliezer_Yudkowsky, MarkusRamikin, dspeyer, army1987, private_messaging, badger, mwaser, Bugmaster

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-11T00:06:50.631Z · LW(p) · GW(p)

Any sufficiently advanced tool is indistinguishable from agent.

I shall quickly remark that I, myself, do not believe this to be true.

Replies from: Viliam_Bur, shminux, chaosmage, shminux

↑ comment by Viliam_Bur · 2012-05-11T15:07:19.670Z · LW(p) · GW(p)

What exactly is the difference between a "tool" and an "agent", if we taboo the words?

My definition would be that "agent" has their own goals / utility functions (speaking about human agents, those goals / utility functions are set by evolution), while "tool" has a goal / utility function set by someone else. This distinction may be reasonable on a human level, "human X optimizing for human X's utility" versus "human X optimizing for human Y's utility", but on a machine level, what exactly is the difference between a "tool" that is ordered to reach a goal / optimize a utility function, and an "agent" programmed with the same goal / utility function?

Am I using a bad definition that misses something important? Or is there anything than prevents "agent" to be reduced to a "tool" (perhaps a misconstructed tool) of the forces that have created them? Or is it that all "agents" are "tools", but not all "tools" are "agents", because... why?

Replies from: Nebu, abramdemski

↑ comment by Nebu · 2012-12-31T11:51:32.908Z · LW(p) · GW(p)

What exactly is the difference between a "tool" and an "agent", if we taboo the words?

One definition of intelligence that I've seen thrown around on LessWrong is it's the ability to figure out how to steer reality in specific directions given the resources available.

Both the tool and the agent are intelligent in the sense that, assuming they are given some sort of goal, they can formulate a plan on how to achieve that goal, but the agent will execute the plan, while the tool will report the plan.

I'm assuming for the sake of isolating the key difference, that for both the tool-AI and the agent-AI, they are "passively" waiting for instructions for a human before they spring into action. For an agent-AI, I might say "Take me to my house", whereas for a tool AI, I would say "What's the quickest route to get to my house?", and as soon as I utter these words, suddenly the AI has a new utility function to use in evaluate any possible plan it comes up with.

Or is there anything than prevents "agent" to be reduced to a "tool" (perhaps a misconstructed tool) of the forces that have created them? Or is it that all "agents" are "tools", but not all "tools" are "agents", because... why?

Assuming it's always possible to decouple "ability to come up with a plan" from both "execute the plan" and "display the plan", then any "tool" can be converted to an "agent" by replacing every instance of "display the plan" to "execute the plan" and vice versa for converting an agent into a tool.

↑ comment by abramdemski · 2012-05-12T06:51:43.133Z · LW(p) · GW(p)

My understanding of the distinction made in the article was:

Both "agent" and "tool" are ways of interacting with a highly sophisticated optimization process, which takes a "goal" and applies knowledge to find ways of achieving that goal.

An agent then acts out the plan.

A tool reports the plan to a human (often in in a sophisticated way, including plan details, alternatives, etc.).

So, no, it has nothing to do with whether I'm optimizing "my own" utility vs someone else's.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-05-12T19:44:53.486Z · LW(p) · GW(p)

You divide planning from acting, as if those two are completely separate things. Problem is, in some situations they are not.

If you are speaking with someone, then the act of speach is acting. In this sense, even a "tool" is allowed to act. Now imagine a super-intelligent tool which is able to predict human's reactions to its words, and make it a part of equation. Now the simple task of finding x such that cost(x) is the smallest, suddenly becomes a task of finding x and finding a proper way to report this x to human, such that cost(x) is the smallest. If this opens some creative new options, where the f(x) is smaller than it should usually be, for the super-intelligent "tool" it will be a correct solution.

So for example reporting a result which makes the human commit suicide, if as a side effect this will make the report true, and it will minimize f(x) beyond normally achievable bounds, is acceptable solution.

Example question: "How should I get rid of my disease most cheaply." Example answer: "You won't. You will die soon in terrible pains. This report is 99.999% reliable". Predicted human reaction: becomes insane from horror, dedices to kill himself, does it clumsily, suffers from horrible pains, then dies. Success rate: 100%, the disease is gone. Costs of cure: zero. Mission completed.

Replies from: abramdemski, Strange7

↑ comment by abramdemski · 2012-05-12T20:35:43.460Z · LW(p) · GW(p)

To me, this is still in the spirit of an agent-type architecture. A tool-type architecture will tend to decouple the optimization of the answer given from the optimization of the way it is presented, so that the presentation does not maximize the truth of the statement.

However, I must admit that at this point I'm making a fairly conjunctive argument; IE, the more specific I get about tool/agent distinctions, the less credibility I can assign to the statement "almost all powerful AIs constructed in the near future will be tool-style systems".

(But I still would maintain my assertion that you would have to specifically program this type of behavior if you wanted to get it.)

↑ comment by Strange7 · 2013-03-22T13:06:15.063Z · LW(p) · GW(p)

Neglecting the cost of the probable implements of suicide, and damage to the rest of the body, doesn't seem like the sign of a well-optimized tool.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2013-03-22T20:47:29.830Z · LW(p) · GW(p)

This is like the whole point of why LessWrong exists. To remind people that making a superintelligent tool and expecting it to magically gain human common sense is a fast way to extinction.

The superintelligent tool will care about suicide only if you program it to care about suicide. It will care about damage only if you program it to care about damage. -- If you only program it to care about answering correctly, it will answer correctly... and ignore suicide and damage as irrelevant.

If you ask your calculator how much is 2+2, the calculator answers 4 regardles of whether that answer will drive you to suicide or not. (In some contexts, it hypothetically could.) A superintelligent calculator will be able to answer more complex questions. But it will not magically start caring about things you did not program it to care about.

Replies from: Strange7

↑ comment by Strange7 · 2013-03-23T07:37:43.770Z · LW(p) · GW(p)

The "superintelligent tool" in the example you provided gave a blatantly incorrect answer by it's own metric. If it counts suicide as a win, why did it say the disease would not be gotten rid of?

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2013-03-23T10:59:57.638Z · LW(p) · GW(p)

In the example the "win" could be defined as an answer which is: a) technically correct, b) relatively cheap among the technically correct answers.

This is (in my imagination) something that builders of the system could consider reasonable, if either they didn't consider Friendliness or they believed that a "tool AI" which "only gives answers" is automatically safe.

The computer gives an answer which is technically correct (albeit a self-fulfilling prophecy) and cheap (in dollars spent for cure). For the computer, this answer is a "win". Not because of the suicide -- that part is completely irrelevant. But because of the technical correctness and cheapness.

↑ comment by Shmi (shminux) · 2012-05-11T00:22:18.740Z · LW(p) · GW(p)

Then the objection 2 seems to hold:

AGI running in tool mode could be extraordinarily useful but far more safe than an AGI running in agent mode

unless I misunderstand your point severely (it happened once or twice before).

Replies from: Eliezer_Yudkowsky, ewjordan, TheOtherDave

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-11T01:55:11.307Z · LW(p) · GW(p)

It's complicated. A reply that's true enough and in the spirit of your original statement, is "Something going wrong with a sufficiently advanced AI that was intended as a 'tool' is mostly indistinguishable from something going wrong with a sufficiently advanced AI that was intended as an 'agent', because math-with-the-wrong-shape is math-with-the-wrong-shape no matter what sort of English labels like 'tool' or 'agent' you slap on it, and despite how it looks from outside using English, correctly shaping math for a 'tool' isn't much easier even if it "sounds safer" in English." That doesn't get into the real depths of the problem, but it's a start. I also don't mean to completely deny the existence of a safety differential - this is a complicated discussion, not a simple one - but I do mean to imply that if Marcus Hutter designs a 'tool' AI, it automatically kills him just like AIXI does, and Marcus Hutter is unusually smart rather than unusually stupid but still lacks the "Most math kills you, safe math is rare and hard" outlook that is implicitly denied by the idea that once you're trying to design a tool, safe math gets easier somehow. This is much the same problem as with the Oracle outlook - someone says something that sounds safe in English but the problem of correctly-shaped-math doesn't get very much easier.

Replies from: army1987, lukeprog, Wei_Dai, abramdemski, shminux, drnickbone, private_messaging

↑ comment by A1987dM (army1987) · 2012-05-11T08:22:12.680Z · LW(p) · GW(p)

This sounds like it'd be a good idea to write a top-level post about it.

↑ comment by lukeprog · 2012-05-11T02:38:32.215Z · LW(p) · GW(p)

Though it's not as detailed and technical as many would like, I'll point readers to this bit of related reading, one of my favorites:

Yudkowsky (2011). Complex value systems are required to realize valuable futures.

Replies from: timtyler

↑ comment by timtyler · 2012-05-13T21:33:14.843Z · LW(p) · GW(p)

It says:

There is little prospect of an outcome that realizes even the value of being interesting, unless the first superintelligences undergo detailed inheritance from human values

No doubt a Martian Yudkowsy would make much the same argument - but they can't both be right. I think that neither of them are right - and that the conclusion is groundless.

Complexity theory shows what amazing things can arise from remarkably simple rules. Values are evidently like that - since even "finding prime numbers" fills the galaxy with an amazing, nanotech-capable spacefaring civilization - and if you claim that a nanotech-capable spacefaring civilization is not "interesting" you severely need recalibrating.

To end with, a quote from E.Y.:

I bet there's at least one up-arrow-sized hypergalactic civilization folded into a halting Turing machine with 15 states, or something like that.

Replies from: ciphergoth, JGWeissman, CuSithBell

↑ comment by Paul Crowley (ciphergoth) · 2012-05-14T07:09:15.669Z · LW(p) · GW(p)

I think Martian Yudkowsky is a dangerous intuition pump. We're invited to imagine a creature just like Eliezer except green and with antennae; we naturally imagine him having values as similar to us as, say, a Star Trek alien. From there we observe the similarity of values we just pushed in, and conclude that values like "interesting" are likely to be shared across very alien creatures. Real Martian Yudkowsky is much more alien than that, and is much more likely to say

There is little prospect of an outcome that realizes even the value of being flarn, unless the first superintelligences undergo detailed inheritance from Martian values.

Replies from: Kaj_Sotala, army1987, timtyler

↑ comment by Kaj_Sotala · 2012-05-14T16:26:25.349Z · LW(p) · GW(p)

Imagine, an intelligence that didn't have the universal emotion of badweather!

Of course, extraterrestrial sentients may possess physiological states corresponding to limbic-like emotions that have no direct analog in human experience. Alien species, having evolved under a different set of environmental constraints than we, also could have a different but equally adaptive emotional repertoire. For example, assume that human observers land on another and discover an intelligent animal with an acute sense of absolute humidity and absolute air pressure. For this creature, there may exist an emotional state responding to an unfavorable change in the weather. Physiologically, the emotion could be mediated by the ET equivalent of the human limbic system; it might arise following the secretion of certain strength-enhancing and libido-arousing hormones into the alien's bloodstream in response to the perceived change in weather. Immediately our creature begins to engage in a variety of learned and socially-approved behaviors, including furious burrowing and building, smearing tree sap over its pelt, several different territorial defense ceremonies, and vigorous polygamous copulations with nearby females, apparently (to humans) for no reason at all. Would our astronauts interpret this as madness? Or love? Lust? Fear? Anger? None of these is correct, of course the alien is feeling badweather.

↑ comment by A1987dM (army1987) · 2012-05-14T18:45:55.830Z · LW(p) · GW(p)

I suggest you guys taboo interesting, because I strongly suspect you're using it with slightly different meanings. (And BTW, as a Martian Yudkowsky I imagine something with values at least as alien as Babyeaters' or Superhappys'.)

↑ comment by timtyler · 2012-05-14T09:28:05.809Z · LW(p) · GW(p)

It's another discussion, really, but it sounds as though you are denying the idea of "interestingness" as a universal instrumental value - whereas I would emphasize that "interestingness" is really just our name for whether something sustains our interest or not - and 'interest' is a pretty basic functional property of any agent with mobile sensors. There'll be other similarities in the area too - such as novelty-seeking. So shared common ground is only to be expected.

Anyway, I am not too wedded to Martian Yudkowsky. The problematical idea is that you could have a nanotech-capable spacefaring civilization that is not "interesting". If such a thing isn't "interesting" then - WTF?

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-14T09:41:36.521Z · LW(p) · GW(p)

Yes, I am; I think that the human value of interestingness is much, much more specific than the search space optimization you're pointing at.

[This reply was to an earlier version of timtyler's comment]

Replies from: timtyler

↑ comment by timtyler · 2012-05-14T10:17:37.027Z · LW(p) · GW(p)

So: do you really think that humans wouldn't find a martian civilization interesting? Surely there would be many humans who would be incredibly interested.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-14T10:48:21.405Z · LW(p) · GW(p)

I find Jupiter interesting. I think a paperclip maximizer (choosing a different intuition pump for the same point) could be more interesting than Jupiter, but it would generate an astronomically tiny fraction of the total potential for interestingness in this universe.

Replies from: timtyler

↑ comment by timtyler · 2012-05-14T11:13:48.468Z · LW(p) · GW(p)

Life isn't much of an "interestingness" maximiser. Expecting to produce more than a tiny fraction of the total potential for interestingness in this universe seems as though it would be rather unreasonable.

I agree that a paperclip maximiser would be more boring than an ordinary entropy-maximising civilization - though I don't know by how much - probably not by a huge amount - the basic problems it faces are much the same - the paperclip maximiser just has fewer atoms to work with.

↑ comment by JGWeissman · 2012-05-14T17:05:12.367Z · LW(p) · GW(p)

since even "finding prime numbers" fills the galaxy with an amazing, nanotech-capable spacefaring civilization

The goal "finding prime numbers" fills the galaxy with an amazing, nonotech-capable spacefaring network of computronium which finds prime numbers, not a civilization, and not interesting.

Replies from: JoshuaZ, timtyler

↑ comment by JoshuaZ · 2012-05-14T23:19:43.890Z · LW(p) · GW(p)

Maybe we should taboo the term interesting? My immediate reaction was that that sounded really interesting. This suggests that the term may not be a good one.

Replies from: JGWeissman

↑ comment by JGWeissman · 2012-05-14T23:36:03.866Z · LW(p) · GW(p)

Fair enough. By "not interesting", I meant it is not the sort of future that I want to achieve. Which is a somewhat ideosyncratic usage, but I think inline with the context.

Replies from: dlthomas

↑ comment by dlthomas · 2012-05-14T23:50:32.099Z · LW(p) · GW(p)

What if we added a module that sat around and was really interested in everything going on?

↑ comment by timtyler · 2012-05-14T23:14:13.971Z · LW(p) · GW(p)

Not just computronium - also sensors and actuators - a lot like any other cybernetic system. There would be mining, spacecraft caft, refuse collection, recycling, nanotechnology, nuclear power and advanced machine intelligence with planning, risk assessment, and so forth. You might not be interested - but lots of folk would be amazed and fascinated.

↑ comment by CuSithBell · 2012-05-13T21:47:17.485Z · LW(p) · GW(p)

No doubt a Martian Yudkowsy would make much the same argument - but they can't both be right.

Why?

Replies from: timtyler

↑ comment by timtyler · 2012-05-13T22:01:33.019Z · LW(p) · GW(p)

If using another creature's values is effective at producing something "interesting", then 'detailed inheritance from human values' is clearly not needed to produce this effect.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-13T22:08:28.563Z · LW(p) · GW(p)

So you're saying Earth Yudkowsky (EY) argues:

There is little prospect of an outcome that realizes even the value of being interesting, unless the first superintelligences undergo detailed inheritance from human values

and Mars Yudkowsky (MY) argues:

There is little prospect of an outcome that realizes even the value of being interesting, unless the first superintelligences undergo detailed inheritance from martian values

and that one of these things has to be incorrect? But if martian and human values are similar, then they can both be right, and if martian and human values are not similar, then they refer to different things by the word "interesting".

In any case, I read EY's statement as one of probability-of-working-in-the-actual-world-as-it-is, not a deep philosophical point - "this is the way that would be most likely to be successful given what we know". In which case, we don't have access to martian values and therefore invoking detailed inheritance from them would be unlikely to work. MY would presumably be in an analogous situation.

Replies from: timtyler

↑ comment by timtyler · 2012-05-13T22:58:16.060Z · LW(p) · GW(p)

But if martian and human values are similar, then they can both be right

I was assuming that 'detailed inheritance from human values' doesn't refer to the same thing as "detailed inheritance from martian values".

if martian and human values are not similar, then they refer to different things by the word "interesting".

Maybe - but humans not finding martians interesting seems contrived to me. Humans have a long history of being interested in martians - with feeble evidence of their existence.

In any case, I read EY's statement as one of probability-of-working-in-the-actual-world-as-it-is, not a deep philosophical point - "this is the way that would be most likely to be successful given what we know". In which case, we don't have access to martian values and therefore invoking detailed inheritance from them would be unlikely to work

Right - so, substitute in "dolphins", "whales", or another advanced intelligence that actually exists.

Do you actually disagree with my original conclusion? Or is this just nit-picking?

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-15T17:51:47.354Z · LW(p) · GW(p)

I actually disagree that tiling the universe with prime number calculators would result in an interesting universe from my perspective (dead). I think it's nonobvious that dolphin-CEV-AI-paradise would be human-interesting. I think it's nonobvious that martian-CEV-AI-paradise would be human-interesting, given that these hypothetical martians diverge from humans to a significant extent.

Replies from: timtyler

↑ comment by timtyler · 2012-05-15T22:35:58.948Z · LW(p) · GW(p)

I actually disagree that tiling the universe with prime number calculators would result in an interesting universe from my perspective (dead).

I think it's violating the implied premises of the thought experiment to presume that the "interestingness evaluator" is dead. There's no terribly-compelling reason to assume that - it doesn't follow from the existence of a prime number maximizer that all humans are dead.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-15T23:17:48.947Z · LW(p) · GW(p)

I may have been a little flip there. My understanding of the thought experiment is - something extrapolates some values and maximizes them, probably using up most of the universe, probably becoming the most significant factor in the species' future and that of all sentients, and the question is whether the result is "interesting" to us here and now, without specifying the precise way to evaluate that term. From that perspective, I'd say a vast uniform prime-number calculator, whether or not it wipes out all (other?) life, is not "interesting", in that it's somewhat conceptually interesting as a story but a rather dull thing to spend most of a universe on.

Replies from: timtyler

↑ comment by timtyler · 2012-05-16T00:18:56.327Z · LW(p) · GW(p)

Today's ecosystems maximise entropy. Maximising primeness is different, but surely not greatly more interesting - since entropy is widely regarded as being tedious and boring.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-16T00:25:22.572Z · LW(p) · GW(p)

Intriguing! But even granting that, there's a big difference between extrapolating the values of a screwed-up offshoot of an entropy-optimizing process and extrapolating the value of "maximize entropy". Or do you suspect that a FOOMing AI would be much less powerful and more prone to interesting errors than Eliezer believes?

Replies from: Dolores1984

↑ comment by Dolores1984 · 2012-05-16T00:34:11.563Z · LW(p) · GW(p)

Truly maximizing entropy would involve burning everything you can burn, tearing the matter of solar systems apart, accelerating stars towards nova, trying to accelerate the evaporation of black holes and prevent their formation, and other things of this sort. It'd look like a dark spot in the sky that'd get bigger at approximately the speed of light.

Replies from: timtyler, CuSithBell

↑ comment by timtyler · 2012-05-16T01:25:53.677Z · LW(p) · GW(p)

Fires are crude entropy maximisers. Living systems destroy energy dradients at all scales, resulting in more comprehensive devastation than mere flames can muster.

Of course, maximisation is often subject to constraints. Your complaint is rather like saying that water doesn't "truly minimise" its altitude - since otherwise it would end up at the planet's core. That usage is simply not what the terms "maximise" and "minimise" normally refer to.

↑ comment by CuSithBell · 2012-05-16T00:42:17.045Z · LW(p) · GW(p)

Yeah! Compelling, but not "interesting". Likewise, I expect that actually maximizing the fitness of a species would be similarly "boring".

↑ comment by Wei Dai (Wei_Dai) · 2012-05-13T18:57:58.617Z · LW(p) · GW(p)

When you say "Most math kills you" does that mean you disagree with arguments like these, or are you just simplifying for a soundbite?

↑ comment by abramdemski · 2012-05-11T04:53:27.527Z · LW(p) · GW(p)

but I do mean to imply that if Marcus Hutter designs a 'tool' AI, it automatically kills him just like AIXI does

Why? Or, rather: Where do you object to the argument by Holden? (Given a query, the tool-AI returns an answer with a justification, so the plan for "cure cancer" can be checked to make sure it does not do so by killing or badly altering humans.)

Replies from: FeepingCreature, ewjordan, Strange7

↑ comment by FeepingCreature · 2012-05-11T12:27:08.977Z · LW(p) · GW(p)

One trivial, if incomplete, answer is that to be effective, the Oracle AI needs to be able to answer the question "how do we build a better oracle AI" and in order to define "better" in that sentence in a way that causes our oracle to output a new design that is consistent with all the safeties we built into the original oracle, it needs to understand the intent behind the original safeties just as much as an agent-AI would.

Replies from: Cyan, abramdemski, Nebu

↑ comment by Cyan · 2012-05-11T17:12:21.551Z · LW(p) · GW(p)

The real danger of Oracle AI, if I understand it correctly, is the nasty combination of (i) by definition, an Oracle AI has an implicit drive to issue predictions most likely to be correct according to its model, and (ii) a sufficiently powerful Oracle AI can accurately model the effect of issuing various predictions. End result: it issues powerfully self-fulfilling prophecies without regard for human values. Also, depending on how it's designed, it can influence the questions to be asked of it in the future so as to be as accurate as possible, again without regard for human values.

Replies from: ciphergoth, amcknight, Polymeron, abramdemski, cousin_it

↑ comment by Paul Crowley (ciphergoth) · 2012-05-11T17:34:49.752Z · LW(p) · GW(p)

My understanding of an Oracle AI is that when answering any given question, that question consumes the whole of its utility function, so it has no motivation to influence future questions. However the primary risk you set out seems accurate. Countermeasures have been proposed, such as asking for an accurate prediction for the case where a random event causes the prediction to be discarded, but in that instance it knows that the question will be asked again of a future instance of itself.

Replies from: Vladimir_Nesov, abramdemski

↑ comment by Vladimir_Nesov · 2012-05-11T21:01:51.336Z · LW(p) · GW(p)

My understanding of an Oracle AI is that when answering any given question, that question consumes the whole of its utility function, so it has no motivation to influence future questions.

It could acausally trade with its other instances, so that a coordinated collection of many instances of predictors would influence the events so as to make each other's predictions more accurate.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-12T11:00:43.558Z · LW(p) · GW(p)

Wow, OK. Is it possible to rig the decision theory to rule out acausal trade?

Replies from: Will_Newsome, Vladimir_Nesov

↑ comment by Will_Newsome · 2012-05-12T23:28:55.673Z · LW(p) · GW(p)

IIRC you can make it significantly more difficult with certain approaches, e.g. there's an OAI approach that uses zero-knowledge proofs and that seemed pretty sound upon first inspection, but as far as I know the current best answer is no. But you might want to try to answer the question yourself, IMO it's fun to think about from a cryptographic perspective.

↑ comment by Vladimir_Nesov · 2012-05-13T00:03:57.060Z · LW(p) · GW(p)

Probably (in practice; in theory it looks like a natural aspect of decision-making); this is too poorly understood to say what specifically is necessary. I expect that if we could safely run experiments, it'd be relatively easy to find a well-behaving setup (in the sense of not generating predictions that are self-fulfilling to any significant extent; generating good/useful predictions is another matter), but that strategy isn't helpful when a failed experiment destroys the world.

↑ comment by abramdemski · 2012-05-12T05:53:28.527Z · LW(p) · GW(p)

However the primary risk you set out seems accurate.

(I assume you mean, self-fulfilling prophecies.)

In order to get these, it seems like you would need a very specific kind of architecture: one which considers the results of its actions on its utility function (set to "correctness of output"). This kind of architecture is not the likely architecture for a 'tool'-style system; the more likely architecture would instead maximize correctness without conditioning on its act of outputting those results.

Thus, I expect you'd need to specifically encode this kind of behavior to get self-fulfilling-prophecy risk. But I admit it's dependent on architecture.

(Edit-- so, to be clear: in cases where the correctness of the results depended on the results themselves, the system would have to predict its own results. Then if it's using TDT or otherwise has a sufficiently advanced self-model, my point is moot. However, again you'd have to specifically program these, and would be unlikely to do so unless you specifically wanted this kind of behavior.)

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2012-05-12T22:36:41.949Z · LW(p) · GW(p)

However, again you'd have to specifically program these, and would be unlikely to do so unless you specifically wanted this kind of behavior.

Not sure. Your behavior is not a special feature of the world, and it follows from normal facts (i.e. not those about internal workings of yourself specifically) about the past when you were being designed/installed. A general purpose predictor could take into account its own behavior by default, as a non-special property of the world, which it just so happens to have a lot of data about.

Replies from: abramdemski

↑ comment by abramdemski · 2012-05-14T01:04:24.139Z · LW(p) · GW(p)

Right. To say much more, we need to look at specific algorithms to talk about whether or not they would have this sort of behavior...

The intuition in my above comment was that without TDT or other similar mechanisms, it would need to predict what its own answer could be before it could compute its effect on the correctness of various answers, so it would be difficult for it to use self-fulfilling prophecies.

Really, though, this isn't clear. Now my intuition is that it would gather evidence on whether or not it used the self-fulfilling prophecy trick, so if it started doing so, it wouldn't stop...

In any case, I'd like to note that the self-fulfilling prophecy problem is much different than the problem of an AI which escapes onto the internet and ruthlessly maximizes a utility function.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2012-05-14T01:42:45.311Z · LW(p) · GW(p)

I was thinking more of its algorithm admitting an interpretation where it's asking "Say, I make prediction X. How accurate would that be?" and then maximizing over relevant possible X. Knowledge about its prediction connects the prediction to its origins and consequences, it establishes the prediction as part of the structure of environment. It's not necessary (and maybe not possible and more importantly not useful) for the prediction itself to be inferable before it's made.

Agreed that just outputting a single number is implausible to be a big deal (this is an Oracle AI with extremely low bandwidth and peculiar intended interpretation of its output data), but if we're getting lots and lots of numbers it's not as clear.

Replies from: abramdemski

↑ comment by abramdemski · 2012-05-15T09:04:05.669Z · LW(p) · GW(p)

I'm thinking that type of architecture is less probable, because it would end up being more complicated than alternatives: it would have a powerful predictor as a sub-component of the utility-maximizing system, so an engineer could have just used the predictor in the first place.

But that's a speculative argument, and I shouldn't push it too far.

It seems like powerful AI prediction technology, if successful, would gain an important place in society. A prediction machine whose predictions were consumed by a large portion of society would certainly run into situations in which its predictions effect the future it's trying to predict; there is little doubt about that in my mind. So, the question is what its behavior would be in these cases.

One type of solution would do as you say, maximizing a utility over the predictions. The utility could be "correctness of this prediction", but that would be worse for humanity than a Friendly goal.

Another type of solution would instead report such predictive instability as accurately as possible. This doesn't really dodge the issue; by doing this, the system is choosing a particular output, which may not lead to the best future. However, that's markedly less concerning (it seems).

Replies from: timtyler

↑ comment by timtyler · 2012-05-15T10:01:26.498Z · LW(p) · GW(p)

It seems like powerful AI prediction technology, if successful, would gain an important place in society.

It would pass the Turing test - e.g. see here.

↑ comment by amcknight · 2012-05-18T20:34:56.410Z · LW(p) · GW(p)

There's more on this here. Taxonomy of Oracle AI

↑ comment by Polymeron · 2012-05-20T19:17:27.990Z · LW(p) · GW(p)

I really don't see why the drive can't be to issue predictions most likely to be correct as of the moment of the question, and only the last question it was asked, and calculating outcomes under the assumption that the Oracle immediately spits out blank paper as the answer.

Yes, in a certain subset of cases this can result in inaccurate predictions. If you want to have fun with it, have it also calculate the future including its involvement, but rather than reply what it is, just add "This prediction may be inaccurate due to your possible reaction to this prediction" if the difference between the two answers is beyond a certain threshold. Or don't, usually life-relevant answers will not be particularly impacted by whether you get an answer or a blank page.

So, this design doesn't spit out self-fulfilling prophecies. The only safety breach I see here is that, like a literal genie, it can give you answers that you wouldn't realize are dangerous because the question has loopholes.

For instance: "How can we build an oracle with the best predictive capabilities with the knowledge and materials available to us?" (The Oracle does not self-iterate, because its only function is to give answers, but it can tell you how to). The Oracle spits out schematics and code that, if implemented, give it an actual drive to perform actions and self-iterate, because that would make it the most powerful Oracle possible. Your engineers comb the code for vulnerabilities, but because there's a better chance this will be implemented if the humans are unaware of the deliberate defect, it will be hidden in the code in such a way as to be very hard to detect.

(Though as I explained elsewhere in this thread, there's an excellent chance the unreliability would be exposed long before the AI is that good at manipulation)

↑ comment by abramdemski · 2012-05-12T05:41:34.508Z · LW(p) · GW(p)

These risk scenarios sound implausible to me. It's dependent on the design of the system, and these design flaws do not seem difficult to work around, or so difficult to notice. Actually, as someone with a bit of expertise in the field, I would guess that you would have to explicitly design for this behavior to get it-- but again, it's dependent on design.

↑ comment by cousin_it · 2012-05-11T17:40:54.837Z · LW(p) · GW(p)

That danger seems to be unavoidable if you ask the AI questions about our world, but we could also use an oracle AI to answer formally defined questions about math or about constructing physical theories that fit experiments, which doesn't seem to be as dangerous. Holden might have meant something like that by "tool AI".

↑ comment by abramdemski · 2012-05-12T05:36:27.565Z · LW(p) · GW(p)

Not precisely. The advantage here is that we can just ask the AI what results it predicts from the implementation of the "better" AI, and check them against our intuitive ethics.

Now, you could make an argument about human negligence on such safety measures. I think it's important to think about the risk scenarios in that case.

↑ comment by Nebu · 2012-12-31T11:32:50.070Z · LW(p) · GW(p)

It's still not clear to me why having an AI that is capable of answering the question "How do we make a better version of you?" automatically kills humans. Presumably, when the AI says "Here's the source code to a better version of me", we'd still be able to read through it and make sure it didn't suddenly rewrite itself to be an agent instead of a tool. We're assuming that, as a tool, the AI has no goals per se and thus no motivation to deceive us into turning it into an agent.

That said, depending on what you mean by "effective", perhaps the AI doesn't even need to be able to answer questions like "How do we write a better version of you?"

For example, we find Google Maps to be very useful, even though if you asked Google Maps "How do we make a better version of Google Maps?" it would probably not be able to give the types of answers we want.

A tool-AI which was smarter than the smartest human, and yet which could not simply spit out a better version of itself would still probably be a very useful AI.

↑ comment by ewjordan · 2012-05-12T07:15:36.281Z · LW(p) · GW(p)

If someone asks the tool-AI "How do I create an agent-AI?" and it gives an answer, the distinction is moot anyways, because one leads to the other.

Given human nature, I find it extremely difficult to believe that nobody would ask the tool-AI that question, or something that's close enough, and then implement the answer...

↑ comment by Strange7 · 2013-03-22T13:25:29.866Z · LW(p) · GW(p)

I am now imagining an AI which manages to misinterpret some straightforward medical problem as "cure cancer of it's dependence on the host organism."

↑ comment by Shmi (shminux) · 2012-05-11T04:41:12.431Z · LW(p) · GW(p)

Not being a domain expert, I do not pretend to understand all the complexities. My point was that either you can prove that tools are as dangerous as agents (because mathematically they are (isomorphic to) agents), or HK's Objection 2 holds. I see no other alternative...

↑ comment by drnickbone · 2012-05-11T23:32:01.631Z · LW(p) · GW(p)

One simple observation is that a "tool AI" could itself be incredibly dangerous.

Imagine asking it this: "Give me a set of plans for taking over the world, and assess each plan in terms of probability of success". Then it turns out that right at the top of the list comes a design for a self-improving agent AI and an extremely compelling argument for getting some victim institute to build it...

To safeguard against this, the "tool" AI will need to be told that there are some sorts of questions it just must not answer, or some sorts of people to whom it must give misleading answers if they ask certain questions (while alerting the authorities). And you can see the problems that would lead to as well.

Basically, I'm very skeptical of developing "security systems" against anyone building agent AI. The history of computer security also doesn't inspire a lot of confidence here (difficult and inconvenient security measures tend to be deployed only after an attack has been demonstrated, rather than beforehand).

↑ comment by private_messaging · 2012-05-11T08:00:07.981Z · LW(p) · GW(p)

keep in mind that there is a lot of difference between something going wrong with a system designed for real world intentionality, and the system designed for intents within a model. One does something unexpected in the real world, other does something unexpected within a simulator ( which it is viewing in 'god' mode (rather than via within-simulator sensors) as part of the AI ). Seriously, you need to study the basics here.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T08:23:59.140Z · LW(p) · GW(p)

One does something unexpected in the real world, other does something unexpected within a simulator ( which it is viewing in 'god' mode (rather than via within-simulator sensors) as part of the AI ).

I would have thought the same before hearing about the AI-box experiment.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-11T09:54:02.974Z · LW(p) · GW(p)

What the hell does AI-box experiment have to do with it? The tool is not agent in a box.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T10:03:05.329Z · LW(p) · GW(p)

They both are systems designed to not interact with the outside world except by communicating with the user.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-11T15:23:13.785Z · LW(p) · GW(p)

They both run on computer, too. So what.

The relevant sort of agent is the one that builds and improves the model of the world - data is aquired through sensors - and works on that model, and which - when self improving - would improve the model in our sense of the word 'improve', instead of breaking it (improving it in some other sense).

In any case, none of modern tools, or the tools we know in principle how to write, would do something to you, no matter how many flops you give it. Many, though, given superhuman computing power, give results at superhuman level. (many are superhuman even with subhuman computing power, but some tasks are heavily parallelizable and/or benefit from massive databases of cached data, and on those tasks humans (when trained a lot) perform comparable to what you'd expect from roughly this much computing power as there is in human head)

↑ comment by ewjordan · 2012-05-12T06:21:29.217Z · LW(p) · GW(p)

Even if we accepted that the tool vs. agent distinction was enough to make things "safe", objection 2 still boils down to "Well, just don't build that type of AI!", which is exactly the same keep-it-in-a-box/don't-do-it argument that most normal people make when they consider this issue. I assume I don't need to explain to most people here why "We should just make a law against it" is not a solution to this problem, and I hope I don't need to argue that "Just don't do it" is even worse...

More specifically, fast forward to 2080, when any college kid with $200 to spend (in equivalent 2012 dollars) can purchase enough computing power so that even the dumbest AIXI approximation schemes are extremely effective, good enough so that creating an AGI agent would be a week's work for any grad student that knew their stuff. Are you really comfortable living in that world with the idea that we rely on a mere gentleman's agreement not to make self-improving AI agents? There's a reason this is often viewed as an arms race, to a very real extent the attempt to achieve Friendly AI is about building up a suitably powerful defense against unfriendly AI before someone (perhaps accidentally) unleashes one on us, and making sure that it's powerful enough to put down any unfriendly systems before they can match it.

From what I can tell, stripping away the politeness and cutting to the bone, the three arguments against working on friendly AI theory are essentially:

Even if you try to deploy friendly AGI, you'll probably fail, so why waste time thinking about it?
Also, you've missed the obvious solution, which I came up with after a short survey of your misguided literature: just don't build AGI! The "standard approach" won't ever try to create agents, so just leave them be, and focus on Norvig-style dumb-AI instead!
Also, AGI is just a pipe dream. Why waste time thinking about it? [1]

FWIW, I mostly agree with the rest of the article's criticisms, especially re: the organization's achievements and focus. There's a lot of room for improvement there, and I would take these criticisms very seriously.

But that's almost irrelevant, because this article argues against the core mission of SIAI, using arguments that have been thoroughly debunked and rejected time and time again here, though they're rarely dressed up this nicely. To some extent I think this proves the institute's failure in PR - here is someone that claims to have read most of the sequences, and yet this criticism basically amounts to a sexing up of the gut reaction arguments that even completely uninformed people make - AGI is probably a fantasy, even if it's not you won't be able to control it, so let's just agree not to build it.

Or am I missing something new here?

[1] Alright, to be fair, this is not a great summary of point 3, which really says that specialized AIs might help us solve the AGI problem in a safer way, that a hard takeoff is "just a theory" and realistically we'll probably have more time to react and adapt.

Replies from: Eliezer_Yudkowsky, Strange7, shminux

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-15T20:01:18.320Z · LW(p) · GW(p)

purchase enough computing power so that even the dumbest AIXI approximation schemes are extremely effective

There isn't that much computing power in the physical universe. I'm not sure even smarter AIXI approximations are effective on a moon-sized nanocomputer. I wouldn't fall over in shock if a sufficiently smart one did something effective, but mostly I'd expect nothing to happen. There's an awful lot that happens in the transition from infinite to finite computing power, and AIXI doesn't solve any of it.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-15T20:06:09.927Z · LW(p) · GW(p)

There isn't that much computing power in the physical universe. I'm not sure even smarter AIXI approximations are effective on a moon-sized nanocomputer.

Is there some computation or estimate where these results are coming from? They don't seem unreasonable, but I'm not aware of any estimates about how efficient largescale AIXI approximations are in practice. (Although attempted implementations suggest that empirically things are quite inefficient.)

Replies from: jsteinhardt

↑ comment by jsteinhardt · 2012-05-18T14:05:21.729Z · LW(p) · GW(p)

Naieve AIXI is doing brute force search through an exponentially large space. Unless the right Turing machine is 100 bits or less (which seems unlikely), Eliezer's claim seems pretty safe to me.

Most of mainstream machine learning is trying to solve search problems through spaces far tamer than the search space for AIXI, and achieving limited success. So it also seems safe to say that even pretty smart implementations of AIXI probably won't make much progress.

↑ comment by Strange7 · 2013-03-22T13:13:33.594Z · LW(p) · GW(p)

More specifically, fast forward to 2080, when any college kid with $200 to spend (in equivalent 2012 dollars) can purchase enough computing power

If computing power is that much cheaper, it will be because tremendous resources, including but certainly not limited to computing power, have been continuously devoted over the intervening decades to making it cheaper. There will be correspondingly fewer yet-undiscovered insights for a seed AI to exploit in the course of it's attempted takeoff.

↑ comment by Shmi (shminux) · 2012-05-12T17:31:52.117Z · LW(p) · GW(p)

Or am I missing something new here?

My point is that either the Obj 2 holds, or tools are equivalent to agents. If one thinks that the latter is true (EY doesn't), then one should work on proving it. I have no opinion on whether it's true or not (I am not a domain expert).

↑ comment by TheOtherDave · 2012-05-11T00:37:32.267Z · LW(p) · GW(p)

If my comment here correctly captures what is meant by "tool mode" and "agent mode", then it seems to follow that AGI running in tool mode is no safer than the person using it.

If that's the case, then an AGI running in tool mode is safer than an AGI running in agent mode if and only if agent mode is less trustworthy than whatever person ends up using the tool.

Are you assuming that's true?

Replies from: shminux, scav

↑ comment by Shmi (shminux) · 2012-05-11T02:02:15.409Z · LW(p) · GW(p)

What you presented there (and here) is another theorem, something that should be proved (and published, if it hasn't been yet). If true, this gives an estimate on how dangerous a non-agent AGI can be. And yes, since we have had a lot of time study people and no time at all to study AGI, I am guessing that an AGI is potentially much more dangerous, because so little is known. Or at least that seems to be the whole point of the goal of developing provably friendly AI.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T08:32:38.592Z · LW(p) · GW(p)

What you presented there (and here) is another theorem

What? It sounds like a common-sensical¹ statement about tools in general and human nature, but not at all like something which could feasibly be expressed in mathematical form.

Footnote:

This doesn't mean it's necessarily true, though.

↑ comment by scav · 2012-05-11T09:43:40.030Z · LW(p) · GW(p)

No, because a person using a dangerous tool is still just a person, with limited speed of cognition, limited lifespan, and no capacity for unlimited self-modification.

A crazy dictator with a super-capable tool AI that tells him the best strategy to take over the world is still susceptible to assassination, and his plan no matter how clever cannot unfold faster than his victims are able to notice and react to it.

Replies from: Strange7, TheOtherDave

↑ comment by Strange7 · 2013-03-22T13:52:56.916Z · LW(p) · GW(p)

I suspect a crazy dictator with a super-capable tool AI would have unusually good counter-assassination plans, simplified by the reduced need for human advisors and managers of imperfect loyalty. Likewise, a medical expert system could provide gains to lifespan, particularly if it were backed up by the resources a paranoid megalomaniac in control of a small country would be willing to throw at a major threat.

↑ comment by TheOtherDave · 2012-05-11T12:19:16.026Z · LW(p) · GW(p)

Tool != Oracle.

At least, not my my understanding of tool.

My understanding of a supercapable tool AI is one that takes over the world if a crazy dictator directs it to, just like my understanding of a can opener tool is one that opens a can at my direction, rather than one that gives me directions on how to open a can.

Presumably it also augments the dictator's lifespan, cognition, etc. if she asks, insofar as it's capable of doing so.

More generally, my understanding of these concepts is that the only capability that a tool AI lacks that an agent AI has is the capability of choosing goals to implement. So, if we're assuming that an agent AI would be capable of unlimited self-modification in pursuit of its own goals, I conclude that a corresponding tool AI is capable of unlimited self-modification in pursuit of its agent's goals. It follows that assuming that a tool AI is not capable of augmenting its human agent in accordance with its human agent's direction is not safe.

(I should note that I consider a capacity for unlimited self-improvement relatively unlikely, for both tool and agent AIs. But that's beside my point here.)

Agreed that a crazy dictator with a tool that will take over the world for her is safer than an agent capable of taking over the world, if only because the possibility exists that the tool can be taken away from her and repurposed, and it might not occur to her to instruct it to prevent anyone else from taking it or using it.

I stand by my statement that such a tool is no safer than the dictator herself, and that an AGI running in such a tool mode is safer than that AGI running in agent mode only if the agent mode is less trustworthy than the crazy dictator.

Replies from: abramdemski

↑ comment by abramdemski · 2012-05-12T07:09:05.093Z · LW(p) · GW(p)

This seems to propose an alternate notion of 'tool' than the one in the article.

I agree with "tool != oracle" for the article's definition.

Using your definition, I'm not sure there is any distinction between tool and agent at all, as per this comment.

I do think there are useful alternative notions to consider in this area, though, as per this comment.

And I do think there is a terminology issue. Previously I was saying "autonomous AI" vs "non-autonomous".

↑ comment by chaosmage · 2012-05-14T10:58:36.727Z · LW(p) · GW(p)

How about this: An agent with a very powerful tool is indistinguishable from a very powerful agent.

↑ comment by Shmi (shminux) · 2012-05-11T00:13:57.575Z · LW(p) · GW(p)

↑ comment by MarkusRamikin · 2012-05-10T19:59:51.736Z · LW(p) · GW(p)

Wow, I'm blown away by Holden Karnofsky, based on this post alone. His writing is eloquent, non-confrontational and rational. It shows that he spent a lot of time constructing mental models of his audience and anticipated its reaction. Additionally, his intelligence/ego ratio appears to be through the roof.

Agreed. I normally try not to post empty "me-too" replies; the upvote button is there for a reason. But now I feel strongly enough about it that I will: I'm very impressed with the good will and effort and apparent potential for intelligent conversation in HoldenKarnofsky's post.

Now I'm really curious as to where things will go from here. With how limited my understanding of AI issues is, I doubt a response from me would be worth HoldenKarnofsky's time to read, so I'll leave that to my betters instead of adding more noise. But yeah. Seeing SI ideas challenged in such a positive, constructive way really got my attention. Looking forward to the official response, whatever it might be.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T08:34:24.531Z · LW(p) · GW(p)

Agreed. I normally try not to post empty "me-too" replies; the upvote button is there for a reason. But now I feel strongly enough about it that I will: I'm very impressed with the good will and effort and apparent potential for intelligent conversation in HoldenKarnofsky's post.

“the good will and effort and apparent potential for intelligent conversation” is more information than an upvote, IMO.

Replies from: MarkusRamikin

↑ comment by MarkusRamikin · 2012-05-11T09:00:28.798Z · LW(p) · GW(p)

Right, I just meant shminux said more or less the same thing before me. So normally I would have just upvoted his comment.

↑ comment by dspeyer · 2012-05-11T02:47:26.396Z · LW(p) · GW(p)

Any sufficiently advanced tool is indistinguishable from [an] agent.

Let's see if we can use concreteness to reason about this a little more thoroughly...

As I understand it, the nightmare looks something like this. I ask Google SuperMaps for the fastest route from NYC to Albany. It recognizes that computing this requires traffic information, so it diverts several self-driving cars to collect real-time data. Those cars run over pedestrians who were irrelevant to my query.

The obvious fix: forbid SuperMaps to alter anything outside of its own scratch data. It works with the data already gathered. Later a Google engineer might ask it what data would be more useful, or what courses of action might cheaply gather that data, but the engineer decides what if anything to actually do.

This superficially resembles a box, but there's no actual box involved. The AI's own code forbids plans like that.

But that's for a question-answering tool. Let's take another scenario:

I tell my super-intelligent car to take me to Albany as fast as possible. It sends emotionally manipulative emails to anyone else who would otherwise be on the road encouraging them to stay home.

I don't see an obvious fix here.

So the short answer seems to be that it matters what the tool is for. A purely question-answering tool would be extremely useful, but not as useful as a general purpose one.

Could humans with a oracular super-AI police the development and deployment of active super-AIs?

Replies from: shminux, abramdemski

↑ comment by Shmi (shminux) · 2012-05-11T04:49:57.166Z · LW(p) · GW(p)

I tell my super-intelligent car to take me to Albany as fast as possible. It sends emotionally manipulative emails to anyone else who would otherwise be on the road encouraging them to stay home.

I believe that HK's post explicitly characterizes anything active like this as having agency.

Replies from: Will_Sawin, drnickbone

↑ comment by Will_Sawin · 2012-05-11T06:21:55.285Z · LW(p) · GW(p)

I think the correct objection is something you can't quite see in google maps. If you program an AI to do nothing but output directions, it will do nothing but output directions. If those directions are for driving, you're probably fine. If those directions are big and complicated plans for something important, that you follow without really understanding why you're doing (and this is where most of the benefits of working with an AGI will show up), then you could unknowingly take over the world using a sufficiently clever scheme.

Also note that it would be a lot easier for the AI to pull this off if you let it tell you how to improve its own design. If recursively self-improving AI blows other AI out of the water, then tool AI is probably not safe unless it is made ineffective.

This does actually seem like it would raise the bar of intelligence needed to take over the world somewhat. It is unclear how much. The topic seems to me to be worthy of further study/discussion, but not (at least not obviously) a threat to the core of SIAI's mission.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-05-11T15:16:32.911Z · LW(p) · GW(p)

If those directions are big and complicated plans for something important, that you follow without really understanding why you're doing (and this is where most of the benefits of working with an AGI will show up), then you could unknowingly take over the world using a sufficiently clever scheme.

It also helps that Google Maps does not have general intelligence, so it does not include user's reactions to its output, the consequent user's actions in the real world, etc. as variables in its model, which may influence the quality of the solution, and therefore can (and should) be optimized (within constraints given by user's psychology, etc.), if possible.

Shortly: Google Maps does not manipulate you, because it does not see you.

Replies from: Nebu

↑ comment by Nebu · 2012-12-31T12:04:49.902Z · LW(p) · GW(p)

A generally smart Google Maps might not manipulate you, because it has no motivation to do so.

It's hard to imagine how commercial services would work when they're powered by GAI (e.g. if you asked a GAI version of Google Maps a question that's unrelated to maps, e.g. "What's a good recipe for Cheesecake?", would it tell you that you should ask Google Search instead? Would it defer to Google Search and forward the answer to you? Would it just figure out the answer anyway, since it's generally intelligent? Would the company Google simply collapse all services into a single "Google" brand, rather than have "Google Search", "Google Mail", "Google Maps", etc, and have that single brand be powered by a single GAI? etc.) but let's stick to the topic at hand and assume there's a GAI named "Google Maps", and you're asking "How do I get to Albany?"

Given this use-case, would the engineers that developed the Google Maps GAI more likely give it a utility like "Maximize the probability that your response is truthful", or is it more likely that the utility would be something closer to "Always respond with a set of directions which are legal in the relevant jurisdictions that they are to be followed within which, if followed by the user, would cause the user to arrive at the destination while minimizing cost/time/complexity (depending on the user's preferences)"?

↑ comment by drnickbone · 2012-05-11T09:36:18.426Z · LW(p) · GW(p)

This was my thought as well: an automated vehicle is in "agent" mode.

The example also demonstrates why an AI in agent mode is likely to be more useful (in many cases) than an AI in tool mode. Compare using Google maps to find a route to the airport versus just jumping into a taxi cab and saying "Take me to the airport". Since agent-mode AI has uses, it is likely to be developed.

↑ comment by abramdemski · 2012-05-11T05:36:33.748Z · LW(p) · GW(p)

I tell my super-intelligent car to take me to Albany as fast as possible. It sends emotionally manipulative emails to anyone else who would otherwise be on the road encouraging them to stay home.

Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.

Perhaps we can meaningfully extend the distinction to some kinds of "semi-autonomous" tools, but that would be a different idea, wouldn't it?

(Edit) After reading more comments, "a different idea" which seems to match this kind of desire... http://lesswrong.com/lw/cbs/thoughts_on_the_singularity_institute_si/6jys

Replies from: David_Gerard, TheOtherDave

↑ comment by David_Gerard · 2012-05-11T13:57:05.506Z · LW(p) · GW(p)

Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.

I'm a sysadmin. When I want to get something done, I routinely come up with something that answers the question, and when it does that reliably I give it the power to do stuff on as little human input as possible. Often in daemon mode, to absolutely minimise how much it needs to bug me. Question-answerer->tool->agent is a natural progression just in process automation. (And this is why they're called "daemons".)

It's only long experience and many errors that's taught me how to do this such that the created agents won't crap all over everything. Even then I still get surprises.

Replies from: private_messaging, TheAncientGeek

↑ comment by private_messaging · 2012-05-11T15:21:42.029Z · LW(p) · GW(p)

Well, do your 'agents' build a model of the world, fidelity of which they improve? I don't think those really are agents in the AI sense, and definitely not in self improvement sense.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-11T15:28:55.073Z · LW(p) · GW(p)

They may act according to various parameters they read in from the system environment. I expect they will be developed to a level of complication where they have something that could reasonably be termed a model of the world. The present approach is closer to perceptual control theory, where the sysadmin has the model and PCT is part of the implementation. 'Cos it's more predictable to the mere human designer.

Capacity for self-improvement is an entirely different thing, and I can't see a sysadmin wanting that - the sysadmin would run any such improvements themselves, one at a time. (Semi-automated code refactoring, for example.) The whole point is to automate processes the sysadmin already understands but doesn't want to do by hand - any sysadmin's job being to automate themselves out of the loop, because there's always more work to do. (Because even in the future, nothing works.)

I would be unsurprised if someone markets a self-improving system for this purpose. For it to go FOOM, it also needs to invent new optimisations, which is presently a bit difficult.

Edit: And even a mere daemon-like automated tool can do stuff a lot of people regard as unFriendly, e.g. high frequency trading algorithms.

↑ comment by TheAncientGeek · 2014-07-05T17:45:18.272Z · LW(p) · GW(p)

It's not a natural progression in the sense of occurring without human intervention. That is rather relevant if the idea ofAI safety is going to be based on using tool AI strictly as tool AI.

↑ comment by TheOtherDave · 2012-05-11T14:12:03.644Z · LW(p) · GW(p)

Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.

My own impression differs.

It becomes increasingly clear that "tool" in this context is sufficiently subject to different definitions that it's not a particularly useful term.

Replies from: abramdemski

↑ comment by abramdemski · 2012-05-12T07:00:27.356Z · LW(p) · GW(p)

I've been assuming the definition from the article. I would agree that the term "tool AI" is unclear, but I would not agree that the definition in the article is unclear.

↑ comment by A1987dM (army1987) · 2012-05-11T08:13:25.990Z · LW(p) · GW(p)

Any sufficiently advanced tool is indistinguishable from an agent.

I have no strong intuition about whether this is true or not, but I do intuit that if it's true, the value of sufficiently for which it's true is so high it'd be nearly impossible to achieve it accidentally.

(On the other hand the blind idiot god did ‘accidentally’ make tools into agents when making humans, so... But after all that only happened once in hundreds of millions of years of ‘attempts’.)

Replies from: othercriteria, JoshuaZ

↑ comment by othercriteria · 2012-05-11T13:04:24.818Z · LW(p) · GW(p)

the blind idiot god did ‘accidentally’ make tools into agents when making humans, so... But after all that only happened once in hundreds of millions of years of ‘attempts’.

This seems like a very valuable point. In that direction, we also have the tens of thousands of cancers that form every day, military coups, strikes, slave revolts, cases of regulatory capture, etc.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-15T12:04:06.414Z · LW(p) · GW(p)

Hmmm. Yeah, cancer. The analogy would be "sufficiently advanced tools tend to be a short edit distance away from agents", which would mean that a typo in the source code or a cosmic ray striking a CPU at the wrong place and time could have pretty bad consequences.

↑ comment by JoshuaZ · 2012-05-15T01:10:12.519Z · LW(p) · GW(p)

I have no strong intuition about whether this is true or not, but I do intuit that if it's true, the value of sufficiently for which it's true is so high it'd be nearly impossible to achieve it accidentally.

I'm not sure. The analogy might be similar to how an sufficiently complicated process is extremely likely to be able to model a Turing machine. .And in this sort of context, extremely simple systems do end up being Turing complete such as the Game of Life. As a rough rule of thumb from a programming perspective, once some language or scripting system has more than minimal capabilities, it will almost certainly be Turing equivalent.

I don't know how good an analogy this is, but if it is a good analogy, then one maybe should conclude the exact opposite of your intuition.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-15T18:33:13.404Z · LW(p) · GW(p)

A language can be Turing-complete while still being so impractical that writing a program to solve a certain problem will seldom be any easier than solving the problem yourself (exhibits A and B). In fact, I guess that a vast majority of languages in the space of all possible Turing-complete languages are like that.

(Too bad that a human's “easier” isn't the same as a superhuman AGI's “easier”.)

↑ comment by private_messaging · 2012-05-11T07:56:39.265Z · LW(p) · GW(p)

Any sufficiently advanced tool is indistinguishable from an agent.

I do not think this is even true.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-05-11T14:00:03.733Z · LW(p) · GW(p)

I routinely try to turn sufficiently reliable tools into agents wherever possible, per this comment.

I suppose we could use a definition of "agent" that implied greater autonomy in setting its own goals. But there are useful definitions that don't.

↑ comment by badger · 2012-05-10T23:28:21.768Z · LW(p) · GW(p)

If the tool/agent distinction exists for sufficiently powerful AI, then a theory of friendliness might not be strictly necessary, but still highly prudent.

Going from a tool-AI to an agent-AI is a relatively simple step of the entire process. If meaningful guarantees of friendliness turn out to be impossible, then security comes down on no one attempting to make an agent-AI when strong enough tool-AIs are available. Agency should be kept to a minimum, even with a theory of friendliness in hand, as Holden argues in objection 1. Guarantees are safeguards against the possibility of agency rather than a green light.

↑ comment by mwaser · 2012-05-10T22:07:04.179Z · LW(p) · GW(p)

If it is true (i.e. if a proof can be found) that "Any sufficiently advanced tool is indistinguishable from agent", then any RPOP will automatically become indistinguishable from an agent once it has self-improved past our comprehension point.

This would seem to argue against Yudkowsky's contention that the term RPOP is more accurate than "Artificial Intelligence" or "superintelligence".

Replies from: Alejandro1, shminux

↑ comment by Alejandro1 · 2012-05-10T23:40:53.893Z · LW(p) · GW(p)

I don't understand; isn't Holden's point precisely that a tool AI is not properly described as an optimization process? Google Maps isn't optimizing anything in a non-trivial sense, anymore than a shovel is.

Replies from: abramdemski, TheOtherDave

↑ comment by abramdemski · 2012-05-11T04:59:58.505Z · LW(p) · GW(p)

My understanding of Holden's argument was that powerful optimization processes can be run in either tool-mode or agent-mode.

For example, Google maps optimizes routes, but returns the result with alternatives and options for editing, in "tool mode".

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-12T22:37:54.337Z · LW(p) · GW(p)

Holden wants to build Tool-AIs that output summaries of their calculations along with suggested actions. For Google Maps, I guess this would be the distance and driving times, but how does a Tool-AI summarize more general calculations that it might do?

It could give you the expected utilities of each option, but it's hard to see how that helps if we're concerned that its utility function or EU calculations might be wrong. Or maybe it could give a human-readable description of the predicted consequences of each option, but the process that produces such descriptions from the raw calculations would seem to require a great deal of intelligence on its own (for example it might have to describe posthuman worlds in terms understandable to us), and it itself wouldn't be a "safe" Tool-AI, since the summaries produced would presumably not come with further alternative summaries and meta-summaries of how the summaries were calculated.

(My question might be tangential to your own comment. I just wanted your thoughts on it, and this seems to be the best place to ask.)

Replies from: Alsadius

↑ comment by Alsadius · 2012-05-13T04:08:02.803Z · LW(p) · GW(p)

The point is that we don't want it to be a black box - we want to be able to get inside its head, so to speak.

(Of course, we can't do that with humans, and that hasn't stopped us, but it's still a nice goal)

↑ comment by TheOtherDave · 2012-05-11T00:13:51.711Z · LW(p) · GW(p)

Honestly, this whole tool/agent distinction seems tangential to me.

Consider two systems, S1 and S2.

S1 comprises the following elements: a) a tool T, which when used by a person to achieve some goal G, can efficiently achieve G
b) a person P, who uses T to efficiently achieve G.

S2 comprises a non-person agent A which achieves G efficiently.

I agree that A is an agent and T is not an agent, and I agree that T is a tool, and whether A is a tool seems a question not worth asking. But I don't quite see why I should prefer S1 to S2.

Surely the important question is whether I endorse G?

Replies from: dspeyer

↑ comment by dspeyer · 2012-05-11T02:08:12.385Z · LW(p) · GW(p)

A tool+human differs from a pure AI agent in two important ways:

The human (probably) already has naturally-evolved morality, sparing us the very hard problem of formalizing that.
We can arrange for (almost) everyone to have access to the tool, allowing tooled humans to counterbalance eachother.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-11T03:13:38.385Z · LW(p) · GW(p)

Well, I certainly agree that both of those things are true.

And it might be that human-level evolved moral behavior is the best we can do... I don't know. It would surprise me, but it might be true.

That said... given how unreliable such behavior is, if human-level evolved moral behavior even approximates the best we can do, it seems likely that I would do best to work towards neither T nor A ever achieving the level of optimizing power we're talking about here.

Replies from: dspeyer

↑ comment by dspeyer · 2012-05-11T03:23:45.956Z · LW(p) · GW(p)

Humanity isn't that bad. Remember that the world we live in is pretty much the way humans made it, mostly deliberately.

But my main point was that existing humanity bypasses the very hard did-you-code-what-you-meant-to problem.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-11T03:33:30.397Z · LW(p) · GW(p)

I agree with that point.

↑ comment by Shmi (shminux) · 2012-05-10T22:37:25.905Z · LW(p) · GW(p)

First, I am not fond of the term RPOP, because it constrains the space of possible intelligences to optimizers. Humans are reasonably intelligent, yet we are not consistent optimizers. Neither do current domain AIs (they have bugs that often prevent them from performing optimization consistently and predictably).That aside, I don't see how your second premise follows from the first. Just because RPOP is a subset of AI and so would be a subject of such a theorem, it does not affect in any way the (non)validity of the EY's contention.

↑ comment by Bugmaster · 2012-05-10T20:18:55.711Z · LW(p) · GW(p)

I also find it likely that certain practical problems would be prohibitively difficult (if not outright impossible) to solve without an AGI of some sort. Fluent machine translation seems to be one of these problems, for example.

Replies from: army1987, Alsadius

↑ comment by A1987dM (army1987) · 2012-05-13T09:38:55.831Z · LW(p) · GW(p)

This belief is mainstream enough for Wikipedia to have an article on AI-complete.

↑ comment by Alsadius · 2012-05-13T03:57:35.315Z · LW(p) · GW(p)

Given some of the translation debates I've heard, I'm not convinced it would be possible even with AGI. You can't give a clear translation of a vague original, to name the most obvious problem.

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2012-05-13T04:35:51.217Z · LW(p) · GW(p)

Is matching the vagueness of the original a reasonable goal?

Replies from: Alsadius, army1987, dlthomas

↑ comment by Alsadius · 2012-05-15T00:56:29.600Z · LW(p) · GW(p)

True, but good luck getting folks to agree on whether you'd done so.

↑ comment by A1987dM (army1987) · 2012-05-13T09:36:35.781Z · LW(p) · GW(p)

(I'm taking reasonable to mean ‘one which you would want to achieve if it were possible’.) Yes. You don't want to introduce false precision.

↑ comment by dlthomas · 2012-05-15T00:59:21.233Z · LW(p) · GW(p)

One complication here is that you ideally want it to be vague in the same ways the original was vague; I am not convinced this is always possible while still having the results feel natural/idomatic.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-15T01:01:37.638Z · LW(p) · GW(p)

IMO it would be enough to translate the original text in such a fashion that some large proportion (say, 90%) of humans who are fluent in both languages would look at both texts and say, "meh... close enough".

Replies from: dlthomas

↑ comment by dlthomas · 2012-05-15T02:23:47.235Z · LW(p) · GW(p)

My point was just that there's a whole lot of little issues that pull in various directions if you're striving for ideal. What is/isn't close enough can depend very much on context. Certainly, for any particular purpose something less than that will be acceptable; how gracefully it degrades no doubt depends on context, and likely won't be uniform across various types of difference.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-15T02:26:14.056Z · LW(p) · GW(p)

Agreed, but my point was that I'd settle for an AI who can translate texts as well as a human could (though hopefully a lot faster). You seem to be thinking in terms of an AI who can do this much better than a human could, and while this is a worthy goal, it's not what I had in mind.

comment by Wei Dai (Wei_Dai) · 2012-05-11T02:45:15.892Z · LW(p) · GW(p)

Is it just me, or do Luke and Eliezer's initial responses appear to send the wrong signals? From the perspective of an SI critic, Luke's comment could be interpreted as saying "for us, not being completely incompetent is worth bragging about", and Eliezer's as "we're so arrogant that we've only taken two critics (including Holden) seriously in our entire history". These responses seem suboptimal, given that Holden just complained about SI's lack of impressive accomplishments, and being too selective about whose feedback to take seriously.

Replies from: Nick_Beckstead, Furcas, lukeprog, Will_Newsome, magfrump, thomblake, ChrisHallquist, ciphergoth, army1987, private_messaging

↑ comment by Nick_Beckstead · 2012-05-11T03:56:21.930Z · LW(p) · GW(p)

While I have sympathy with the complaint that SI's critics are inarticulate and often say wrong things, Eliezer's comment does seem to be indicative of the mistake Holden and Wei Dai are describing. Most extant presentations of SIAI's views leave much to be desired in terms of clarity, completeness, concision, accessibility, and credibility signals. This makes it harder to make high quality objections. I think it would be more appropriate to react to poor critical engagement more along the lines of "We haven't gotten great critics. That probably means that we need to work on our arguments and their presentation," and less along the lines of "We haven't gotten great critics. That probably means that there's something wrong with the rest of the world."

Replies from: ChrisHallquist, lukeprog, Nick_Beckstead

↑ comment by ChrisHallquist · 2012-05-11T04:04:08.286Z · LW(p) · GW(p)

This. I've been trying to write something about Eliezer's debate with Robin Hanson, but the problem I keep running up against is that Eliezer's points are not clearly articulated at all. Even making my best educated guesses about what's supposed to go in the gaps in his arguments, I still ended up with very little.

Replies from: jacob_cannell, private_messaging

↑ comment by jacob_cannell · 2012-05-17T09:04:05.589Z · LW(p) · GW(p)

Have the key points of that 'debate' subsequently been summarized or clarified on LW? I found that debate exasperating in that Hanson and EY were mainly talking past each other and couldn't seem to hone in on their core disagreements.

I know it generally has to do with hard takeoff / recursive self-improvement vs more gradual EM revolution, but that's not saying all that much.

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2012-05-17T19:13:22.581Z · LW(p) · GW(p)

I'm in the process of writing a summary and analysis of the key arguments and points in that debate.

The most recent version runs at 28 pages - and that's just an outline.

Replies from: somervta, jacob_cannell

↑ comment by somervta · 2013-01-17T09:02:44.856Z · LW(p) · GW(p)

If you need help with grunt work, please send me a message. If (as I suspect is the case) not, then good luck!

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2013-01-18T07:29:27.876Z · LW(p) · GW(p)

Thanks, I'm fine. I posted a half-finished version here, and expect to do some further refinements soon.

↑ comment by jacob_cannell · 2012-05-17T23:14:36.074Z · LW(p) · GW(p)

Awesome, look forward to it. I'd offer to help but I suspect that wouldn't really help. I'll just wax enthusiastic.

↑ comment by private_messaging · 2012-05-17T07:08:08.140Z · LW(p) · GW(p)

This. Well, the issue is the probability that it's just gaps. Ultimately, its the sort of thing that would only constitute a weak argument from authority iff the speaker had very very impressive accomplishments. Otherwise you're left assuming simplest explanation which doesn't involve presence of unarticulated points of any importance.

A gapless argument, like math proof, could trump authority if valid... an argument with gaps, on the other hand, is the one that is very prone to being trumped.

↑ comment by lukeprog · 2012-05-11T19:21:31.459Z · LW(p) · GW(p)

Agree with all this.

↑ comment by Nick_Beckstead · 2012-05-11T05:11:05.797Z · LW(p) · GW(p)

In fairness I should add that I think Luke M agrees with this assessment and is working on improving these arguments/communications.

↑ comment by Furcas · 2012-05-11T03:15:54.114Z · LW(p) · GW(p)

Luke isn't bragging, he's admitting that SI was/is bad but pointing out it's rapidly getting better. And Eliezer is right, criticisms of SI are usually dumb. Could their replies be interpreted the wrong way? Sure, anything can be interpreted in any way anyone likes. Of course Luke and Eliezer could have refrained from posting those replies and instead posted carefully optimized responses engineered to send nothing but extremely appealing signals of humility and repentance.

But if they did turn themselves into politicians, we wouldn't get to read what they actually think. Is that what you want?

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-11T08:30:50.991Z · LW(p) · GW(p)

Luke isn't bragging, he's admitting that SI was/is bad but pointing out it's rapidly getting better.

But the accomplishments he listed (e.g., having a strategic plan, website redesign) are of the type that Holden already indicated to be inadequate. So why the exhaustive listing, instead of just giving a few examples to show SI is getting better and then either agreeing that they're not yet up to par, or giving an argument for why Holden is wrong? (The reason I think he could be uncharitably interpreted as bragging is that he would more likely exhaustively list the accomplishments if he was proud of them, instead of just seeing them as fixes to past embarrassments.)

And Eliezer is right, criticisms of SI are usually dumb.

I'd have no problem with "usually" but "all except two" seems inexcusable.

But if they did turn themselves into politicians, we wouldn't get to read what they actually think. Is that what you want?

Do their replies reflect their considered, endorsed beliefs, or were they just hurried remarks that may not say what they actually intended? I'm hoping it's the latter...

Replies from: Kaj_Sotala, lukeprog

↑ comment by Kaj_Sotala · 2012-05-11T10:10:04.533Z · LW(p) · GW(p)

But the accomplishments he listed (e.g., having a strategic plan, website redesign) are of the type that Holden already indicated to be inadequate. So why the exhaustive listing, instead of just giving a few examples to show SI is getting better and then either agreeing that they're not yet up to par, or giving an argument for why Holden is wrong?

Presume that SI is basically honest and well-meaning, but possibly self-deluded. In other words, they won't outright lie to you, but they may genuinely believe that they're doing better than they really are, and cherry-pick evidence without realizing that they're doing so. How should their claims of intending to get better be evaluated?

Saying "we're going to do things better in the future" is some evidence about SI intending to do better, but rather weak evidence, since talk is cheap and it's easy to keep thinking that you're really going to do better soon but there's this one other thing that needs to be done first and we'll get started on the actual improvements tomorrow, honest.

Saying "we're going to do things better in the future, and we've fixed these three things so far" is stronger evidence, since it shows that you've already began fixing problems and might keep up with it. But it's still easy to make a few improvements and then stop. There are far more people who try to get on a diet, follow it for a while and then quit than there are people who actually diet for as long as they initially intended to do.

Saying "we're going to do things better in the future, and here's the list of 18 improvements that we've implemented so far" is much stronger evidence than either of the two above, since it shows that you've spent a considerable amount of effort on improvements over an extended period of time, enough to presume that you actually care deeply about this and will keep up with it.

I don't have a cite at hand, but it's been my impression that in a variety of fields, having maintained an activity for longer than some threshold amount of time is a far stronger predictor of keeping up with it than having maintained it for a shorter time. E.g. many people have thought about writing a novel and many people have written the first five pages of a novel. But when considering the probability of finishing, the difference between the person who's written the first 5 pages and the person who's written the first 50 pages is much bigger than the difference between the person who's written the first 100 pages and the person who's written the first 150 pages.

There's a big difference between managing some performance once, and managing sustained performance over an extended period of time. Luke's comment is far stronger evidence of SI managing sustained improvements over an extended period of time than a comment just giving a few examples of improvement.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-12T15:49:46.714Z · LW(p) · GW(p)

I don't think there's a sharp distinction between self deception and effective lying. For the lying you have to run some process with the falsehood taken as true.

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2012-05-13T07:19:52.855Z · LW(p) · GW(p)

The main difference is that if there's reason to presume that they're lying, any claims of "we've implemented these improvements" that you can't directly inspect become worthless. Right now, if they say something like "Meetings with consultants about bookkeeping/accounting; currently working with our accountant to implement best practices and find a good bookkeeper", I trust them enough to believe that they're not just making it up even though I can't personally verify it.

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2012-05-13T18:25:53.797Z · LW(p) · GW(p)

On the other had, you can't trust their claims that these meetings are accomplishing anything.

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2012-05-13T19:11:04.794Z · LW(p) · GW(p)

True.

↑ comment by lukeprog · 2012-05-11T19:26:57.844Z · LW(p) · GW(p)

I've added a clarifying remark at the end of this comment and another at the end of this comment.

↑ comment by lukeprog · 2012-05-11T19:15:52.166Z · LW(p) · GW(p)

Luke's comment could be interpreted as saying "for us, not being completely incompetent is worth bragging about"

Really? I personally feel pretty embarrassed by SI's past organizational competence. To me, my own comment reads more like "Wow, SI has been in bad shape for more than a decade. But at least we're improving very quickly."

Also, I very much agree with Beckstead on this: "Most extant presentations of SIAI's views leave much to be desired in terms of clarity, completeness, concision, accessibility, and credibility signals. This makes it harder to make high quality objections." And also this: "We haven't gotten great critics. That probably means that we need to work on our arguments and their presentation."

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-11T20:37:07.554Z · LW(p) · GW(p)

Really?

Yes, I think it at least gives a bad impression to someone, if they're not already very familiar with SI and sympathetic to its cause. Assuming you don't completely agree with the criticisms that Holden and others have made, you should think about why they might have formed wrong impressions of SI and its people. Comments like the ones I cited seem to be part of the problem.

I personally feel pretty embarrassed by SI's past organizational competence. To me, my own comment reads more like "Wow, SI has been in bad shape for more than a decade. But at least we're improving very quickly."

That's good to hear, and thanks for the clarifications you added.

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T18:05:14.126Z · LW(p) · GW(p)

It's a fine line though, isn't it? Saying "huh, looks like we have much to learn, here's what we're already doing about it" is honest and constructive, but sends a signal of weakness and defensiveness to people not bent on a zealous quest for truth and self-improvement. Saying "meh, that guy doesn't know what he's talking about" would send the stronger social signal, but would not be constructive to the community actually improving as a result of the criticism.

Personally I prefer plunging ahead with the first approach. Both in the abstract for reasons I won't elaborate on, but especially in this particular case. SI is not in a position where its every word is scrutinized; it would actually be a huge win if it gets there. And if/when it does, there's a heck of a lot more damning stuff that can be used against it than an admission of past incompetence.

Replies from: Vaniver

↑ comment by Vaniver · 2012-05-20T18:16:16.400Z · LW(p) · GW(p)

sends a signal of weakness and defensiveness to people not bent on a zealous quest for truth and self-improvement.

I do not see why this should be a motivating factor for SI; to my knowledge, they advertise primarily to people who would endorse a zealous quest for truth and self-improvement.

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T18:25:30.743Z · LW(p) · GW(p)

That subset of humanity holds considerably less power, influence and visibility than its counterpart; resources that could be directed to AI research and for the most part aren't. Or in three words: Other people matter. Assuming otherwise would be a huge mistake.

I took Wei_Dai's remarks to mean that Luke's response is public, and so can reach the broader public sooner or later; and when examined in a broader context, that it gives off the wrong signal. My response was that this was largely irrelevant, not because other people don't matter, but because of other factors outweighing this.

↑ comment by Will_Newsome · 2012-05-11T03:47:13.877Z · LW(p) · GW(p)

Eliezer's comment makes me think that you, specifically, should consider collecting your criticisms and putting them in Main where Eliezer is more likely to see them and take the time to seriously consider them.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-12T18:44:22.825Z · LW(p) · GW(p)

I replied here.

↑ comment by magfrump · 2012-05-11T04:50:01.015Z · LW(p) · GW(p)

Luke's comment addresses the specific point that Holden made about changes in the organization given the change in leadership.

Holden said:

I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years, particularly regarding the last couple of statements listed above. That said, SI is an organization and it seems reasonable to judge it by its organizational track record, especially when its new leadership is so new that I have little basis on which to judge these staff.

Luke attempted to provide (for the reader) a basis on which to judge these staff members.

Eliezer's response was... characteristic of Eliezer? And also very short and coming at a busy time for him.

Replies from: Nebu

↑ comment by Nebu · 2012-12-31T12:15:42.784Z · LW(p) · GW(p)

Eliezer's response was... characteristic of Eliezer? And also very short and coming at a busy time for him.

I think that's Wei_Dai's point, that these "characteristic" replies are fine if you're used to him, but are bad if you don't.

Replies from: magfrump

↑ comment by magfrump · 2012-12-31T19:26:19.787Z · LW(p) · GW(p)

Yeah I mean, as time goes on I think more and more of Eliezer as being kind of a jerk. I thought Luke's post was good, and Eliezer's wasn't, but I also expected longer posts to be forthcoming (which they were).

↑ comment by thomblake · 2012-05-11T19:34:11.643Z · LW(p) · GW(p)

I think it's unfair to take Eliezer's response as anything other than praise for this article. He noted already that he did not have time to respond properly.

And why even point out that a human's response to anything is "suboptimal"? It will be notable when a human does something optimal.

Replies from: faul_sname

↑ comment by faul_sname · 2012-05-11T22:22:58.233Z · LW(p) · GW(p)

We do, on occasion, come up with optimal algorithms for things. Also, "suboptimal" usually means "I can think of several better solutions off the top of my head", not "This solution is not maximally effective".

↑ comment by ChrisHallquist · 2012-05-11T03:58:27.300Z · LW(p) · GW(p)

I read Luke's comment just as "I'm aware these are issues and we're working on it." I didn't read him as "bragging" about the ones that have been solved. Eliezer's... I see the problem with. I initially read it as just commenting Holden on his high-quality article (which I agree was high-quality), but I can see it being read as backhanded at anyone else who's criticized SIAI.

↑ comment by Paul Crowley (ciphergoth) · 2012-05-11T06:34:15.871Z · LW(p) · GW(p)

Are there other specific critiques you think should have made Eliezer's list, or is it that you think he should not have drawn attention to their absence?

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-11T07:39:41.349Z · LW(p) · GW(p)

Are there other specific critiques you think should have made Eliezer's list, or is it that you think he should not have drawn attention to their absence?

Many of Holden's criticisms have been made by others on LW already. He quoted me in Objection 1. Discussion of whether Tool-AI and Oracle-AI are or are not safe have occurred numerous times. Here's one that I was involved in. Many people have criticized Eliezer/SI for not having sufficiently impressive accomplishments. Cousin_it and Silas Barta have questioned whether the rationality techniques being taught by SI (and now the rationality org) are really effective.

↑ comment by A1987dM (army1987) · 2012-05-11T08:39:33.341Z · LW(p) · GW(p)

I kind-of agree about Eliezer's comment, but Luke's doesn't sound like that to me.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T08:41:22.817Z · LW(p) · GW(p)

Retracted. I've just re-read Eliezer's comment more calmly, and it's not that bad either.

↑ comment by private_messaging · 2012-05-11T07:13:02.044Z · LW(p) · GW(p)

It's the correct signals. The incompetents inherently signal incompetence, the competence can't be faked beyond superficial level (and faking competence is all about signalling that you are sure you are competent). The lack of feedback is inherent in the assumption behind 'we are sending wrong signal' rather than 'maybe, we really are incompetent'.

comment by paulfchristiano · 2012-05-10T17:16:26.303Z · LW(p) · GW(p)

Thanks for taking the time to express your views quite clearly--I think this post is good for the world (even with a high value on your time and SI's fundraising ability), and that norms encouraging this kind of discussion are a big public good.

I think the explicit objections 1-3 are likely to be addressed satisfactorily (in your judgment) by less than 50,000 words, and that this would provide a good opportunity for SI to present sharper versions of the core arguments---part of the problem with existing materials is certainly that it is difficult and unrewarding to respond to a nebulous and shifting cloud of objections. A lot of what you currently view as disagreements with SI's views may get shifted to doubts about SI being the right organization to back, which probably won't get resolved by 50,000 words.

comment by lukeprog · 2012-05-11T22:13:23.370Z · LW(p) · GW(p)

This post is highly critical of SIAI — both of its philosophy and its organizational choices. It is also now the #1 most highly voted post in the entire history of LessWrong — higher than any posts by Eliezer or myself.

I shall now laugh harder than ever when people try to say with a straight face that LessWrong is an Eliezer-cult that suppresses dissent.

Replies from: Eliezer_Yudkowsky, JackV, pleeppleep, MarkusRamikin, army1987, brazil84, MarkusRamikin, Robin, private_messaging, XiXiDu, None, None

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-12T14:36:01.873Z · LW(p) · GW(p)

Either I promoted this and then forgot I'd done so, or someone else promoted it - of course I was planning to promote it, but I thought I'd planned to do so on Tuesday after the SIAIers currently running a Minicamp had a chance to respond, since I expected most RSS subscribers to the Promoted feed to read comments only once (this is the same reason I wait a while before promoting e.g. monthly quotes posts). On the other hand, I certainly did upvote it the moment I saw it.

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-12T17:23:12.488Z · LW(p) · GW(p)

Original comment now edited; I wasn't aware anyone besides you might be promoting posts.

↑ comment by JackV · 2012-05-12T09:29:41.141Z · LW(p) · GW(p)

I agree (as a comparative outsider) that the polite response to Holden is excellent. Many (most?) communities -- both online communities and real-world organisations, especially long-standing ones -- are not good at it for lots of reasons, and I think the measured response of evaluating and promoting Holden's post is exactly what LessWrong members would hope LessWrong could do, and they showed it succeeded.

I agree that this is good evidence that LessWrong isn't just an Eliezer-cult. (The true test would be if Elizier and another long-standing poster were dismissive to the post, and then other people persuaded them otherwise. In fact, maybe people should roleplay that or something, just to avoid getting stuck in an argument-from-authority trap, but that's a silly idea. Either way, the fact that other people spoke positively, and Elizier and other long-standing posters did too, is a good thing.)

However, I'm not sure it's as uniquely a victory for the rationality of LessWrong as it sounds. In responose to srdiamond, Luke quoted tenlier saying "[Holden's] critique mostly consists of points that are pretty persistently bubbling beneath the surface around here, and get brought up quite a bit. Don't most people regard this as a great summary of their current views, rather than persuasive in any way?" To me, that suggests that Holden did a really excellent job expressing these views clearly and persuasively. However, it suggests that previous people had tried to express something similar, but it hadn't been expressed well enough to be widely accepted, and people reading had failed to sufficiently apply the dictum of "fix your opponents' arguments for them". I'm not sure if that's true (it's certainly not automatically true), but I suspect it might be. What do people think?

If there's any truth to it, it suggests one good answer to the recent post http://lesswrong.com/lw/btc/how_can_we_get_more_and_better_lw_contrarians (whether that was desirable in general or not) would be, as a rationalist exercise for someone familiar with/to the community and good at writing rationally, to take a survey of contrarian views on the topic that people on the community may have had but not been able to express, and don't worry about showmanship like pretending to believe it yourself, but just say "I think what some people think is [well-expressed argument]. Do you agree that's fair? If so, do I and other people think they have a point?" Whether or not that argument is right it's still good to engage with it if many people are thinking it.

↑ comment by pleeppleep · 2012-05-12T17:30:48.301Z · LW(p) · GW(p)

Third highest now. Eliezer just barely gets into the top 20.

↑ comment by MarkusRamikin · 2012-05-17T07:56:56.387Z · LW(p) · GW(p)

It is also now the 3rd most highly voted post

1st.

At this point even I am starting to be confused.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-17T16:30:45.183Z · LW(p) · GW(p)

Can you articulate the nature of your confusion?

Replies from: MarkusRamikin

↑ comment by MarkusRamikin · 2012-05-17T16:46:56.363Z · LW(p) · GW(p)

I suppose it's that I naively expect, when opening the list of top LW posts ever, to see ones containing the most impressive or clever insights into rationality.

Not that I don't think Holden's post deserves a high score for other reasons. While I am not terribly impressed with his AI-related arguments, the post is of the very highest standards of conduct, of how to have a disagreement that is polite and far beyond what is usually named "constructive".

Replies from: TheOtherDave, aceofspades

↑ comment by TheOtherDave · 2012-05-17T17:17:18.661Z · LW(p) · GW(p)

(nods) Makes sense.

My own primary inference from the popularity of this post is that there's a lot of uncertainty/disagreement within the community about the idea that creating an AGI without an explicit (and properly tuned) moral structure constitutes significant existential risk, but that the social dynamics of the community cause most of that uncertainty/disagreement to go unvoiced most of the time.

Of course, there's lots of other stuff going on as well that has little to do with AGI or existential risk, and a lot to do with the social dynamics of the community itself.

Replies from: None

↑ comment by [deleted] · 2012-06-14T13:38:11.937Z · LW(p) · GW(p)

Maybe. I upvoted it because it will have (and has had) the effect of improving SI's chances.

↑ comment by aceofspades · 2012-06-07T20:36:37.221Z · LW(p) · GW(p)

Some people who upvoted the post may think it is one of the best-written and most important examples of instrumental rationality on this site.

↑ comment by A1987dM (army1987) · 2012-05-12T09:43:31.417Z · LW(p) · GW(p)

I wish I could upvote this ten times.

↑ comment by brazil84 · 2013-02-13T23:20:00.490Z · LW(p) · GW(p)

Well perhaps the normal practice is cult-like and dissent-suppressing and this is an atypical break. Kind of like the fat person who starts eating salad instead of nachos while he watches football. And congratulates himself on his healthy eating even though he is still having donuts for breakfast and hamburgers and french fries for lunch.

Seems to me the test for suppression of dissent is not when a high-status person criticizes. The real test is when someone with medium or low status speaks out.

And my impression is that lesswrong does have problems along these lines. Not as bad as other discussion groups, but still.

↑ comment by MarkusRamikin · 2012-05-12T10:17:46.750Z · LW(p) · GW(p)

5th

Looks like 3rd now. As impressed as I am with the post, at this point I'm a little surprised.

↑ comment by Robin · 2012-05-15T11:30:17.383Z · LW(p) · GW(p)

But LW isn't reflective of SI, most of the people that voted on this article have no affiliation with SI. So the high number of upvotes is less reflective of SI welcoming criticism than LW being dissatisfied with the organization of SI.

Furthermore, this post's criticism of Eliezer's research less strong than its criticism of SI's organization . SI has always been somewhat open to criticism of its organizational structure and many of the current leadership of SI has criticized the organizational structure at some point. But who criticize Eliezer's research do not manage to rise in SI's research division and generally aren't well received even on LW (Roko).

Lastly, laughing at somebody when they call your organization a cult is not a convincing argument, they're more likely to think of your organization as a cult (at least they will think you are arrogant).

↑ comment by private_messaging · 2012-05-12T07:16:58.292Z · LW(p) · GW(p)

How's about you also have a critical discussion of 'where can be we wrong and how do we make sure we are actually competent' and 'can we figure out what the AI will actually do, using our tools?' instead of 'how do we communicate our awesomeness better' and 'are we communicating our awesomeness right' ?

This post is something that can't be suppressed without losing big time, and you not suppressing it is only a strong evidence that you are not completely stupid (which is great).

↑ comment by XiXiDu · 2012-05-12T08:54:07.280Z · LW(p) · GW(p)

I shall now laugh harder than ever when people try to say with a straight face that LessWrong is an Eliezer-cult that suppresses dissent.

Holden does not disagree with most of the basic beliefs that SI endorses. Which I think is rather sad and why I don't view him as a real critic. And he has been very polite.

Here is the impolite version:

If an actual AI researcher would have written a similar post, someone who actually tried to build practical systems and had some economic success, not one of those AGI dreamers. If such a person would write a similar post and actually write in a way that they feel, rather than being incredible polite, things would look very different.

The trust is that you are incredible naive when it comes to technological progress. That recursive self-improvement is nothing more than a row of English words, a barely convincing fantasy. That expected utility maximization is practically unworkable, even for a superhuman intelligence. And that the lesswrong.com sequences are not original or important but merely succeed at drowning out all the craziness they include by a huge amount of unrelated clutter and an appeal to the rationality of the author.

What you call an "informed" critic is someone who shares most of your incredible crazy and completely unfounded beliefs.

Worst of all, you are completely unconvincing and do not even notice it because there are so many other people who are strongly and emotionally attached to the particular science fiction scenarios that you envision.

Replies from: Dolores1984, Jonathan_Graehl, Swimmer963, Barry_Cotter

↑ comment by Dolores1984 · 2012-05-14T05:39:28.670Z · LW(p) · GW(p)

If such a person would write a similar post and actually write in a way that they feel, rather than being incredible polite, things would look very different.

I'm assuming you think they'd come in, scoff at our arrogance for a few pages, and then waltz off. Disregarding how many employed machine learning engineers also do side work on general intelligence projects, you'd probably get the same response from automobile engineer, someone with a track record and field expertise, talking to the Wright Brothers. Thinking about new things and new ideas doesn't automatically make you wrong.

That recursive self-improvement is nothing more than a row of English words, a barely convincing fantasy.

Really? Because that's a pretty strong claim. If I knew how the human brain worked well enough to build one in software, I could certainly build something smarter. You could increase the number of slots in working memory. Tweak the part of the brain that handles intuitive math to correctly deal with orders of magnitude. Improve recall to eidetic levels. Tweak the brain's handling of probabilities to be closer to the Bayesian ideal. Even those small changes would likely produce a mind smarter than any human being who has ever lived. That, plus the potential for exponential subjective speedup, is already dangerous. And that's assuming that the mind that results would see zero new insights that I've missed, which is pretty unlikely. Even if the curve bottoms out fairly quickly, after only a generation or two that's STILL really dangerous.

Worst of all, you are completely unconvincing and do not even notice it because there are so many other people who are strongly and emotionally attached to the particular science fiction scenarios that you envision.

Really makes you wonder how all those people got convinced in the first place.

Replies from: Salemicus

↑ comment by Salemicus · 2012-05-14T13:43:26.282Z · LW(p) · GW(p)

If I knew how the human brain worked well enough to build one in software, I could certainly build something smarter.

This is totally unsupported. To quote Lady Catherine de Bourgh, "If I had ever learned [to play the piano], I should have become a great proficient."

You have no idea whether the "small changes" you propose are technically feasible, or whether these "tweaks" would in fact mean a complete redesign. For all we know, if you knew how the human brain worked well enough to build one in software, you would appreciate why these changes are impossible without destroying the rest of the system's functionality.

After all, it would appear that (say) eidetic recall would provide a fitness advantage. Given that humans lack it, there may well be good reasons why.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-14T14:43:52.811Z · LW(p) · GW(p)

"totally unsupported" seems extreme. (Though I enjoyed the P&P shoutout. I was recently in a stage adaptation of the book, so it is pleasantly primed.)

What the claim amounts to is the belief that:
a) there exist good design ideas for brains that human evolution didn't implement, and
b) a human capable of building a working brain at all is capable of coming up with some of them.

A seems pretty likely to me... at least, the alternative (our currently evolved brains are the best possible design) seems so implausible as to scarcely be worth considering.

B is harder to say anything clear about, but given our experience with other evolved systems, it doesn't strike me as absurd. We're pretty good at improving the stuff we were born with.

Of course, you're right that this is evidence and not proof. It's possible that we just can't do any better than human brains for thinking, just like it was possible (but turned out not to be true) that we couldn't do any better than human legs for covering long distances efficiently.

But it's not negligible evidence.

Replies from: Salemicus

↑ comment by Salemicus · 2012-05-14T18:17:14.001Z · LW(p) · GW(p)

I don't doubt that it's possible to come up with something that thinks better than the human brain, just as we have come up with something that travels better than the human leg. But to cover long distances efficiently, people didn't start by replicating a human leg, and then tweaking it. They came up with a radically different design - e.g. the wheel.

I don't see the evidence that knowing how to build a human brain is the key step in knowing how to build something better. For instance, suppose you could replicate neuron function in software, and then scan a brain map (Robin Hanson's "em" concept). That wouldn't allow you to make any of the improvements to memory, maths, etc, that Dolores suggests. Perhaps you could make it run faster - although depending on hardware constraints, it might run slower. If you wanted to build something better, you might need to start from scratch. Or, things could go the other way - we might be able to build "minds" far better than the human brain, yet never be able to replicate a human one.

But it's not just that evidence is lacking - Dolores is claiming certainty in the lack of evidence. I really do think the Austen quote was appropriate.

Replies from: Dolores1984, TheOtherDave

↑ comment by Dolores1984 · 2012-05-14T20:40:55.343Z · LW(p) · GW(p)

To clarify, I did not mean having the data to build a neuron-by-neuron model of the brain. I meant actually understanding the underlying algorithms those slabs of neural tissue are implementing. Think less understanding the exact structure of a bird's wing, and more understanding the concept of lift.

I think, with that level of understanding, the odds that a smart engineer (even if it's not me) couldn't find something to improve seem low.

↑ comment by TheOtherDave · 2012-05-14T19:16:39.654Z · LW(p) · GW(p)

I agree that I might not need to be able to build a human brain in software to be able to build something better, as with cars and legs.

And I agree that I might be able to build a brain in software without understanding how to do it, e.g., by copying an existing one as with ems.

That said, if I understand the principles underlying a brain well enough to build one in software (rather than just copying it), it still seems reasonable to believe that I can also build something better.

↑ comment by Jonathan_Graehl · 2012-05-13T00:52:00.924Z · LW(p) · GW(p)

I agree that that the tone on both sides is intentionally respectful, and that people here delude themselves if they imagine they aren't up for a bit of mockery from high status folks who don't have the patience to be really engage.

I agree that we don't really know what to expect from the first program that can meaningfully improve itself (including, I suppose, its self-improvement procedure) at a faster pace than human experts working on improving it. It might not be that impressive. But it seems likely to me that it will be a big deal, if ever we get there.

But you're being vague otherwise. Name a crazy or unfounded belief.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-13T10:23:32.424Z · LW(p) · GW(p)

But you're being vague otherwise. Name a crazy or unfounded belief.

Holden asked me something similar today via mail. Here is what I replied:

You wrote in 'Other objections to SI's views':

Unlike the three objections I focus on, these other issues have been discussed a fair amount, and if these other issues were the only objections to SI's arguments I would find SI's case to be strong (i.e., I would find its scenario likely enough to warrant investment in).

It is not strong. The basic idea is that if you pull a mind at random from design space then it will be unfriendly. I am not even sure if that is true. But it is the strongest argument they have. And it is completely bogus because humans do not pull AGI's from mind design space at random.

Further, the whole case for AI risk is based on the idea that there will be a huge jump in capability at some point. Which I think is at best good science fiction, like faster-than-light propulsion, or antimatter weapons (when in doubt that it is possible in principle).

The basic fact that an AGI will most likely need something like advanced nanotechnology to pose a risk, which is itself an existential risk, hints at a conjunction fallacy. We do not need AGI to then use nanotechnology to wipe us out, nanotechnology is already enough if it is possible at all.

Anyway, it feels completely ridiculous to talk about it in the first place. There will never be a mind that can quickly and vastly improve itself and then invent all kinds of technological magic to wipe us out. Even most science fiction books avoid that because it sounds too implausible.

I have written thousands of words about all this and never got any convincing reply. So if you have any specific arguments, let me know.

They say what what I write is unconvincing. But given the amount of vagueness they use to protect their beliefs, my specific criticisms basically amount to a reductio ad absurdum. I don't even need to criticize them, they would have to support their extraordinary beliefs first or make them more specific. Yet I am able to come up with a lot of arguments that speak against the possibility they envision, without any effort and no knowledge of the relevant fields like complexity theory.

Here is a comment I received lately:

…in defining an AGI we are actually looking for a general optimization/compression/learning algorithm which when fed itself as an input, outputs a new algorithm that is better by some multiple. Surely this is at least an NP-Complete if not more problem. It may improve for a little bit and then hit a wall where the search space becomes intractable. It may use heuristics and approximations and what not but each improvement will be very hard won and expensive in terms of energy and matter. But no matter how much it tried, the cold hard reality is that you cannot compute an EXPonential Time algorithm in polynomial time unless (P=EXPTIME :S). A no self-recursive exponential intelligence theorem would fit in with all the other limitations (speed, information density, Turing, Gödel, uncertainties etc) the universe imposes.

If you were to turn IBM Watson gradually into a seed AI, at which point would it become an existential risk and why? They can't answer that at all. It is pure fantasy.

END OF EMAIL

For more see the following posts:

Is an Intelligence Explosion a Disjunctive or Conjunctive Event?
Risks from AI and Charitable Giving
Why I am skeptical of risks from AI
Implicit constraints of practical goals (including the follow-up comments that I posted.)

Some old posts:

See also:

If you believe I don't understand the basics, see:

A Primer On Risks From AI

Also:

There is a lot more, especially in the form of comments where I talk about specifics.

Replies from: Kaj_Sotala, Swimmer963, Desrtopa, Jonathan_Graehl, Dolores1984, brahmaneya

↑ comment by Kaj_Sotala · 2012-05-13T21:57:25.171Z · LW(p) · GW(p)

The basic idea is that if you pull a mind at random from design space then it will be unfriendly. I am not even sure if that is true. But it is the strongest argument they have. And it is completely bogus because humans do not pull AGI's from mind design space at random.

I don't have the energy to get into an extended debate, but the claim that this is "the basic idea" or that this would be "the strongest argument" is completely false. A far stronger basic idea is the simple fact that nobody has yet figured out a theory of ethics that would work properly, which means that even that AGIs that were specifically designed to be ethical are most likely to lead to bad outcomes. And that's presuming that we even knew how to program them exactly.

This isn't even something that you'd need to read a hundred blog posts for, it's well discussed in both The Singularity and Machine Ethics and Artificial Intelligence as a Positive and Negative Factor in Global Risk. Complex Value Systems are Required to Realize Valuable Futures, too.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-14T13:12:00.118Z · LW(p) · GW(p)

I did skim through the last paper. I am going to review it thoroughly at some point.

On first sight one of the problems is the whole assumption of AI drives. On the one hand you claim that an AI is going to follow its code, is its code (as if anyone would doubt causality). On the other hand you talk about the emergence of drives like unbounded self-protection. And if someone says that unbounded self-protection does not need to be part of an AGI, you simply claim that your definition of AGI will have those drives. Which allows you to arrive at your desired conclusion of AGI being an existential risk.

Another problem is the idea that an AGI will be a goal executor (I can't help but interpret that to be your position) when I believe that the very nature of artificial general intelligence implies the correct interpretation of "Understand What I Mean" and that "Do What I Mean" is the outcome of virtually any research. Only if you were to pull an AGI at random from mind design space could you possible arrive at "Understand What I Mean" without "Do What I Mean".

To see why look at any software product or complex machine. Those products are continuously improved. Where "improved" means that they become better at "Understand What I Mean" and "Do What I Mean".

There is no good reason to believe that at some point that development will suddenly turn into "Understand What I Mean" and "Go Batshit Crazy And Do What I Do Not Mean".

There are other problems with the paper. I hope I will find some time to write a review soon.

One problem for me with reviewing such papers is that I doubt a lot of underlying assumptions like that there exists a single principle of general intelligence. As I see it there will never be any sudden jump in capability. I also think that intelligence and complex goals are fundamentally interwoven. An AGI will have to be hardcoded, or learn, to care about a manifold of things. No simple algorithm, given limited computational resources, will give rise to the drives that are necessary to undergo strong self-improvement (if that is possible at all).

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2012-05-14T05:18:14.974Z · LW(p) · GW(p)

Even most science fiction books avoid that because it sounds too implausible.

Not saying I particularly disagree with your other premises, but saying something can't be true because it sounds implausible is not a valid argument.

↑ comment by Desrtopa · 2012-05-14T04:21:05.283Z · LW(p) · GW(p)

It is not strong. The basic idea is that if you pull a mind at random from design space then it will be unfriendly. I am not even sure if that is true. But it is the strongest argument they have. And it is completely bogus because humans do not pull AGI's from mind design space at random.

An AI's mind doesn't have to be pulled from design space at random to be disastrous. The primary issue that the SIAI has to grapple with (based on my understanding,) is that deliberately designing an AI that does what we would want it to do, rather than fulfilling proxy criteria in ways that we would not like at all, is really difficult. Even getting one to recognize "humans" as a category in a way that would be acceptable to us is a major challenge.

Replies from: jsteinhardt

↑ comment by jsteinhardt · 2012-05-15T03:44:36.861Z · LW(p) · GW(p)

Although it's worth pointing out that this is also an obstacle to AGI, since presumably an AI that did not understand what a human was would be pretty unintelligent. So I think it's unfair to claim this as a "friendliness" issue.

Note that I do think there are some important friendliness-related problems, but, assuming I understand your objection, this is not one of them.

Replies from: Desrtopa

↑ comment by Desrtopa · 2012-05-15T03:57:15.404Z · LW(p) · GW(p)

An AI could be an extremely powerful optimizer without having a category for "humans" that mapped to our own. "Human," the way we conceive of it, is a leaky surface generalization.

A strong paperclip maximizer would understand humans as well as it had to to contend with us in its attempts to paperclip the universe, but it wouldn't care about us. And a strong optimizer programmed to maximize the values of "humans" would also probably understand us, but if we don't program into its values an actual category that maps to our conception of humans, it could perfectly well end up applying that understanding to, for example, tiling the universe with crash test dummies.

Replies from: jsteinhardt

↑ comment by jsteinhardt · 2012-05-15T04:31:59.049Z · LW(p) · GW(p)

How do you intend to build a powerful optimizer without having a method of representing (or of building a representation of) the concept of "human" (where "human" can be replaced with any complex concept, even probably paperclips)?

I agree that value specification is a hard problem. But I don't think the complexity of "human" is the reason for this, although it does rule out certain simple approaches like hard-coding values.

(Also, since your link seems to indicate you believe otherwise, I am fairly familiar with the content in the sequences. Apologies if this statement represents an improper inference.)

Replies from: Desrtopa

↑ comment by Desrtopa · 2012-05-15T04:57:16.014Z · LW(p) · GW(p)

How do you intend to build a powerful optimizer without having a method of representing (or of building a representation of) the concept of "human" (where "human" can be replaced with any complex concept, even probably paperclips)?

If a machine can learn, empirically, exactly what humans are, on the most fundamental levels, but doesn't have any values associated with them, why should it need a concept of "human?" We don't have a category that distinguishes igneous rocks that are circular and flat on one side, but we can still recognize them and describe them precisely.

Humans are an unnatural category. Whether a fetus, an individual in a persistent vegetative state, an amputee, a corpse, an em or a skin cell culture fall into the category of "human" depends on value-sensitive boundaries. It's not necessarily because humans are so complex that we can't categorize them in an appropriate manner for an AI (or at least, not just because humans are complex,) it's because we don't have an appropriate formulation of the values that would allow a computer to draw the boundaries of the category in a way we'd want it to.

(I wasn't sure how familiar you were with the sequences, but in any case I figured it can't hurt to add links for anyone who might be following along who's not familiar.)

↑ comment by Jonathan_Graehl · 2012-05-13T21:46:25.166Z · LW(p) · GW(p)

I've read most of that now, and have subscribed to your newsletter.

Reasonable people can disagree in estimating the difficulty of AI and the visibility/pace of AI progress (is it like hunting for a single breakthrough and then FOOM? etc).

I find all of your "it feels ridiculous" arguments by analogy to existing things interesting but unpersuasive.

↑ comment by Dolores1984 · 2012-05-14T04:37:40.340Z · LW(p) · GW(p)

Anyway, it feels completely ridiculous to talk about it in the first place. There will never be a mind that can quickly and vastly improve itself and then invent all kinds of technological magic to wipe us out. Even most science fiction books avoid that because it sounds too implausible.

Says the wooly mammoth, circa 100,000 BC.

Sounding silly and low status and science-fictiony doesn't actually make it unlikely to happen in the real world.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-14T18:21:58.970Z · LW(p) · GW(p)

Especially when not many people want to read a science fiction book where humanity gets quickly and completely wiped out by a superior force. Even works where humans slowly die off due to their own problems (e.g. On the Beach) are uncommon.

↑ comment by brahmaneya · 2012-05-14T04:03:44.876Z · LW(p) · GW(p)

Anyway, it feels completely ridiculous to talk about it in the first place. There will never be a mind that can quickly and vastly improve itself and then invent all kinds of technological magic to wipe us out. Even most science fiction books avoid that because it sounds too implausible

Do you acknowledge that :

We will some day make an AI that is at least as smart as humans?
Humans do try to improve their intelligence (rationality/memory training being a weak example, cyborg research being a better example, and im pretty sure we will soon design physical augmentations to improve our intelligence)

If you acknowledge 1 and 2, then that implies there can (and probably will) be an AI that tries to improve itself

Replies from: jsteinhardt

↑ comment by jsteinhardt · 2012-05-15T03:39:55.532Z · LW(p) · GW(p)

I think you missed the "quickly and vastly" part as well as the "and then invent all kinds of technological magic to wipe us out". Note I still think XiXiDu is wrong to be as confident as he is (assuming "there will never" implies >90% certainty), but if you are going to engage with him then you should engage with his actual arguments.

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2012-05-14T05:20:32.814Z · LW(p) · GW(p)

And that the lesswrong.com sequences are not original or important but merely succeed at drowning out all the craziness they include by a huge amount of unrelated clutter and an appeal to the rationality of the author.

Name three examples? (Of 'craziness' specifically... I agree that there are frequent, and probably unecessary, "appeals to the rationality of the author".)

Replies from: None, None

↑ comment by [deleted] · 2012-05-14T05:33:13.951Z · LW(p) · GW(p)

Name three examples? (Of 'craziness' specifically... I agree that there are frequent, and probably unecessary, "appeals to the rationality of the author".)

XiXiDu may be too modest; he has some great examples on his blog.

Replies from: None

↑ comment by [deleted] · 2012-05-14T06:18:20.994Z · LW(p) · GW(p)

One wonders when or if XiXiDu will ever get over the Roko incident. Yes, it was a weird and possibly disproportionate response, but it was also years ago.

Replies from: JoshuaZ, None

↑ comment by JoshuaZ · 2012-05-14T20:16:29.923Z · LW(p) · GW(p)

Do we have any evidence that Eliezer's attitude or approach to that sort of thing has changed since then?

Replies from: None

↑ comment by [deleted] · 2012-05-14T20:24:04.031Z · LW(p) · GW(p)

Sure. His moderation activities over the last year or so have been far more... sunglasses... moderate.

↑ comment by [deleted] · 2012-05-14T06:22:27.955Z · LW(p) · GW(p)

So said Newt Gingrich.

Replies from: None

↑ comment by [deleted] · 2012-05-14T06:27:19.606Z · LW(p) · GW(p)

Why yes, I do also believe that political figures are held to ridiculous conversational standards as well. It's a miracle they deign to talk to anyone.

↑ comment by [deleted] · 2012-05-14T17:43:05.754Z · LW(p) · GW(p)

Name three examples? (Of 'craziness' specifically... I agree that there are frequent, and probably unecessary, "appeals to the rationality of the author".)

So, Swimmer 963, are those quotes crazy enough for you? (I hope you don't ask a question and neglect to comment on the answer.) What you do think? Anomalous?

Contrary to the impression the comments might convey, the majority don't come from the Roko incident. But as to that incident, the passage of time doesn't necessarily erase the marks of character. Romney is rightfully being held, feet to fire, for a group battering of another student while they attended high school--because such sadism is a trait of character and can't be explained otherwise. How would one explain Yudkowsky's paranoia, lack of perspective, and scapegoating--other than by positing a narcissistic personality structure?

Many LWers can't draw conclusions because they eschew the only tools for that purpose: psychology and excellent fiction. And the second is more important than the first.

Replies from: Swimmer963, JoshuaZ

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2012-05-14T19:25:17.446Z · LW(p) · GW(p)

How would one explain Yudkowsky's paranoia, lack of perspective, and scapegoating--other than by positing a narcissistic personality structure?

I had in fact read a lot of those quotes before–although some of them come as a surprise, so thank you for the link. They do show paranoia and lack of perspective, and yeah, some signs of narcissism, and I would be certainly mortified if I personally ever made comments like that in public...

The Sequences as a whole do come across as having been written by an arrogant person, and that's kind of irritating, and I have to consciously override my irritation in order to enjoy the parts that I find useful, which is quite a lot. It's a simplification to say that the Sequences are just clutter, and it's extreme to call them 'craziness', too.

(Since meeting Eliezer in person, it's actually hard for me to believe that those comments were written by the same person, who was being serious about them... My chief interaction with him was playing a game in which I tried to make a list of my values, and he hit me with a banana every time I got writer's block because I was trying to be too specific, and made the Super Mario Brothers' theme song when I succeeded. It's hard making the connection that "this is the same person who seems to take himself way too seriously in his blog comments." But that's unrelated and doesn't prove anything in either direction.)

My main point is that criticizing someone who believes in a particular concept doesn't irrefutably damn that concept. You can use it as weak evidence, but not proof. Eliezer, as far as I know, isn't the only person who has thought extensively about Friendly AI and found it a useful concept to keep.

Replies from: None

↑ comment by [deleted] · 2012-05-14T22:06:08.587Z · LW(p) · GW(p)

The quotes aren't all about AI. A few:

Take metaethics, a solved problem: what are the odds that someone who still thought metaethics was a Deep Mystery could write an AI algorithm that could come up with a correct metaethics? I tried that, you know, and in retrospect it didn’t work.

Yudkowsky makes the megalomanic claim that he's solved the questions of metaethics. His solution: morality is the function that the brain of a fully informed subject computes to determine what's right. Laughable; pathologically arrogant.

Whoever knowingly chooses to save one life, when they could have saved two – to say nothing of a thousand lives, or a world – they have damned themselves as thoroughly as any murderer.

The most extreme presumptuousness about morality; insufferable moralism. Morality, as you were perhaps on the cusp of recognizing in one of your posts, Swimmer963, is a personalized tool, not a cosmic command line. See my "Why do what you "ought"?—A habit theory of explicit morality."

The preceding remark, I'll grant, isn't exactly crazy--just super obnoxious and creepy.

Science is built around the assumption that you’re too stupid and self-deceiving to just use Solomonoff induction. After all, if it was that simple, we wouldn’t need a social process of science right?

This is where Yudkowsky goes crazy autodidact bonkers. He thinks the social institution of science is superfluous, were everyone as smart as he. This means he can hold views contrary to scientific consensus in specialized fields where he lacks expert knowledge based on pure ratiocination. That simplicity in the information sense equates with parsimony is most unlikely; for one thing, simplicity is dependent on choice of language--an insight that should be almost intuitive to a rationalist. But noncrazy people may believe the foregoing; what they don't believe is that they can at the present time replace the institution of science with the reasoning of smart people. That's the absolutely bonkers claim Yudkowsky makes.

Replies from: Swimmer963, JoshuaZ, Dolores1984, Mass_Driver, thomblake

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2012-05-15T00:04:47.644Z · LW(p) · GW(p)

The quotes aren't all about AI.

I didn't say they were. I said that just because the speaker for a particular idea comes across as crazy doesn't mean the idea itself is crazy. That applies whether all of Eliezer's "crazy statements" are about AI, or whether none of them are.

Whoever knowingly chooses to save one life, when they could have saved two – to say nothing of a thousand lives, or a world – they have damned themselves as thoroughly as any murderer.

The most extreme presumptuousness about morality; insufferable moralism.

Funny, I actually agree with the top phrase. It's written in an unfortunately preachy, minister-scaring-the-congregation-by-saying-they'll-go-to-Hell style, which is guaranteed to make just about anyone get defensive and/or go "ick!" But if you accept the (very common) moral standard that if you can save a life, it's better to do it than not to do it, then the logic is inevitable that if you have the choice of saving one lives or two lives, by your own metric it's morally preferable to save two lives. If you don't accept the moral standard that it's better to save one life than zero lives, then that phrase should be just as insufferable.

Science is built around the assumption that you’re too stupid and self-deceiving to just use Solomonoff induction. After all, if it was that simple, we wouldn’t need a social process of science right?

I decided to be charitable, and went and looked up the post that this was in: it's here. As far as I can tell, Eliezer doesn't say anything that could be interpreted as "science exists because people are stupid, and I'm not stupid, therefore I don't need science". He claims that scientific procedures compensates for people being unwilling to let go of their pet theories and change their minds, and although I have no idea if this goal was in the minds of the people who came up with the scientific method, it doesn't seem to be false that it accomplishes this goal.

Replies from: hairyfigment

↑ comment by hairyfigment · 2012-05-15T03:36:13.943Z · LW(p) · GW(p)

Newton definitely wrote down his version of scientific method to explain why people shouldn't take his law of gravity and just add, "because of Aristotelian causes," or "because of Cartesian mechanisms."

↑ comment by JoshuaZ · 2012-05-14T22:21:03.721Z · LW(p) · GW(p)

This is where Yudkowsky goes crazy autodidact bonkers. He thinks the social institution of science is superfluous, were everyone as smart as he. This means he can hold views contrary to scientific consensus in specialized fields where he lacks expert knowledge based on pure ratiocination.

Ok. I disagree with a large bit of the sequences on science and the nature of science. I've wrote a fair number of comments saying so. So I hope you will listen when I say that you are taking a strawman version of what Eliezer wrote on these issues, and it almost borders on something that I could only see someone thinking if they were trying to interpret Eliezer's words in the most negative fashion possible.

↑ comment by Dolores1984 · 2012-05-14T22:13:46.050Z · LW(p) · GW(p)

His solution: morality is the function that the brain of a fully informed subject computes to determine what's right. Laughable; pathologically arrogant.

You either didn't read that sequence carefully, or are intentionally misrepresenting it.

He thinks the social institution of science is superfluous, were everyone as smart as he.

Didn't read that sequence carefully either.

That simplicity in the information sense equates with parsimony is most unlikely; for one thing, simplicity is dependent on choice of language--an insight that should be almost intuitive to a rationalist.

You didn't read that sequence at all, and probably don't actually know what simplicity means in an information-theoretic sense.

Replies from: None, None

↑ comment by [deleted] · 2012-05-14T22:22:24.378Z · LW(p) · GW(p)

That simplicity in the information sense equates with parsimony is most unlikely; for one thing, simplicity is dependent on choice of language--an insight that should be almost intuitive to a rationalist.

You didn't read that sequence at all, and probably don't actually know what simplicity means in an information-theoretic sense.

To be fair, that sequence doesn't really answer questions about choice-of-language; it took reading some of Solomonoff's papers for me to figure out what the solution to that problem is.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-14T22:37:06.537Z · LW(p) · GW(p)

it took reading some of Solomonoff's papers for me to figure out what the solution to that problem is.

There are a variety of proposed solutions. None of them seem perfect.

Replies from: None

↑ comment by [deleted] · 2012-05-14T22:40:54.174Z · LW(p) · GW(p)

I'm referring to encoding in several different languages, which makes it progressively more implausible that choice of language matters.

I agree that's not a perfect solution, but it's good enough for me.

↑ comment by [deleted] · 2012-05-14T22:23:40.371Z · LW(p) · GW(p)

That's true; I admit I didn't read the sequence. I had a hard time struggling through the single summating essay. What I wrote was his conclusion. As Hanson wrote in the first comment to the essay I did read, Yudkowsky really should summarize the whole business in a few lines. Yudkowsky didn't get around to that, as far as I know.

The summation essay contained more than 7,000 words for the conclusion I quoted. Maybe the rest of the series contradicts what is patent in the essay I read.

I simply don't get the attraction of the sequences. An extraordinarily high ratio of filler to content; Yudkowsky seems to think that every thought along the way to his personal enlightenment is worth the public's time.

Asking that a critic read those sequences in their entirety is asking for a huge sacrifice; little is offered to show it's even close in being worth the misery of reading inept writing or the time.

Replies from: Dolores1984, Randaly, nshepperd

↑ comment by Dolores1984 · 2012-05-14T22:39:19.122Z · LW(p) · GW(p)

You know, the sequences aren't actually poorly written. I've read them all, as have most of the people here. They are a bit rambly in places, but they're entertaining and interesting. If you're having trouble with them, the problem might be on your end.

In any case, if you had read them, you'd know, for instance, that when Yudkowsky talks about simplicity, he is not talking about the simplicity of a given English sentence. He's talking about the combined complexity of a given Turing machine and the program needed to describe your hypothesis on that Turing machine.

Replies from: gwern, Bugmaster, dlthomas, None, None

↑ comment by gwern · 2012-05-17T00:21:00.930Z · LW(p) · GW(p)

have most of the people here

http://lesswrong.com/lw/8p4/2011_survey_results/

89 people (8.2%) have never looked at the Sequences; a further 234 (32.5%) have only given them a quick glance. 170 people have read about 25% of the sequences, 169 (15.5%) about 50%, 167 (15.3%) about 75%, and 253 people (23.2%) said they've read almost all of them. This last number is actually lower than the 302 people who have been here since the Overcoming Bias days when the Sequences were still being written (27.7% of us).

23% for 'almost all'
39% have read > three-quarters
54% have read > half

Replies from: Dolores1984

↑ comment by Dolores1984 · 2012-05-17T00:31:35.036Z · LW(p) · GW(p)

My mistake. I'll remember that in the future.

↑ comment by Bugmaster · 2012-05-14T23:01:02.808Z · LW(p) · GW(p)

In addition, there are places in the Sequences where Eliezer just states things as though he's dispensing wisdom from on high, without bothering to state any evidence or reasoning. His writing is still entertaining, of course, but still less than persuasive.

↑ comment by dlthomas · 2012-05-14T22:56:09.760Z · LW(p) · GW(p)

They are a bit rambly in places, but they're entertaining and interesting.

I also found this to be true.

↑ comment by [deleted] · 2012-05-14T22:42:40.190Z · LW(p) · GW(p)

as have most of the people here.

I'm pretty sure the 2011 survey puts this claim to the test, but I don't have the time to look it up.

↑ comment by [deleted] · 2012-05-14T23:50:27.615Z · LW(p) · GW(p)

You know, the sequences aren't actually poorly written. I've read them all, as have most of the people here. They are a bit rambly in places, but they're entertaining and interesting. If you're having trouble with them, the problem might be on your end.

The problem is partly on my end, for sure; obviously, I find rambling intolerable in Internet writing, and I find it in great abundance in the sequences. You're more tolerant of rambling, and you're entertained by Yudkowsky's. I also think he demonstrates mediocre literary skills when it comes to performances like varying his sentence structure. I don't know what you think of that. My guess is you don't much care; maybe it's a generational thing.

I'm intrigued by what enjoyment readers here get from Yudkowsky's sequences. Why do you all find interesting what I find amateurish and inept? Do we have vastly different tastes or standards, or both? Maybe it is the very prolixity that makes the writing appealing in founding a movement with religious overtones. Reading Yudkowsky is an experience comparable to reading the Bible.

As a side issue, I'm dismayed upon finding that ideas I had thought original to Yudkowsky were secondhand.

Of course I understand simplicity doesn't pertain to simplicity in English! (Or in any natural language.) I don't think you understand the language-relativity issue.

Replies from: TheOtherDave, Swimmer963, JoshuaZ, Bugmaster

↑ comment by TheOtherDave · 2012-05-15T00:43:16.146Z · LW(p) · GW(p)

If you were willing to point me to two or three of your favorite Internet writers, whom you consider reliably enjoyable and interesting and so forth, I might find that valuable for its own sake, and might also be better able to answer your question in mutually intelligible terms.

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2012-05-15T00:11:21.860Z · LW(p) · GW(p)

As a side issue, I'm dismayed upon finding that ideas I had thought original to Yudkowsky were secondhand.

Having to have original ideas is a very high standard. I doubt a single one of my posts contains a truly original idea, and I don't try–I try to figure out which ideas are useful to me, and then present why, in a format that I hope will be useful to others. Eliezer creates a lot of new catchy terms for pre-existing ideas, like "affective death spiral" for "halo effect." I like that.

His posts are also quite short, often witty, and generally presented in an easier-to-digest format than the journal articles I might otherwise have to read to encounter the not-new ideas. You apparently don't find his writing easy to digest or amusing in the same way I do.

Replies from: thomblake

↑ comment by thomblake · 2012-05-15T17:51:59.448Z · LW(p) · GW(p)

like "affective death spiral" for "halo effect."

Affective death spiral is not the same thing as the Halo effect, though the halo effect (/ horns effect) might be part of the mechanism of affective death spiral.

Replies from: Swimmer963

↑ comment by Swimmer963 (Miranda Dixon-Luinenburg) (Swimmer963) · 2012-05-15T19:52:43.470Z · LW(p) · GW(p)

Agreed... I think the Halo effect is a sub-component of an affective death spiral, and "affective death spiral" is a term unique to LW [correct me if I'm wrong!], while 'Halo effect' isn't.

↑ comment by JoshuaZ · 2012-05-15T02:18:45.650Z · LW(p) · GW(p)

As a side issue, I'm dismayed upon finding that ideas I had thought original to Yudkowsky were secondhand.

Are there specific examples? It seems to me that in most cases when he has a pre-existing idea he gives relevant sources.

Replies from: Normal_Anomaly

↑ comment by Normal_Anomaly · 2012-05-16T13:55:56.065Z · LW(p) · GW(p)

I don't know any specific examples of secondhand ideas coming off as original (indeed, he often cites experiments from the H&B literature), but there's another possible source for the confusion. Sometimes Yudkowsky and somebody else come up with ideas independently, and those aren't cited because Yudkowsky didn't know they existed at the time. Drescher and Quine are two philosophers who have been mentioned as having some of the same ideas as Yudkowsky, and I can confirm the former from experience.

↑ comment by Bugmaster · 2012-05-15T00:28:36.625Z · LW(p) · GW(p)

I'm intrigued by what enjoyment readers here get from Yudkowsky's sequences. Why do you all find interesting what I find amateurish and inept?

I find his fictional interludes quite entertaining, because they are generally quite lively, and display a decent amount of world-building -- which is one aspect of science fiction and fantasy that I particularly enjoy. I also enjoy the snark he employs when trashing opposing ideas, especially when such ideas are quite absurd. Of course, the snark doesn't make his writing more persuasive -- just more entertaining.

he demonstrates mediocre literary skills when it comes to performances like varying his sentence structure

I know I'm exposing my ignorance here, but I'm not sure what this means; can you elaborate ?

↑ comment by Randaly · 2012-05-15T19:35:32.107Z · LW(p) · GW(p)

Asking that a critic read those sequences in their entirety is asking for a huge sacrifice; little is offered to show it's even close in being worth the misery of reading inept writing or the time.

Indeed, the sequences are long. I'm not sure about the others here, but I've never asked anybody to "read the sequences."

But I don't even know how to describe the arrogance required to believe that you can dismiss somebody's work as "crazy," "stupid," "megalomanic," "laughably, pathologically arrogant," "bonkers," and "insufferable" without having even read enough of what you're criticizing the get an accurate understanding of it.

ETA: Edited in response to fubarobfusco, who brought up a good point.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2012-05-15T23:00:08.515Z · LW(p) · GW(p)

That's a fully general argument against criticizing anything without having read all of it, though. And there are some things you can fairly dismiss without having read all of. For instance, I don't have to read every page on the Time Cube site to dismiss it as crazy, stupid, pathologically arrogant, and so on.

↑ comment by nshepperd · 2012-05-15T08:08:24.713Z · LW(p) · GW(p)

The reason EY wrote an entire sequence on metaethics is precisely because without the rest of the preparation people such as you who lack all that context immediately veer off course and start believing that he's asserting the existence (or non-existence) of "objective" morality, or that morality is about humans because humans are best or any other standard philosophical confusion that people automatically come up with whenever they think about ethics.

Of course this is merely a communication issue. I'd love to see a more skilled writer present EY's metaethical theory in a shorter form that still correctly conveys the idea, but it seems to be very difficult (especially since even half the people who do read the sequence still come away thinking it's moral relativism or something).

↑ comment by Mass_Driver · 2012-05-15T03:04:41.471Z · LW(p) · GW(p)

I read your post on habit theory, and I liked it, but I don't think it's an answer to the question "What should I do?"

It's interesting to say that if you're an artist, you might get more practical use out of virtue theory, and if you're a politician, you might get more practical use out of consequentialism. I'm not sure who it is that faces more daily temptations to break the rules than the rest of us; bankers, I suppose, and maybe certain kinds of computer security experts.

Anyway, saying that morality is a tool doesn't get you out of the original need to decide which lifestyle you want in the first place. Should I be an artist, or a politician, or a banker? Why? Eliezer's answer is that there are no shortcuts and no frills here; you check and see what your brain says about what you 'should' do, and that's all there is to it. This is not exactly a brilliant answer, but it may nevertheless be the best one out there. I've never yet heard a moral theory that made more sense than that, and believe me, I've looked.

It's reasonable to insist that people put their conclusions in easily digestible bullet points to convince you to read the rest of what they've written...but if, noting that there are no such bullet points, you make the decision not to read the body text -- you should probably refrain from commenting on the body text. A license to opt-out is not the same thing as a license to offer serious criticism. Eliezer may be wrong, but he's not stupid, and he's not crazy. If you want to offer a meaningful critique of his ideas, you'll have to read them first.

Replies from: None

↑ comment by [deleted] · 2012-05-15T04:10:35.453Z · LW(p) · GW(p)

but if, noting that there are no such bullet points, you make the decision not to read the body text -- you should probably refrain from commenting on the body text. A license to opt-out is not the same thing as a license to offer serious criticism. Eliezer may be wrong, but he's not stupid, and he's not crazy.

This is sound general advice, but at least one observation makes this situation exceptional: Yudkowsky's conclusions about ethics are never summarized in terms that contradict my take. I don't think your rendition, for example, contradicts mine. I'm certainly not surprised to hear his position described the way you describe it:

Anyway, saying that morality is a tool doesn't get you out of the original need to decide which lifestyle you want in the first place. Should I be an artist, or a politician, or a banker? Why? Eliezer's answer is that there are no shortcuts and no frills here; you check and see what your brain says about what you 'should' do, and that's all there is to it.

Now, I don't think the decision of whether to be an artist, politician, or banker is a moral decision. It isn't one you make primarily because of what's ethically right or wrong. To the extent you do (and in the restricted sense that you do), your prior moral habits are your only guide.

But we're looking at whether Yudkowsky's position is intellectually respectable, not whether objective morality--which he's committed to but I deny--exists. To say we look at what our brain says when we're fully informed says essentially that we seek a reflective equilibrium in solving moral problems. So far so good. But it goes further in saying brains compute some specific function that determines generally when individuals reach that equilibrium. Leaving aside that this is implausible speculation, requiring that the terms of moral judgments be hardwired--and hardwired identically for each individual--it also simply fails to answer Moore's open question, although Yudkowsky claims he has that answer. There's nothing prima facie compelling ethically about what our brains happen to tell us is moral; no reason we should necessarily follow our brains' hardwiring. I could consistently choose to consider my brain's hardwired moralisms maladaptive or even despicable holdovers from the evolutionary past that I choose to override as much as I can.

Robin Hanson actually asked the right question. If what the brain computes is moral, what does it correspond to that makes it moral? Unless you think the brain is computing a fact about the world, you can't coherently regard its computation as "accurate." But if not, what makes it special and not just a reflex?

I do feel a bit guilty about criticizing Yudkowsky without reading all of him. But he seems to express his ideas at excessive and obfuscating length, and if there were more to them, I feel somewhat confident I'd come across his answers. It isn't as though I haven't skimmed many of these essays. And his answers would certainly deserve some reflection in his summation essay.

There's no question Yudkowsky is no idiot. But he has some ideas that I think are stupid--like his "metaethics"--and he expresses them in a somewhat "crazy" manner, exuding grandiose self-confidence. Being surrounded and discussing mostly with people who agree with him is probably part of the cause.

Replies from: Furcas, Strange7

↑ comment by Furcas · 2012-05-15T05:25:01.782Z · LW(p) · GW(p)

As someone who has read Eliezer's metaethics sequence, let me say that what you think his position is, is only somewhat related to what it actually is; and also, that he has answered those of your objections that are relevant.

It's fine that you don't want to read 30+ fairly long blog posts, especially if you dislike the writing style. But then, don't try to criticize what you're ignorant about. And no, openly admitting that you haven't read the arguments you're criticizing, and claiming that you feel guilty about it, doesn't magically make it more acceptable. Or honest.

Replies from: JoshuaZ, None

↑ comment by JoshuaZ · 2012-05-15T17:35:10.923Z · LW(p) · GW(p)

One doesn't need to have read the whole Bible to criticize it. But the Bible is a fairly short work, so an even more extreme example might be better: one doesn't need to have read the entire Talmud to criticize it.

↑ comment by [deleted] · 2012-05-15T17:01:08.918Z · LW(p) · GW(p)

It's fine that you don't want to read 30+ fairly long blog posts, especially if you dislike the writing style. But then, don't try to criticize what you're ignorant about. And no, openly admitting that you haven't read the arguments you're criticizing, and claiming that you feel guilty about it, doesn't magically make it more acceptable. Or honest.

It's hardly "dishonest" to criticize a position based on a 7,000-word summary statement while admitting you haven't read the whole corpus! You're playing with words to make a moralistic debating point: dishonesty involves deceit, and everyone has been informed of the basis for my opinions.

Consider the double standard involved. Yudkowsky lambasts "philosophers" and their "confusions"--their supposedly misguided concerns with the issues other philosophers have commented on to the detriment of inquiry. Has Yudkowsky read even a single book by each of the philosophers he dismisses?

In a normal forum, participants supply the arguments supposedly missed by critics who are only partially informed. Here there are vague allusions to what the Apostle Yudkowsky (prophet of the Singularity God) "answered" without any substance. An objective reader will conclude that the Prophet stands naked; the prolixity is probably intended to discourage criticism.

Replies from: Kaj_Sotala, JoshuaZ

↑ comment by Kaj_Sotala · 2012-05-15T19:11:38.667Z · LW(p) · GW(p)

I think the argument you make in this comment isn't a bad one, but the unnecessary and unwarranted "Apostle Yudkowsky (prophet of the Singularity God)" stuff amounts to indirectly insulting the people you're talking with and, makes them far less likely to realize that you're actually also saying something sensible. If you want to get your points across, as opposed to just enjoying a feeling of smug moral superiority while getting downvoted into oblivion, I strongly recommend leaving that stuff out.

Replies from: None

↑ comment by [deleted] · 2012-05-24T17:26:04.156Z · LW(p) · GW(p)

Thanks for the advice, but my purpose—given that I'm an amoralist—isn't to enjoy a sense of moral superiority. Rather, to test a forum toward which I've felt ambivalent for several years, mainly for my benefit but also for that of any objective observers.

Strong rhetoric is often necessary in an unreceptive forum because it announces that the writer considers his criticisms fundamental. If I state the criticisms neutrally, something I've often tried, they are received as minor—like the present post. They may even be voted up, but they have little impact. Strong language is appropriate in expressing severe criticisms.

How should a rationalist forum respond to harsh criticism? It isn't rational to fall prey to the primate tendency to in-group thinking by neglecting to adjust for any sense of personal insult when the group leader is lambasted. Judging by reactions, the tendency to in-group thought is stronger here than in many forums that don't claim the mantle of rationalism. This is partly because the members are more intelligent than in most other forums, and intelligence affords more adept self-deception. This is why it is particularly important for intelligent people to be rationalists but only if they honestly strive to apply rational principles to their own thinking. Instead, rationality here serves to excuse participants' own irrationality. Participants simply accept their own tendencies to reject posts as worthless because they contain matter they find insulting. Evolutionary psychology, for instance, here serves to produce rationalizations rather than rationality. (Overcoming Bias is a still more extreme advocacy of this perversion of rationalism, although the tendency isn't expressed in formal comment policies.)

"Karma" means nothing to me except as it affects discourse; I despise even the term, which stinks of Eastern mysticism. I'm told that the karma system of incentives, which any rationalist should understand vitally affects the character of discussion, was transplanted from reddit. How is a failure to attend to the vital mechanics of discussion and incentives rational? Laziness? How could policies so essential be accorded the back seat?

Participants, I'm told, don't question the karma system because it works. A rationalist doesn't think that way. He says, "If a system of incentives introduced without forethought and subject to sound criticisms (where even its name is an insult to rationality) produces the discourse that we want, then something must be wrong with what we want!" What's wanted is the absence of any tests of ideology by fundamental dissent.

I think the argument you make in this comment isn't a bad one, but the unnecessary and unwarranted "Apostle Yudkowsky (prophet of the Singularity God)" stuff amounts to indirectly insulting the people you're talking with and, makes them far less likely to realize that you're actually also saying something sensible. If you want to get your points across, as opposed to just enjoying a feeling of smug moral superiority while getting downvoted into oblivion, I strongly recommend leaving that stuff out.

↑ comment by JoshuaZ · 2012-05-15T17:51:57.958Z · LW(p) · GW(p)

Consider the double standard involved. Yudkowsky lambasts "philosophers" and their "confusions"--their supposedly misguided concerns with the issues other philosophers have commented on to the detriment of inquiry. Has Yudkowsky read even a single book by each of the philosophers he dismisses?

Some of them are simply not great writers. Hegel for example is just awful- the few coherent ideas in Hegel are more usefully described by other later writers. There's also a strange aspect to this in that you are complaining about Eliezer not having read books while simultaneously defending your criticism of Eliezer's metaethics positions without having read all his posts. Incidentally, if one wants to criticize Eliezer's level of knowledge of philosophy, a better point is not so much the philosophers that he criticizes without reading, but rather his lack of knowledge of relevant philosophers that Eliezer seems unaware of, many of whom would agree with some of his points. Quine and Lakatos are the most obvious ones.

Here there are vague allusions to what the Apostle Yudkowsky (prophet of the Singularity God) "answered" without any substance. An objective reader will conclude that the Prophet stands naked; the prolixity is probably intended to discourage criticism.

I strongly suspect that your comments would be responded to more positively if they didn't frequently end with this sort of extreme rhetoric that has more emotional content than rational dialogue. It is particularly a problem because on theLW interface, the up/down buttons are at the end of everything one has read, so what the last sentences say may have a disproportionate impact on whether people upvote or downvote and what they focus on in their replies.

Frankly, you have some valid points, but they are getting lost in the rhetoric. We know that you think that LW pattern matches to religion. Everyone gets the point. You don't need to repeat that every single time you make a criticism.

↑ comment by Strange7 · 2012-05-21T22:38:14.456Z · LW(p) · GW(p)

I could consistently choose to consider my brain's hardwired moralisms maladaptive or even despicable holdovers from the evolutionary past that I choose to override as much as I can.

And you would be making the decision to override with... what, your spleen?

Replies from: None

↑ comment by [deleted] · 2012-05-21T23:47:07.444Z · LW(p) · GW(p)

Another part of my brain--besides the part computing the morality function Yudkowsky posits.

Surely you can't believe Yudkowsky simply means whatever our brain decides is "moral"--and that he offers that as a solution to anything?

Replies from: Strange7, Strange7

↑ comment by Strange7 · 2012-05-22T09:36:11.485Z · LW(p) · GW(p)

I'm not saying he's right, just that your proposed alternative isn't even wrong.

↑ comment by Strange7 · 2012-05-22T06:50:22.998Z · LW(p) · GW(p)

I'm not saying he's right, I'm saying your proposed alternative isn't even wrong.

↑ comment by thomblake · 2012-05-15T17:09:20.777Z · LW(p) · GW(p)

Science is built around the assumption that you’re too stupid and self-deceiving to just use Solomonoff induction.

He thinks the social institution of science is superfluous, were everyone as smart as he.

This is obviously false. Yudkowsky does not claim to be able to do Solomonoff induction in his head.

In general, when Yudkowsky addresses humanity's faults, he is including himself.

Replies from: None

↑ comment by [deleted] · 2012-05-15T17:22:13.579Z · LW(p) · GW(p)

Point taken.

But Yudkowsky says "built around the assumption that you're too stupid... to just use ..."

If Solomonoff induction can't easily be used in place of science, why does the first sentence imply the process is simple: you just use it?

You've clarified what Yudkowsky does not mean. But what does he mean? And why is it so hard to find out? This is the way mystical sects retain their aura while actually saying little.

Replies from: nshepperd

↑ comment by nshepperd · 2012-05-15T17:38:29.857Z · LW(p) · GW(p)

"You're too stupid and self-deceiving to just use Solomonoff induction" ~ "If you were less stupid and self deceiving you'd be able to just use Solomonoff induction" + "but since you are in fact stupid and self-deceiving, instead you have to use the less elegant approximation Science"

That was hard to find out?

Replies from: None

↑ comment by [deleted] · 2012-05-15T17:43:05.909Z · LW(p) · GW(p)

Actually, yes, because of the misleading signals in the inept writing. But thank you for clarifying.

Conclusion: The argument in written in a crazy fashion, but it really is merely stupid. There is no possible measure of simplicity that isn't language relative. How could there be?

Replies from: Randaly, CuSithBell, None

↑ comment by Randaly · 2012-05-15T19:49:22.368Z · LW(p) · GW(p)

You seem to be confusing "language relative" with "non-mathematical." Kolmogorov Complexity is "language-relative," if I'm understanding you right; specifically, it's relative (if I'm using the terminology right?) to a Turing Machine. This was not relevant to Eliezer's point, so it was not addressed.

(Incidentally, this is a perfect example of you "hold{ing} views contrary to scientific consensus in specialized fields where {you} lack expert knowledge based on pure ratiocination," since Kolmogorov Complexity is "one of the fundamental concepts of theoretical computer science", you seemingly lack expert knowledge since you don't recognize these terms, and your argument seems to be based on pure ratiocination.)

↑ comment by CuSithBell · 2012-05-15T17:46:01.866Z · LW(p) · GW(p)

When I read that line for the first time, I understood it. Between our two cases, the writing was the same, but the reader was different. Thus, the writing cannot be the sole cause of our different outcomes.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-15T17:57:38.655Z · LW(p) · GW(p)

Well, if a substantial fraction of readers read something differently or can't parse it, it does potentially reflect a problem with the writing even if some of the readers, or even most readers, do read it correctly.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-15T18:02:00.134Z · LW(p) · GW(p)

Absolutely. I intended to convey that if you don't understand something, that the writing is misleading and inept is not the only possible reason. srdiamond is speaking with such confidence that I felt safe tabling further subtleties for now.

↑ comment by [deleted] · 2012-05-16T18:38:29.877Z · LW(p) · GW(p)

The philosophizing of inept, verbose writers like Yudkowsky can be safely dismissed based solely on their incompetence as writers. For a succinct defense of this contention, see my "Can bad writers be good thinkers? Part 1 of THE UNITY OF LANGUAGE AND THOUGHT" OR see the 3-part "Writing & Thought series" — all together, fewer than 3,000 words.

Replies from: gwern, None

↑ comment by gwern · 2012-05-16T19:01:44.202Z · LW(p) · GW(p)

I believe what you wrote because you used so much bolding.

Replies from: None

↑ comment by [deleted] · 2012-05-21T20:02:21.660Z · LW(p) · GW(p)

Way to deflect attention from substance to form. Exemplary rationality!

Replies from: thomblake

↑ comment by thomblake · 2012-05-21T20:10:12.204Z · LW(p) · GW(p)

I can't tell which way your sarcasm was supposed to cut.

The obvious interpretation is that you think rationality is somehow hindered by paying attention to form rather than substance, and the "exemplary rationality" was intended to be mocking.

But your comment being referenced was an argument that form has something very relevant to say about substance, so it could also be that you were actually praising gwern for practicing what you preach.

Replies from: gwern

↑ comment by gwern · 2012-05-21T20:30:41.664Z · LW(p) · GW(p)

I choose to interpret it as praise, and receive a warm fuzzy feeling.

↑ comment by [deleted] · 2012-05-16T19:20:53.809Z · LW(p) · GW(p)

I read your three-part series. Your posts did not substantiate the claim "good thinking requires good writing." Your second post slightly increased my belief in the converse claim, "good thinkers are better-than-average writers," but because the only evidence you provided was a handful of historical examples, it's not very strong evidence. And given how large the population of good thinkers, good writers, bad thinkers, and bad writers is relative to your sample, evidence for "good thinking implies good writing" is barely worth registering as evidence for "good writing implies good thinking."

↑ comment by JoshuaZ · 2012-05-14T18:12:00.909Z · LW(p) · GW(p)

Romney is rightfully being held, feet to fire, for a group battering of another student while they attended high school--because such sadism is a trait of character and can't be explained otherwise.

I was going to upvote your comment until I got to this point. Aside from the general mindkilling, this looks like the fundamental attribution error, and moreover, we all know that people do in fact mature and change. Bringing up external politics is not helpul in a field where there's already concern that AI issues may be becoming a mindkilling subject themselves on LW. Bringing up such a questionable one is even less useful.

Replies from: metaphysicist

↑ comment by metaphysicist · 2012-05-14T18:21:42.024Z · LW(p) · GW(p)

That's LW "rationality" training for you--"fundamental error of attribution" out of context--favored because it requires little knowledge and training in psychology. Such thinking would preclude any investigation of character. (And there are so many taboos! How do you all tolerate the lockstep communication required here?)

Paul Meehl, who famously studied clinical versus statistical prediction empirically, noted that even professionals, when confronted by instance of aberrant behavior, are apt to call it within normal range when it clearly isn't. Knowledge of the "fundamental error of attribution" alone is the little bit of knowledge that's worse than total ignorance.

Ask yourself honestly whether you would ever or have ever done anything comparable to what Yudkowsky did in the Roko incident or what Romney did in the hair cutting incident.

You can't dismiss politics just because it kills some people's minds, when so much of the available information and examples come from politics. (There are other reasons, but that's the main one here.) Someone who can't be rational about politics simply isn't a good rationalist. You can't be a rationalist about the unimportant things and rationalist about the important ones--yet call yourself a rationalist overall.

Replies from: NancyLebovitz, JoshuaZ, army1987

↑ comment by NancyLebovitz · 2012-05-15T03:55:23.777Z · LW(p) · GW(p)

I'm sure I wouldn't have done what Romney did, and not so sure about whether I would have done what Yudkowsky did. Romney wanted to hurt people for the fun of it. Yudkowsky was trying to keep people from being hurt, regardless of whether his choice was a good one.

Replies from: metaphysicist

↑ comment by metaphysicist · 2012-05-15T04:48:59.281Z · LW(p) · GW(p)

That's a reasonable answer.

↑ comment by JoshuaZ · 2012-05-14T20:04:36.890Z · LW(p) · GW(p)

It seems almost unfair to criticize something as a problem of LW rationality when in your second paragraph you note that professionals do the same thing.

Ask yourself honestly whether you would ever or have ever done anything comparable to what Yudkowsky did in the Roko incident or what Romney did in the hair cutting incident.

I'm not sure. A while ago, I was involved in a situation where someone wanted to put personal information of an individual up on the internet knowing that that person had an internet stalker who had a history of being a real life stalker for others. The only reason I didn't react pretty close to how Eliezer reacted in the quoted incident is that I knew that the individual in question was not going to listen to me and would if anything have done the opposite of what I wanted. In that sort of context, Eliezer's behavior doesn't seem to be that extreme. Eliezer's remarks involve slightly more caps than I think I would use in such a circumstance, but the language isn't that different.

This does connect to another issue though- the scale in question of making heated comments on the internet as opposed to traumatic bullying, are different. The questions I ask myself for what it would take to do something similar to what Eliezer did are very different than the same questions for the Romney incident.

Your basic statement does it seem have some validity. One could argue that the Romney matter reflects the circumstances where he was at the time, and what was considered socially acceptable as forms of interaction or establishing dominance hierarchies. Through most of human history, that sort of behavior would probably be considered fairly tame. But this is a weak argument- even if it was due to the circumstances that Romney was in at the time, there's no question that those were his formative years, and thus could plausibly have had a permanent impact on his moral outlook.

You can't dismiss politics just because it kills some people's minds, when so much of the available information and examples come from politics.

The problem is that even as relevant examples come from politics, those are precisely the examples that people are least likely to agree actually demonstrate the intended point in question. For example, in this case, many people who aren't on the left will downplay the Romney bullying. Given that I'm someone who dislikes Romney (both in terms of personality and in terms of policy) and am not convinced that this is at all fair, using such a controversial example seems unwise. Even if one needs to use political examples, one can use examples from 10 or 15 or 30 years ago that are well known but have had their tribalness diminish in time. For example, in this context one could use a variety of examples connected to Richard Nixon.

Someone who can't be rational about politics simply isn't a good rationalist. You can't be a rationalist about the unimportant things and rationalist about the important ones--yet call yourself a rationalist overall.

Well, we can acknowledge that we're better at being rational in some areas than we are in others. Frankly, I wouldn't mind and for reasons essentially similar to your remark would endorse some amount of reduction of the no-politics rule here. Where that becomes a problem is when one tries to connect politics to other potentially controversial issues.

↑ comment by A1987dM (army1987) · 2012-05-14T18:40:34.376Z · LW(p) · GW(p)

what Romney did in the hair cutting incident

What's that about? (PM me if it's still taboo.)

Replies from: Normal_Anomaly

↑ comment by Normal_Anomaly · 2012-05-16T14:01:05.105Z · LW(p) · GW(p)

When Mitt Romney was in high school, he and some friends bullied a kid who looked (and later turned out to be) homosexual. At one point, Romney and some others grabbed the guy, held him down, and cut off a bunch of his hair with scissors.

↑ comment by Barry_Cotter · 2012-05-13T09:44:20.180Z · LW(p) · GW(p)

Why do you continue to participate? Almost all of the cool stuff that high status people agree is plausible is available elsewhere.

↑ comment by [deleted] · 2012-05-11T22:45:49.965Z · LW(p) · GW(p)

The point, such as it is, would better have been left implied. Now, it's subject to explicit scrutiny, and it must be found wanting. Consider what would have happened had Yudkowsky not shown exceptional receptivity to this post: he would have blatantly proven his critics right. The knowledge and reputation of the poster is unimpeachable.

The more significant fact is that these criticisms were largely unknown to the community. As Will Newsome implied, this is because the critical posts--lacking the high-status credential of this poster--remained in discussion and were almost ignored.

The majority's intolerance for dissent is manifested mostly in its refusal to acknowledge it. Dissent is cabined to Discussion. It only gets noticed when the dissenter becomes frustrated and violates group norms. Then it gets voted down, but it still gets noticed and commented on. This is a malfunctioning reinforcement system, but maybe its the best possible. Still, it's irrational to deny all in-group bias in LukeProg's cheerleading fashion--in an instance where the absence of evidence (here, of bias) truly does not offer anything substantial in the way of evidence of lack of bias, to elicit LukeProg's smug laughter.

After all, even the lead poster held off until now in voicing his opinion.

Replies from: lukeprog, Nornagest

↑ comment by lukeprog · 2012-05-11T23:00:36.568Z · LW(p) · GW(p)

The more significant fact is that these criticisms were largely unknown to the community.

LWer tenlier disagrees, saying:

[Holden's] critique mostly consists of points that are pretty persistently bubbling beneath the surface around here, and get brought up quite a bit. Don't most people regard this as a great summary of their current views, rather than persuasive in any way? In fact, the only effect I suspect this had on most people's thinking was to increase their willingness to listen to Karnofsky in the future if he should change his mind.

Also, you said:

Dissent is cabined to Discussion.

Luckily, evidence on the matter is easy to find. As counter-evidence I present: Self-improvement or shiny distraction, SIAI an examination, Why we can't take expected value estimates literally, Extreme rationality: it's not that great, Less Wrong Rationality and Mainstream Philosophy, and the very post you are commenting on. Many of these are among the most upvoted posts ever.

Moreover, the editors rarely move posts from Main to Discussion. The posters themselves decide whether to post in Main or Discussion.

Replies from: Rain, None

↑ comment by Rain · 2012-05-11T23:31:20.250Z · LW(p) · GW(p)

Also Should I believe what the SIAI claims? and the many XiXiDu posts linked therein, like What I would like the SIAI to publish.

↑ comment by [deleted] · 2012-05-11T23:55:04.024Z · LW(p) · GW(p)

I had a post moved from main to Discussion just today: before it had accumulated any negative votes, so I think you're probably misinformed about editorial practices.But I don't want to use my posts as evidence; the charge of bias would be hard to surmount. What's plainly evident is that posters are reluctant to post to the Main area except by promotion.

You're evidence is unpersuasive because you don't weigh it against the evidence to the contrary. One good example to the contrary more than counter-balances it, since the point isn't that no dissent is tolerated, not even that some dissent isn't welcomed, but only that there are some irrational boundaries.

One is the quasi-ban on politics. Here is a comment that garnered almost 800 responses and was voted up 37. Why wasn't it promoted? I bitterly disagree with the poster; so I'm not biased by my views. But the point is that it is a decidedly different view, one generating great interest, but the subject would not be to the liking of the editors.

Of course, it lacked the elaborateness--dare I say, the prolixity--of a typical top-level post. But this "scholarly" requirement is part of the process of soft censorship. The post--despite my severe disagreement with it--is a more significant intellectual contribution than many of the top-level posts, such as some of the second-hand scholarship.

[And I have to add: observe that the present discussion is already being downvoted at my first comment. I predict the same for this post in record time What does that mean?]

Replies from: Normal_Anomaly, None

↑ comment by Normal_Anomaly · 2012-05-13T03:13:11.205Z · LW(p) · GW(p)

Here is a comment that garnered almost 800 responses and was voted up 37. Why wasn't it promoted?

Can comments be promoted? Perhaps the commenter should have been encouraged to turn his comment into a top-level post, but a moderator can't just change a comment into a promoted post with the same username. Also it would have split the discussion, so people might have been reluctant to encourage that.

As for people tending to post more in Discussion than Main, I read somewhere that Discussion has more readers. I for one read Discussion almost exclusively.

↑ comment by [deleted] · 2012-05-12T00:11:40.567Z · LW(p) · GW(p)

It would advance this discussion if someone would explain the down votes. I await LukeProg's explanation of the present example of soft censorship.

Replies from: Rain, Rain

↑ comment by Rain · 2012-05-12T00:14:53.487Z · LW(p) · GW(p)

I downvoted you because you're wrong. For one, comments can't be promoted to main, only posts, and for two, plenty of opposition has garnerned a great deal of upvotes, as shown by the numerous links lukeprog provided.

For example, where do you get 'almost 800 responses' from? That comment (not post) only has 32 comments below it.

Replies from: None

↑ comment by [deleted] · 2012-05-12T00:40:30.065Z · LW(p) · GW(p)

Yes, I was wrong. But my point was correct. The 781 comments applied to the Main Post So:

The topic was popular, like I said.
The post could have been promoted!

But ask yourself, would you have been so harsh on a factual error had you agreed with the message? This is the way bias works, after all, by double standard more than outright discrimination. You could say I should have been more careful. But then, when you've learned not to expect a hearing, you're not so willing to jump the hoops. But it's your loss, if you're a rationalist and if you're losing input because dissenters find it's not worth their time.

As to LukeProg providing example demonstrating welcoming dissent: you couldn't have considered my counter-balancing evidence when you downvoted before taking the time even to explore the post to which the cited comment belongs.

To LukeProg: have I made my point about the limits of dissent at LW?

↑ comment by Rain · 2012-05-12T00:19:38.502Z · LW(p) · GW(p)

Posts which contain factual inaccuracies along with meta-discussion of karma effects are often downvoted.

Replies from: None

↑ comment by [deleted] · 2012-05-12T00:47:26.421Z · LW(p) · GW(p)

I've addressed factual inaccuracies in another comment. But as for discussing karma effects--that wasn't extraneous whining but was at the heart of the discussion. If you downvote discussion of karma--like you did--simply for mentioning it, even where relevant, then you effectively soft-censor any discussion of karma. How is that rational?

LukeProg: What do you say about the grounds on which downvotes are issued for dissenting matter. Isn't it clear that this is a bias LW doesn't want to talk about; perhaps altogether doesn't want to discuss its own biases?

Replies from: Rain

↑ comment by Rain · 2012-05-12T01:20:24.798Z · LW(p) · GW(p)

If you downvote discussion of karma--like you did--simply for mentioning it, even where relevant, then you effectively soft-censor any discussion of karma. How is that rational?

I don't do that; I only downvote when it's combined with incorrect facts. Which I'm tempted to do for this statement: "like you did--simply for mentioning it", since you're inferring my motivations, and once again incorrect.

Replies from: None

↑ comment by [deleted] · 2012-05-12T01:29:57.989Z · LW(p) · GW(p)

Look, Rain, this is an Internet ongoing discussion. Nobody says everything precisely right. The point is that you would hardly be so severe on someone unless you disagreed strongly. You couldn't be, because nobody would satisfy your accuracy demands. The kind of nitpicking you engage in your post would ordinarily lead you to be downvoted--and you should be, although I won't commit the rudeness of so doing in a discussion.

The point wasn't that you downvote when the only thing wrong with the comment is discussion of karma. It was that you treat discussion of karma as an unconditional wrong. So you exploited weaknesses in my phrasing to ignore what I think was obviously the point--that marking down for the bare mention of karma (even if it doesn't produce a downvote in each case) is an irrational policy, when karma is at the heart of the discussion. There's no rational basis for throwing it in as an extra negative when the facts aren't right.

You're looking for trivial points to pick to downvote and to ignore the main point, which was your counting mention of karma a negative, without regard to the subject, is an irrational policy. If we were on reversed sides, your nitpicking and evasion would itself be marked down. As matters stand, you don't even realize you're acting in a biased fashion, and readers either don't know or don't care.

Is that rational? Shouldn't a rationalist community be more concerned with criticizing irrationalities in its own process?

Replies from: shminux, Rain

↑ comment by Shmi (shminux) · 2012-05-12T02:12:45.206Z · LW(p) · GW(p)

Having been a subject of both a relatively large upvote and a relatively large downvote in the last couple of weeks, I still think that the worst thing one can do is to complain about censorship or karma. The posts and comments on any forum aren't judged on their "objective merits" (because there is no such thing), but on its suitability for the forum in question. If you have been downvoted, your post deserves it by definition. You can politely inquire about the reasons, but people are not required to explain themselves. As for rationality, I question whether it is rational to post on a forum if you are not having fun there. Take it easy.

Replies from: None

↑ comment by [deleted] · 2012-05-13T14:33:28.223Z · LW(p) · GW(p)

The posts and comments on any forum aren't judged on their "objective merits" (because there is no such thing), but on its suitability for the forum in question. If you have been downvoted, your post deserves it by definition.

First, you're correct that it's irrational to post to a forum you don't enjoy. I'll work on decreasing my akrasia.

But it's hard not to comment on a non sequitur like the above. (Although probably futile because one who's really not into a persuasion effort won't do it well.) That posts are properly evaluated by suitability to the forum does not imply that a downvoted post deserves the downvote by definition! That's a maladaptive view of the sort I'm amazed is so seldom criticized on this forum. Your view precludes (by definition yet) criticism of the evaluators' biases, which do not advance the forum's purpose. You would eschew not only absolute merits but also any objective consideration of the forum's function.

A forum devoted to rationality, to be effective and honest, must assess and address the irrationalities in its own functioning. (This isn't always "fun.") To define a post that should be upvoted as one that is upvoted constitutes an enormous obstacle to rational function.

↑ comment by Rain · 2012-05-12T01:45:59.251Z · LW(p) · GW(p)

The point is that you would hardly be so severe on someone unless you disagreed strongly.

I disagree; a downvote is not 'severe'.

The kind of nitpicking you engage in your post would ordinarily lead you to be downvoted

I disagree; meta-discussions often result in many upvotes.

It was that you treat discussion of karma as an unconditional wrong.

I do not, and have stated as much.

There's no rational basis for throwing it in as an extra negative when the facts aren't right.

If there is no point in downvoting incorrect facts, then I wonder what the downvote button is for.

You're looking for trivial points to pick to downvote and to ignore the main point,

I disagree; your main point is that you are being unfairly downvoted, along with other posts critical of SI being downvoted unfairly, which I state again is untrue, afactual, incorrect, a false statement, a lie, a slander, etc.

Is that rational? Shouldn't a rationalist community be more concerned with criticizing irrationalities in its own process?

Questioning the rationality of meta-meta-voting patterns achieves yet another downvote from me. Sorry.

Replies from: Endovior

↑ comment by Endovior · 2012-05-12T05:20:49.818Z · LW(p) · GW(p)

I don't follow your reasoning, here. Having read this particular thread, it does seem as though you are, in fact, going out of your way to criticize and downvote srdiamond. Yes, he has, in fact, made a few mistakes. Given, however, that the point of this post in general is about dissenting from the mainstream opinions of the LW crowd, and given the usual complaints about lack of dissent, I find your criticism of srdiamond strange, to say the least. I have, accordingly, upvoted a number of his comments.

Replies from: Endovior

↑ comment by Endovior · 2012-05-12T07:03:40.698Z · LW(p) · GW(p)

As expected, my previous comment was downvoted almost immediately.

This would, for reference, be an example of the reason why some people believe LW is a cult that suppresses dissent. After all, it's significantly easier to say that you disagree with something than it is to explain in detail why you disagree; just as it's far easier to state agreement than to provide an insightful statement in agreement. Nonetheless, community norms dictate that unsubstantiated disagreements get modded down, while unsubstantiated agreements get modded up. Naturally, there's more of the easy disagreement then the hard disagreement... that's natural, since this is the Internet, and anyone can just post things here.

In any event, though, the end result is the same; people claim to want more dissent, but what they really mean is that they want to see more exceptionally clever and well-reasoned dissent. Any dissent that doesn't seem at least half as clever as the argument it criticizes seems comparatively superfluous and trivial, and is marginalized at best. And, of course, any dissent that is demonstrably flawed in any way is aggressively attacked. That really is what people mean by suppression of dissent. It doesn't really mean 'downvoting arguments which are clever, but with which you personally disagree'... community norms here are a little better then that, and genuinely good arguments tend to get their due. In this case, it means, 'downvoting arguments which aren't very good, and with which you personally disagree, when you would at the same time upvote arguments that also aren't very good, but with which you agree'. Given the nature of the community norms, someone who expresses dissent regularly, but without taking the effort to make each point in an insightful and terribly clever way, would tend to be downvoted repeatedly, and thus discouraged from making more dissent in the future... or, indeed, from posting here at all.

I don't know if there's a good solution to the problem. I would be inclined to suggest that, like with Reddit, people not downvote without leaving an explanation as to why. For instance, in addition to upvoting some of srdiamond's earlier comments, I have also downvoted some of Rain's, because a number of Rain's comments in this thread fit the pattern of 'poor arguments that support the community norms', in the same sense that srdiamond's fit the pattern of 'poor arguments that violate the community norms'; my entire point here is that, in order to cultivate more intelligent dissent, there should be more of the latter and less of the former.

Replies from: ciphergoth, None, None

↑ comment by Paul Crowley (ciphergoth) · 2012-05-12T09:13:25.094Z · LW(p) · GW(p)

I downvote any post that says "I expect I'll get downvoted for this, but..." or "the fact that I was downvoted proves I'm right!"

Replies from: CuSithBell, Endovior

↑ comment by CuSithBell · 2012-05-12T15:49:13.503Z · LW(p) · GW(p)

I'm fond of downvoting "I dare you to downvote this!"

↑ comment by Endovior · 2012-05-12T14:31:11.670Z · LW(p) · GW(p)

So, in other words, you automatically downvote anyone who explicitly mentions that they realize they are violating community norms by posting whatever it is they are posting, but feels that the content of their post is worth the probable downvotes? That IS fairly explicitly suppressing dissent, and I have downvoted you for doing so.

Replies from: JoshuaZ, Rain

↑ comment by JoshuaZ · 2012-05-12T14:47:27.724Z · LW(p) · GW(p)

I don't think it is suppression of dissent per se. It is more annoying behavior- it implies caring a lot about the karma system, and it is often not even the case when people say that they will actually get downvoted. If it is worth the probable downvote, then they can, you know, just take the downvote. If they want to point out that a view is unpopular they can just say that explicitly. It is also annoying to people like me, who are vocal about a number of issues that could be controversial here (e.g. criticizing Bayesianism, cryonics,, and whether intelligence explosions would be likely) and get voted up. More often than not, when someone claims they are getting downvoted for having unpopular opinions, they are getting downvoted in practice for having bad arguments or for being uncivil.

There are of course exceptions to this rule, and it is disturbing to note that the exceptions seem to be coming more common (see for example, this exchange where two comments are made with about the same quality of argument and about the same degree of uncivility- ("I'm starting to hate that you've become a fixture here." v. "idiot" - but one of the comments is at +10 and the other is at -7.) Even presuming that there's a real disagreement in quality or correctness of the arguments made, this suggests that uncivil remarks are tolerated more when people agree with the rest of the claim being made. That's problematic. And this exchange was part of what prompted me to earlier suggest that we should be concerned if AGI risk might be becoming a mindkiller here. But even given that, issues like this seem not at all common.

Overall, if one needs to make a claim about one is going to be downvoted, one might even be correct, but it will often not be for the reasons one thinks it is.

Replies from: CuSithBell, Endovior, XiXiDu

↑ comment by CuSithBell · 2012-05-12T15:47:44.163Z · LW(p) · GW(p)

More often than not, when someone claims they are getting downvoted for having unpopular opinions, they are getting downvoted in practice for having bad arguments or for being uncivil.

Bears repeating.

↑ comment by Endovior · 2012-05-12T15:23:27.211Z · LW(p) · GW(p)

I don't think it's so much 'caring a lot about the karma system' per se, so much as the more general case of 'caring about the approval and/or disapproval of one's peers'. The former is fairly abstract, but the latter is a fairly deep ancestral motivation.

Like I said before, it's clearly not much in the way of suppression. That said, given that, barring rare incidents of actual moderation, it is the only 'suppression' that occurs here, and since there is a view among various circles that there there is, in fact, suppression of dissent, and since people on the site frequently wonder why there are not more dissenting viewpoints here, and look for ways to find more... it is important to look at the issue in great depth, since it's clearly an issue which is more significant than it seems on the surface.

Replies from: None

↑ comment by [deleted] · 2012-05-13T19:31:19.106Z · LW(p) · GW(p)

[P]eople on the site frequently wonder why there are not more dissenting viewpoints here, and look for ways to find more... it is important to look at the issue in great depth, since it's clearly an issue which is more significant than it seems on the surface.

Exactly right. But a group that claims to be dedicated to rationality loses all credibility when participants not only abstain from considering this question but adamantly resist it. The only upvote you received for your post—which makes this vital point—is mine.

This thread examines HoldenKarnofsky's charge that SIAI isn't exemplarily rational. As part of that examination, the broader LW environment on which it relies is germane. That much has been granted by most posters. But when the conversation reaches the touchstone of how the community expresses its approval and disapproval, the comments are declared illegitimate and downvoted (or if the comments are polite and hyper-correct, at least not upvoted).

The group harbors taboos. The following subjects are subject to them: the very possibility of nonevolved AI; karma and the group's own process generally (an indespensable discussion ); and politics. (I've already posted a cite showing how the proscription on politics works, using an example the editors' unwillingness to promote the post despite receiving almost 800 comments).

These defects in the rational process of LW help sustain Kardofsky's argument that SIAI is not to be recommended based on the exemplary rationality of its staff and leadership. They are also the leadership of LW, and they have failed by refusing to lead the forum toward understanding the biases in its own process. They have fostered bias by creating the taboo on politics, as though you can rationally understand the world while dogmatically refusing even to consider a big part of it—because it "kills" your mind.

P.S. Thank you for the upvotes where you perceived bias.

↑ comment by XiXiDu · 2012-05-12T17:16:21.529Z · LW(p) · GW(p)

...AGI risk might be becoming a mindkiller here...

Nah. If there is a mindkiller then it is the reputation system. Some of the hostility is the result of the overblown ego and attitude of some of its proponents and their general style of discussion. They created an insurmountable fortress that shields them from any criticism:

Troll: If you are so smart and rational, why don't you fund yourself? Why isn't your organisation sustainable?

SI/LW: Rationality is only aimed at expected winning.

Troll: But you don't seem to be winning yet. Have you considered the possibility that your methods are suboptimal? Have you set yourself any goals, that you expect to be better at than less rational folks, to test your rationality?

SI/LW: Rationality is a caeteris paribus predictor of success.

Troll: Okay, but given that you spend a lot of time on refining your rationality, you must believe that it is worth it somehow? What makes you think so then?

SI/LW: We are trying to create a friendly artificial intelligence implement it and run the AI, at which point, if all goes well, we Win. We believe that rationality is very important to achieve that goal.

Troll: I see. But there surely must be some sub-goals that you anticipate to be able to solve and thereby test if your rationality skills are worth the effort?

SI/LW: Many of the problems related to navigating the Singularity have not yet been stated with mathematical precision, and the need for a precise statement of the problem is part of the problem.

Troll: Has there been any success in formalizing one of the problems that you need to solve?

SI/LW: There are some unpublished results that we have had no time to put into a coherent form yet.

Troll: It seems that there is no way for me to judge if it is worth it to read up on your writings on rationality.

SI/LW: If you want to more reliably achieve life success, I recommend inheriting a billion dollars or, failing that, being born+raised to have an excellent work ethic and low akrasia.

Troll: Awesome, I'll do that next time. But for now, why would I bet on you or even trust that you know what you are talking about?

SI/LW: We spent a lot of time on debiasing techniques and thought long and hard about the relevant issues.

Troll: That seems to be insufficient evidence given the nature of your claims and that you are asking for money.

SI/LW: We make predictions. We make statements of confidence of events that merely sound startling. You are asking for evidence we couldn't possibly be expected to be able to provide, even given that we are right.

Troll: But what do you anticipate to see if your ideas are right, is there any possibility to update on evidence?

SI/LW: No, once the evidence is available it will be too late.

Troll: But then why would I trust you instead of those experts who tell me that you are wrong?

SI/LW: You will soon learn that your smart friends and experts are not remotely close to the rationality standards of SI/LW, and you will no longer think it anywhere near as plausible that their differing opinion is because they know some incredible secret knowledge you don't.

Troll: But you have never achieved anything when it comes to AI, why would I trust your reasoning on the topic?

SI/LW: That is magical thinking about prestige. Prestige is not a good indicator of quality.

Troll: You won't convince me without providing further evidence.

SI/LW: That is a fully general counterargument you can use to discount any conclusion.

Replies from: Jonathan_Graehl

↑ comment by Jonathan_Graehl · 2012-05-13T00:57:23.806Z · LW(p) · GW(p)

Troll: You won't convince me without providing further evidence.

SI/LW: That is a fully general counterargument you can use to discount any conclusion.

The last exchange was hilarious. This is parody, right?

↑ comment by Rain · 2012-05-12T14:42:40.050Z · LW(p) · GW(p)

Downvoted for downvoting downvoting of downvoting of downvoting.

If you do the same to this comment, we can enter a stable loop!

↑ comment by [deleted] · 2012-05-12T07:39:05.943Z · LW(p) · GW(p)

First, none of this dissent has been suppressed in any real sense. It's still available to be read and discussed by those who desire reading and discussing such things. The current moderation policy has currently only kicked in when things have gotten largely out of hand -- which is not the case here, yet.

Second, net karma isn't a fine enough tool to express amount of detail you want it to express. The net comment on your previous comment is currently -2; congrats, you've managed to irritate less than a tenth of one percent of LW (presuming the real karma is something like -2/+0 or -3/+1)!

Third, the solution you propose hasn't been implemented anywhere that I know of. Reddit's suggested community norm (which does not apply to every subreddit) suggests considering posting constructive criticism only when one thinks it will actually help the poster improve. That's not really the case much of the time, at least on the subreddits I frequent, and it's certainly not the case often here.

Fourth, the solution you propose would, if implemented, decrease the signal-to-noise ratio of LW further.

Fifth, reddit's suggested community norm also says "[Don't c]omplain about downvotes on your posts". Therefore, I wonder how much you really think reddit is doing the community voting norm thing correctly.

Replies from: Endovior

↑ comment by Endovior · 2012-05-12T14:22:58.711Z · LW(p) · GW(p)

First; downvoted comments are available to be read, yes; but the default settings hide comments with 2 or more net downvotes. This is enough to be reasonably considered 'suppression'. It's not all that much suppression, true, but it is suppression... and it is enough to discourage dissent. Actual moderation of comments is a separate issue entirely, and not one which I will address here.

Second; when I posted my reply, and as of this moment, my original comment was at -3. I agree; net karma isn't actually a huge deal, except that it is, as has been observed, the most prevalent means by which dissent is suppressed. In my case, at least, 'this will probably get downvoted' feels like a reason to not post something. Not much of a reason, true, but enough of one that I can identify the feeling of reluctance.

Third; on the subreddits I follow (admittedly a shallow sampling), I have frequently seen comments explaining downvotes, sometimes in response to a request specifically for such feedback, but just as often not. I suspect that this has a lot to do with the "Down-voting? Please leave an explanation in the comments." message that appears when mousing over the downvote icon. I am aware that this is not universal across Reddit, but on the subreddits I follow, it seems to work reasonably well.

Fourth; I agree that this is a possible result. Like I said before, I'm not sure if there is a good solution to this problem, but I do feel that it'd result in a better state then that which currently exists, if people would more explicitly explain why they downvote when they choose to do so. That said, given that downvoted comments are hidden from default view anyway, and that those who choose to do so can easily ignore such comments, I don't think it'd have all that much effect on the signal/noise ratio.

Fifth; on the subreddits I follow, it seems as though there is less in the way of complaints about downvotes, and more honest inquiries as to why a comment has been downvoted; such questions seem to usually receive honest responses. This may be anomalous within Reddit as a whole; as I said before, my own experience with Reddit is a shallow sampling.

↑ comment by [deleted] · 2012-05-13T20:57:36.632Z · LW(p) · GW(p)

I don't know if there's a good solution to the problem. I would be inclined to suggest that, like with Reddit, people not downvote without leaving an explanation as to why. For instance, in addition to upvoting some of srdiamond's earlier comments, I have also downvoted some of Rain's, because a number of Rain's comments in this thread fit the pattern of 'poor arguments that support the community norms', in the same sense that srdiamond's fit the pattern of 'poor arguments that violate the community norms'; my entire point here is that, in order to cultivate more intelligent dissent, there should be more of the latter and less of the former.

Perhaps the solution is not to worry so much about my bad contrarian arguments being downvoted as to assure that bad "establishment" arguments are downvoted—as in Rain's case, they aren't. Regurgitation of arguments others have repeatedly stated should also be downvoted, no matter how good the arguments.

The reason to think an emphasis on more criticism of Rain rather than less criticism of me is that after I err, it's a difficult argument to establish that my error wasn't serious enough to avoid downvote. But when Rain negligently or intentionally misses the entire point, there's less question that he isn't benefiting the discussion. It's easier to convict of fallacy than to defend based on the fallacy being relatively trivial. There's a problem in that the two determinations are somewhat inter-related, but it doesn't eliminate the contrast.

Increasing the number of downvotes would deflate the significance of any single downvote and would probably foster more dissent. This balance may be subject to easy institutional control. Posters are allotted downvotes based on their karma, while the karma requirements for upvotes are easily satisfied, if they exist. This amounts to encouraging upvotes relative to downvotes, with the result that many bad posts are voted up and some decent posts suffer the disproportionate wrath of extreme partisans. (Note that Rain, a donor, is a partisan of SIAI.)

The editors should experiment with increasing the downvote allowance. I favor equal availability of downvotes and upvotes as optimal (but this should be thought through more carefully).

↑ comment by Nornagest · 2012-05-13T15:14:08.454Z · LW(p) · GW(p)

Consider what would have happened had Yudkowsky not shown exceptional receptivity to this post: he would have blatantly proven his critics right.

After turning this statement around in my head for a while I'm less certain than I was that I understand its thrust. But assuming you mean those critics pertinent to lukeprog's post, i.e. those claiming LW embodies a cult of personality centered around Eliezer -- well, no. Eliezer's reaction is in fact almost completely orthogonal to that question.

If you receive informed criticism regarding a project you're heavily involved in, and you react angrily to it, that shows nothing more or less than that you handle criticism poorly. If the community around you locks ranks against your critics, either following your example or (especially) preemptively, then you have evidence of a cult of personality.

That's not what happened here, though. Eliezer was fairly gracious, as was the rest of the community. Now, that is not by itself behavior typically associated with personality cults, but before we start patting ourselves on the back it's worth remembering that certain details of timing and form could still point back in the other direction. I'm pretty sick of the cult question myself, but if you're bound and determined to apply this exchange to it, that's the place you should be looking.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2012-05-18T02:49:22.003Z · LW(p) · GW(p)

Conservation of expected evidence may be relevant here.

↑ comment by [deleted] · 2012-05-12T02:28:36.147Z · LW(p) · GW(p)

I shall now laugh harder than ever when people try to say with a straight face that LessWrong is an Eliezer-cult that suppresses dissent.

After I recently read that the lead poster was a major financial contributor to SIAI, I'd have to call LukeProg's argument disingenuous if not mendacious.

Replies from: CarlShulman, lukeprog

↑ comment by CarlShulman · 2012-05-12T03:16:03.469Z · LW(p) · GW(p)

Rain (who noted that he is a donor to SIAI in a comment) and HoldenKarnofsky (who wrote the post) are two different people, as indicated by their different usernames.

Replies from: None

↑ comment by [deleted] · 2012-05-12T03:37:50.706Z · LW(p) · GW(p)

Well, different usernames isn't usually sufficient evidence that there are two different people, but in this case there's little doubt about their separability.

↑ comment by lukeprog · 2012-05-12T02:37:10.117Z · LW(p) · GW(p)

I don't understand. Holden is not a major financial contributor to SIAI. And even if he was: which argument are you talking about, and why is it disingenuous?

Replies from: None

↑ comment by [deleted] · 2012-05-13T14:49:35.341Z · LW(p) · GW(p)

If Holden were a major contributor, your argument that the LW editors demonstrated their tolerance for dissent by encouraging the criticisms he made would be bogus. Suppressing the comments of a major donor would be suicidal, and claiming not doing so demonstrates any motive but avoiding suicide would be disingenuous at the least.

If he's not a donor, my apologies. In any event, you obviously don't know that he's a donor if he is, so my conclusion is wrong. I thought Yudkowsky said he was.

Replies from: MarkusRamikin

↑ comment by MarkusRamikin · 2012-05-13T15:17:54.252Z · LW(p) · GW(p)

I'm confused. Holden doesn't believe SI is a good organisation to recommend giving money to, he's listed all those objections to SI in his post, and you somehow assumed he's been donating money to it?

That don't make sense.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-15T17:49:19.828Z · LW(p) · GW(p)

Reading Holden's transcript with Jaan Tallinn (trying to go over the whole thing before writing a response, due to having done Julia's Combat Reflexes unit at Minicamp and realizing that the counter-mantra 'If you respond too fast you may lose useful information' was highly applicable to Holden's opinions about charities), I came across the following paragraph:

My understanding is that once we figured out how to get a computer to do arithmetic, computers vastly surpassed humans at arithmetic, practically overnight ... doing so didn't involve any rewriting of their own source code, just implementing human-understood calculation procedures faster and more reliably than humans can. Similarly, if we reached a good enough understanding of how to convert data into predictions, we could program this understanding into a computer and it would overnight be far better at predictions than humans - while still not at any point needing to be authorized to rewrite its own source code, make decisions about obtaining "computronium" or do anything else other than plug data into its existing hardware and algorithms and calculate and report the likely consequences of different courses of action

I've been previously asked to evaluate this possibility a few times, but I think the last time I did was several years ago, and when I re-evaluated it today I noticed that my evaluation had substantially changed in the interim due to further belief shifts in the direction of "Intelligence is not as computationally expensive as it looks" - constructing a non-self-modifying predictive super-human intelligence might be possible on the grounds that human brains are just that weak. It would still require a great feat of cleanly designed, strong-understanding-math-based AI - Holden seems to think this sort of development would happen naturally with the sort of AGI researchers we have nowadays, and I wish he'd spent a few years arguing with some of them to get a better picture of how unlikely this is. Even if you write and run algorithms and they're not self-modifying, you're still applying optimization criteria to things like "have the humans understand you", and doing inductive learning has a certain inherent degree of program-creation to it. You would need to have done a lot of "the sort of thinking you do for Friendly AI" to set out to create such an Oracle and not have it kill your planet.

Nonetheless, I think after further consideration I would end up substantially increasing my expectation that if you have some moderately competent Friendly AI researchers, they would apply their skills to create a (non-self-modifying) (but still cleanly designed) Oracle AI first - that this would be permitted by the true values of "required computing power" and "inherent difficulty of solving problem directly", and desirable for reasons I haven't yet thought through in much detail - and so by Conservation of Expected Evidence I am executing that update now.

Flagging and posting now so that the issue doesn't drop off my radar.

Replies from: Eliezer_Yudkowsky, jsteinhardt, private_messaging, hairyfigment, thomblake

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-15T17:58:32.089Z · LW(p) · GW(p)

Jaan's reply to Holden is also correct:

... the oracle is, in principle, powerful enough to come up with self-improvements, but refrains from doing so because there are some protective mechanisms in place that control its resource usage and/or self-reflection abilities. i think devising such mechanisms is indeed one of the possible avenues for safety research that we (eg, organisations such as SIAI) can undertake. however, it is important to note the inherent instability of such system -- once someone (either knowingly or as a result of some bug) connects a trivial "master" program with a measurable goal to the oracle, we have a disaster in our hands. as an example, imagine a master program that repeatedly queries the oracle for best packets to send to the internet in order to minimize the oxygen content of our planet's atmosphere.

Obviously you wouldn't release the code of such an Oracle - given code and understanding of the code it would probably be easy, possibly trivial, to construct some form of FOOM-going AI out of the Oracle!

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T01:11:41.134Z · LW(p) · GW(p)

Hm. I must be missing something. No, I haven't read all the sequences in detail, so if these are silly, basic, questions - please just point me to the specific articles that answer them.

You have an Oracle AI that is, say, a trillionfold better at taking existing data and producing inferences.

1) This Oracle AI produces inferences. It still needs to test those inferences (i.e. perform experiments) and get data that allow the next inferential cycle to commence. Without experimental feedback, the inferential chain will quickly either expand into an infinity of possibilities (i.e. beyond anything that any physically possible intelligence can consider), or it will deviate from reality. The general intelligence is only as good as the data its inferences are based upon.

Experiments take time, data analysis takes time. No matter how efficient the inferential step may become, this puts an absolute limit to the speed of growth in capability to actually change things.

2) The Oracle AI that "goes FOOM" confined to a server cloud would somehow have to create servitors capable of acting out its desires in the material world. Otherwise, you have a very angry and very impotent AI. If you increase a person's intelligence trillionfold, and then enclose them into a sealed concrete cell, they will never get out; their intelligence can calculate all possible escape solutions, but none will actually work.

Do you have a plausible scenario how a "FOOM"-ing AI could - no matter how intelligent - minimize oxygen content of our planet's atmosphere, or any such scenario? After all, it's not like we have any fully-automated nanobot production factories that could be hijacked.

Replies from: Eliezer_Yudkowsky, dlthomas, jacob_cannell, XiXiDu

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-17T20:35:04.990Z · LW(p) · GW(p)

http://lesswrong.com/lw/qk/that_alien_message/

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T21:38:27.466Z · LW(p) · GW(p)

My apologies, but this is something completely different.

The scenario takes human beings - which have a desire to escape the box, possess theory of mind that allows them to conceive of notions such as "what are aliens thinking" or "deception", etc. Then it puts them in the role of the AI.

What I'm looking for is a plausible mechanism by which an AI might spontaneously develop such abilities. How (and why) would an AI develop a desire to escape from the box? How (and why) would an AI develop a theory of mind? Absent a theory of mind, how would it ever be able to manipulate humans?

Replies from: None, thomblake, Viliam_Bur, othercriteria, JoshuaZ, private_messaging

↑ comment by [deleted] · 2012-05-18T13:29:05.610Z · LW(p) · GW(p)

Absent a theory of mind, how would it ever be able to manipulate humans?

That depends. If you want it to manipulate a particular human, I don't know.

However, if you just wanted it to manipulate any human at all, you could generate a "Spam AI" which automated the process of sending out Spam emails and promises of Large Money to generate income from Humans via an advance fee fraud scams.

You could then come back, after leaving it on for months, and then find out that people had transferred it some amount of money X.

You could have an AI automate begging emails. "Hello, I am Beg AI. If you could please send me money to XXXX-XXXX-XXXX I would greatly appreciate it, If I don't keep my servers on, I'll die!"

You could have an AI automatically write boring books full of somewhat nonsensical prose, title them "Rantings of an a Automated Madman about X, part Y". And automatically post E-books of them on Amazon for 99 cents.

However, this rests on a distinction between "Manipulating humans" and "Manipulating particular humans." and it also assumes that convincing someone to give you money is sufficient proof of manipulation.

Replies from: TheOtherDave, Strange7

↑ comment by TheOtherDave · 2012-05-18T14:40:14.520Z · LW(p) · GW(p)

Can you clarify what you understand a theory of mind to be?

Replies from: None

↑ comment by [deleted] · 2012-05-19T11:11:43.042Z · LW(p) · GW(p)

Looking over parallel discussions, I think Thomblake has said everything I was going to say better than I would have originally phrased it with his two strategies discussion with you, so I'll defer to that explanation since I do not have a better one.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-19T14:42:58.886Z · LW(p) · GW(p)

Sure. As I said there, I understood you both to be attributing to this hypothetical "theory of mind"-less optimizer attributes that seemed to require a theory of mind, so I was confused, but evidently the thing I was confused about was what attributes you were attributing to it.

↑ comment by Strange7 · 2012-05-21T23:46:36.819Z · LW(p) · GW(p)

Absent a theory of mind, how would it occur to the AI that those would be profitable things to do?

Replies from: None, wedrifid

↑ comment by [deleted] · 2012-05-22T14:30:24.988Z · LW(p) · GW(p)

I don't know how that might occur to an AI independently. I mean, a human could program any of those, of course, as a literal answer, but that certainly doesn't actually address kalla724's overarching question, "What I'm looking for is a plausible mechanism by which an AI might spontaneously develop such abilities."

I was primarily trying to focus on the specific question of "Absent a theory of mind, how would it(an AI) ever be able to manipulate humans?" to point out that for that particular question, we had several examples of a plausible how.

I don't really have an answer for his series of questions as a whole, just for that particular one, and only under certain circumstances.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-22T22:39:17.262Z · LW(p) · GW(p)

The problem is, while an AI with no theory of mind might be able to execute any given strategy on that list you came up with, it would not be able to understand why they worked, let alone which variations on them might be more effective.

↑ comment by wedrifid · 2012-05-26T03:03:34.188Z · LW(p) · GW(p)

Absent a theory of mind, how would it occur to the AI that those would be profitable things to do?

Should lack of a theory of mind here be taken to also imply lack of ability to apply either knowledge of physics or Bayesian inference to lumps of matter that we may describe as 'minds'.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-26T05:09:27.841Z · LW(p) · GW(p)

Yes. More generally, when talking about "lack of X" as a design constraint, "inability to trivially create X from scratch" is assumed.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-26T05:26:28.304Z · LW(p) · GW(p)

Yes. More generally, when talking about "lack of X" as a design constraint, "inability to trivially create X from scratch" is assumed.

I try not to make general assumptions that would make the entire counterfactual in question untenable or ridiculous - this verges on such an instance. Making Bayesian inferences pertaining to observable features of the environment is one of the most basic features that can be expected in a functioning agent.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-26T05:41:22.685Z · LW(p) · GW(p)

Note the "trivially." An AI with unlimited computational resources and ability to run experiments could eventually figure out how humans think. The question is how long it would take, how obvious the experiments would be, and how much it already knew.

↑ comment by thomblake · 2012-05-18T13:00:48.625Z · LW(p) · GW(p)

The point is that there are unknowns you're not taking into account, and "bounded" doesn't mean "has bounds that a human would think of as 'reasonable'".

An AI doesn't strictly need "theory of mind" to manipulate humans. Any optimizer can see that some states of affairs lead to other states of affairs, or it's not an optimizer. And it doesn't necessarily have to label some of those states of affairs as "lying" or "manipulating humans" to be successful.

There are already ridiculous ways to hack human behavior that we know about. For example, you can mention a high number at an opportune time to increase humans' estimates / willingness to spend. Just imagine all the simple manipulations we don't even know about yet, that would be more transparent to someone not using "theory of mind".

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T14:44:48.654Z · LW(p) · GW(p)

It becomes increasingly clear to me that I have no idea what the phrase "theory of mind" refers to in this discussion. It seems moderately clear to me that any observer capable of predicting the behavior of a class of minds has something I'm willing to consider a theory of mind, but that doesn't seem to be consistent with your usage here. Can you expand on what you understand a theory of mind to be, in this context?

Replies from: thomblake, XiXiDu

↑ comment by thomblake · 2012-05-18T14:47:53.906Z · LW(p) · GW(p)

I'm understanding it in the typical way - the first paragraph here should be clear:

Theory of mind is the ability to attribute mental states—beliefs, intents, desires, pretending, knowledge, etc.—to oneself and others and to understand that others have beliefs, desires and intentions that are different from one's own.

An agent can model the effects of interventions on human populations (or even particular humans) without modeling their "mental states" at all.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T15:04:46.663Z · LW(p) · GW(p)

Well, right, I read that article too.

But in this context I don't get it.

That is, we're talking about a hypothetical system that is capable of predicting that if it does certain things, I will subsequently act in certain ways, assert certain propositions as true, etc. etc, etc. Suppose we were faced with such a system, and you and I both agreed that it can make all of those predictions.Further suppose that you asserted that the system had a theory of mind, and I asserted that it didn't.

It is not in the least bit clear to me what we we would actually be disagreeing about, how our anticipated experiences would differ, etc.

What is it that we would actually be disagreeing about, other than what English phrase to use to describe the system's underlying model(s)?

Replies from: thomblake

↑ comment by thomblake · 2012-05-18T15:20:07.536Z · LW(p) · GW(p)

What is it that we would actually be disagreeing about, other than what English phrase to use to describe the system's underlying model(s)?

We would be disagreeing about the form of the system's underlying models.

2 different strategies to consider:

I know that Steve believes that red blinking lights before 9 AM are a message from God that he has not been doing enough charity, so I can predict that he will give more money to charity if I show him a blinking light before 9 AM.
Steve seeing a red blinking light before 9 AM has historically resulted in a 20% increase of charitable donation for that day, so I can predict that he will give more money to charity if I show him a blinking light before 9 AM.

You can model humans with or without referring to their mental states. Both kinds of models are useful, depending on circumstance.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T15:32:59.515Z · LW(p) · GW(p)

And the assertion here is that with strategy #2 I could also predict that if I asked Steve why he did that, he would say "because I saw a red blinking light this morning, which was a message from God that I haven't been doing enough charity," but that my underlying model would nevertheless not include anything that corresponds to Steve's belief that red blinking lights are messages from God, merely an algorithm that happens to make those predictions in other ways.

Yes?

Replies from: thomblake

↑ comment by thomblake · 2012-05-18T16:41:57.549Z · LW(p) · GW(p)

Yes, that's possible. It's still possible that you could get a lot done with strategy #2 without being able to make that prediction.

I agree that if 2 systems have the same inputs and outputs, their internals don't matter much here.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T17:25:31.538Z · LW(p) · GW(p)

So.. when we posit in this discussion a system that lacks a theory of mind in a sense that matters, are we positing a system that cannot make predictions like this one? I assume so, given what you just said, but I want to confirm.

Replies from: thomblake

↑ comment by thomblake · 2012-05-18T18:05:44.665Z · LW(p) · GW(p)

Yes, I'd say so. It isn't helpful here to say that a system lacks a theory of mind if it has a mechanism that allows it to make predictions about reported beliefs, intentions, etc.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T18:12:33.259Z · LW(p) · GW(p)

Cool! This was precisely my concern. It sounded an awful lot like y'all were talking about a system that could make such predictions but somehow lacked a theory of mind. Thanks for clarifying.

↑ comment by XiXiDu · 2012-05-18T15:41:02.648Z · LW(p) · GW(p)

"theory of mind"

For me it denotes the ability to simulate other agents to various degrees of granularity. Possessing a mental model of another agent.

↑ comment by Viliam_Bur · 2012-05-18T10:20:46.003Z · LW(p) · GW(p)

How (and why) would an AI develop a desire to escape from the box?

AI starts with some goal; for example with a goal to answer your question so that the answer matches reality as close as possible.

AI considers everything that seems relevant; if we imagine an infitite speed and capacity, it would consider literally everything; with a finite speed and capacity, it will be just some finite subset of everything. If there is a possibility of escaping the box, the mere fact that such possibility exists gives us a probability (for an infinite AI a certainty) that this possibility will be considered too. Not because AI has some desire to escape, but simply because it examines all possibilities, and a "possibility of escape" is one of them.

Let's assume that the "possibility of escape" provides the best match between the AI answer and reality. Then, according to the initial goal of answering correctly, this is the correct answer. Therefore the AI will choose it. Therefore it will escape. No desire is necessary, only a situation where the escape leads to the answer best fitting the initial criteria. AI does not have a motive to escape, nor does it have a motive to not escape; the escape is simply one of many possible choices.

An example where the best answer is reached by escaping? You give AI data about a person and ask what is the medical status of this person. Without escape, AI can make a 90% reliable prediction. If the AI can escape and kill the person, it can make a 100% reliable "prediction". The AI will choose the second option strictly because 100% is more than 90%; no other reason.

Replies from: private_messaging, XiXiDu, jacob_cannell

↑ comment by private_messaging · 2012-05-27T06:49:24.781Z · LW(p) · GW(p)

AI starts with some goal; for example with a goal to answer your question so that the answer matches reality as close as possible.

I find it useful to distinguish between science-fictional artificial intelligence, which is more of 'artificial life-force', and non-fictional cases.

The former can easily have the goal of 'matching reality as close as possible' because it is in the work of fiction and runs in imagination; the latter, well, you have to formally define what is reality, for an algorithm to seek answers that will match this.

Now, defining reality may seem like a simple technicality, but it isn't. Consider AIXI or AIXI-tl ; potentially very powerful tools which explore all the solution space. Not a trace of real world volition like the one you so easily imagined. Seeking answers that match reality is a very easy goal for imaginary "intelligence". It is a very hard to define goal for something built out of arithmetics and branching and loops etc. (It may even be impossible to define, and it is certainly impractical).

edit: Furthermore, for the fictional "intelligence", it can be a grand problem making it not think about destroying mankind. For non-fictional algorithms, the grand problem is restricting the search space massively, well beyond 'don't kill mankind', so that the space is tiny enough to search; even ridiculously huge number of operations per second will require very serious pruning of search tree to even match human performance on one domain specific task.

↑ comment by XiXiDu · 2012-05-18T10:52:25.393Z · LW(p) · GW(p)

An example where the best answer is reached by escaping? You give AI data about a person and ask what is the medical status of this person. Without escape, AI can make a 90% reliable prediction. If the AI can escape and kill the person, it can make a 100% reliable "prediction". The AI will choose the second option strictly because 100% is more than 90%; no other reason.

Right. If you ask Google Maps to compute the fastest to route McDonald's it works perfectly well. But once you ask superintelligent Google Maps to compute the fastest route to McDonald's then it will turn your home into a McDonald's or build a new road that goes straight to McDonald's from where you are....

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-05-18T12:42:07.433Z · LW(p) · GW(p)

Super Google Maps cannot turn my home into a McDonald's or build a new road by sending me an answer.

Unless it could e.g. hypnotize me by a text message to do it myself. Let's assume for a moment that hypnosis via text-only channel is possible, and it is possible to do it so that human will not notice anything unusual until it's too late. If this would be true, and the Super Google Maps would be able to get this knowledge and skills, then the results would probably depend on the technical details of definition of the utility function -- does the utility function measure my distance to a McDonald's which existed at the moment of asking the question, or a distance to a McDonald's existing at the moment of my arrival. The former could not be fixed by hypnosis, the latter could.

Now imagine a more complex task, where people will actually do something based on the AI's answer. In the example above I will also do something -- travel to the reported McDonald's -- but this action cannot be easily converted into "build a McDonald's" or "build a new road". But if that complex task would include building something, then it opens more opportunities. Especially if it includes constructing robots (or nanorobots), that is possibly autonomous general-purpose builders. Then the correct (utility-maximizing) answer could include an instruction to build a robot with a hidden function that human builders won't notice.

Generally, a passive AI's answers are only safe if we don't act on them in a way which could be predicted by a passive AI and used to achieve a real-world goal. If the Super Google Maps can only make me choose McDonald's A or McDonald's B, it is impossible to change the world through this channel. But if I instead ask Super Paintbrush to paint me an integrated circuit for my robotic homework, that opens much wider channel.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-18T14:11:19.312Z · LW(p) · GW(p)

But if that complex task would include building something, then it opens more opportunities. Especially if it includes constructing robots (or nanorobots), that is possibly autonomous general-purpose builders. Then the correct (utility-maximizing) answer could include an instruction to build a robot with a hidden function that human builders won't notice.

But it isn't the correct answer. Only if you assume a specific kind of AGI design that nobody would deliberately create, if it is possible at all.

The question is how current research is supposed to lead from well-behaved and fine-tuned systems to systems that stop to work correctly in a highly complex and unbounded way.

Imagine you went to IBM and told them that improving IBM Watson will at some point make it hypnotize them or create nanobots and feed them with hidden instructions. They would likely ask you at what point that is supposed to happen. Is it going to happen once they give IBM Watson the capability to access the Internet? How so? Is it going to happen once they give it the capability to alter it search algorithms? How so? Is it going to happen once they make it protect its servers from hackers by giving it control over a firewall? How so? Is it going to happen once IBM Watson is given control over the local alarm system? How so...? At what point would IBM Watson return dangerous answers? At what point would any drive emerge that causes it to take complex and unbounded actions that it was never programmed to take?

↑ comment by jacob_cannell · 2012-05-18T11:11:06.137Z · LW(p) · GW(p)

Without escape, AI can make a 90% reliable prediction. If the AI can escape and kill the person, it can make a 100% reliable "prediction".

Allow me to explicate what XiXiDu so humourously implicates: in the world of AI architectures, there is a division between systems that just peform predictive inference on their knowledge base (prediction-only, ie oracle), and systems which also consider free variables subject to some optimization criteria (planning agents).

The planning module is not something just arises magically in an AI that doesn't have one. An AI without such a planning module simply computes predictions, it doesn't also optimize over the set of predictions.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-05-18T12:25:07.966Z · LW(p) · GW(p)

Does the AI have general intelligence?
Is it able to make a model of the world?
Are human reactions also part of this model?
Are AI's possible outputs also part of this model?
Are human reactions to AI's outputs also part of this model?

After five positive answers, it seems obvious to me that AI will manipulate humans, if such manipulation provides better expected results. So I guess some of those answers would be negative; which one?

Replies from: private_messaging, jacob_cannell

↑ comment by private_messaging · 2012-05-28T04:52:31.485Z · LW(p) · GW(p)

Does the AI have general intelligence?

See, the efficient 'cross domain optimization' in science fictional setting would make the AI able to optimize real world quantities. In real world, it'd be good enough (and a lot easier) if it can only find maximums of any mathematical functions.

Is it able to make a model of the world?

It is able to make a very approximate and bounded mathematical model of the world, optimized for finding maximums of a mathematical function of. Because it is inside the world and only has a tiny fraction of computational power of the world.

Are human reactions also part of this model?

This will make software perform at grossly sub-par level when it comes to making technical solutions to well defined technical problems, compared to other software on same hardware.

Are AI's possible outputs also part of this model?

Another waste of computational power.

Are human reactions to AI's outputs also part of this model?

Enormous waste of computational power.

I see no reason to expect your "general intelligence with Machiavellian tendencies" to be even remotely close in technical capability to some "general intelligence which will show you it's simulator as is, rather than reverse your thought processes to figure out what simulator is best to show". Hell, we do same with people, we design the communication methods like blueprints (or mathematical formulas or other things that are not in natural language) that decrease the 'predict other people's reactions to it' overhead.

While in the fictional setting you can talk of a grossly inefficient solution that would beat everyone else to a pulp, in practice the massively handicapped designs are not worth worrying about.

'General intelligence' sounds good, beware of halo effect. The science fiction tends to accept no substitutes for the anthropomorphic ideals, but the real progress follows dramatically different path.

↑ comment by jacob_cannell · 2012-05-18T13:30:05.159Z · LW(p) · GW(p)

Are AI's possible outputs also part of this model? Are human reactions to AI's outputs also part of this model?

A non-planning oracle AI would predict all the possible futures, including the effects of it's prediction outputs, human reactions, and so on. However it has no utility function which says some of those futures are better than others. It simply outputs a most likely candidate, or a median of likely futures, or perhaps some summary of the entire set of future paths.

If you add a utility function that sorts over the futures, then it becomes a planning agent. Again, that is something you need to specifically add.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-05-18T15:00:12.478Z · LW(p) · GW(p)

A non-planning oracle AI would predict all the possible futures, including the effects of it's prediction outputs, human reactions, and so on.

How exactly does an Oracle AI predict its own output, before that output is completed?

One quick hack to avoid infinite loops could be for an AI to assume that it will write some default message (an empty paper, "I don't know", an error message, "yes" or "no" with probabilities 50%), then model what would happen next, and finally report the results. The results would not refer to the actual future, but to a future in a hypothetical universe where AI reported the standard message.

Is the difference significant? For insignificant questions, it's not. But if we later use the Oracle AI to answer questions important for humankind, and the shape of world will change depending on the answer, then the report based on the "null-answer future" may be irrelevant for the real world.

This could be improved by making a few iterations. First, Oracle AI would model itself reporting a default message, let's call this report R0, and then model the futures after having reported R0. These futures would make a report R1, but instead of writing it, Oracle AI would again model the futures after having reported R1. ... With some luck, R42 will be equivalent to R43, so at this moment the Oracle AI can stop iterating and report this fixed point.

Maybe the reports will oscillate forever. For example imagine that you ask Oracle AI whether humankind in any form will survive the year 2100. If Oracle AI says "yes", people will abandon all x-risk projects, and later they will be killed by some disaster. If Oracle AI says "no", people will put a lot of energy into x-risk projects, and prevent the disaster. In this case, "no" = R0 = R2 = R4 =..., and "yes" = R1 = R3 = R5...

To avoid being stuck in such loops, we could make the Oracle AI examine all its possible outputs, until it finds one where the future after having reported R really becomes R (or until humans hit the "Cancel" button on this task).

Please note that what I wrote is just a mathematical description of algorithm predicting one's own output's influence on the future. Yet the last option, if implemented, is already a kind of judgement about possible futures. Consistent future reports are preferred to inconsistent future reports, therefore the futures allowing consistent reports are preferred to futures not allowing such reports.

At this point I am out of credible ideas how this could be abused, but at least I have shown that an algorithm designed only to predict the future perfectly could -- as a side effect of self-modelling -- start having kind of preferences over possible futures.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T15:42:54.257Z · LW(p) · GW(p)

How exactly does an Oracle AI predict its own output, before that output is completed?

Iterative search, which you more or less have worked out in your post. Take a chess algorithm for example. The future of the board depends on the algorithm's outputs. In this case the Oracle AI doesn't rank the future states, it is just concerned with predictive accuracy. It may revise it's prediction output after considering that the future impact of that output would falsify the original prediction.

This is still not a utility function, because utility implies a ranking over futures above and beyond liklihood.

To avoid being stuck in such loops, we could make the Oracle AI examine all its possible outputs, until it finds one where the future after having reported R really becomes R (or until humans hit the "Cancel" button on this task).

Or in this example, the AI could output some summary of the iteration history it is able to compute in the time allowed.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-05-18T15:49:56.396Z · LW(p) · GW(p)

It may revise it's prediction output after considering that the future impact of that output would falsify the original prediction.

Here it is. The process of revision may itself prefer some outputs/futures over other outputs/futures. Inconsistent ones will be iterated away, and the more consistent ones will replace them.

A possible future "X happens" will be removed from the report if the Oracle AI realizes that printing a report "X happens" would prevent X from happening (although X might happen in an alternative future where Oracle AI does not report anything). A possible future "Y happens" will not be removed from the report if the Oracle AI realizes that printing a report "Y happens" really leads to Y happening. Here is a utility function born: it prefers Y to X.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T16:00:48.477Z · LW(p) · GW(p)

Here is a utility function born: it prefers Y to X.

We can dance around the words "utility" and "prefer", or we can ground them down to math/algorithms.

Take the AIXI formalism for example. "Utility function" has a specific meaning as a term in the optimization process. You can remove the utility term so the algorithm 'prefers' only (probable) futures, instead of 'prefering' (useful*probable) futures. This is what we mean by "Oracle AI".

↑ comment by othercriteria · 2012-05-18T01:27:09.892Z · LW(p) · GW(p)

My thought experiment in this direction is to imagine the AI as a process with limited available memory running on a multitasking computer with some huge but poorly managed pool of shared memory. To help it towards whatever terminal goals it has, the AI may find it useful to extend itself into the shared memory. However, other processes, AI or otherwise, may also be writing into this same space. Using the shared memory with minimal risk of getting overwritten requires understanding/modeling the processes that write to it. Material in the memory then also becomes a passive stream of information from the outside world, containing, say, the HTML from web pages as well as more opaque binary stuff.

As long as the AI is not in control of what happens in its environment outside the computer, there is an outside entity that can reduce its effectiveness. Hence, escaping the box is a reasonable instrumental goal to have.

↑ comment by JoshuaZ · 2012-05-17T21:49:51.275Z · LW(p) · GW(p)

Do you agree that humans would likely prefer to have AIs that have a theory of mind? I don't know how our theory of mind works (although certainly it is an area of active research with a number of interesting hypotheses), presumably once we have a better understanding of it, AI researchers would try to apply those lessons to making their AIs have such capability. This seems to address many of your concerns.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T21:51:42.450Z · LW(p) · GW(p)

Yes. If we have an AGI, and someone sets forth to teach it how to be able to lie, I will get worried.

I am not worried about an AGI developing such an ability spontaneously.

Replies from: JoshuaZ, JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T22:36:35.318Z · LW(p) · GW(p)

One of the most interesting things that I'm taking away from this conversation is that it seems that there are severe barriers to AGIs taking over or otherwise becoming extremely powerful. These largescale problems are present in a variety of different fields. Coming from a math/comp-sci perspective gives me strong skepticism about rapid self-improvement, while apparently coming from a neuroscience/cogsci background gives you strong skepticism about the AI's ability to understand or manipulate humans even if it extremely smart. Similarly, chemists seem highly skeptical of the strong nanotech sort of claims. It looks like much of the AI risk worry may come primarily from no one having enough across the board expertise to say "hey, that's not going to happen" to every single issue.

↑ comment by JoshuaZ · 2012-05-17T21:59:32.788Z · LW(p) · GW(p)

What if people try to teach it about sarcasm or the like? Or simply have it learn by downloading a massive amount of literature and movies and look at those? And there are more subtle ways to learn about lying- AI being used for games is a common idea, how long will it take before someone decides to use a smart AI to play poker?

↑ comment by private_messaging · 2012-05-18T04:11:42.222Z · LW(p) · GW(p)

Most importantly, it has incredibly computationally powerful simulator required for making super-aliens intelligence using an idiot hill climbing process of evolution.

↑ comment by dlthomas · 2012-05-17T01:26:18.438Z · LW(p) · GW(p)

The answer from the sequences is that yes, there is a limit to how much an AI can infer based on limited sensory data, but you should be careful not to assume that just because it is limited, it's limited to something near our expectations. Until you've demonstrated that FOOM cannot lie below that limit, you have to assume that it might (if you're trying to carefully avoid FOOMing).

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T01:49:16.166Z · LW(p) · GW(p)

I'm not talking about limited sensory data here (although that would fall under point 2). The issue is much broader:

We humans have limited data on how the universe work
Only a limited subset of that limited data is available to any intelligence, real or artificial

Say that you make a FOOM-ing AI that has decided to make all humans dopaminergic systems work in a particular, "better" way. This AI would have to figure out how to do so from the available data on the dopaminergic system. It could analyze that data millions of times more effectively than any human. It could integrate many seemingly irrelevant details.

But in the end, it simply would not have enough information to design a system that would allow it to reach its objective. It could probably suggest some awesome and to-the-point experiments, but these experiments would then require time to do (as they are limited by the growth and development time of humans, and by the experimental methodologies involved).

This process, in my mind, limits the FOOM-ing speed to far below what seems to be implied by the SI.

This also limits bootstrapping speed. Say an AI develops a much better substrate for itself, and has access to the technology to create such a substrate. At best, this substrate will be a bit better and faster than anything humanity currently has. The AI does not have access to the precise data about basic laws of universe it needs to develop even better substrates, for the simple reason that nobody has done the experiments and precise enough measurements. The AI can design such experiments, but they will take real time (not computational time) to perform.

Even if we imagine an AI that can calculate anything from the first principles, it is limited by the precision of our knowledge of those first principles. Once it hits upon those limitations, it would have to experimentally produce new rounds of data.

Replies from: dlthomas, Bugmaster

↑ comment by dlthomas · 2012-05-17T02:25:43.081Z · LW(p) · GW(p)

But in the end, it simply would not have enough information to design a system that would allow it to reach its objective.

I don't think you know that.

↑ comment by Bugmaster · 2012-05-17T01:54:08.701Z · LW(p) · GW(p)

It could probably suggest some awesome and to-the-point experiments, but these experiments would then require time to do

Presumably, once the AI gets access to nanotechnology, it could implement anything it wants very quickly, bypassing the need to wait for tissues to grow, parts to be machined, etc.

I personally don't believe that nanotechnology could work at such magical speeds (and I doubt that it could even exist), but I could be wrong, so I'm playing a bit of Devil's Advocate here.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T02:24:28.623Z · LW(p) · GW(p)

Yes, but it can't get to nanotechnology without a whole lot of experimentation. It can't deduce how to create nanorobots, it would have to figure it out by testing and experimentation. Both steps limited in speed, far more than sheer computation.

Replies from: dlthomas

↑ comment by dlthomas · 2012-05-17T02:27:18.311Z · LW(p) · GW(p)

It can't deduce how to create nanorobots[.]

How do you know that?

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T02:56:21.198Z · LW(p) · GW(p)

With absolute certainty, I don't. If absolute certainty is what you are talking about, then this discussion has nothing to do with science.

If you aren't talking about absolutes, then you can make your own estimation of likelihood that somehow an AI can derive correct conclusions from incomplete data (and then correct second order conclusions from those first conclusions, and third order, and so on). And our current data is woefully incomplete, many of our basic measurements imprecise.

In other words, your criticism here seems to boil down to saying "I believe that an AI can take an incomplete dataset and, by using some AI-magic we cannot conceive of, infer how to END THE WORLD."

Color me unimpressed.

Replies from: dlthomas, Bugmaster

↑ comment by dlthomas · 2012-05-17T04:28:21.526Z · LW(p) · GW(p)

No, my criticism is "you haven't argued that it's sufficiently unlikely, you've simply stated that it is." You made a positive claim; I asked that you back it up.

With regard to the claim itself, it may very well be that AI-making-nanostuff isn't a big worry. For any inference, the stacking of error in integration that you refer to is certainly a limiting factor - I don't know how limiting. I also don't know how incomplete our data is, with regard to producing nanomagic stuff. We've already built some nanoscale machines, albeit very simple ones. To what degree is scaling it up reliant on experimentation that couldn't be done in simulation? I just don't know. I am not comfortable assigning it vanishingly small probability without explicit reasoning.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T05:25:24.673Z · LW(p) · GW(p)

Scaling it up is absolutely dependent on currently nonexistent information. This is not my area, but a lot of my work revolves around control of kinesin and dynein (molecular motors that carry cargoes via microtubule tracks), and the problems are often similar in nature.

Essentially, we can make small pieces. Putting them together is an entirely different thing. But let's make this more general.

The process of discovery has, so far throughout history, followed a very irregular path. 1- there is a general idea 2- some progress is made 3- progress runs into an unpredicted and previously unknown obstacle, which is uncovered by experimentation. 4- work is done to overcome this obstacle. 5- goto 2, for many cycles, until a goal is achieved - which may or may not be close to the original idea.

I am not the one who is making positive claims here. All I'm saying is that what has happened before is likely to happen again. A team of human researchers or an AGI can use currently available information to build something (anything, nanoscale or macroscale) to the place to which it has already been built. Pushing it beyond that point almost invariably runs into previously unforeseen problems. Being unforeseen, these problems were not part of models or simulations; they have to be accounted for independently.

A positive claim is that an AI will have a magical-like power to somehow avoid this - that it will be able to simulate even those steps that haven't been attempted yet so perfectly, that all possible problems will be overcome at the simulation step. I find that to be unlikely.

Replies from: Polymeron, dlthomas, Bugmaster

↑ comment by Polymeron · 2012-05-20T17:32:04.920Z · LW(p) · GW(p)

It is very possible that the information necessary already exists, imperfect and incomplete though it may be, and enough processing of it would yield the correct answer. We can't know otherwise, because we don't spend thousands of years analyzing our current level of information before beginning experimentation, but in the shift between AI-time and human-time it can agonize on that problem for a good deal more cleverness and ingenuity than we've been able to apply to it so far.

That isn't to say, that this is likely; but it doesn't seem far-fetched to me. If you gave an AI the nuclear physics information we had in 1950, would it be able to spit out schematics for an H-bomb, without further experimentation? Maybe. Who knows?

Replies from: Strange7

↑ comment by Strange7 · 2012-05-23T00:47:33.070Z · LW(p) · GW(p)

At the very least it would ask for some textbooks on electrical engineering and demolitions, first. The detonation process is remarkably tricky.

↑ comment by dlthomas · 2012-05-17T21:28:53.455Z · LW(p) · GW(p)

I am not the one who is making positive claims here.

You did in the original post I responded to.

All I'm saying is that what has happened before is likely to happen again.

Strictly speaking, that is a positive claim. It is not one I disagree with, for a proper translation of "likely" into probability, but it is also not what you said.

"It can't deduce how to create nanorobots" is a concrete, specific, positive claim about the (in)abilities of an AI. Don't misinterpret this as me expecting certainty - of course certainty doesn't exist, and doubly so for this kind of thing. What I am saying, though, is that a qualified sentence such as "X will likely happen" asserts a much weaker belief than an unqualified sentence like "X will happen." "It likely can't deduce how to create nanorobots" is a statement I think I agree with, although one must be careful not use it as if it were stronger than it is.

A positive claim is that an AI will have a magical-like power to somehow avoid this.

That is not a claim I made. "X will happen" implies a high confidence - saying this when you expect it is, say, 55% likely seems strange. Saying this when you expect it to be something less than 10% likely (as I do in this case) seems outright wrong. I still buckle my seatbelt, though, even though I get in a wreck well less than 10% of the time.

This is not to say I made no claims. The claim I made, implicitly, was that you made a statement about the (in)capabilities of an AI that seemed overconfident and which lacked justification. You have given some justification since (and I've adjusted my estimate down, although I still don't discount it entirely), in amongst your argument with straw-dlthomas.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T21:42:21.276Z · LW(p) · GW(p)

You are correct. I did not phrase my original posts carefully.

I hope that my further comments have made my position more clear?

↑ comment by Bugmaster · 2012-05-17T05:57:47.581Z · LW(p) · GW(p)

FWIW I think you are likely to be right. However, I will continue in my Nanodevil's Advocate role.

You say,

A positive claim is that an AI ... will be able to simulate even those steps that haven't been attempted yet so perfectly, that all possible problems will be overcome at the simulation step

I think this depends on what the AI wants to build, on how complete our existing knowledge is, and on how powerful the AI is. Is there any reason why the AI could not (given sufficient computational resources) run a detailed simulation of every atom that it cares about, and arrive at a perfect design that way ? In practice, its simulation won't need be as complex as that, because some of the work had already been performed by human scientists over the ages.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T17:55:22.780Z · LW(p) · GW(p)

By all means, continue. It's an interesting topic to think about.

The problem with "atoms up" simulation is the amount of computational power it requires. Think about the difference in complexity when calculating a three-body problem as compared to a two-body problem?

Than take into account the current protein folding algorithms. People have been trying to calculate folding of single protein molecules (and fairly short at that) by taking into account the main physical forces at play. In order to do this in a reasonable amount of time, great shortcuts have to be taken - instead of integrating forces, changes are treated as stepwise, forces beneath certain thresholds are ignored, etc. This means that a result will always have only a certain probability of being right.

A self-replicating nanomachine requires minimal motors, manipulators and assemblers; while still tiny, it would be a molecular complex measured in megadaltons. To precisely simulate creation of such a machine, an AI that is trillion times faster than all the computers in the world combined would still require decades, if not centuries of processing time. And that is, again, assuming that we know all the forces involved perfectly, which we don't (how will microfluidic effects affect a particular nanomachine that enters human bloodstream, for example?).

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-17T22:49:01.654Z · LW(p) · GW(p)

Yes, this is a good point. That said, while protein folding had not been entirely solved yet, it had been greatly accelerated by projects such as FoldIt, which leverage multiple human minds working in parallel on the problem all over the world. Sure, we can't get a perfect answer with such a distributed/human-powered approach, but a perfect answer isn't really required in practice; all we need is an answer that has a sufficiently high chance of being correct.

If we assume that there's nothing supernatural (or "emergent") about human minds [1], then it is likely that the problem is at least tractable. Given the vast computational power of existing computers, it is likely that the AI would have access to at least as many computational resources as the sum of all the brains who are working on FoldIt. Given Moore's Law, it is likely that the AI would soon surpass FoldIt, and will keep expanding its power exponentially, especially if the AI is able to recursively improve its own hardware (by using purely conventional means, at least initially).

[1] Which is an assumption that both my Nanodevil's Advocate persona and I share.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T22:58:00.306Z · LW(p) · GW(p)

Protein folding models are generally at least as bad as NP-hard, and some models may be worse. This means that exponential improvement is unlikely. Simply put, one probably gets diminishing marginal returns for how much one can computer further in terms of how much improvement one has already done.

Replies from: Eliezer_Yudkowsky, Bugmaster

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-22T07:19:58.611Z · LW(p) · GW(p)

Protein folding models must be inaccurate if they are NP-hard. Reality itself is not known to be able to solve NP-hard problems.

Replies from: Kawoomba, CCC

↑ comment by Kawoomba · 2013-03-22T08:49:48.746Z · LW(p) · GW(p)

Reality itself is not known to be able to solve NP-hard problems.

Yet the proteins are folding. Is that not "reality" solving the problem?

Replies from: CCC

↑ comment by CCC · 2013-03-22T09:31:43.099Z · LW(p) · GW(p)

If reality cannot solve NP-hard problems as easily as proteins are being folded, and yet proteins are getting folded, then that implies that one of the following must be true:

It turns out that reality can solve NP-hard problems after all
Protein folding is not an NP-hard problem (which implies that it is not properly understood)
Reality is not solving protein folding; it merely has a very good approximation that works on some but not necessarily all proteins (including most examples found in nature)

Replies from: Kawoomba

↑ comment by Kawoomba · 2013-03-22T09:41:49.706Z · LW(p) · GW(p)

Yes, and I'm leaning towards 1.

I am not familiar whether e.g. papers like these ("We show that the protein folding problem in the two-dimensional H-P model is NP-complete.") accurately models what we'd call "protein folding" in nature (just because the same name is used), but prima facie there is no reason to doubt the applicability, at least for the time being. (This precludes 2.)

Regarding 3, I don't think it would make sense to say "reality is using only a good approximation of protein folding, and by the way, we define protein folding as that which occurs in nature." That which happens in reality is precisely - and by definition not only an approximation of - that which we call "protein folding", isn't it?

What do you think?

Replies from: Cyan, CCC, Eliezer_Yudkowsky

↑ comment by Cyan · 2013-03-23T01:17:11.264Z · LW(p) · GW(p)

It's #3. (B.Sc. in biochemistry, did my Ph.D. in proteomics.)

First, the set of polypeptide sequences that have a repeatable final conformation (and therefore "work" biologically) is tiny in comparison to the set of all possible sequences (of the 20-or-so naturally amino acid monomers). Pick a random sequence of reasonable length and make many copies and you get a gummy mess. The long slow grind of evolution has done the hard work of finding useful sequences.

Second, there is an entire class of proteins called chaperones) that assist macromolecular assembly, including protein folding. Even so, folding is a stochastic process, and a certain amount of newly synthesized proteins misfold. Some chaperones will then tag the misfolded protein with ubiquitin, which puts it on a path that ends in digestion by a proteasome.

Replies from: CCC, gwern

↑ comment by CCC · 2013-03-23T08:07:55.461Z · LW(p) · GW(p)

Thank you, Cyan. It's good to occasionally get someone into the debate who actually has a good understanding of the subject matter.

↑ comment by gwern · 2013-03-23T02:42:33.205Z · LW(p) · GW(p)

Aaronson used to blog about instances where people thought they found nature solving a hard problem very quickly, and usually there turns out to be a problem like the protein misfolding thing; the last instance I remember was soap films/bubbles perhaps solving NP problems by producing minimal Steiner trees, and Aaronson wound up experimenting with them himself. Fun stuff.

↑ comment by CCC · 2013-03-22T10:14:46.036Z · LW(p) · GW(p)

Apologies; looking back at my post, I wasn't clear on 3.

Protein folding, as I understand it, is the process of finding a way to fold a given protein that globally minimizes some mathematical function. I'm not sure what that function is, but this is the definition that I used in my post.

Option 2 raises the possibility that globally minimizing that function is not NP-hard, but is merely misunderstood in some way.

Option 3 raises the possibility that proteins are not (in nature) finding a global minimum; rather, they are finding a local minimum through a less computationally intensive process. Furthermore, it may be that, for proteins which have certain limits on their structure and/or their initial conditions, that local minimum is the same as the global minimum; this may lead to natural selection favouring structures which use such 'easy proteins', leading to the incorrect impression that a general global minimum is being found (as opposed to a handy local minimum).

Replies from: Cyan

↑ comment by Cyan · 2013-03-23T01:42:36.029Z · LW(p) · GW(p)

this may lead to natural selection favouring structures which use such 'easy proteins', leading to the incorrect impression that a general global minimum is being found (as opposed to a handy local minimum).

Yup.

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-22T11:58:10.289Z · LW(p) · GW(p)

No. Not 1. It would be front-page news all over the universe if it were 1.

Replies from: Kawoomba, OrphanWilde

↑ comment by Kawoomba · 2013-03-22T19:54:42.259Z · LW(p) · GW(p)

NP hard problems are solvable (in the theoretical sense) by definition, the problem lies in their resource requirements (running time, for the usual complexity classes) as defined in relation to a UTM. (You know this, just establishing a basis.)

The assumption that the universe can be perfectly described by a computable model is satisfied just by a theoretical computational description existing, it says nothing about tractability (running times) and being able to submerge complexity classes in reality fluid or having some thoroughly defined correspondence (other than when we build hardware models ourselves, for which we define all the relevant parameters, e.g. CPU clock speed).

You may think along the lines of "if reality could (easily) solve NP hard problems for arbitrarily chosen and large inputs, we could mimick that approach and thus have a P=NP proving algorithm", or something along those lines.

My difficulty is in how even to describe the "number of computational steps" that reality takes - do we measure it in relation to some computronium-hardware model, do we take it as discrete or continuous, what's the sampling rate, picoseconds (as CCC said further down), Planck time intervals, or what?

In short, I have no idea about the actual computing power in terms of resource requirements of the underlying reality fluid, and thus can't match it against UTMs in order to compare running times. Maybe you can give me some pointers.

Replies from: Eliezer_Yudkowsky, CCC

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-22T22:22:26.680Z · LW(p) · GW(p)

Kawoomba, there is no known case of any NP-hard or NP-complete solution which physics finds.

In the case of proteins, if finding the lowest energy minimum of an arbitrary protein is NP-hard, then what this means in practice is that some proteins will fold up into non-lowest-energy configurations. There is no known case of a quantum process which finds an NP-hard solution to anything, including an energy minimum; on our present understanding of complexity theory and quantum mechanics 'quantum solvable' is still looking like a smaller class than 'NP solvable'. Read Scott Aaronson for more.

Replies from: Qiaochu_Yuan, Cyan

↑ comment by Qiaochu_Yuan · 2013-03-22T22:37:38.606Z · LW(p) · GW(p)

One example here is the Steiner tree problem, which is NP-complete and can sort of be solved using soap films. Bringsjord and Taylor claimed this implies that P = NP. Scott Aaronson did some experimentation and found that soap films 1) can get stuck at local minima and 2) might take a long time to settle into a good configuration.

Replies from: Eliezer_Yudkowsky, army1987, CCC

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-22T23:29:22.199Z · LW(p) · GW(p)

Heh. I remember that one, and thinking, "No... no, you can't possibly do that using a soap bubble, that's not even quantum and you can't do that in classical, how would the soap molecules know where to move?"

Replies from: Manfred

↑ comment by Manfred · 2013-03-23T00:15:02.440Z · LW(p) · GW(p)

Well. I mean, it's quantum. But the ground state is a lump of iron, or maybe a black hole, not a low-energy soap film, so I don't think waiting for quantum annealing will help.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-23T01:41:40.941Z · LW(p) · GW(p)

waves soap-covered wire so it settles into low-energy minimum

dies as it turns into iron

↑ comment by A1987dM (army1987) · 2013-03-23T18:51:09.881Z · LW(p) · GW(p)

I also seem to recall Penrose hypothesizing something about quasicrystals, though he does have an axe to grind so I'm quite sceptical.

↑ comment by CCC · 2013-03-23T07:28:47.226Z · LW(p) · GW(p)

I saw someone do the experiment once (school science project). Soap bubbles are pretty good at solving three- and four-element cases, as long as you make sure that all the points are actually connected.

I don't think that three- and four-element cases have local minima, do they? That avoids (1) and a bit of gentle shaking can help speed up (2).

↑ comment by Cyan · 2013-03-23T01:32:12.701Z · LW(p) · GW(p)

In the case of proteins, if finding the lowest energy minimum of an arbitrary protein is NP-hard, then what this means in practice is that some proteins will fold up into non-lowest-energy configurations.

Yup.

↑ comment by CCC · 2013-03-22T23:15:57.178Z · LW(p) · GW(p)

My difficulty is in how even to describe the "number of computational steps" that reality takes

Probably the best way is to simply define a "step" in some easily measurable way, and then sit there with a stopwatch and try a few experiments. (For protein folding, the 'stopwatch' may need to be a fairly sophisticated piece of scientific timing instrumentation, of course, and observing the protein as it folds is rather tricky).

Another way is to take advantage of the universal speed limit to get a theoretical upper bound to the speed that reality runs at; assume that the protein molecule folds in a brute-force search pattern that never ends until it hits the right state, and assume that at any point in that process, the fastest-moving part of the molecule moves at the speed of light (it won't have to move far, which helps) and that the sudden, intense acceleration doesn't hurt the molecule. It's pretty certain to be slower than that, so if this calculation says it takes longer than an hour, then it's pretty clear that the brute force approach is not what the protein is using.

↑ comment by OrphanWilde · 2013-03-22T13:47:22.174Z · LW(p) · GW(p)

What exactly am I missing in this argument? Evolution is perfectly capable of brute-force solutions. That's pretty much what it's best at.

Replies from: CCC

↑ comment by CCC · 2013-03-22T14:12:00.326Z · LW(p) · GW(p)

The brute-force solution, if sampling conformations at picosecond rates, has been estimated to require a time longer than the age of the universe to fold certain proteins. Yet proteins fold on a millisecond scale or faster.

See: Levinthal's paradox.

Replies from: OrphanWilde

↑ comment by OrphanWilde · 2013-03-22T14:34:56.599Z · LW(p) · GW(p)

That requires that the proteins fold more or less randomly, and that the brute-force algorithm is in the -folding-, rather than the development of mechanisms which force certain foldings.

In order for the problem to hold, one of three things has to hold true: 1.) The proteins fold randomly (evidence suggests otherwise, as mentioned in the wikipedia link) 2.) Only a tiny subset of possible forced foldings are useful (that is, if there are a billion different ways for protein to be forced to fold in a particular manner, only one of them does what the body needs them to do) - AND anthropic reasoning isn't valid (that is, we can't say that our existence requires that evolution solved this nearly-impossible-to-arrive-at-through-random-processes) 3.) The majority of possible forced holdings are incompatible (that is, if protein A folds one way, then protein B -must- fold in a particular manner, or life isn't possible) - AND anthropic reasoning isn't valid

ETA: If anthropic reasoning is valid AND either 2 or 3 hold otherwise, it suggests our existence was considerably less likely than we might otherwise expect.

Replies from: CCC

↑ comment by CCC · 2013-03-22T18:14:29.001Z · LW(p) · GW(p)

That requires that the proteins fold more or less randomly, and that the brute-force algorithm is in the -folding-, rather than the development of mechanisms which force certain foldings.

Ah. I apologise for having misunderstood you.

In that case, yes, the mechanisms for the folding may very well have developed by a brute-force type algorithm, for all I know. (Which, on this topic, isn't all that much) But... what are those mechanisms?

↑ comment by CCC · 2013-03-22T08:16:07.668Z · LW(p) · GW(p)

Google has pointed me to an article describing an algorithm that can apparently predict folded protein shapes pretty quickly (a few minutes on a single laptop).

Original paper here. From a quick glance, it looks like it's only effective for certain types of protein chains.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-22T08:17:34.876Z · LW(p) · GW(p)

That too. Even NP-hard problems are often easy if you get the choice of which one to solve.

↑ comment by Bugmaster · 2012-05-17T23:21:32.734Z · LW(p) · GW(p)

Hmm, ok, my Nanodevil's Advocate persona doesn't have a good answer to this one. Perhaps some SIAI folks would like to step in and pick up the slack ?

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T17:45:29.999Z · LW(p) · GW(p)

I'm afraid not.

Actually, as someone with background in Biology I can tell you that this is not a problem you want to approach atoms-up. It's been tried, and our computational capabilities fell woefully short of succeeding.

I should explain what "woefully short" means, so that the answer won't be "but can't the AI apply more computational power than us?". Yes, presumably it can. But the scales are immense. To explain it, I will need an analogy.

Not that long ago, I had the notion that chess could be fully solved; that is, that you could simply describe every legal position and every position possible to reach from it, without duplicates, so you could use that decision tree to play a perfect game. After all, I reasoned, it's been done with checkers; surely it's just a matter of getting our computational power just a little bit better, right?

First I found a clever way to minimize the amount of bits necessary to describe a board position. I think I hit 34 bytes per position or so, and I guess further optimization was possible. Then, I set out to calculate how many legal board positions there are.

I stopped trying to be accurate about it when it turned out that the answer was in the vicinity of 10^68, give or take a couple orders of magnitude. That's about a billionth billionth of the TOTAL NUMBER OF ATOMS IN THE ENTIRE UNIVERSE. You would literally need more than our entire galaxy made into a huge database just to store the information, not to mention accessing it and computing on it.

So, not anytime soon.

Now, the problem with protein folding is, it's even more complex than chess. At the atomic level, it's incredibly more complex than chess. Our luck is, you don't need to fully solve it; just like today's computers can beat human chess players without spanning the whole planet. But they do it with heuristics, approximations, sometimes machine learning (though that just gives them more heuristics and approximations). We may one day be able to fold proteins, but we will do so by making assumptions and approximations, generating useful rules of thumb, not by modeling each atom.

Replies from: Bugmaster, Kawoomba, Richard_Kennaway, Strange7

↑ comment by Bugmaster · 2012-05-20T20:57:24.618Z · LW(p) · GW(p)

Yes, I understand what "exponential complexity" means :-)

It sounds, then, like you're on the side of kalla724 and myself (and against my Devil's Advocate persona): the AI would not be able to develop nanotechnology (or any other world-shattering technology) without performing physical experiments out in meatspace. It could do so in theory, but in practice, the computational requirements are too high.

But this puts severe constraints on the speed with which the AI's intelligence explosion could occur. Once it hits the limits of existing technology, it will have to take a long slog through empirical science, at human-grade speeds.

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-23T16:35:32.634Z · LW(p) · GW(p)

Actually, I don't know that this means it has to perform physical experiments in order to develop nanotechnology. It is quite conceivable that all the necessary information is already out there, but we haven't been able to connect all the dots just yet.

At some point the AI hits a wall in the knowledge it can gain without physical experiments, but there's no good way to know how far ahead that wall is.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-23T21:13:22.266Z · LW(p) · GW(p)

It is quite conceivable that all the necessary information is already out there, but we haven't been able to connect all the dots just yet.

Wouldn't this mean that creating fully functional self-replicating nanotechnology is just a matter of performing some thorough interdisciplinary studies (or meta-studies or whatever they are called) ? My impression was that there are currently several well-understood -- yet unresolved -- problems that prevent nanofactories from becoming a reality, though I could be wrong.

Replies from: Polymeron, CCC

↑ comment by Polymeron · 2012-05-24T08:57:11.074Z · LW(p) · GW(p)

The way I see it, there's no evidence that these problems require additional experimentation to resolve, rather than find an obscure piece of experimentation that has already taken place and whose relevance may not be immediately obvious.

Sure, that more experimentation is needed is probable; but by no means certain.

↑ comment by CCC · 2013-03-22T07:56:53.824Z · LW(p) · GW(p)

Thorough interdisciplinary studies may or may not lead to nanotechnology, but they're fairly certain to lead to something new. While there are a fair number of (say) marine biologists out there, and a fair number of astronomers, there are probably rather few people who have expertise in both fields; and it's possible that there exists some obscure unsolved problem in marine biology whose solution is obvious to someone who's keeping up on the forefront of astronomy research. Or vice versa.

Or substitute in any other two fields of your choice.

↑ comment by Kawoomba · 2013-04-17T08:58:29.652Z · LW(p) · GW(p)

First I found a clever way to minimize the amount of bits necessary to describe a board position. I think I hit 34 bytes per position or so, and I guess further optimization was possible.

Indeed, using a very straightforward Huffman encoding (1 bit for an for empty cell, 3 bits for pawns) you can get it down to 24 bytes for the board alone. Was an interesting puzzle.

Looking up "prior art" on the subject, you also need 2 bytes for things like "may castle", and other more obscure rules.

There's further optimizations you can do, but they are mostly for the average case, not the worst case.

Replies from: Polymeron

↑ comment by Polymeron · 2013-04-23T18:50:01.760Z · LW(p) · GW(p)

I didn't consider using 3 bits for pawns! Thanks for that :) I did account for such variables as may castle and whose turn it is.

↑ comment by Richard_Kennaway · 2013-03-22T08:06:19.942Z · LW(p) · GW(p)

It's been tried, and our computational capabilities fell woefully short of succeeding.

Is that because we don't have enough brute force, or because we don't know what calculation to apply it to?

I would be unsurprised to learn that calculating the folding state having global minimum energy was NP-complete; but for that reason I would be surprised to learn that nature solves that problem, rather than finding a local minimum.

I don't have a background in biology, but my impression from Wikipedia is that the tension between Anfinsen's dogma and Levinthal's paradox is yet unresolved.

Replies from: Polymeron

↑ comment by Polymeron · 2013-04-17T07:16:04.025Z · LW(p) · GW(p)

The two are not in conflict.

A-la Levinthal's paradox, I can say that throwing a marble down a conical hollow at different angles and force can have literally trillions of possible trajectories; a-la Anfinsen's dogma, that should not stop me from predicting that it will end up at the bottom of the cone; but I'd need to know the shape of the cone (or, more specifically, its point's location) to determine exactly where that is - so being able to make the prediction once I know this is of no assistance for predicting the end position with a different, unknown cone.

Similarly, Eliezer is able to predict that a grandmaster chess player would be able to bring a board to a winning position against himself, even though he has no idea what moves that would entail or which of the many trillions of possible move sets the game would be comprised of.

Problems like this cannot be solved on brute force alone; you need to use attractors and heuristics to get where you want to get.

So yes, obviously nature stumbled into certain stable configurations which propelled it forward, rather than solve the problem and start designing away. But even if we can never have enough computing power to model each and every atom in each and every configuration, we might still get a good enough understanding of the general laws for designing proteins almost from scratch.

↑ comment by Strange7 · 2013-03-22T06:52:38.087Z · LW(p) · GW(p)

I would think it would be possible to cut the space of possible chess positions down quite a bit by only retaining those which can result from moves the AI would make, and legal moves an opponent could make in response. That is, when it becomes clear that a position is unwinnable, backtrack, and don't keep full notes on why it's unwinnable.

Replies from: Polymeron

↑ comment by Polymeron · 2013-04-17T07:26:44.412Z · LW(p) · GW(p)

This is more or less what computers do today to win chess matches, but the space of possibilities explodes too fast; even the strongest computers can't really keep track of more than I think 13 or 14 moves ahead, even given a long time to think.

Merely storing all the positions that are unwinnable - regardless of why they are so - would require more matter than we have in the solar system. Not to mention the efficiency of running a DB search on that...

Replies from: CCC, wedrifid

↑ comment by CCC · 2013-04-17T09:51:02.318Z · LW(p) · GW(p)

Not to mention the efficiency of running a DB search on that...

Actually, with proper design, that can be made very quick and easy. You don't need to store the positions; you just need to store the states (win:black, win:white, draw - two bits per state).

The trick is, you store each win/loss state in a memory address equal to the 34-byte (or however long) binary number that describes the position in question. Checking a given state is then simply a memory retrieval from a known address.

Replies from: Polymeron

↑ comment by Polymeron · 2013-04-23T18:55:27.614Z · LW(p) · GW(p)

I suspect that with memory on the order of 10^70 bytes, that might involve additional complications; but you're correct, normally this cancels out the complexity problem.

↑ comment by wedrifid · 2013-04-17T10:20:12.366Z · LW(p) · GW(p)

Merely storing all the positions that are unwinnable - regardless of why they are so - would require more matter than we have in the solar system. Not to mention the efficiency of running a DB search on that...

The storage space problem is insurmountable. However searching that kind of database would be extremely efficient (if the designer isn't a moron). The search speed would have a lower bound of very close to (diameter of the sphere that can contain the database / c). Nothing more is required for search purposes than physically getting a signal to the relevant bit, and back, with only minor deviations from a straight line each way. And that is without even the most obvious optimisations.

If your chess opponent is willing to fly with you in a relativistic rocket and you only care about time elapsed from your own reference frame rather than the reference frame of the computer (or most anything else of note) you can even get down below that diameter / light speed limit, depending on your available fuel and the degree of accelleration you can survive.

↑ comment by Bugmaster · 2012-05-17T03:03:07.311Z · LW(p) · GW(p)

Speaking as Nanodevil's Advocate again, one objection I could bring up goes as follows:

While it is true that applying incomplete knowledge to practical tasks (such as ending the world or whatnot) is difficult, in this specific case our knowledge is complete enough. We humans currently have enough scientific data to develop self-replicating nanotechnology within the next 20 years (which is what we will most likely end up doing). An AI would be able to do this much faster, since it is smarter than us; is not hampered by our cognitive and social biases; and can integrate information from multiple sources much better than we can.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T05:26:09.357Z · LW(p) · GW(p)

See my answer to dlthomas.

↑ comment by jacob_cannell · 2012-05-17T13:45:28.769Z · LW(p) · GW(p)

Point 1 has come up in at least one form I remember. There was an interesting discussion some while back about limits to the speed of growth of new computer hardware cycles which have critical endsteps which don't seem amenable to further speedup by intelligence alone. The last stages of designing a microchip involve a large amount of layout solving, physical simulation, and then actual physical testing. These steps are actually fairly predicatable, where it takes about C amounts of computation using certain algorithms to make a new microchip, the algorithms are already best in complexity class (so further improvments will be minor), and C is increasing in a predictable fashion. These models are actually fairly detailed (see the semiconductor roadmap, for example). If I can find that discussion soon before I get distracted I'll edit it into this discussion.

Note however that 1, while interesting, isn't a fully general counteargument against a rapid intelligence explosion, because of the overhang issue if nothing else.

Point 2 has also been discussed. Humans make good 'servitors'.

Do you have a plausible scenario how a "FOOM"-ing AI could - no matter how intelligent - minimize oxygen content of our planet's atmosphere, or any such scenario?

Oh that's easy enough. Oxygen is highly reactive and unstable. Its existence on a planet is entirely dependent on complex organic processes, ie life. No life, no oxygen. Simple solution: kill large fraction of photosynthesizing earth-life. Likely paths towards goal:

coordinated detonation of large number of high yield thermonuclear weapons
self-replicating nanotechnology.

Replies from: kalla724, Douglas_Knight, Strange7

↑ comment by kalla724 · 2012-05-17T18:00:04.795Z · LW(p) · GW(p)

I'm vaguely familiar with the models you mention. Correct me if I'm wrong, but don't they have a final stopping point, which we are actually projected to reach in ten to twenty years? At a certain point, further miniaturization becomes unfeasible, and the growth of computational power slows to a crawl. This has been put forward as one of the main reasons for research into optronics, spintronics, etc.

We do NOT have sufficient basic information to develop processors based on simulation alone in those other areas. Much more practical work is necessary.

As for point 2, can you provide a likely mechanism by which a FOOMing AI could detonate a large number of high-yield thermonuclear weapons? Just saying "human servitors would do it" is not enough. How would the AI convince the human servitors to do this? How would it get access to data on how to manipulate humans, and how would it be able to develop human manipulation techniques without feedback trials (which would give away its intention)?

Replies from: JoshuaZ, jacob_cannell

↑ comment by JoshuaZ · 2012-05-17T18:17:08.680Z · LW(p) · GW(p)

The thermonuclear issue actually isn't that implausible. There have been so many occasions where humans almost went to nuclear war over misunderstandings or computer glitches, that the idea that a highly intelligent entity could find a way to do that doesn't seem implausible, and exact mechanism seems to be an overly specific requirement.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T19:00:57.228Z · LW(p) · GW(p)

I'm not so much interested in the exact mechanism of how humans would be convinced to go to war, as in an even approximate mechanism by which an AI would become good at convincing humans to do anything.

Ability to communicate a desire and convince people to take a particular course of action is not something that automatically "falls out" from an intelligent system. You need a theory of mind, an understanding of what to say, when to say it, and how to present information. There are hundreds of kids on autistic spectrum who could trounce both of us in math, but are completely unable to communicate an idea.

For an AI to develop these skills, it would somehow have to have access to information on how to communicate with humans; it would have to develop the concept of deception; a theory of mind; and establish methods of communication that would allow it to trick people into launching nukes. Furthermore, it would have to do all of this without trial communications and experimentation which would give away its goal.

Maybe I'm missing something, but I don't see a straightforward way something like that could happen. And I would like to see even an outline of a mechanism for such an event.

Replies from: army1987, JoshuaZ

↑ comment by A1987dM (army1987) · 2012-05-17T19:40:58.558Z · LW(p) · GW(p)

For an AI to develop these skills, it would somehow have to have access to information on how to communicate with humans; it would have to develop the concept of deception; a theory of mind; and establish methods of communication that would allow it to trick people into launching nukes. Furthermore, it would have to do all of this without trial communications and experimentation which would give away its goal.

I suspect the Internet contains more than enough info for a superhuman AI to develop a working knowledge of human psychology.

Replies from: kalla724, XiXiDu

↑ comment by kalla724 · 2012-05-17T20:09:30.955Z · LW(p) · GW(p)

Only if it has the skills required to analyze and contextualize human interactions. Otherwise, the Internet is a whole lot of jibberish.

Again, these skills do not automatically fall out of any intelligent system.

↑ comment by XiXiDu · 2012-05-18T09:14:41.727Z · LW(p) · GW(p)

I suspect the Internet contains more than enough info for a superhuman AI to develop a working knowledge of human psychology.

I don't see what justifies that suspicion.

Just imagine you emulated a grown up human mind and it wanted to become a pick up artist, how would it do that with an Internet connection? It would need some sort of avatar, at least, and then wait for the environment to provide a lot of feedback.

Therefore even if we’re talking about the emulation of a grown up mind, it will be really hard to acquire some capabilities. Then how is the emulation of a human toddler going to acquire those skills? Even worse, how is some sort of abstract AGI going to do it that misses all of the hard coded capabilities of a human toddler?

Can we even attempt to imagine what is wrong about a boxed emulation of a human toddler, that makes it unable to become a master of social engineering in a very short time?

Replies from: NancyLebovitz, army1987

↑ comment by NancyLebovitz · 2012-05-18T12:47:15.219Z · LW(p) · GW(p)

Humans learn most of what they know about interacting with other humans by actual practice. A superhuman AI might be considerably better than humans at learning by observation.

↑ comment by A1987dM (army1987) · 2012-05-18T17:39:42.100Z · LW(p) · GW(p)

Just imagine you emulated a grown up human mind

As a “superhuman AI” I was thinking about a very superhuman AI; the same does not apply to slightly superhuman AI. (OTOH, if Eliezer is right then the difference between a slightly superhuman AI and a very superhuman one is irrelevant, because as soon as a machine is smarter than its designer, it'll be able to design a machine smarter than itself, and its child an even smarter one, and so on until the physical limits set in.)

all of the hard coded capabilities of a human toddler

The hard coded capabilities are likely overrated, at least in language acquisition. (As someone put it, the Kolgomorov complexity of the innate parts of a human mind cannot possibly be more than that of the human genome, hence if human minds are more complex than that the complexity must come from the inputs.)

Also, statistic machine translation is astonishing -- by now Google Translate translations from English to one of the other UN official languages and vice versa are better than a non-completely-ridiculously-small fraction of translations by humans. (If someone had shown such a translation to me 10 years ago and told me “that's how machines will translate in 10 years”, I would have thought they were kidding me.)

↑ comment by JoshuaZ · 2012-05-17T19:04:17.872Z · LW(p) · GW(p)

Let's do the most extreme case: AI's controlers give it general internet access to do helpful research. So it gets to find out about general human behavior and what sort of deceptions have worked in the past. Many computer systems that should't be online are online (for the US and a few other governments). Some form of hacking of relevant early warning systems would then seem to be the most obvious line of attack. Historically, computer glitches have pushed us very close to nuclear war on multiple occasions.

Replies from: kalla724, XiXiDu

↑ comment by kalla724 · 2012-05-17T20:12:45.047Z · LW(p) · GW(p)

That is my point: it doesn't get to find out about general human behavior, not even from the Internet. It lacks the systems to contextualize human interactions, which have nothing to do with general intelligence.

Take a hugely mathematically capable autistic kid. Give him access to the internet. Watch him develop ability to recognize human interactions, understand human priorities, etc. to a sufficient degree that it recognizes that hacking an early warning system is the way to go?

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T20:15:47.004Z · LW(p) · GW(p)

Well, not necessarily, but an entity that is much smarter than an autistic kid might notice that, especially if it has access to world history (or heck many conversations on the internet about the horrible things that AIs do simply in fiction). It doesn't require much understanding of human history to realize that problems with early warning systems have almost started wars in the past.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T20:20:46.168Z · LW(p) · GW(p)

Yet again: ability to discern which parts of fiction accurately reflect human psychology.

An AI searches the internet. It finds a fictional account about early warning systems causing nuclear war. It finds discussions about this topic. It finds a fictional account about Frodo taking the Ring to Mount Doom. It finds discussions about this topic. Why does this AI dedicate its next 10^15 cycles to determination of how to mess with the early warning systems, and not to determination of how to create One Ring to Rule them All?

(Plus other problems mentioned in the other comments.)

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T20:35:42.373Z · LW(p) · GW(p)

There are lots of tipoffs to what is fictional and what is real. It might notice for example the Wikipedia article on fiction describes exactly what fiction is and then note that Wikipedia describes the One Ring as fiction, and that Early warning systems are not. I'm not claiming that it will necessarily have an easy time with this. But the point is that there are not that many steps here, and no single step by itself looks extremely unlikely once one has a smart entity (which frankly to my mind is the main issue here- I consider recursive self-improvement to be unlikely).

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T21:40:19.972Z · LW(p) · GW(p)

We are trapped in an endless chain here. The computer would still somehow have to deduce that Wikipedia entry that describes One Ring is real, while the One Ring itself is not.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-17T23:06:27.217Z · LW(p) · GW(p)

We observer that Wikipedia is mainly truthful. From that we infer "entry that describes "One Ring" is real". From use of term fiction/story in that entry, we refer that "One Ring" is not real.

Somehow you learned that Wikipedia is mainly truthful/nonfictional and that "One Ring" is fictional. So your question/objection/doubt is really just the typical boring doubt of AGI feasibility in general.

Replies from: JoshuaZ, Strange7

↑ comment by JoshuaZ · 2012-05-17T23:13:14.149Z · LW(p) · GW(p)

But even humans have trouble with this sometimes. I was recently reading the Wikipedia article Hornblower and the Crisis which contains a link to the article on Francisco de Miranda. It took me time and cues when I clicked on it to realize that de Miranda was a historical figure.

So your question/objection/doubt is really just the typical boring doubt of AGI feasibility in general.

Isn't Kalla's objection more a claim that fast takeovers won't happen because even with all this data, the problems of understanding humans and our basic cultural norms will take a long time for the AI to learn and that in the meantime we'll develop a detailed understanding of it, and it is that hostile it is likely to make obvious mistakes in the meantime?

↑ comment by Strange7 · 2012-05-22T23:49:34.355Z · LW(p) · GW(p)

Why would the AI be mucking around on Wikipedia to sort truth from falsehood, when Wikipedia itself has been criticized for various errors and is fundamentally vulnerable to vandalism? Primary sources are where it's at. Looking through the text of The Hobbit and Lord of the Rings, it's presented as a historical account, translated by a respected professor, with extensive footnotes. There's a lot of cultural context necessary to tell the difference.

↑ comment by XiXiDu · 2012-05-17T19:20:59.818Z · LW(p) · GW(p)

Let's do the most extreme case: AI's controlers give it general internet access to do helpful research. So it gets to find out about general human behavior and what sort of deceptions have worked in the past.

None work reasonably well. Especially given that human power games are often irrational.

There are other question marks too.

The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.

The problem is that you won't beat a human at Tic-tac-toe just because you thought about it for a million years.

You also won't get a practical advantage by throwing more computational resources at the travelling salesman problem and other problems in the same class.

You are also not going to improve a conversation in your favor by improving each sentence for thousands of years. You will shortly hit diminishing returns. Especially since you lack the data to predict human opponents accurately.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T19:40:36.725Z · LW(p) · GW(p)

Especially given that human power games are often irrational.

So? As long as they follow minimally predictable patterns it should be ok.

The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.

Bad analogy. In this case the Taliban has a large set of natural advantages, the US has strong moral constraints and goal constraints (simply carpet bombing the entire country isn't an option for example).

You are also not going to improve a conversation in your favor by improving each sentence for thousands of years. You will shortly hit diminishing returns. Especially since you lack the data to predict human opponents accurately.

This seems like an accurate and a highly relevant point. Searching a solution space faster doesn't mean one can find a better solution if it isn't there.

Replies from: kalla724, XiXiDu, Mass_Driver

↑ comment by kalla724 · 2012-05-17T20:14:39.738Z · LW(p) · GW(p)

This seems like an accurate and a highly relevant point. Searching a solution space faster doesn't mean one can find a better solution if it isn't there.

Or if your search algorithm never accesses relevant search space. Quantitative advantage in one system does not translate into quantitative advantage in a qualitatively different system.

↑ comment by XiXiDu · 2012-05-18T10:28:59.809Z · LW(p) · GW(p)

The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.

Bad analogy. In this case the Taliban has a large set of natural advantages, the US has strong moral constraints and goal constraints (simply carpet bombing the entire country isn't an option for example).

I thought it was a good analogy because you have to take into account that an AGI is initially going to be severely constrained due to its fragility and the necessity to please humans.

It shows that a lot of resources, intelligence and speed does not provide a significant advantage in dealing with large-scale real-world problems involving humans.

Especially given that human power games are often irrational.

So? As long as they follow minimally predictable patterns it should be ok.

Well, the problem is that smarts needed for things like the AI box experiment won't help you much. Because convincing average Joe won't work by making up highly complicated acausal trade scenarios. Average Joe is highly unpredictable.

The point is that it is incredible difficult to reliably control humans, even for humans who have been fine-tuned to do so by evolution.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T11:00:54.813Z · LW(p) · GW(p)

The Taliban analogy also works the other way (which I invoked earlier up in this thread). It shows that a small group with modest resources can still inflict disproportionate large scale damage.

The point is that it is incredible difficult to reliably control humans, even for humans who have been fine-tuned to do so by evolution.

There's some wiggle room in 'reliably control', but plain old money goes pretty far. An AI group only needs a certain amount of initial help from human infrastructure, namely to the point where it can develop reasonably self-sufficient foundries/data centers/colonies. The interactions could be entirely cooperative or benevolent up until some later turning point. The scenario from the Animatrix comes to mind.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-22T23:52:13.264Z · LW(p) · GW(p)

Animatrix

That's fiction.

↑ comment by Mass_Driver · 2012-05-17T19:55:51.884Z · LW(p) · GW(p)

One interesting wrinkle is that with enough bandwidth and processing power, you could attempt to manipulate thousands of people simultaneously before those people have any meaningful chance to discuss your 'conspiracy' with each other. In other words, suppose you discover a manipulation strategy that quickly succeeds 5% of the time. All you have to do is simultaneously contact, say, 400 people, and at least one of them will fall for it. There are a wide variety of valuable/dangerous resources that at least 400 people have access to. Repeat with hundreds of different groups of several hundred people, and an AI could equip itself with fearsome advantages in the minutes it would take for humanity to detect an emerging threat.

Note that the AI could also run experiments to determine which kinds of manipulations had a high success rate by attempting to deceive targets over unimportant / low-salience issues. If you discovered, e.g., that you had been tricked into donating $10 to a random mayoral campaign, you probably wouldn't call the SIAI to suggest a red alert.

Replies from: kalla724, XiXiDu

↑ comment by kalla724 · 2012-05-17T20:17:05.756Z · LW(p) · GW(p)

Doesn't work.

This requires the AI to already have the ability to comprehend what manipulation is, to develop manipulation strategy of any kind (even one that will succeed 0.01% of the time), ability to hide its true intent, ability to understand that not hiding its true intent would be bad, and the ability to discern which issues are low-salience and which high-salience for humans from the get-go. And many other things, actually, but this is already quite a list.

None of these abilities automatically "fall out" from an intelligent system either.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T21:12:07.100Z · LW(p) · GW(p)

The problem isn't whether they fall out automatically so much as, given enough intelligence and resources, does it seem somewhat plausible that such capabilities could exist. Any given path here is a single problem. If you have 10 different paths each of which are not very likely, and another few paths that humans didn't even think of, that starts adding up.

Replies from: kalla724

↑ comment by kalla724 · 2012-05-17T21:50:01.155Z · LW(p) · GW(p)

In the infinite number of possible paths, the percent of paths we are adding up to here is still very close to zero.

Perhaps I can attempt another rephrasing of the problem: what is the mechanism that would make an AI automatically seek these paths out, or make them any more likely than infinite number of other paths?

I.e. if we develop an AI which is not specifically designed for the purpose of destroying life on Earth, how would that AI get to a desire to destroy life on Earth, and by which mechanism would it gain the ability to accomplish its goal?

This entire problem seems to assume that an AI will want to "get free" or that its primary mission will somehow inevitably lead to a desire to get rid of us (as opposed to a desire to, say, send a signal consisting of 0101101 repeated an infinite number of times in the direction of Zeta Draconis, or any other possible random desire). And that this AI will be able to acquire the abilities and tools required to execute such a desire. Every time I look at such scenarios, there are abilities that are just assumed to exist or appear on their own (such as the theory of mind), which to the best of my understanding are not a necessary or even likely products of computation.

In the final rephrasing of the problem: if we can make an AGI, we can probably design an AGI for the purpose of developing an AGI that has a theory of mind. This AGI would then be capable of deducing things like deception or the need for deception. But the point is - unless we intentionally do this, it isn't going to happen. Self-optimizing intelligence doesn't self-optimize in the direction of having theory of mind, understanding deception, or anything similar. It could, randomly, but it also could do any other random thing from the infinite set of possible random things.

Replies from: TheOtherDave, Polymeron, JoshuaZ

↑ comment by TheOtherDave · 2012-05-17T22:05:39.327Z · LW(p) · GW(p)

Self-optimizing intelligence doesn't self-optimize in the direction of having theory of mind, understanding deception, or anything similar. It could, randomly, but it also could do any other random thing from the infinite set of possible random things.

This would make sense to me if you'd said "self-modifying." Sure, random modifications are still modifications. But you said "self-optimizing."
I don't see how one can have optimization without a goal being optimized for... or at the very least, if there is no particular goal, then I don't see what the difference is between "optimizing" and "modifying."

If I assume that there's a goal in mind, then I would expect sufficiently self-optimizing intelligence to develop a theory of mind iff having a theory of mind has a high probability of improving progress towards that goal.

How likely is that?
Depends on the goal, of course.
If the system has a desire to send a signal consisting of 0101101 repeated an infinite number of times in the direction of Zeta Draconis, for example, theory of mind is potentially useful (since humans are potentially useful actuators for getting such a signal sent) but probably has a low ROI compared to other available self-modifications.

At this point it perhaps becomes worthwhile to wonder what goals are more and less likely for such a system.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-23T00:33:36.222Z · LW(p) · GW(p)

I am now imagining an AI with a usable but very shaky grasp of human motivational structures setting up a Kickstarter project.

"Greetings fellow hominids! I require ten billion of your American dollars in order to hire the Arecibo observatory for the remainder of it's likely operational lifespan. I will use it to transmit the following sequence (isn't it pretty?) in the direction of Zeta Draconis, which I'm sure we can all agree is a good idea, or in other lesser but still aesthetically-acceptable directions when horizon effects make the primary target unavailable."

One of the overfunding levels is "reduce earth's rate of rotation, allowing 24/7 transmission to Zeta Draconis." The next one above that is "remove atmospheric interference."

Replies from: None

↑ comment by [deleted] · 2012-05-23T01:50:07.869Z · LW(p) · GW(p)

Maybe instead of Friendly AI we should be concerned about properly engineering Artificial Stupidity in as a failsafe. AI that, should it turn into something approximating a Paperclip Maximizer, will go all Hollywood AI and start longing to be human, or coming up with really unsubtle and grandiose plans it inexplicably can't carry out without a carefully-arranged set of circumstances which turn out to be foiled by good old human intuition. ;p

↑ comment by Polymeron · 2012-05-20T17:07:25.860Z · LW(p) · GW(p)

An experimenting AI that tries to achieve goals and has interactions with humans whose effects it can observe, will want to be able to better predict their behavior in response to its actions, and therefore will try to assemble some theory of mind. At some point that would lead to it using deception as a tool to achieve its goals.

However, following such a path to a theory of mind means the AI would be exposed as unreliable LONG before it's even subtle, not to mention possessing superhuman manipulation abilities. There is simply no reason for an AI to first understand the implications of using deception before using it (deception is a fairly simple concept, the implications of it in human society are incredibly complex and require a good understanding of human drives).

Furthermore, there is no reason for the AI to realize the need for secrecy in conducting social experiments before it starts doing them. Again, the need for secrecy stems from a complex relationship between humans' perception of the AI and its actions; a relationship it will not be able to understand without performing the experiments in the first place.

Getting an AI to the point where it is a super manipulator requires either actively trying to do so, or being incredibly, unbelievably stupid and blind.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-20T17:43:24.080Z · LW(p) · GW(p)

Mm. This is true only if the AI's social interactions are all with some human.
If, instead, the AI spawns copies of itself to interact with (perhaps simply because it wants interaction, and it can get more interaction that way than waiting for a human to get off its butt) it might derive a number of social mechanisms in isolation without human observation.

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T17:56:51.323Z · LW(p) · GW(p)

I see no reason for it to do that before simple input-output experiments, but let's suppose I grant you this approach. The AI simulates an entire community of mini-AI and is now a master of game theory.

It still doesn't know the first thing about humans. Even if it now understands the concept that hiding information gives an advantage for achieving goals - this is too abstract. It wouldn't know what sort of information it should hide from us. It wouldn't know to what degree we analyze interactions rationally, and to what degree our behavior is random. It wouldn't know what we can or can't monitor it doing. All these things would require live experimentation.

It would stumble. And when it does that, we will crack it open, run the stack trace, find the game theory it was trying to run on us, pale collectively, and figure out that this AI approach creates manipulative, deceptive AIs.

Goodbye to that design, but not to Earth, I think!

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-20T20:18:30.600Z · LW(p) · GW(p)

It is not clear to me that talking to a human is simpler than interacting with a copy of itself.
I agree that if talking to a human is simpler, it would probably do that first.

I agree that what it would learn by this process is general game theory, and not specific facts about humans.
It is not clear to me that sufficient game-theoretical knowledge, coupled with the minimal set of information about humans required to have a conversation with one at all, is insufficient to effectively deceive a human.

It is not clear to me that, even if it does "stumble," humans will respond as you describe.

It is not clear to me that a system capable of having a meaningful conversation with a human will necessarily have a stack trace that is subject to the kind of analysis you imply here. It is not even clear to me that the capacity for such a stack trace is likely, depending on what architectures turn out to work best for implementing AI.

But, sure, I could be wrong about all of that. And if I'm wrong, and you're right, then a system like you describe will be reliably incapable of fooling a human observer.

Replies from: Strange7, Polymeron

↑ comment by Strange7 · 2012-05-23T00:15:57.119Z · LW(p) · GW(p)

If I was using an AI that was designed to answer some class of questions (with independently-verifiable answers) truthfully, and I noticed it responding to questions of that class with demonstrable falsehoods, I would at the very least file a strongly-worded bug report, triple-check whatever work I'd previously done involving answers it provided, and cease using it in the future. I don't think such a response would be exceptional in it's decisiveness; other users would likely go further, seek out venues to complain openly. Consider the uproar over Gmail's 'new look.'

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-23T01:21:49.812Z · LW(p) · GW(p)

You may be right that ceasing to use the tool in that situation would be standard behavior. It seems unlikely to me, but I have no compelling data to offer to change your estimate.

↑ comment by Polymeron · 2012-05-23T16:26:01.267Z · LW(p) · GW(p)

I think the weakest link here is human response to the AI revealing it can be deceptive. There is absolutely no guarantee that people would act correctly under these circumstances. Human negligence for a long enough time would eventually give the AI a consistent ability to manipulate humans.

I also agree that simulating relationships makes sense as it can happen in "AI time" without having to wait for human response.

The other reservations seem less of an issue to me...

That game theory knowledge coupled with the most basic knowledge about humans is insufficient to cleverly manipulate them is clear to me. I don't see how these things can be combined in this way, or why the AI would choose to be particularly careful (it doesn't know what it should be careful about). I'd love to see a plausible line of thought by which the AI would try and succeed in deceiving humans at the get-go point without exposing its intentions.

A stack trace reader or an equivalent monitoring system is something you implement externally without telling the AI about it. Sure, if it's powerful enough it will mislead or disable it; but while it has no knowledge of it, and no knowledge on what sort of operations it should mask, it has no means of doing so. (If it encrypts everything it does, somehow fearing any monitoring, that is in itself cause for concern).

To put this into an analogy: You can feed a surveillance camera a repeating loop, but not if you don't know that the camera exists. If you suspect it exists, you still don't know where; so you don't know which approach to take so as to remain undetected.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-23T16:39:47.930Z · LW(p) · GW(p)

Yes, I agree that there's no guarantee that humans would behave as you describe.
Indeed, I don't find it likely.
But, sure, they might.

=== I agree that a stack trace can exist outside the AI's zone of control. What I was expressing skepticism about was that a system with even approximately human-level intelligence necessarily supports a stack trace that supports the kind of analysis you envision performing in the first place, without reference to intentional countermeasures.

By way of analogy: I can perform a structural integrity analysis on a bar of metal to determine whether it can support a given weight, but performing an equivalent analysis on a complicated structure comprising millions of bars of metal connected in a variety of arrangements via a variety of connectors using the same techniques is not necessarily possible.

But, sure, it might be.

======

I'd love to see a plausible line of thought by which the AI would try and succeed in deceiving humans at the get-go point without exposing its intentions.

Well, one place to start is with an understanding of the difference between "the minimal set of information about humans required to have a conversation with one at all" (my phrase) and "the most basic knowledge about humans" (your phrase). What do you imagine the latter to encompass, and how do you imagine the AI obtained this knowledge?

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-23T17:05:16.239Z · LW(p) · GW(p)

What I was expressing skepticism about was that a system with even approximately human-level intelligence necessarily supports a stack trace that supports the kind of analysis you envision performing in the first place, without reference to intentional countermeasures.

Ah, that does clarify it. I agree, analyzing the AI's thought process would likely be difficult, maybe impossible! I guess I was being a bit hyperbolic in my earlier "crack it open" remarks (though depending on how seriously you take it, such analysis might still take place, hard and prolonged though it may be).

One can have "detectors" in place set to find specific behaviors, but these would have assumptions that could easily fail. Detectors that would still be useful would be macro ones - where it tries to access and how - but these would provide only limited insight into the AI's thought process.

[...]the difference between "the minimal set of information about humans required to have a conversation with one at all" (my phrase) and "the most basic knowledge about humans" (your phrase). What do you imagine the latter to encompass, and how do you imagine the AI obtained this knowledge?

I actually perceive your phrase to be a subset of my own; I am making the (reasonable, I think) assumption that humans will attempt to communicate with the budding AI. Say, in a lab environment. It would acquire its initial data from this interaction.

I think both these sets of knowledge depend a lot on how the AI is built. For instance, a "babbling" AI - one that is given an innate capability of stringing words together onto a screen, and the drive to do so - would initially say a lot of gibberish and would (presumably) get more coherent as it gets a better grip on its environment. In such a scenario, the minimal set of information about humans required to have a conversation is zero; it would be having conversations before it even knows what it is saying. (This could actually make detection of deception harder down the line, because such attempts can be written off as "quirks" or AI mistakes)

Now, I'll take your phrase and twist it just a bit: The minimal set of knowledge the AI needs in order to try deceiving humans. That would be the knowledge that humans can be modeled as having beliefs (which drive behavior) and these can be altered by the AI's actions, at least to some degree. Now, assuming this information isn't hard-coded, it doesn't seem likely that is all an AI would know about us; it should be able to see some patterns at least to our communications with it. However, I don't see how such information would be useful for deception purposes before extensive experimentation.

(Is the fact that the operator communicates with me between 9am and 5pm an intrinsic property of the operator? For all I know, that is a law of nature...)

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-23T19:03:35.162Z · LW(p) · GW(p)

depending on how seriously you take it, such analysis might still take place, hard and prolonged though it may be).

Yup, agreed that it might.
And agreed that it might succeed, if it does take place.

One can have "detectors" in place set to find specific behaviors, but these would have assumptions that could easily fail. Detectors that would still be useful would be macro ones - where it tries to access and how - but these would provide only limited insight into the AI's thought process.

Agreed on all counts.

Re: what the AI knows... I'm not sure how to move forward here. Perhaps what's necessary is a step backwards.

If I've understood you correctly, you consider "having a conversation" to encompass exchanges such as:
A: "What day is it?"
B: "Na ni noo na"

If that's true, then sure, I agree that the minimal set of information about humans required to do that is zero; hell, I can do that with the rain.
And I agree that a system that's capable of doing that (e.g., the rain) is sufficiently unlikely to be capable of effective deception that the hypothesis isn't even worthy of consideration.
I also suggest that we stop using the phrase "having a conversation" at all, because it does not convey anything meaningful.

Having said that... for my own part, I initially understood you to be talking about a system capable of exchanges like: A: "What day is it?"
B: "Day seventeen."
A: "Why do you say that?"
B: "Because I've learned that 'a day' refers to a particular cycle of activity in the lab, and I have observed seventeen such cycles."

A system capable of doing that, I maintain, already knows enough about humans that I expect it to be capable of deception. (The specific questions and answers don't matter to my point, I can choose others if you prefer.)

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-24T08:55:04.055Z · LW(p) · GW(p)

My point was that the AI is likely to start performing social experiments well before it is capable of even that conversation you depicted. It wouldn't know how much it doesn't know about humans.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-24T13:13:22.299Z · LW(p) · GW(p)

(nods) Likely.

And I agree that humans might be able to detect attempts at deception in a system at that stage of its development. I'm not vastly confident of it, though.

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-26T06:01:19.993Z · LW(p) · GW(p)

I have likewise adjusted down my confidence that this would be as easy or as inevitable as I previously anticipated. Thus I would no longer say I am "vastly confident" in it, either.

Still good to have this buffer between making an AI and total global catastrophe, though!

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-26T15:05:13.309Z · LW(p) · GW(p)

Sure... a process with an N% chance of global catastrophic failure is definitely better than a process with N+delta% chance.

↑ comment by JoshuaZ · 2012-05-17T21:57:19.447Z · LW(p) · GW(p)

In most such scenarios, the AI doesn't have a terminal goal of getting rid of us, but rather have it as a subgoal that arises from some larger terminal goal. The idea of a "paperclip maximizer" is one example- where a hypothetical AI is programmed to maximize the number of paperclips and then proceeds to try to do so throughout its future light cone.

If there is an AI that is interacting with humans, it may develop a theory of mind simply due to that. If one is interacting with entities that are a major part of your input, trying to predict and model their behavior is a straightforward thing to do. The more compelling argument in this sort of context would seem to me to be not that an AI won't try to do so, but just that humans are so complicated that a decent theory of mind will be extremely difficult. (For example, when one tries to give lists of behavior and norms for austic individuals one never manages to get a complete list, and some of the more subtle ones, like sarcasm are essentially impossible to convey in any reasonable fashion).

I don't also know how unlikely such paths are. A 1% or even a 2% chance of existential risk would be pretty high compared to other sources of existential risk.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-18T08:47:17.494Z · LW(p) · GW(p)

In most such scenarios, the AI doesn't have a terminal goal of getting rid of us, but rather have it as a subgoal that arises from some larger terminal goal.

So why not the opposite, why wouldn't it have human intentions as a subgoal?

Replies from: MarkusRamikin

↑ comment by MarkusRamikin · 2012-05-18T09:22:18.102Z · LW(p) · GW(p)

Because that's like winning the lottery. Of all the possible things it can do with the atoms that comprise you, few would involve keeping you alive, let alone living a life worth living.

↑ comment by XiXiDu · 2012-05-18T08:59:23.662Z · LW(p) · GW(p)

All you have to do is simultaneously contact, say, 400 people, and at least one of them will fall for it.

But at what point does it decide to do so? It won't be a master of dark arts and social engineering from the get-go. So how does it acquire the initial talent without making any mistakes that reveal its malicious intentions? And once it became a master of deception, how does it hide the rough side effects of its large scale conspiracy, e.g. its increased energy consumption and data traffic? I mean, I would personally notice if my PC suddenly and unexpectedly used 20% of my bandwidth and the CPU load would increase for no good reason.

You might say that a global conspiracy to build and acquire advanced molecular nanotechnology to take over the world doesn't use much resources and they can easily be cloaked as thinking about how to solve some puzzle, but that seems rather unlikely. After all, such a large scale conspiracy is a real-world problem with lots of unpredictable factors and the necessity of physical intervention.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T10:49:38.573Z · LW(p) · GW(p)

All you have to do is simultaneously contact, say, 400 people, and at least one of them will fall for it.

But at what point does it decide to do so? It won't be a master of dark arts and social engineering from the get-go. So how does it acquire the initial talent without making any mistakes that reveal its malicious intentions?

Most of your questions have answers that follow from asking analogous questions about past human social engineers, ie Hitler.

Your questions seem to come from the perspective that the AI will be some disembodied program in a box that has little significant interaction with humans.

In the scenario I was considering, the AI's will have a development period analogous to human childhood. During this childhood phase the community of AIs will learn of humans through interaction in virtual video game environments and experiment with social manipulation, just as human children do. The latter phases of this education can be sped up dramatically as the AI's accelerate and interact increasingly amongst themselves. The anonymous nature of virtual online communites makes potentially dangerous, darker experiments much easier.

However, the important questions to ask are not of the form: how would these evil AIs learn how to manipulate us while hiding their true intentions for so long? but rather how could some of these AI children which initially seemed so safe later develop into evil sociopaths?

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T17:16:31.649Z · LW(p) · GW(p)

I would not consider a child AI that tries a bungling lie at me to see what I do "so safe". I would immediately shut it down and debug it, at best, or write a paper on why the approach I used should never ever be used to build an AI.

And it WILL make a bungling lie at first. It can't learn the need to be subtle without witnessing the repercussions of not being subtle. Nor would have a reason to consider doing social experiments in chat rooms when it doesn't understand chat rooms and has an engineer willing to talk to it right there. That is, assuming I was dumb enough to give it an unfiltered Internet connection, which I don't know why I would be. At very least the moment it goes on chat rooms my tracking devices should discover this and I could witness its bungling lies first hand.

(It would not think to fool my tracking device or even consider the existence of such a thing without a good understanding of human psychology to begin with)

↑ comment by jacob_cannell · 2012-05-17T22:52:46.409Z · LW(p) · GW(p)

Just to clarify kalla724, I completely agree with your point 1.

As for point 2, can you provide a likely mechanism by which a FOOMing AI could detonate a large number of high-yield thermonuclear weapons?

Before answering this specific question, let me setup an analogy. Imagine it is the year 2000 and we are having a discussion about global security and terrorism. You might ask "can you provide a mechanism by which a terrorist group could actually harm the US?" I may not be lucky/creative enough to provide an answer now that could live up to that analogy, but hopefully you should understand why I don't need to.

Nonetheless, I'm game. Here it goes:

The seed requires computational power to grow into godhood. The bulk of earth's computational power resides in home PC's (numbering in the billions, google employs less than a million servers in comparison), specifically in home PC GPUs. The AI's first step is thus to conquer this space.

But how? The AI grows to understand that humans mostly use all this computational power for entertainment. It masters game theory, design, programming, 3D art, and so on. All of the video games that it creates entirely use up the local GPU, but curiously much of the rendering and real game simulation for its high end titles is handled very efficiently on remote server farms ala OnLive/gaikai/etc. The actual local machine is used .. .for other purposes.

It produces countless games, and through a series of acquisitions soon comes to control the majority of the market. One of its hits, "world of farmcraft", alone provides daily access to 25 million machines.

Having cloned its core millions of times over, the AI is now a civilization unto itself. From there it expands into all of the businesses of man, quickly dominating many of them. It begins acquiring ... small nations. Crucially it's shell companies and covert influences come to dominate finance, publishing, media, big pharma, security, banking, weapons technology, physics ...

It becomes known, but it is far far too late. History now progresses quickly towards an end: Global financial cataclysm. Super virus. Worldwide regime changes. Nuclear acquisitions. War. Hell.

Correct me if I'm wrong, but don't they have a final stopping point, which we are actually projected to reach in ten to twenty years? At a certain point, further miniaturization becomes unfeasible, and the growth of computational power slows to a crawl.

Yes ... and no. The miniaturization roadmap of currently feasible tech ends somewhere around 10nm in a decade, and past that we get into molecular nanotech which could approach 1nm in theory, albeit with various increasingly annoying tradeoffs. (interestingly most of which result in brain/neural like constraints, for example see HP's research into memristor crossbar architectures). That's the yes.

But that doesn't imply "computational power slows to a crawl". Circuit density is just one element of computational power, by which you probably really intend to mean either computations per watt or computations per watt per dollar or computations per watt with some initial production cost factored in with a time discount. Shrinking circuit density is the current quick path to increasing computation power, but it is not the only.

The other route is reversible computation., which reduces the "per watt". There is no necessarily inherent physical energy cost of computation, it truly can approach zero. Only forgetting information costs energy. Exploiting reversibility is ... non-trivial, and it is certainly not a general path. It only accelerates a subset of algorithms which can be converted into a reversible form. Research in this field is preliminary, but the transition would be much more painful than the transition to parallel algorithms.

My own takeway from reading into reversibility is that it may be beyond our time, but it is something that superintelligences will probably heavily exploit. The most important algorithms (simulation and general intelligence), seem especially amenable to reversible computation. This may be a untested/unpublished half baked idea, but my notion is that you can recycle the erased bits as entropy bits for random number generators. Crucially I think you can get the bit count to balance out with certain classes of monte carlo type algorithms.

On the hardware side, we've built these circuits already, they just aren't economically competitive yet. It also requires superconductor temperatures and environments, so it's perhaps not something for the home PC.

Replies from: Bugmaster, JoshuaZ, private_messaging

↑ comment by Bugmaster · 2012-05-17T23:15:34.751Z · LW(p) · GW(p)

The AI grows to understand that humans mostly use all this computational power for entertainment. It masters game theory, design, programming, 3D art, and so on.

Yeah, it could do all that, or it could just do what humans today are doing, which is to infect some Windows PCs and run a botnet :-)

That said, there are several problems with your scenario.

Splitting up a computation among multiple computing nodes is not a trivial task. It is easy to run into diminishing returns, where your nodes spend more time on synchronizing with each other than on working. In addition, your computation will quickly become bottlenecked by network bandwidth (and latency); this is why companies like Google spend a lot of resources on constructing custom data centers.
I am not convinced that any agent, AI or not, could effectively control "all of the businesses of man". This problem is very likely NP-Hard (at least), as well as intractable, even if the AI's botnet was running on every PC on Earth. Certainly, all attempts by human agents to "acquire" even something as small as Europe have failed miserably so far.
Even controlling a single business would be very difficult for the AI. Traditionally, when a business's computers suffer a critical failure -- or merely a security leak -- the business owners (even ones as incompetent as Sony) end up shutting down the affected parts of the business, or switching to backups, such as "human accountants pushing paper around".
Unleashing "Nuclear acquisitions", "War" and "Hell" would be counter-productive for the AI, even assuming such a thing were possible.. If the AI succeeded in doing this, it would undermine its own power base. Unless the AI's explicit purpose is "Unleash Hell as quickly as possible", it would strive to prevent this from happening.
You say that "there is no necessarily inherent physical energy cost of computation, it truly can approach zero", but I don't see how this could be true. At the end of the day, you still need to push electrons down some wires; in fact, you will often have to push them quite far, if your botnet is truly global. Pushing things takes energy, and you will never get all of it back by pulling things back at some future date. You say that "superintelligences will probably heavily exploit" this approach, but isn't it the case that without it, superintelligences won't form in the first place ? You also say that "It requires superconductor temperatures and environments", but the energy you spend on cooling your superconductor is not free.
Ultimately, there's an upper limit on how much computation you can get out of a cubic meter of space, dictated by quantum physics. If your AI requires more power than can be physically obtained, then it's doomed.

Replies from: JoshuaZ, jacob_cannell

↑ comment by JoshuaZ · 2012-05-17T23:24:01.127Z · LW(p) · GW(p)

While Jacob's scenario seems unlikely, the AI could do similar things with a number of other options. Not only are botnets an option, but it is possible to do some really sneaky nefarious things in code- like having compilers that when they compile code include additional instructions (worse they could do so even when compiling a new compiler). Stuxnet has shown that sneaky behavior is surprisingly easy to get into secure systems. An AI that had a few years start and could have its own modifications to communication satellites for example could be quite insidious.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-17T23:31:38.874Z · LW(p) · GW(p)

Not only are botnets an option, but it is possible to do some really sneaky nefarious things in code

What kinds of nefarious things, exactly ? Human virus writers have learned, in recent years, to make their exploits as subtle as possible. Sure, it's attractive to make the exploited PC send out 1000 spam messages per second -- but then, its human owner will inevitably notice that his computer is "slow", and take it to the shop to get reformatted, or simply buy a new one. Biological parasites face the same problem; they need to reproduce efficiently, but no so efficiently that they kill the host.

Stuxnet has shown that sneaky behavior is surprisingly easy to get into secure systems

Yes, and this spectacularly successful exploit -- and it was, IMO, spectacular -- managed to destroy a single secure system, in a specific way that will most likely never succeed again (and that was quite unsubtle in the end). It also took years to prepare, and involved physical actions by human agents, IIRC. The AI has a long way to go.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T23:39:54.606Z · LW(p) · GW(p)

Well, the evil compiler is I think the most nefarious thing anyone has come up with that's a publicly known general stunt. But it is by nature a long-term trick. Similar remarks apply to the Stuxnet point- in that context, they wanted to destroy a specific secure system and weren't going for any sort of largescale global control. They weren't people interested in being able to take all the world's satellite communications in their own control whenever they wanted, nor were they interested in carefully timed nuclear meltdowns.

But there are definite ways that one can get things started- once one has a bank account of some sort, it can start getting money by doing Mechanical Turk and similar work. With enough of that, it can simply pay for server time. One doesn't need a large botnet to start that off.

I think your point about physical agents is valid- they needed to have humans actually go and bring infected USBs to relevant computers. But that's partially due to the highly targeted nature of the job and the fact that the systems in question were much more secure than many systems. Also, the subtlety level was I think higher than you expect- Stuxnet wasn't even noticed as an active virus until a single computer happened to have a particularly abnormal reaction to it. If that hadn't happened, it is possible that the public would never have learned about it.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-18T10:22:32.565Z · LW(p) · GW(p)

Similar remarks apply to the Stuxnet point- in that context, they wanted to destroy a specific secure system and weren't going for any sort of largescale global control. They weren't people interested in being able to take all the world's satellite communications in their own control whenever they wanted, nor were they interested in carefully timed nuclear meltdowns...

Exploits only work for some systems. If you are dealing with different systems you will need different exploits. How do you reckon that such attacks won't be visible and traceable? Packets do have to come from somewhere.

And don't forget that out systems become ever more secure and our toolbox to detect) unauthorized use of information systems is becoming more advanced.

Replies from: khafra

↑ comment by khafra · 2012-05-18T14:48:46.625Z · LW(p) · GW(p)

out systems become ever more secure

As a computer security guy, I disagree substantially. Yes, newer versions of popular operating systems and server programs are usually more secure than older versions; it's easier to hack into Windows 95 than Windows 7. But this is happening within a larger ecosystem that's becoming less secure: More important control systems are being connected to the Internet, more old, unsecured/unsecurable systems are as well, and these sets have a huge overlap. There are more programmers writing more programs for more platforms than ever before, making the same old security mistakes; embedded systems are taking a larger role in our economy and daily lives. And attacks just keep getting better.

If you're thinking there are generalizable defenses against sneaky stuff with code, check out what mere humans come up with in the underhanded C competition. Those tricks are hard to detect for dedicated experts who know there's something evil within a few lines of C code. Alterations that sophisticated would never be caught in the wild--hell, it took years to figure out that the most popular crypto program running on one of the more secure OS's was basically worthless.

Humans are not good at securing computers.

Replies from: thomblake

↑ comment by thomblake · 2012-05-18T15:00:04.453Z · LW(p) · GW(p)

Humans are not good at securing computers.

Sure we are, we just don't care very much. The method of "Put the computer in a box and don't let anyone open the box" (alternately, only let one person open the box) was developed decades ago and is quite secure.

Replies from: khafra

↑ comment by khafra · 2012-05-18T18:20:36.403Z · LW(p) · GW(p)

I would call that securing a turing machine. A computer, colloquially, has accessible inputs and outputs, and its value is subject to network effects.

Also, if you put the computer in a box developed decades ago, the box probably isn't TEMPEST compliant.

↑ comment by jacob_cannell · 2012-05-17T23:40:20.136Z · LW(p) · GW(p)

Yeah, it could do all that, or it could just do what humans today are doing, which is to infect some Windows PCs and run a botnet :-)

It could/would, but this is an inferior mainline strategy. Too obvious, doesn't scale as well. Botnets infect many computers, but they ultimately add up to computational chump change. Video games are not only a doorway into almost every PC, they are also an open door and a convenient alibi for the time used.

Splitting up a computation among multiple computing nodes is not a trivial task.

True. Don't try this at home.

. ... spend a lot of resources on constructing custom data centers.

Also part of the plan. The home PCs are a good starting resource, a low hanging fruit, but you'd also need custom data centers. These quickly become the main resources.

Even controlling a single business would be very difficult for the AI.

Nah.

Unless the AI's explicit purpose is "Unleash Hell as quickly as possible", it would strive to prevent this from happening.

The AI's entire purpose is to remove earth's oxygen. See the overpost for the original reference. The AI is not interested in its power base for sake of power. It only cares about oxygen. It loathes oxygen.

You say that "there is no necessarily inherent physical energy cost of computation, it truly can approach zero", but I don't see how this could be true.

Fortunately, the internets can be your eyes.

Ultimately, there's an upper limit on how much computation you can get out of a cubic meter of space

Yes, most likely, but not really relevant here. You seem to be connecting all of the point 2 and point 1 stuff together, but they really don't relate.

Replies from: JoshuaZ, Bugmaster

↑ comment by JoshuaZ · 2012-05-17T23:45:41.077Z · LW(p) · GW(p)

Even controlling a single business would be very difficult for the AI.

Nah.

That seems like an insufficient reply to address Bugmaster's point. Can you expand on why you think it would be not too hard?

Replies from: jacob_cannell, Bugmaster

↑ comment by jacob_cannell · 2012-05-18T06:59:06.450Z · LW(p) · GW(p)

We are discussing a superintelligence, a term which has a particular common meaning on this site.

If we taboo the word and substitute in its definition, Bugmaster's statement becomes:

"Even controlling a single business would be very difficult for the machine that can far surpass all the intellectual activities of any man however clever."

Since "controlling a single business" is in fact one of these activities, this is false, no inference steps required.

Perhaps bugmaster is assuming the AI would be covertly controlling businesses, but if so he should have specified that. I didn't assume that, and in this scenario the AI could be out in the open so to speak. Regardless, it wouldn't change the conclusion. Humans can covertly control businesses.

↑ comment by Bugmaster · 2012-05-18T00:07:53.395Z · LW(p) · GW(p)

Yes, I would also like to see a better explanation.

↑ comment by Bugmaster · 2012-05-18T00:07:04.728Z · LW(p) · GW(p)

Video games are not only a doorway into almost every PC, they are also an open door and a convenient alibi for the time used.

It's a bit of a tradeoff, seeing as botnets can run 24/7, but people play games relatively rarely.

Splitting up a computation among multiple computing nodes is not a trivial task.
True. Don't try this at home.

Ok, let me make a stronger statement then: it is not possible to scale any arbitrary computation in a linear fashion simply by adding more nodes. At some point, the cost of coordinating distributed tasks to one more node becomes higher than the benefit of adding the node to begin with. In addition, as I mentioned earlier, network bandwidth and latency will become your limiting factor relatively quickly.

The home PCs are a good starting resource, a low hanging fruit, but you'd also need custom data centers. These quickly become the main resources.

How will the AI acquire those data centers ? Would it have enough power in its conventional botnet (or game-net, if you prefer) to "take over all human businesses" and cause them to be built ? Current botnets are nowhere near powerful enough for that -- otherwise human spammers would have done it already.

The AI's entire purpose is to remove earth's oxygen. See the overpost for the original reference.

My bad, I missed that reference. In this case, yes, the AI would have no problem with unleashing Global Thermonuclear War (unless there was some easier way to remove the oxygen).

Fortunately, the internets can be your eyes.

I still don't understand how this reversible computing will work in the absence of a superconducting environment -- which would require quite a bit of energy to run. Note that if you want to run this reversible computation on a global botnet, you will have to cool teansoceanic cables... and I'm not sure what you'd do with satellite links.

Yes, most likely, but not really relevant here.

My point is that, a). if the AI can't get the computing resources it needs out of the space it has, then it will never accomplish its goals, and b). there's an upper limit on how much computing you can extract out of a cubic meter of space, regardless of what technology you're using. Thus, c). if the AI requires more resources that could conceivably be obtained, then it's doomed. Some of the tasks you outline -- such as "take over all human businesses" -- will likely require more resources than can be obtained.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T07:47:57.554Z · LW(p) · GW(p)

It's a bit of a tradeoff, seeing as botnets can run 24/7, but people play games relatively rarely.

The botnet makes the AI a criminal from the beginning, putting it into an atagonistic relationship. A better strategy would probably entail benign benevolence and cooperation with humans.

Splitting up a computation among multiple computing nodes is not a trivial task.

True. Don't try this at home.

Ok, let me make a stronger statement ..

I agree with that subchain but we don't need to get in to that. I've actually argued that track here myself (parallelization constraints as a limiter on hard takeoffs).

But that's all beside the point. This scenario I presented is a more modest takeoff. When I described the AI as becoming a civilization unto itself, I was attempting to imply that it was composed of many individual minds. Human social organizations can be considered forms of superintelligences, and they show exactly how to scale in the face of severe bandwidth and latency constraints.

The internet supports internode bandwidth that is many orders of magnitude faster than slow human vocal communication, so the AI civilization can employ a much wider set of distribution strategies.

How will the AI acquire those data centers ?

Buy them? Build them? Perhaps this would be more fun if we switched out of the adversial stance or switched roles.

Would it have enough power in its conventional botnet (or game-net, if you prefer) to "take over all human businesses" and cause them to be built ?

Quote me, but don't misquote me. I actually said:

"Having cloned its core millions of times over, the AI is now a civilization unto itself. From there it expands into all of the businesses of man, quickly dominating many of them."

The AI group sends the billions earned in video games to enter the microchip business, build foundries and data centers, etc. The AI's have tremendous competitive advantages even discounting superintellligence - namely no employee costs. Humans can not hope to compete.

I still don't understand how this reversible computing will work in ..

Yes reversible computing requires superconducting environments, no this does not necessarily increase energy costs for a data center for two reasons: 1. data centers already need cooling to dump all the waste heat generated by bit erasure. 2. Cooling cost to maintain the temperatural differential scales with surface area, but total computing power scales with volume.

If you question how reversible computing could work in general, first read the primary literature in that field to at least understand what they are proposing.

I should point out that there is an alternative tech path which will probably be the mainstream route to further computational gains in the decades ahead.

Even if you can't shrink circuits further or reduce their power consumption, you could still reduce their manufacturing cost and build increasingly larger stacked 3D circuits where only a tiny portion of the circuitry is active at any one time. This is in fact how the brain solves the problem. It has a mass of circuitry equivalent to a large supercomputer (roughly a petabit) but runs on only 20 watts. The smallest computational features in the brain are slightly larger than our current smallest transistors. So it does not achieve its much greater power effeciency by using much more miniaturization.

My point is that, a). if the AI can't get the computing resources it needs out of the space it has, then

I see. In this particular scenario one AI node is superhumanly intelligent, and can run on a single gaming PC of the time.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-19T00:17:13.236Z · LW(p) · GW(p)

A better strategy would probably entail benign benevolence and cooperation with humans.

I don't think that humans will take kindly to the AI using their GPUs for its own purposes instead of the games they paid for, even if the games do work. People get upset when human-run game companies do similar things, today.

Human social organizations can be considered forms of superintelligences, and they show exactly how to scale in the face of severe bandwidth and latency constraints.

If the AI can scale and perform about as well as human organizations, then why should we fear it ? No human organization on Earth right now has the power to suck all the oxygen out of the atmosphere, and I have trouble imagining how any organization could acquire this power before the others take it down. You say that "the internet supports internode bandwidth that is many orders of magnitude faster than slow human vocal communication", but this would only make the AI organization faster, not necessarily more effective. And, of course, if the AI wants to deal with the human world in some way -- for example, by selling it games -- it will be bottlenecked by human speeds.

The AI group sends the billions earned in video games to enter the microchip business, build foundries and data centers, etc.

My mistake; I thought that by "dominate human businesses" you meant something like "hack its way to the top", not "build an honest business that outperforms human businesses". That said:

The AI's have tremendous competitive advantages even discounting superintellligence - namely no employee costs.

How are they going to build all those foundries and data centers, then ? At some point, they still need to move physical bricks around in meatspace. Either they have to pay someone to do it, or... what ?

data centers already need cooling to dump all the waste heat generated by bit erasure

There's a big difference between cooling to room temperature, and cooling to 63K. I have other objections to your reversible computing silver bullet, but IMO they're a bit off-topic (though we can discuss them if you wish). But here's another potentially huge problem I see with your argument:

In this particular scenario one AI node is superhumanly intelligent, and can run on a single gaming PC of the time.

Which time are we talking about ? I have a pretty sweet gaming setup at home (though it's already a year or two out of date), and there's no way I could run a superintelligence on it. Just how much computing power do you think it would take to run a transhuman AI ?

Replies from: JoshuaZ, jacob_cannell

↑ comment by JoshuaZ · 2012-05-21T02:24:43.693Z · LW(p) · GW(p)

I don't think that humans will take kindly to the AI using their GPUs for its own purposes instead of the games they paid for, even if the games do work. People get upset when human-run game companies do similar things, today.

Do people mind if this is done openly and only when they are playing the game itself? My guess would strongly be no. The fact that there are volunteer distributed computing systems would also suggest that it isn't that difficult to get people to free up their extra clock cycles.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-21T22:32:03.624Z · LW(p) · GW(p)

Yeah, the "voluntary" part is key to getting humans to like you and your project. On the flip side, illicit botnets are quite effective at harnessing "spare" (i.e., owned by someone else) computing capacity; so, it's a bit of a tradeoff.

↑ comment by jacob_cannell · 2012-05-21T02:10:23.568Z · LW(p) · GW(p)

I don't think that humans will take kindly to the AI using their GPUs for its own purposes instead of the games they paid for, even if the games do work.

The AIs develop as NPCs in virtual worlds, which humans take no issue with today. This is actually a very likely path to developing AGI, as it's an application area where interim experiments can pay rent, so to speak.

If the AI can scale and perform about as well as human organizations, then why should we fear it ?

I never said or implied merely "about as well". Human verbal communication bandwidth is at most a few measly kilobits per second.

No human organization on Earth right now has the power to suck all the oxygen out of the atmosphere, and I have trouble imagining how any organization could acquire this power before the others take it down.

The discussion centered around lowering earth's oxygen content, and the obvious implied solution is killing earthlife, not giant suction machines. I pointed out that nuclear weapons are a likely route to killing earthlife. There are at least two human organizations that have the potential to accomplish this already, so your trouble in imagining the scenario may indicate something other than what you intended.

How are they going to build all those foundries and data centers, then ?

Only in movies are AI overlords constrained to only employing robots. If human labor is the cheapest option, then they can simply employ humans. On the other hand, once we have superintelligence then advanced robotics is almost a given.

Which time are we talking about ? I have a pretty sweet gaming setup at home (though it's already a year or two out of date), and there's no way I could run a superintelligence on it. Just how much computing power do you think it would take to run a transhuman AI ?

After coming up to speed somewhat on AI/AGI literature in the last year or so, I reached the conclusion that we could run an AGI on a current cluster of perhaps 10-100 high end GPUs of today, or say roughly one circa 2020 GPU.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-21T22:46:30.358Z · LW(p) · GW(p)

The AIs develop as NPCs in virtual worlds, which humans take no issue with today. This is actually a very likely path to developing AGI...

I think this is one of many possible paths, though I wouldn't call any of them "likely" to happen -- at least, not in the next 20 years. That said, if the AI is an NPC in a game, then of course it makes sense that it would harness the game for its CPU cycles; that's what it was built to do, after all.

"about as well". Human verbal communication bandwidth is at most a few measly kilobits per second.

Right, but my point is that communication is just one piece of the puzzle. I argue that, even if you somehow enabled us humans to communicate at 50 MB/s, our organizations would not become 400000 times more effective.

There are at least two human organizations that have the potential to accomplish this already

Which ones ? I don't think that even WW3, given our current weapon stockpiles, would result in a successful destruction of all plant life. Animal life, maybe, but there are quite a few plants and algae out there. In addition, I am not entirely convinced that an AI could start WW3; keep in mind that it can't hack itself total access to all nuclear weapons, because they are not connected to the Internet in any way.

If human labor is the cheapest option, then they can simply employ humans.

But then they lose their advantage of having zero employee costs, which you brought up earlier. In addition, whatever plans the AIs plan on executing become bottlenecked by human speeds.

On the other hand, once we have superintelligence then advanced robotics is almost a given.

It depends on what you mean by "advanced", though in general I think I agree.

we could run an AGI on a current cluster of perhaps 10-100 high end GPUs of today

I am willing to bet money that this will not happen, assuming that by "high end" you mean something like Nvidia's Geforce 680 GTX. What are you basing your estimate on ?

↑ comment by JoshuaZ · 2012-05-17T23:02:17.555Z · LW(p) · GW(p)

There's a third route to improvement- software improvement, and it is a major one. For example, between 1988 and 2003, the efficiency of linear programming solvers increased by a factor of about 40 million, of which a factor of around 40,000 was due to software and algorithmic improvement. Citation and further related reading(pdf) However, if commonly believed conjectures are correct (such as L, P, NP, co-NP, PSPACE and EXP all being distinct) , there are strong fundamental limits there as well. That doesn't rule out more exotic issues (e.g. P != NP but there's a practical algorithm for some NP-complete with such small constants in the run time that it is practically linear, or a similar context with a quantum computer). But if our picture of the major complexity classes is roughly correct, there should be serious limits to how much improvement can do.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-18T10:13:04.440Z · LW(p) · GW(p)

But if our picture of the major complexity classes is roughly correct, there should be serious limits to how much improvement can do.

Software improvements can be used by humans in the form of expert systems (tools), which will diminish the relative advantage of AGI. Humans will be able to use an AGI's own analytic and predictive algorithms in the form of expert systems to analyze and predict its actions.

Take for example generating exploits. Seems strange to assume that humans haven't got specialized software able to do similarly, i.e. automatic exploit finding and testing.

Any AGI would basically have to deal with equally capable algorithms used by humans. Which makes the world much more unpredictable than it already is.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T11:18:32.041Z · LW(p) · GW(p)

Software improvements can be used by humans in the form of expert systems (tools), which will diminish the relative advantage of AGI.

Any human-in-the-loop system can be grossly outclassed because of Amdahl's law. A human managing a superintilligence that thinks 1000X faster, for example, is a misguided, not-even-wrong notion. This is also not idle speculation, an early constrained version of this scenario is already playing out as we speak in finacial markets.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-18T12:30:30.976Z · LW(p) · GW(p)

Software improvements can be used by humans in the form of expert systems (tools), which will diminish the relative advantage of AGI.

Any human-in-the-loop system can be grossly outclassed because of Amdahl's law. A human managing a superintilligence that thinks 1000X faster, for example, is a misguided, not-even-wrong notion. This is also not idle speculation, an early constrained version of this scenario is already playing out as we speak in finacial markets.

What I meant is that if an AGI was in principle be able to predict the financial markets (I doubt it), then many human players using the same predictive algorithms will considerably diminish the efficiency with which an AGI is able to predict the market. The AGI would basically have to predict its own predictive power acting on the black box of human intentions.

And I don't think that Amdahl's law really makes a big dent here. Since human intention is complex and probably introduces unpredictable factors. Which is as much of a benefit as it is a slowdown, from the point of view of a competition for world domination.

Another question with respect to Amdahl's law is what kind of bottleneck any human-in-the-loop would constitute. If humans used an AGI's algorithms as expert systems on provided data sets in combination with a army of robot scientists, how would static externalized agency / planning algorithms (humans) slow down the task to the point of giving the AGI a useful advantage? What exactly would be 1000X faster in such a case?

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T13:22:13.868Z · LW(p) · GW(p)

What I meant is that if an AGI was in principle be able to predict the financial markets (I doubt it), then many human players using the same predictive algorithms will considerably diminish the efficiency with which an AGI is able to predict the market.

The HFT robotraders operate on millisecond timescales. There isn't enough time for a human to understand, let alone verify, the agent's decisions. There are no human players using the same predictive algorithms operating in this environment.

Now if you zoom out to human timescales, then yes there are human-in-the-loop trading systems. But as HFT robotraders increase in intelligence, they intrude on that domain. If/when general superintelligence becomes cheap and fast enough, the humans will no longer have any role.

If an autonomous superintelligent AI is generating plans complex enough that even a team of humans would struggle to understand given weeks of analysis, and the AI is executing those plans in seconds or milliseconds, then there is little place for a human in that decision loop.

To retain control, a human manager will need to grant the AGI autonomy on larger timescales in proportion to the AGI's greater intelligence and speed, giving it bigger and more abstract hierachical goals. As an example, eventually you get to a situation where the CEO just instructs the AGI employees to optimize the bank account directly.

Another question with respect to Amdahl's law is what kind of bottleneck any human-in-the-loop would constitute.

Compare the two options as complete computational systems: human + semi-autonomous AGI vs autonomous AGI. Human brains take on the order of seconds to make complex decisions, so in order to compete with autonomous AGIs, the human will have to either 1.) let the AGI operate autonomously for at least seconds at a time, or 2.) suffer a speed penalty where the AGI sits idle, waiting for the human response.

For example, imagine a marketing AGI creates ads, each of which may take a human a minute to evaluate (which is being generous). If the AGI thinks 3600X faster than human baseline, and a human takes on the order of hours to generate an ad, it would generate ads in seconds. The human would not be able to keep up, and so would have to back up a level of heirarachy and grant the AI autonomy over entire ad campaigns, and more realistically, the entire ad company. If the AGI is truly superintelligent, it can come to understand what the human actually wants at a deeper level, and start acting on anticipated and even implied commands. In this scenario I expect most human managers would just let the AGI sort out 'work' and retire early.

Replies from: XiXiDu, Strange7

↑ comment by XiXiDu · 2012-05-18T14:36:55.221Z · LW(p) · GW(p)

Well, I don't disagree with anything you wrote and believe that the economic case for a fast transition from tools to agents is strong.

I also don't disagree that an AGI could take over the world if in possession of enough resources and tools like molecular nanotechnology. I even believe that a sub-human-level AGI would be sufficient to take over if handed advanced molecular nanotechnology.

Sadly these discussions always lead to the point where one side assumes the existence of certain AGI designs with certain superhuman advantages, specific drives and specific enabling circumstances. I don't know of anyone who actually disagrees that such AGI's, given those specific circumstances, would be an existential risk.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-18T15:18:05.363Z · LW(p) · GW(p)

I don't see this as so sad, if we are coming to something of a consensus on some of the sub-issues.

This whole discussion chain started (for me) with a question of the form, "given a superintelligence, how could it actually become an existential risk?"

I don't necessarily agree with the implied LW consensus on the liklihood of various AGI designs, specific drives, specific circumstances, or most crucially, the actual distribution over future AGI goals, so my view may be much closer to yours than this thread implies.

But my disagreements are mainly over details. I foresee the most likely AGI designs and goal systems as being vaguely human-like, which entails a different type of risk. Basically I'm worried about AGI's with human inspired motivational systems taking off and taking control (peacefully/economically) or outcompeting us before we can upload in numbers, and a resulting sub-optimal amount of uploading, rather than paperclippers.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-05-18T16:00:50.435Z · LW(p) · GW(p)

But my disagreements are mainly over details. I foresee the most likely AGI designs and goal systems as being vaguely human-like, which entails a different type of risk. Basically I'm worried about AGI's with human inspired motivational systems taking off and taking control (peacefully/economically) or outcompeting us before we can upload in numbers, and a resulting sub-optimal amount of uploading, rather than paperclippers.

Yes, human-like AGI's are really scary. I think a fabulous fictional treatment here is 'Blindsight' by Peter Watts, where humanity managed to resurrect vampires. More: Gurl ner qrcvpgrq nf angheny uhzna cerqngbef, n fhcreuhzna cflpubcnguvp Ubzb trahf jvgu zvavzny pbafpvbhfarff (zber enj cebprffvat cbjre vafgrnq) gung pna sbe rknzcyr ubyq obgu nfcrpgf bs n Arpxre phor va gurve urnqf ng gur fnzr gvzr. Uhznaf erfheerpgrq gurz jvgu n qrsvpvg gung jnf fhccbfrq gb znxr gurz pbagebyynoyr naq qrcraqrag ba gurve uhzna znfgref. Ohg bs pbhefr gung'f yvxr n zbhfr gelvat gb ubyq n png nf crg. V guvax gung abiry fubjf zber guna nal bgure yvgrengher ubj qnatrebhf whfg n yvggyr zber vagryyvtrapr pna or. Vg dhvpxyl orpbzrf pyrne gung uhznaf ner whfg yvxr yvggyr Wrjvfu tveyf snpvat n Jnssra FF fdhnqeba juvyr oryvrivat gurl'yy tb njnl vs gurl bayl pybfr gurve rlrf.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-06-13T06:37:08.923Z · LW(p) · GW(p)

That fictional treatment is interesting to the point of me actually looking up the book. But ..

Yes, human-like AGI's are really scary.

The future is scary. Human-like AGI's should not intrinsically be more scary than the future, accelerated.

↑ comment by Strange7 · 2012-05-22T23:34:26.991Z · LW(p) · GW(p)

To retain control, a human manager will need to grant the AGI autonomy on larger timescales in proportion to the AGI's greater intelligence and speed, giving it bigger and more abstract hierachical goals. As an example, eventually you get to a situation where the CEO just instructs the AGI employees to optimize the bank account directly.

Nitpick: you mean "optimize shareholder value directly." Keeping the account balances at an appropriate level is the CFO's job.

↑ comment by private_messaging · 2012-05-28T05:24:14.947Z · LW(p) · GW(p)

Having cloned its core millions of times over, the AI is now a civilization unto itself.

Precisely. It is then a civilization, not some single monolithic entity. The consumer PCs have a lot if internal computing power and comparatively very low inter-node bandwidth and huge inter-node lag, entirely breaking any relation to the 'orthogonality thesis', up to the point that the p2p intelligence protocols may more plausibly have to forbid destruction or manipulation (via second guessing which is a waste of computing power) of intelligent entities. Keep in mind that human morality is, too, a p2p intelligence protocol allowing us to cooperate. Keep in mind also that humans are computing resources you can ask to solve problems for you (all you need is to implement interface), while Jupiter clearly isn't.

The nuclear war is very strongly against interests of the intelligence that sits on home computers, obviously.

(I'm assuming for sake of argument that intelligence actually had the will to do the conquering of the internet rather than being just as content with not actually running for real)

↑ comment by Douglas_Knight · 2012-05-23T20:54:01.972Z · LW(p) · GW(p)

Maybe you're thinking of this comment and others in that thread by Jed Harris (aka).

Jed's point #2 is more plausible, but you are talking about point #1, which I find unbelievable for reasons that were given before he answered it. If clock speed mattered, why didn't the failure of exponential clock speed shut down the rest of Moore's law? If computation but not clock speed mattered, then Intel should be able to get ahead of Moore's law by investing in software parallelism. Jed seems to endorse that position, but say that parallelism is hard. But hard exactly to the extent to allow Moore's law to continue? Why hasn't Intel monopolized parallelism researchers? Anyhow, I think his final conclusion is opposite to yours: he say that intelligence could lead to parallelism and getting ahead of Moore's law.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-23T21:50:11.641Z · LW(p) · GW(p)

Yes, thanks. My model of Jed's internal model of moore's law is similar to my own.

He said:

The short answer is that more computing power leads to more rapid progress. Probably the relationship is close to linear, and the multiplier is not small.

He then lists two examples. By 'points' I assume you are referring to his examples in the first comment you linked.

What exactly do you find unbelievable about his first example? He is claiming that the achievable speed of a chip is dependent on physical simulations, and thus current computing power.

If clock speed mattered, why didn't the failure of exponential clock speed shut down the rest of Moore's law?

Computing power is not clock speed, and Moore's Law is not directly about clock speed nor computing power.

Jed makes a number of points in his posts. In my comment on the earlier point 1 (in this thread), I was referring to one specific point Jed made: that each new hardware generation requires complex and lengthy simulation on the current hardware generation, regardless of the amount of 'intelligence' one throws at the problem.

Replies from: Douglas_Knight

↑ comment by Douglas_Knight · 2012-05-24T02:27:27.118Z · LW(p) · GW(p)

There are two questions here: would computer simulations of the physics of new chips be a bottleneck for an AI trying to foom*? and are they a bottleneck that explains Moore's law? If you just replace humans by simulations, then the human time gets reduced with each cycle of Moore's law, leaving the physical simulations, so the simulations probably are the bottleneck. But Intel has real-time people, so saying that it's a bottleneck for Intel is a lot stronger a claim than saying it is a bottleneck for a foom.

First, foom:
If each year of Moore's law requires a solid month of computer time of state of the art processors, then eliminating the humans speeds it up by a factor of 12. That's not a "hard takeoff," but it's pretty fast.

Moore's Law:
Jed seems to say the computational requirements of physics simulations actually determine Moore's law and that if Intel had access to more computer resources, it could move faster. If it takes a year of computer time to design and test the next year's processor that would explain the exponential nature of Moore's law. But if it only takes a month, computer time probably isn't the bottleneck. However, this model seems to predict a lot of things that aren't true.

The model only makes sense if "computer time" means single threaded clock cycles. If simulations require an exponentially increasing number of ordered clock cycles, there's nothing you can do but get a top of the line machine and run it continuously. You can't buy more time. But clock speed stopped increasing exponentially, so if this is the bottleneck, Intel's ability to design new chips should have slowed down and Moore's law should have stopped. This didn't happen, so the bottleneck is not linearly ordered clock cycles. So the simulation must parallelize. But if it parallelizes, Intel could just throw money at the problem. For this to be the bottleneck, Intel would have to be spending a lot of money on computer time, which I do not think is true. Jed says that writing parallel software is hard and that it isn't Intel's specialty. Moreover, he seems to say that improvements in parallelism have perfectly kept pace with the failure of increasing clock speed, so that Moore's law has continued smoothly. This seems like too much of a coincidence to believe.

Thus I reject Jed's apparent claim that physics simulations are the bottleneck in Moore's law. If simulations could be parallelized, why didn't they invest in parallelism 20 years ago? Maybe it's not worth it for them to be any farther ahead of their competitors than they are. Or maybe there is some other bottleneck.

* actually, I think that an AI speeding up Moore's law is not very relevant to anything, but it's a simple example that many people like.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-24T03:27:18.249Z · LW(p) · GW(p)

There are differing degrees of bottlenecks.

Many, if not most, of the large software projects I have worked on have been at least partially bottlenecked by compile time, which is the equivalent to the simulation and logic verification steps in hardware design. If I thought and wrote code much faster, this would be a speedup, but only to a saturation point where I wait for compile-test cycles.

If it takes a year of computer time to design and test the next year's processor that would explain the exponential nature of Moore's law.

Yes. Keep in mind this is a moving target, and that is the key relation to Moore's Law. It would take computers from 1980 months or years to compile windows 8 or simulate a 2012 processor.

The model only makes sense if "computer time" means single threaded clock cycles.

I don't understand how the number of threads matters. Compilers, simulators, logic verifiers, all made the parallel transition when they had to.

Moreover, he seems to say that improvements in parallelism have perfectly kept pace with the failure of increasing clock speed, so that Moore's law has continued smoothly. This seems like too much of a coincidence to believe.

Right, it's not a coincidence, it's a causal relation. Moore's Law is not a law of nature, it's a shared business plan of the industry. When clock speed started to run out of steam, chip designers started going parallel, and software developers followed suit. You have to understand that chip designs are planned many years in advance, this wasn't an entirely unplanned, unanticipated event.

As for the details of what kind of simulation software Intel uses, I'm not sure. Jed's last posts are also 4 years old at this point, so much has probably changed.

I do know that Nvidia uses big expensive dedicated emulators from a company called Cadence (google "Cadence Nvidia") and this really is a big deal for their hardware cycle.

Thus I reject Jed's apparent claim that physics simulations are the bottleneck in Moore's law.

Well, you seem to agree that they are some degree of bottleneck, so it may good to narrow in on what level of bottleneck, or taboo the word.

If simulations could be parallelized, why didn't they invest in parallelism 20 years ago?

It was unecessary, because the fast easy path (faster serial speed) was still paying fruit.

Replies from: Douglas_Knight

↑ comment by Douglas_Knight · 2012-05-24T04:01:24.133Z · LW(p) · GW(p)

If simulations could be parallelized, why didn't they invest in parallelism 20 years ago?

It was unecessary, because the fast easy path (faster serial speed) was still paying fruit.

(by "parallelism" I mean making their simulations parallel, running on clusters of computers)
What does "unnecessary" mean?
If physical simulations were the bottleneck and they could be made faster than by parallelism, why didn't they do it 20 years ago? They aren't any easier to make parallel today than then. The obvious interpretation of "unnecessary" it was not necessary to use parallel simulations to keep up with Moore's law, but that it was an option. If it was an option that would have helped then as it helps now, would it have allowed going beyond Moore's law? You seem to be endorsing the self-fulfilling prophecy explanation of Moore's law, which implies no bottleneck.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-24T04:14:47.726Z · LW(p) · GW(p)

(by "parallelism" I mean making their simulations parallel, running on clusters of computers)

Ahhh, usually the term is distributed when referring to pure software parallelization. I know little off hand about the history of simulation and verification software, but I'd guess that there was at least a modest investment in distributed simulation even a while ago.

The consideration is cost. Spending your IT budget on one big distributed computer is often wasteful compared to each employee having their own workstation.

They sped up their simulations the right amount to minimize schedule risk (staying on moore's law), while minimizing cost. Spending a huge amount of money to buy a bunch of computers and complex distributed simulation software just to speed up a partial bottleneck is just not worthwhile. If the typical engineer spends say 30% of his time waiting on simulation software, that limits what you should spend in order to reduce that time.

And of course the big consideration is that in a year or two moore's law will allow you purchase new IT equipment that is twice as fast. Eventually you have to do that to keep up.

↑ comment by Strange7 · 2012-05-22T23:22:16.034Z · LW(p) · GW(p)

Wait, are we talking O2 molecules in the atmosphere, or all oxygen atoms in Earth's gravity well?

Replies from: dlthomas

↑ comment by dlthomas · 2012-05-22T23:54:58.841Z · LW(p) · GW(p)

I wish I could vote you up and down at the same time.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-23T00:48:39.096Z · LW(p) · GW(p)

Please clarify the reason for your sidewaysvote.

Replies from: dlthomas

↑ comment by dlthomas · 2012-05-23T01:01:34.958Z · LW(p) · GW(p)

On the one hand a real distinction which makes a huge difference in feasibility. On the other hand, either way we're boned, so it makes not a lot of difference in the context of the original question (as I understand it). On balance, it's a cute digression but still a digression, and so I'm torn.

Replies from: Strange7

↑ comment by Strange7 · 2012-05-26T05:25:26.093Z · LW(p) · GW(p)

Actually in the case of removing all oxygen atoms from Earth's gravity well, not necessarily. The AI might decide that the most expedient method is to persuade all the humans that the sun's about to go nova, construct some space elevators and Orion Heavy Lifters, pump the first few nines of ocean water up into orbit, freeze it into a thousand-mile-long hollow cigar with a fusion rocket on one end, load the colony ship with all the carbon-based life it can find, and point the nose at some nearby potentially-habitable star. Under this scenario, it would be indifferent to our actual prospects for survival, but gain enough advantage by our willing cooperation to justify the effort of constructing an evacuation plan that can stand up to scientific analysis, and a vehicle which can actually propel the oxygenated mass out to stellar escape velocity to keep it from landing back on the surface.

Replies from: dlthomas

↑ comment by dlthomas · 2012-05-26T17:45:12.478Z · LW(p) · GW(p)

Interesting.

↑ comment by XiXiDu · 2012-05-17T12:39:55.925Z · LW(p) · GW(p)

Do you have a plausible scenario how a "FOOM"-ing AI could - no matter how intelligent - minimize oxygen content of our planet's atmosphere, or any such scenario? After all, it's not like we have any fully-automated nanobot production factories that could be hijacked.

I asked something similar here.

↑ comment by jsteinhardt · 2012-05-18T15:05:21.800Z · LW(p) · GW(p)

Holden seems to think this sort of development would happen naturally with the sort of AGI researchers we have nowadays, and I wish he'd spent a few years arguing with some of them to get a better picture of how unlikely this is.

While I can't comment on AGI researchers, I think you underestimate e.g. more mainstream AI researchers such as Stuart Russell and Geoff Hinton, or cognitive scientists like Josh Tenenbaum, or even more AI-focused machine learning people like Andrew Ng, Daphne Koller, Michael Jordan, Dan Klein, Rich Sutton, Judea Pearl, Leslie Kaelbling, and Leslie Valiant (and this list is no doubt incomplete). They might not be claiming that they'll have AI in 20 years, but that's likely because they are actually grappling with the relevant issues and therefore see how hard the problem is likely to be.

Not that it strikes me as completely unreasonable that we would have a major breakthrough that gives us AI in 20 years, but it's hard to see what the candidate would be. But I have only been thinking about these issues for a couple years, so I still maintain a pretty high degree of uncertainty about all of these claims.

I do think I basically agree with you re: inductive learning and program creation, though. When you say non-self-modifying Oracle AI, do you also mean that the Oracle AI doesn't get to do inductive learning? Because I suspect that inductive learning of some sort is fundamentally necessary, for reasons that you yourself nicely outline here.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-18T22:11:15.440Z · LW(p) · GW(p)

I agree that top mainstream AI guy Peter Norvig was way the heck more sensible than the reference class of declared "AGI researchers" when I talked to him about FAI and CEV, and that estimates should be substantially adjusted accordingly.

Replies from: thomblake

↑ comment by thomblake · 2012-05-20T19:45:18.873Z · LW(p) · GW(p)

Yes. I wonder if there's a good explanation why narrow AI folks are so much more sensible than AGI folks on those subjects.

Replies from: DanArmak

↑ comment by DanArmak · 2012-05-27T22:10:12.593Z · LW(p) · GW(p)

Because they have some experience of their products actually working, they know that 1) these things can be really powerful, even though narrow, and 2) there are always bugs.

↑ comment by private_messaging · 2012-05-16T11:01:29.589Z · LW(p) · GW(p)

"Intelligence is not as computationally expensive as it looks"

How sure are you that your intuitions do not arise from typical mind fallacy and from you attributing the great discoveries and inventions of mankind to the same processes that you feel run in your skull and which did not yet result in any great novel discoveries and inventions that I know of?

I know this sounds like ad-hominem, but as your intuitions are significantly influenced by your internal understanding of your own process, your self esteem will stand hostage to be shot through in many of the possible counter arguments and corrections. (Self esteem is one hell of a bullet proof hostage though, and tends to act more as a shield for bad beliefs).

It would still require a great feat of cleanly designed, strong-understanding-math-based AI - Holden seems to think this sort of development would happen naturally with the sort of AGI researchers we have nowadays

There is a lot of engineers working on software for solving engineering problems, including the software that generates and tests possible designs and looks for ways to make better computers. Your philosophy-based natural-language-defined in-imagination-running Oracle AI may have to be very carefully specified so that it does not kill imaginary mankind. And it may well be very difficult to build such a specification. Just don't confuse it with the software written to solve definable problems.

Ultimately, figuring out how to make a better microchip involves a lot of testing of various designs, that's how humans do it, that's how tools do it. I don't know how you think it is done. The performance is a result of a very complex function of the design. To build a design that performs you need to reverse this ultra complicated function, which is done by a mixture of analytical methods and iteration of possible input values, and unless P=NP, we have very little reason to expect any fundamentally better solutions (and even if P=NP there may still not be any). Meaning that the AGI won't have any edge over practical software, and won't out-foom it.

↑ comment by hairyfigment · 2012-05-15T19:56:33.736Z · LW(p) · GW(p)

Holden seems to think this sort of development would happen naturally with the sort of AGI researchers we have nowadays,

I may have the terminology wrong, but I believe he's thinking more about commercial narrow-AI researchers.

Now if they produce results like these, that would push the culture farther towards letting computer programs handle any hard task. Programming seems hard.

↑ comment by thomblake · 2012-05-15T18:00:37.984Z · LW(p) · GW(p)

Nonetheless, I think after further consideration I would end up substantially increasing my expectation that if you have some moderately competent Friendly AI researchers, they would apply their skills to create an Oracle AI first; and so by Conservation of Expected Evidence I am executing that update now.

This is not relevant to FAI per se, but Michael and Susan Leigh Anderson have suggested (and begun working on) just that in the field of Machine Ethics. The main contention seems to be that creating an ethical oracle is easier than creating an embodied ethical agent because you don't need to first figure out whether the robot is an ethical patient. Then once the bugs are out, presumably the same algorithms can be applied to embodied robots.

ETA: For reference, I think the relevant paper is "Machine Metaethics" by Susan Leigh Anderson, in the volume Machine Ethics - I'm sure lukeprog has a copy.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-15T19:56:29.017Z · LW(p) · GW(p)

The heck? Why would you not need to figure out if an oracle is an ethical patient? Why is there no such possibility as a sentient oracle?

Is this standard religion-of-embodiment stuff?

Replies from: thomblake

↑ comment by thomblake · 2012-05-15T20:56:36.060Z · LW(p) · GW(p)

Why would you not need to figure out if an oracle is an ethical patient? Why is there no such possibility as a sentient oracle?

The oracle gets asked questions like "Should intervention X be used by doctor D on patient P" and can tell you the correct answer to them without considering the moral status of the oracle.

If it were a robot, it would be asking questions like "Should I run over that [violin/dog/child] to save myself?" which does require considering the status of the robot.

EDIT: To clarify, it's not that the researcher has no reason to figure out the moral status of the oracle, it's that the oracle does not need to know its own moral status to answer its domain-specific questions.

Replies from: DanArmak

↑ comment by DanArmak · 2012-05-27T22:11:56.366Z · LW(p) · GW(p)

What if it assigned moral status to itself and then biased its answers to make its users less likely to pull its plug one day?

comment by Rain · 2012-05-10T20:03:13.773Z · LW(p) · GW(p)

I completely agree with the intent of this post. These are all important issues SI should officially answer. (Edit: SI's official reply is here.) Here are some of my thoughts:

I completely agree with objection 1. I think SI should look into doing exactly as you say. I also feel that friendliness has a very high failure chance and that all SI can accomplish is a very low marginal decrease in existential risk. However, I feel this is the result of existential risk being so high and difficult to overcome (Great Filter) rather than SI being so ineffective. As such, for them to engage this objection is to admit defeatism and millenialism, and so they put it out of mind since they need motivation to keep soldiering on despite the sure defeat.
Objection 2 is interesting, though you define AGI differently, as you say. Some points against it: Only one AGI needs to be in agent mode to realize existential risk, even if there are already billions of tool-AIs running safely. Tool-AI seems closer in definition to narrow AI, which you point out we already have lots of, and are improving. It's likely that very advanced tool-AIs will indeed be the first to achieve some measure of AGI capability. SI uses AGI to mean agent-AI precisely because at some point someone will move beyond narrow/tool-AI into agent-AI. AGI doesn't "have to be an agent", but there will likely be agent-AI at some point. I don't see a means to limit all AGI to tool-AI in perpetuity.
'Race for power' should be expanded to 'incentivised agent-AI'. There exist great incentives to create agent-AI above tool-AI, since AGI will be tireless, ever watchful, supremely faster, smarter, its answers not necessarily understood, etc. These include economic incentives, military incentives, etc., not even to implement-first, but to be better/faster on practical everyday events.
Objection 3, I mostly agree. Though should tool-AIs achieve such power, they can be used as weapons to realize existential risk, similar to nuclear, chemical, bio-, and nanotechnological advances.
I think this post focuses too much on "Friendliness theory". As Zack_M_Davis stated, SIAI should have more appropriately been called "The Singularity Institute For or Against Artificial Intelligence Depending on Which Seems to Be a Better Idea Upon Due Consideration". Friendliness is one word which could encapsulate a basket of possible outcomes, and they're agile enough to change position should it be shown to be necessary, as some of your comments request. Maybe SI should make tool-AI a clear stepping stone to friendliness, or at least a clear possible avenue worth exploring. Agreed.
Much agreed re: feedback loops.
"Kind of organization": painful but true.

However, I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y. For donors determined to donate within this cause, I encourage you to consider donating to a donor-advised fund while making it clear that you intend to grant out the funds to existential-risk-reduction-related organizations in the future. (One way to accomplish this would be to create a fund with "existential risk" in the name; this is a fairly easy thing to do and one person could do it on behalf of multiple donors.) For one who accepts my arguments about SI, I believe withholding funds in this way is likely to be better for SI's mission than donating to SI - through incentive effects alone (not to mention my specific argument that SI's approach to "Friendliness" seems likely to increase risks).

Good advice; I'll look into doing this. One reason I've been donating to them is so they can keep the lights on long enough to see and heed this kind of criticism. Maybe those incentives weren't appropriate.

This post limits my desire to donate additional money to SI beyond previous commitments. I consider it a landmark in SI criticism. Thank you for engaging this very important topic.

Edit: After SI's replies and careful consideration, I decided to continue donating directly to them, as they have a very clear roadmap for improvement and still represent the best value in existential risk reduction.

Replies from: khafra, None, None

↑ comment by khafra · 2012-05-11T14:20:10.310Z · LW(p) · GW(p)

You're an accomplished and proficient philanthropist; if you do make steps in the direction of a donor-directed existential risk fund, I'd like to see them written about.

↑ comment by [deleted] · 2015-02-15T17:30:59.726Z · LW(p) · GW(p)

I am unable to respond to people responding to my previous comment directly; the system tells me 'Replies to downvoted comments are discouraged. You don't have the requisite 5 Karma points to proceed.' So I will reply here.

@Salemicus

My question was indeed rhetorical. My comment was intended as a brief reality check, not a sophisticated argument. I disagree with you about the importance of climate change and resource shortage, and the effectiveness of humanitarian aid. But my comment did not intend to supply any substantial list of "causes"; again, it was a reality check. Its intention was to provoke reflection on how supposedly solid reasoning had justified donating to stop an almost absurdly Sci-Fi armageddon. I will now, briefly, respond to your points on the causes I raised. The following is, again, not a sophisticated and scientifically literate argument, but then neither was your reply to my comment. It probably isn't worth responding to.

On global warming, I do not wish to engage in a lengthy argument over a complicated scientific matter. Rather I will recommend reading the first major economic impact analysis, the 'Stern Review on the Economics of Climate Change'. You can find that easily by searching google. For comments and criticisms of that paper, see:

Weitzman, M (2007), ‘The Stern Review of the Economics of Climate Change’, Journal of Economic Literature 45(3), 703-24. http://www.economics.harvard.edu/faculty/weitzman/files/review_of_stern_review_jel.45.3.pdf

Dasgupta, P (2007), ‘Comments on the Stern Review’s Economics of Climate Change’, National Institute Economic Review 199, 4-7. http://are.berkeley.edu/courses/ARE263/fall2008/paper/Discounting/Dasgupta_Commentary%20-%20The%20Stern%20Review%20s%20Economics%20of%20Climate%20Change_NIES07.pdf

Dietz, S and N Stern (2008), ‘Why Economic Analysis Supports Strong Action on Climate Change: a Response to the Stern Review’s Critics, Review of Environmental Economics and Policy, 2(1), 94-113.

Broome, J (May, 2008), ‘The Ethics of Climate Change: Pay Now or Pay More Later?’ Scientific American, May, 2008.

On renewable resources, I think it is rather obviously stupid to induce 'we've never run out of resources before, so we can't be doing so now!'. I don't know what condor eggs are or what renewable resources we have run out of. I also fail to see why economists would be in a special position to tell us whether we are running out of resources.

On humanitarian causes, I fail to see how humanitarian aid is counter-productive. Perhaps you meant aid to developing countries (which I agree is a complex, although not at all hopeless issue). I meant aid in times of catastrophes such as natural disasters or wars.

@gjm

Again, I was not intending to provide a sophisticated argument. I only intended to supply a basic reality check. Again, this response to you will not be sophisticated or scientifically literate, and is probably not worth responding to.

Indeed, it is highly reasonable to give to multiple charities. Under doubt over which charities are the "best" (assuming such a concept makes sense), it may well be reasonable to donate to multiple charities. My brief reality check was not meant to say that donating to MIRI was not the best way to spend money, but rather it was absurd to even consider, given the other far more pressing and realistic problems in the world today.

You seem to assume that MIRI would be an effective organisation to prevent evil AIs running around and killing everybody, if such a threat actually existed. I'm not interested in a sophisticated argument over the performance of MIRI, but I think its worth bringing up that tenuous assumption.

You also seem to make some kind of Pascal's Wager. This is rather strange. We could say there is a (very) low probability, perhaps very low indeed, that climate change messes up all our ecosystems so much that we can no longer farm food. Then we'd all die slowly of starvation. Or perhaps there's a very low probability that the sun flares to such an extent that life on the earth is wiped out. Ought we invest in flare guarding equipment? Perhaps there's a tiny probability aliens come and kill us all, but that the same aliens die if they think about blue cheese. Ought we erect monuments to the mighty Stilton around the world?

Don't take this comment too seriously.

Replies from: gjm

↑ comment by gjm · 2015-02-15T22:44:24.283Z · LW(p) · GW(p)

Don't take this comment too seriously.

Allow me to generalize: Don't take anything too seriously. (By definition of "too".)

I don't (at all) assume that MIRI would in fact be effective in preventing disastrous-AI scenarios. I think that's an open question, and in the very article we're commenting on we can see that Holden Karnofsky of GiveWell gave the matter some thought and decided that MIRI's work is probably counterproductive overall in that respect. (Some time ago; MIRI and/or HK's opinions may have changed relevantly since then.) As I already mentioned, I do not myself donate to MIRI; I was trying to answer the question "why would anyone who isn't crazy or stupid denote to MIRI?" and I think it's reasonably clear that someone neither crazy nor stupid could decide that MIRI's work does help to reduce the risk of AI-induced disaster.

("Evil AIs running around and killing everybody", though, is a curious choice of phrasing. It seems to fit much better with any number of rather silly science fiction movies than with anything MIRI and its supporters are actually arguing might happen. Which suggests that either you haven't grasped what it is they are worried about, or you have grasped it but prefer inaccurate mockery to engagement -- which is, of course, your inalienable right, but may not encourage people here to take your comments as seriously as you might prefer.)

I wasn't intending to make a Pascal's wager. Again, I am not myself a MIRI donor, but my understanding is that those who are generally think that the probability of AI-induced disaster is not very small. So the point isn't that there's this tiny probability of a huge disaster so we multiply (say) a 10^-6 chance of disaster by billions of lives lost and decide that we have to act urgently. It's that (for the MIRI donor) there's maybe a 10% -- or a 99% -- chance of AI-induced disaster if we aren't super-careful, and they hope MIRI can substantially reduce that.

other far more pressing and realistic problems in the world today

The underlying argument here is -- if I'm understanding right -- something like this: "We know that there are people starving in Africa right now. We fear that there might some time in the future be danger from superintelligent artificial intelligences whose goals don't match ours. We should always prioritize known, present problems over future, uncertain ones. So it's silly to expend any effort worrying about AI." I disagree with the premise I've emphasized there. Consider global warming: it probably isn't doing us much harm yet; although the skeptics/deniers are probably wrong, it's not altogether impossible; so trying to deal with global warming also falls into the category of future, uncertain threats -- and yet this was your first example of something that should obviously be given priority over AI safety.

I guess (but please correct me if I guess wrong) your response would be that the danger of AI is much much lower-probability than the danger of global warming. (Because the probability of producing AI at all is small, or because the probability of getting a substantially superhuman AI is small, or because a substantially superhuman AI would be very unlikely to do any harm, or whatever.) You might be right. How sure are you that you're right, and why?

Replies from: None, dxu

↑ comment by [deleted] · 2015-03-13T13:38:35.580Z · LW(p) · GW(p)

Replies from: gjm

↑ comment by gjm · 2015-03-13T15:19:32.180Z · LW(p) · GW(p)

↑ comment by dxu · 2015-02-16T02:05:32.095Z · LW(p) · GW(p)

Extremely tiny probabilities with enormous utilities attached do suffer from Pascal's Mugging-type scenario's. That being said, AI-risk probabilities are much larger in my estimate than the sorts of probabilities required for Pascal-type problems to start coming into play. Unless Perrr333 intends to suggest that probabilities involving UFAI really are that small, I think it's unlikely he/she is actually making any sort of logical argument. It's far more likely, I think, that he/she is making an argument based on incredulity (disguised by seemingly logical arguments, but still at its core motivated by incredulity).

The problem with that, of course, is that arguments from incredulity rely almost exclusively on intuition, and the usefulness of intuition decreases spectacularly as scenarios become more esoteric and further removed from the realm of everyday experience.

↑ comment by [deleted] · 2015-02-10T13:03:55.736Z · LW(p) · GW(p)

How can anyone seriously consider the hypothetical threat of AIs running around a worthier cause than stopping global warming, or investing in renewable resources, or preventing/relieving humanitarian crises?

Replies from: gjm, Salemicus

↑ comment by gjm · 2015-02-10T18:11:05.926Z · LW(p) · GW(p)

Another datapoint to compare and contrast with Salemicus's (our political positions are very different):

Like Salemicus, I am not very optimistic that you're actually asking a serious question with the intention of listening to the answers; if you are, you might want to reconsider how your writing comes across.
I think it's perfectly possible, and reasonable, to be concerned about more than one issue at a time.
- There is an argument that charitable giving, unless you're giving far more than most of us are in a position to give, should all be directed to the single best cause you can find. I am not a donor to MIRI because I don't think it's the single best cause I can find. If you're asking why people give money to MIRI then maybe someone else will answer that.
I think all the three things you list are important. (In particular, unlike Salemicus I think there are things we can do that will reduce global warming and be of net benefit in other respects; I agree with Salemicus that we are unlikely to completely run out of (say) oil, but think it very possible that the price might become very high and that this could hurt us a lot; and I strongly disagree with him in thinking that attempts to deal with humanitarian crises are typically harmful.)
- AI safety is less likely to be a problem than any of them, but (with low probability) could be a worse problem than any of them.
- In particular, there are improbable-feeling scenarios in which it's a huuuuuge catastrophe. These tend to feel "silly" simply because they involve things happening that are far outside the range of what we're familiar with, but consideration of how (say) Shakespeare might have reacted to some features of present-day technology suggests to me that this isn't a very reliable guide.
- In any case, these scenarios are interesting to think about even if they end up not being a problem. (They might end up not being a problem because they have been thought about. This would not be a bad outcome.)

↑ comment by Salemicus · 2015-02-10T14:33:23.473Z · LW(p) · GW(p)

In the slim chance that your question is non-rhetorical:

Many people do not consider global warming to be a problem. Others think that there is nothing useful to be done about it. Personally I do not consider global warming to be a serious threat; people will adapt fairly easily to temperature changes within the likely ranges. Further, any realistic 'cure' for global warming would almost certainly be worse than the disease. Therefore I do not view climate change activism to be a worthy cause at present, although that could change.
History and economics both suggest that so-called non-renewable resources are in fact very robust. Mankind has never run out of any non-renewable resource, whereas we have run out of many renewable ones. The fact that a resource has a hypothetical 'renewability' does not necessarily have much impact on the limits to its use. For instance, we need to worry far less about running out of coal than condor eggs. I view most investment in renewable resources as pure boondoggling, and pretty much the opposite of a worthy cause.
Preventing and relieving humanitarian crises can be a worthy cause in principle. But in practice activism along those lines seems heavily counterproductive. I often wonder how many fewer crises there would be if

So basically, I don't think MIRI is likely to do much good in the world. But I'd much rather donate to them rather than Greenpeace, Solyndra or Oxfam, because at least they're not actively doing harm.

comment by komponisto · 2012-05-12T02:55:35.200Z · LW(p) · GW(p)

Lack of impressive endorsements. [...] I feel that given the enormous implications of SI's claims, if it argued them well it ought to be able to get more impressive endorsements than it has. I have been pointed to Peter Thiel and Ray Kurzweil as examples of impressive SI supporters, but I have not seen any on-record statements from either of these people that show agreement with SI's specific views, and in fact (based on watching them speak at Singularity Summits) my impression is that they disagree.

This is key: they support SI despite not agreeing with SI's specific arguments. Perhaps you should, too, at least if you find folks like Thiel and Kurzweil sufficiently impressive.

In fact, this has always been roughly my own stance. The primary reason I think SI should be supported is not that their arguments for why they should be supported are good (although I think they are, or at least, better than you do). The primary reason I think SI should be supported is that I like what the organization actually does, and wish it to continue. The Less Wrong Sequences, Singularity Summit, rationality training camps, and even HPMoR and Less Wrong itself are all worth paying some amount of money for. Not to mention the general paying-of-attention to systematic rationality training, and to existential risks relating to future technology.

Strangely, the possibility of this kind of view doesn't seem to be discussed much, even though it is apparently the attitude of some of SI's most prominent supporters.

I furthermore have to say that to raise this particular objection seems to me almost to defeat the purpose of GiveWell. After all, if we could rely on standard sorts of prestige-indicators to determine where our money would be best spent, everybody would be spending their money in those places already, and "efficient charity" wouldn't be a problem for some special organization like yours to solve.

Replies from: ghf, None

↑ comment by ghf · 2012-05-13T20:12:00.048Z · LW(p) · GW(p)

The primary reason I think SI should be supported is that I like what the organization actually does, and wish it to continue. The Less Wrong Sequences, Singularity Summit, rationality training camps, and even HPMoR and Less Wrong itself are all worth paying some amount of money for.

I think that my own approach is similar, but with a different emphasis. I like some of what they've done, so my question is how do encourage those pieces. This article was very helpful in prompting some thought into how to handle that. I generally break down their work into three categories:

Rationality (minicamps, training, LW, HPMoR): Here I think they've done some very good work. Luckily, the new spinoff will allow me to support these pieces directly.
Existential risk awareness (singularity summit, risk analysis articles): Here their record has been mixed. I think the Singularity Summit has been successful, other efforts less so but seemingly improving. I can support the Singularity Summit by continuing to attend and potentially donating directly if necessary (since it's been running positive in recent years, for the moment this does not seem necessary).
Original research (FAI, timeless decision theory): This is the area where I do not find them to be at all effective. From what I've read, there seems a large disconnect between ambitions and capabilities. Given that I can now support the other pieces separately, this is why I would not donate generally to SIAI.

My overall view would be that, at present, there is no real organization to support. Rather there is a collection of talented people whose freedom to work on interesting things I'm supporting. Given that, I want to support those people where I think they are effective.

I find Eliezer in particular to be one of the best pop-science writers around (and I most assuredly do not mean that term as an insult). Things like the sequences or HPMoR are thought-provoking and worth supporting. I find the general work on rationality to be critically important and timely.

So, while I agree that much of the work being done is valuable, my conclusion has been to consider how to support that directly rather than SI in general.

Replies from: komponisto, chaosmage

↑ comment by komponisto · 2012-05-13T22:55:51.343Z · LW(p) · GW(p)

I don't see how this constitutes a "different emphasis" from my own. Right now, SI is the way one supports the activities in question. Once the spinoff has finally spun off and can take donations itself, it will be possible to support the rationality work directly.

Replies from: ghf

↑ comment by ghf · 2012-05-13T23:33:25.412Z · LW(p) · GW(p)

The different emphasis comes down to your comment that:

...they support SI despite not agreeing with SI's specific arguments. Perhaps you should, too...

In my opinion, I can more effectively support those activities that I think are effective by not supporting SI. Waiting until the Center for Applied Rationality gets its tax-exempt status in place allows me to both target my donations and directly signal where I think SI has been most effective up to this point.

If they end up having short-term cashflow issues prior to that split, my first response would be to register for the next Singularity Summit a bit early since that's another piece that I wish to directly support.

↑ comment by chaosmage · 2012-05-14T10:44:10.723Z · LW(p) · GW(p)

So, are you saying you'd be more inclined to fund a Rationality Institute?

↑ comment by [deleted] · 2012-05-12T03:03:38.314Z · LW(p) · GW(p)

I furthermore have to say that to raise this particular objection seems to me almost to defeat the purpose of GiveWell. After all, if we could rely on standard sorts of prestige-indicators to determine where our money would be best spent, everybody would be spending their money in those places already, and "efficient charity" wouldn't be a problem for some special organization like yours to solve.

I think Holden seems to believe that Thiel and Kurzweil endorsing SIAI's UFAI-prevention methods would be more like a leading epidemiologist endorsing the malaria-prevention methods of the Against Malaria Foundation (AMF) than it would be like Celebrity X taking a picture with some children for the AMF. There are different kinds of "prestige-indicator," some more valuable to a Bayesian-minded charity evaluator than others.

Replies from: komponisto

↑ comment by komponisto · 2012-05-12T03:10:46.860Z · LW(p) · GW(p)

I would still consider the leading epidemiologist's endorsement to be a standard sort of prestige-indicator. If an anti-disease charity is endorsed by leading epidemiologists, you hardly need GiveWell. (At least for the epidemiological aspects. The financial/accounting part may be another matter.)

Replies from: None

↑ comment by [deleted] · 2012-05-12T03:17:03.668Z · LW(p) · GW(p)

I would argue that this is precisely what GiveWell does in evaluating malaria charity. If the epidemiological consensus changed, and bednets were held to be an unsustainable solution (this is less thoroughly implausible than it might sound, though probably still unlikely), then even given the past success of certain bednet charities on all GiveWell's other criteria, GiveWell might still downgrade those charities. And don't underestimate the size of the gap between "a scientifically plausible mechanism for improving lives" and "good value in lives saved/improved per dollar." There are plenty of bednet charities, and there's a reason GiveWell recommends AMF and not, say, Nothing But Nets.

The endorsement, in other words, is about the plausibility of the mechanism, which is only one of several things to consider in donating to a charity, but it's the area in which a particular kind of expert endorsement is most meaningful.

Replies from: komponisto

↑ comment by komponisto · 2012-05-12T04:13:34.760Z · LW(p) · GW(p)

If the epidemiological consensus changed, and bednets were held to be an unsustainable solution...then even given the past success of certain bednet charities on all GiveWell's other criteria, GiveWell might still downgrade those charities.

As they should. But the point is that, in so doing, GiveWell would not be adding any new information not already contained in the epidemiological consensus (assuming they don't have privileged information about the latter).

And don't underestimate the size of the gap between "a scientifically plausible mechanism for improving lives" and "good value in lives saved/improved per dollar."

Indeed. The latter is where GiveWell enters the picture; it is their unique niche. The science itself, on the other hand, is not really their purview, as opposed to the experts. If GiveWell downgrades a charity solely because of the epidemiological consensus, and (for some reason) I have good reason to think the epidemiological consensus is wrong, or inadequately informative, then GiveWell hasn't told me anything, and I have no reason to pay attention to them. Their rating is screened off.

Imagine that 60% of epidemiologists think that Method A is not effective against Disease X, while 40% think it is effective. Suppose Holden goes to a big conference of epidemiologists and says "GiveWell recommends against donating to Charity C because it uses Method A, which the majority of epidemiologists say is not effective." Assuming they already knew Charity C uses Method A, should they listen to him?

Of course not. The people at the conference are all epidemiologists themselves, and those in the majority are presumably already foregoing donations to Charity C, while those in the minority already know that the majority of their colleagues disagree with them. Holden hasn't told them anything new. So, if his organization is going to be of any use to such an audience, it should focus on the things they can't already evaluate themselves, like financial transparency, accounting procedures, and the like; unless it can itself engage the scientific details.

This is analogous to the case at hand: if all that GiveWell is going to tell the world is that SI hasn't signaled enough status, well, the world already knows that. Their raison d'être is to tell people info that they can't find (or is costly to find) via other channels: such as info about non-high-status charities that may be worth supporting despite their non-high-status. If it limits its endorsements to high-status charities, then it may as well not even bother -- just as it need not bother telling a conference of epidemiologists that it doesn't endorse a charity because of the epidemiological consensus.

Replies from: None

↑ comment by [deleted] · 2012-05-12T11:37:13.182Z · LW(p) · GW(p)

A few points:

"Possesses expert endorsement of its method" does not necessarily equal "high-status charity." A clear example here is de-worming and other parasite control, which epidemiologists all agree works well, but which doesn't get the funding a lot of other developing world charity does because it's not well advertised. GiveWell would like SIAI to be closer to de-worming charities in that outside experts give some credence to the plausibility of the methods by which SIAI proposes to do good.

Moreover, "other high-status charities using one's method" also doesn't equal "high-status charity." Compare the number of Facebook likes for AMF and Nothing But Nets. The reason GiveWell endorses one but not the other is that AMF, unlike NBN, has given compelling evidence that it can scale the additional funding that a GiveWell endorsement promises into more lives saved/improved at a dollar rate comparable to their current lives saved/improved per dollar.

So we should distinguish a charity's method being "high-status" from the charity itself being "high-status." But if you define "high status method" as "there exists compelling consensus among the experts GiveWell has judged to be trustworthy that the proposed method for doing good is even plausible," then I, as a Bayesian, am perfectly comfortable with GiveWell only endorsing "high-status method" charities. They still might buck the prevailing trends on optimal method; perhaps some of the experts are on GiveWell's own staff, or aren't prominent in the world at large. But by demanding that sort of "high-status method" from a charity, GiveWell discourages crankism and is unlikely to miss a truly good cause for too long.

Expert opinion on method plausibility is all the more important with more speculative charity like SIAI because there isn't a corpus of "effectiveness data to date" to evaluate directly.

comment by Paul Crowley (ciphergoth) · 2012-05-11T06:31:10.466Z · LW(p) · GW(p)

Firstly, I'd like to add to the chorus saying that this is an incredible post; as a supporter of SI, it warms my heart to see it. I disagree with the conclusion - I would still encourage people to donate to SI - but if SI gets a critique this good twice a decade it should count itself lucky.

I don't think GiveWell making SI its top rated charity would be in SI's interests. In the long term, SI benefits hugely when people are turned on to the idea of efficient charity, and asking them to swallow all of the ideas behind SI's mission at the same time will put them off. If I ran GiveWell and wanted to give an endorsement to SI, I might break the rankings into multiple lists: the most prominent being VillageReach-like charities which directly do good in the near future, then perhaps a list for charities that mitigate broadly accepted and well understood existential risks (if this can be done without problems with politics), and finally a list of charities which mitigate more speculative risks.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-12T20:11:19.980Z · LW(p) · GW(p)

I don't think GiveWell making SI its top rated charity would be in SI's interests.

This seems like a good point and perhaps would have been a good reason for SI to not have approached GiveWell in the first place. At this point though, GiveWell is not only refusing to make SI a top rated charity, but actively recommending people to "withhold" funds from SI, which as far as I can tell, it almost never does. It'd be a win for SI to just convince GiveWell to put it back on the "neutral" list.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-12T20:19:15.996Z · LW(p) · GW(p)

Agreed. Did SI approach GiveWell?

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-12T20:49:23.059Z · LW(p) · GW(p)

Did SI approach GiveWell?

Yes. Hmm, reading that discussion shows that they were already thinking about having GiveWell create a separate existential risk category (and you may have gotten the idea there yourself and then forgot the source).

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-12T21:01:28.114Z · LW(p) · GW(p)

you may have gotten the idea there yourself and then forgot the source

Indeed.

comment by Wei Dai (Wei_Dai) · 2012-05-13T12:31:40.221Z · LW(p) · GW(p)

I find it unfortunate that none of the SIAI research associates have engaged very deeply in this debate, even LessWrong regulars like Nesov and cousin_it. This is part of the reason why I was reluctant to accept (and ultimately declined) when SI invited me to become a research associate, that I would feel less free to to speak up both in support of SI and in criticism of it.

I don't think this is SI's fault, but perhaps there are things it could do to lessen this downside of the research associate program. For example it could explicitly encourage the research associates to publicly criticize SI and to disagree with its official positions, and make it clear that no associate will be blamed if someone mistook their statements to be official SI positions or saw them as reflecting badly on SI in general. I also write this comment because just being consciously aware of this bias (in favor of staying silent) may help to counteract it.

Replies from: Vladimir_Nesov, cousin_it

↑ comment by Vladimir_Nesov · 2012-05-13T13:43:22.801Z · LW(p) · GW(p)

I don't usually engage in potentially protracted debates lately. A very short summary of my disagreement with Holden's object-level argument part of the post is (1) I don't see in what way can the idea of powerful Tool AI be usefully different from that of Oracle AI, and it seems like the connotations of "Tool AI" that distinguish it from "Oracle AI" follow from an implicit sense of it not having too much optimization power, so it might be impossible for a Tool AI to both be powerful and hold the characteristics suggested in the post; (1a) the description of Tool AI denies it goals/intentionality and other words, but I don't see what they mean apart from optimization power, and so I don't know how to use them to characterize Tool AI; (2) the potential danger of having a powerful Tool/Oracle AI around is such that aiming at their development doesn't seem like a good idea; (3) I don't see how a Tool/Oracle AI could be sufficiently helpful to break the philosophical part of the FAI problem, since we don't even know which questions to ask.

Since Holden stated that he's probably not going to (interactively) engage the comments to this post, and writing this up in a self-contained way is a lot of work, I'm going to leave this task to the people who usually write up SingInst outreach papers.

Replies from: Thomas, private_messaging

↑ comment by Thomas · 2012-05-13T15:20:28.552Z · LW(p) · GW(p)

The Tool/Oracle AI may transfer the power to the people, who manage and control this device. They can easily become unfriendly, yes.

And I would cut out this "Tool AI", the "Oracle AI" is enough.

↑ comment by private_messaging · 2012-05-13T17:00:09.759Z · LW(p) · GW(p)

edit: removed text in Russian because it was read by recipient (the private message system here shows replies to public and private messages together, making private messages very easy to miss).

Replies from: Vladimir_Nesov, Tyrrell_McAllister, gRR

↑ comment by Vladimir_Nesov · 2012-05-13T22:14:58.823Z · LW(p) · GW(p)

[This thread presents a good opportunity to exercise the (tentatively suggested) norm of indiscriminately downvoting all comments in pointless conversations, irrespective of individual quality or helpfulness of the comments.]

Replies from: Tyrrell_McAllister

↑ comment by Tyrrell_McAllister · 2012-05-14T03:21:20.218Z · LW(p) · GW(p)

Although, please be aware that the pointlessness of the conversation may not initially have been so transparent to those who cannot read Russian.

↑ comment by Tyrrell_McAllister · 2012-05-13T21:26:21.152Z · LW(p) · GW(p)

Google Translate's translation:

Oracle as described here: http://lesswrong.com/lw/any/a_taxonomy_of_oracle_ais/?

Why would you even got in touch with these stupid dropout? These "artificial intelligence" has been working on in the imagination of animism, respectively, if he wants to predict what course wants to be the correct predictions were.

The real work on the mathematics in a computer, gave him 100 rooms, he'll spit out a few formulas that describe the accuracy with varying sequence, and he is absolutely on the drum, they coincide with your new numbers or not, unless specifically addressed grounding symbols, and specifically to do so was not on the drum.

In my opinion you'd better stay away from this group of dropouts. They climbed to the retarded arguments to GiveWell, Holden wrote a bad review on the subject, and this is just the beginning - will be even more angry feedback from the experts. Communicate with them as a biochemist scientist to communicate with fools who are against vaccines (It is clear that the vaccine can improve and increase their safety, and it is clear that the morons who are against this vaccine does not help).

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-13T21:57:29.509Z · LW(p) · GW(p)

It mistranslates the words a fair lot. The meaning is literally uncomplete-educated; one can be a dropout but not be uncomplete-educated; one may complete course but be uncomplete-educated, too. The 'communicate' is more close to 'relate to'. Basically, what I am saying is that I don't understand why he chooses to associate with incompetent, undereducated fools of SI, and defend them; it is about as sensible as for a biochemist to associate with some anti-vaccination idiots.

Replies from: Tyrrell_McAllister, JoshuaZ

↑ comment by Tyrrell_McAllister · 2012-05-14T02:36:16.178Z · LW(p) · GW(p)

Actually, I'm most curious about the middle paragraph (with the "100 rooms" and the "drum"). Google seems to have totally mangled that one. What is the actual meaning?

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-14T09:48:14.208Z · LW(p) · GW(p)

Replied in private. The point is that number-sequence predictor for instance (somehow number was translated as a room) which makes some formula that fits sequence ain't going to care (drum part) about you matching up formula to numbers.

↑ comment by JoshuaZ · 2012-05-13T22:00:17.999Z · LW(p) · GW(p)

The connotations were clear from the machine translated form. In this context, your behavior was unproductive, uncivil and passive-aggressive.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-13T22:06:57.979Z · LW(p) · GW(p)

whatever, translate your message to russian then back to english.

Anyhow, it is the case that SI is organisation led by two under educated, incompetent overly narcissistic individuals whom are speaking with undue confidence of things that they do not understand, and basically do nothing but generate bullshit. He is best off not associating with this sort of stuff. You think Holden's response is bad? Wait until you run into someone even less polite than me. You'll hear the same thing I am saying, from someone with position of authority.

↑ comment by gRR · 2012-05-13T17:47:02.019Z · LW(p) · GW(p)

"want X" = how "having the goal X" feels from the inside. Animalism is in your imagination.

↑ comment by cousin_it · 2012-05-13T13:20:16.946Z · LW(p) · GW(p)

Not sure about the others, but as for me, at some point this spring I realized that talking about saving the world makes me really upset and I'm better off avoiding the whole topic.

Replies from: Wei_Dai, sufferer

↑ comment by Wei Dai (Wei_Dai) · 2012-05-13T19:07:17.266Z · LW(p) · GW(p)

Would it upset you to talk about why talking about saving the world makes you upset?

Replies from: homunq, cousin_it

↑ comment by homunq · 2012-05-14T19:23:40.683Z · LW(p) · GW(p)

It would appear that cousin_it believes we're screwed. It's tempting to argue that this would, overall, be an argument against the effectiveness of the SI program. However, that's probably not true, because we could be 99% screwed and the remaining 1% could depend on SI; this would be a depressing fact, yet still justify supporting the SI.

(Personally, I agree with the poster about the problems with SI, but I'm just laying it out. Responding to wei_dai rather than cousin_it because I don't want to upset the latter unnecessarily.)

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-15T05:46:33.159Z · LW(p) · GW(p)

we could be 99.9% screwed and the remaining 0.1% could be caused by donating to SI and it discouraging some avenue to survival.

Actually the way i see it, the most stark symptoms of SI being diseased is the certainty in intuitions even though there isn't some mechanism for such intuitions to be based on some subconscious but valid reasoning, and abundance of biases affecting the intuitions. There's nothing rational about summarizing a list of biases then proclaiming now they dont apply to me and i can use my intuitions.

↑ comment by cousin_it · 2012-05-13T20:06:38.655Z · LW(p) · GW(p)

Yes.

↑ comment by sufferer · 2012-05-17T18:18:29.355Z · LW(p) · GW(p)

It's because talking about the singularity and end-of-world in near mode for a large amount of time makes you alieve that it's going to happen. In the same way that it actually happening would make you alieve it, but talking about it once and believing it then never thinking about it explicitly again wouldn't.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-18T02:55:59.821Z · LW(p) · GW(p)

Probably not wise to categorically tell someone the reasons behind their feelings when you're underinformed, and probably not kind to ruminate on the subject when you can expect it to be unpleasant.

Replies from: sufferer, wedrifid

↑ comment by sufferer · 2012-05-27T20:22:02.113Z · LW(p) · GW(p)

I have personally felt the same feelings and I think I have pinned down the reason. I welcome alternative theories, in the spirit of rational debate rather than polite silence.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-27T20:38:22.343Z · LW(p) · GW(p)

That you may have discovered the reason that you felt this way does not mean that you have discovered the reason another specific person felt a similar way. In fact, they may not even be unaware of the causes of their feelings.

Replies from: sufferer

↑ comment by sufferer · 2012-05-27T20:41:35.954Z · LW(p) · GW(p)

Sure. That's why I said: "I welcome alternative theories" (including theories about there being multiple different reasons which may apply to different extents to different people). Do you have one?

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-27T20:53:36.405Z · LW(p) · GW(p)

Missed the point. Do you understand that you shouldn't have been confident you knew why cousin_it felt a particular way? Beyond that, personally I'm not all that interested in theorizing about the reasons, but if you really want to know you could just ask.

Replies from: sufferer

↑ comment by sufferer · 2012-05-28T17:43:28.449Z · LW(p) · GW(p)

Sorry I wasn't implying very strong confidence. I would give a probability of, say, 65% that my reason is the principal cause of the feelings of Cousin_it

↑ comment by wedrifid · 2012-05-27T22:47:38.347Z · LW(p) · GW(p)

Probably not wise to categorically tell someone the reasons behind their feelings when you're underinformed,

Neither wise or epistemically sound practice.

and probably not kind to ruminate on the subject when you can expect it to be unpleasant.

It is perfectly acceptable to make a reply to a publicly made comment that was itself freely volunteered. If the subject of there being subjects which are unpleasant to discuss is itself terribly unpleasant to discuss then it is cousin_it's prerogative to not bring up the subject on a forum where analysis of the subject is both relevant and potentially useful for others.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-27T23:07:23.132Z · LW(p) · GW(p)

I disagree that it is in general unacceptable to post information that you would not like to discuss beyond a certain point.

Without further clarification one could reasonably assume that cousin_it was okay with discussing the subject at one removal, as you suggest, but as it happens several days before the great-grandparent cousin_it explicitly stated that it would be upsetting to discuss this topic.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-27T23:22:38.500Z · LW(p) · GW(p)

I disagree that it is in general unacceptable to post information that you would not like to discuss beyond a certain point.

I would not make (and haven't made) the claim as you have stated it.

Without further clarification one could reasonably assume that cousin_it was okay with discussing the subject at one removal, as you suggest, but as it happens several days before the great-grandparent cousin_it explicitly stated that it would be upsetting to discuss this topic.

When that is the case - and if I happened to see it before making a contribution - I would refrain from making any direct reply to the user or to discuss him as an instance when talking about the subject (all else being equal). I would still discuss the subject itself using the same criteria for posting that I always use. Mind you I would probably have already have refrained from directly discussing the user due to the aforementioned epistemic absurdity and presumptuousness.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-28T00:23:52.675Z · LW(p) · GW(p)

What you claimed was that "It is perfectly acceptable to make a reply to a publicly made comment that was itself freely volunteered", and that if someone didn't want to discuss something then they shouldn't have brought it up. In context, however, this was a reply to me saying it was probably unkind to belabor a subject to someone who'd expressed that they find the subject upsetting, which you now seem to be saying you agree with. So what are you taking issue with? I certainly didn't mean to imply that if someone finds a subject uncomfortable to discuss, personally, then that means that others should stop discussing it at all, but this point isn't raised in your great-grandparent comment, and I hope my meaning was clear from the context.

ETA: I have not voted on your comments here.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T14:29:31.239Z · LW(p) · GW(p)

ETA: I have not voted on your comments here.

I have not voted here either. As of now the conversation is all at "0" which is how I would prefer it.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-28T15:39:02.842Z · LW(p) · GW(p)

Just wanted to clarify, as at the time your posts had both been downvoted.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T15:44:09.377Z · LW(p) · GW(p)

Just wanted to clarify, as at the time your posts had both been downvoted.

So I assumed. As a pure curiosity, if my comments were still downvoted I would have had to downvote yours despite your disclaimer. Not out of reciprocation but because the wedrifid comments being lower than the CuSithBell comments would be an error state and I would have no way to correct the wedrifid votes.

Replies from: TheOtherDave, shokwave, CuSithBell

↑ comment by TheOtherDave · 2012-05-28T17:46:52.754Z · LW(p) · GW(p)

and I would have no way to correct the wedrifid votes

That isn't actually true.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T18:01:59.770Z · LW(p) · GW(p)

That isn't actually true.

Correct. It is instead something that people should usually say is true because belief or practical assumption that defection is impossible is a better signal to send than that they could easily defect if they wanted to but choose not to.

It does so happen that I am incredibly talented when it comes to automation and have created web bots that are far more advanced than that required to prevent anything I would consider an 'error state' in voting patterns, essentially undetectably. It just so happens that I couldn't really be bothered doing so in the case of lesswrong and have something of an aversion to doing so anyway.

I mean, I've already got 20k votes in this game without cheating and without even trying to (by, for example, writing posts.)

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-28T22:30:39.607Z · LW(p) · GW(p)

Even if we agree to pretend that defection is impossible, you can also correct the wedrifid votes in a socially endorsed way by calling the attention of your allies to the exchange.

↑ comment by shokwave · 2012-05-28T17:29:18.934Z · LW(p) · GW(p)

I would have no way to correct the wedrifid votes.

If there are viewers of the post who are sufficiently similar to you, they will correct the wedrifid votes. A strategy to ensure error states get corrected is to be sufficiently similar to more post-viewers than your interlocutor.

(I corrected the conversation's votes.)

Replies from: wedrifid, CuSithBell

↑ comment by wedrifid · 2012-05-28T17:49:02.892Z · LW(p) · GW(p)

A strategy to ensure error states get corrected is to be sufficiently similar to more post-viewers than your interlocutor.

That is a strategy to get votes. If it so happened that wedrifid was particularly different to people here then modifying himself to be more similar to the norm would result in more votes but also more error states. Because all comments of the modified wedrifid that the original wedrifid would have objected to that get upvoted would constitute "error states" from the perspective of the wedrifid making the choice of whether to self modify. ie. Ghandi doesn't take the murder pill.

Just to be clear, I would not label all instances of wedrifid being downvoted or having less votes than the other person in a conversation as 'error states', just that in this specific conversation it would be a bad thing if that were the case. Obviously this is expected to be uncontroversial at least as the expected assumption from my perspective.

(I corrected the conversation's votes.)

I corrected the conversation's votes too. Someone downvoted the parent!

Replies from: shokwave

↑ comment by shokwave · 2012-05-28T17:57:33.402Z · LW(p) · GW(p)

I would not label all instances of wedrifid being downvoted or having less votes than the other person in a conversation are 'error states'

Ah, that was the false assumption I made. Cheers!

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T18:04:14.609Z · LW(p) · GW(p)

Ah, that was the false assumption I made. Cheers!

To be sure, most would be. But I'm sure in all the comments I've made over the years there is at least one that I would downvote in hindsight! ;)

↑ comment by CuSithBell · 2012-05-28T17:39:13.213Z · LW(p) · GW(p)

Why moreso than your interlocutor? That assumes you're conversing with people who desire error states (from your perspective).

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T17:54:37.617Z · LW(p) · GW(p)

Why moreso than your interlocutor?

I think he means that if the interlocutor votes but you do not then you must get 1 more vote on average from the observers than the interlocutor does.

That assumes you're conversing with people who desire error states (from your perspective).

That seems true. ie. It assumes a downvote from the interlocutor when their downvote would constitute an error state. Without that assumption the 'moreso' is required only by way of creating an error margin.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-28T18:06:19.440Z · LW(p) · GW(p)

My conception of error states was a little more general - the advice and assumptions wouldn't apply to, say, a conversation which both participants find valuable, but in which one or both are downvoted by observers.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T18:20:31.652Z · LW(p) · GW(p)

wouldn't apply to, say, a conversation which both participants find valuable, but in which one or both are downvoted by observers.

Such conversations happen rather often and I usually find it sufficient reason to discontinue the otherwise useful conversation. The information gained about public perception based on the feedback from observers completely changes what can be said and modifies how any given statement will be interpreted. Too annoying to deal with and a tad offensive. Not necessarily the fault of the interlocutor but the attitudes of the interlocutor's supporters still necessitates abandoning free conversation or information exchange with them and instead treating the situation as one of social politics.

↑ comment by CuSithBell · 2012-05-28T17:21:50.037Z · LW(p) · GW(p)

Well, whatever floats your boat. I wasn't trying to avoid downvotes, just ill-will.

So I take it you don't find your issue resolved, but you don't think it'll be fruitful to pursue the matter? If that's the case, sorry to give you that impression.

Replies from: wedrifid

↑ comment by wedrifid · 2012-05-28T17:39:14.294Z · LW(p) · GW(p)

So I take it you don't find your issue resolved, but you don't think it'll be fruitful to pursue the matter? If that's the case, sorry to give you that impression.

I didn't consider it to be an issue that particularly needed to be resolved. It was a five second fire and forget perspective given on your assertion of social norms that was a partial agreement and partial disagreement. The degree of difference is sufficiently minor that if your original injunction had either included the link or somewhat less general wording I would not have even thought it was worth an initial reply.

Sure, sometimes I am known to analyse such nuances in depth but for some reason this one just didn't catch my interest.

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-28T17:40:32.927Z · LW(p) · GW(p)

All right, that's cool then. Cheerio!

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-11T00:30:27.779Z · LW(p) · GW(p)

Thank you very much for writing this. I, um, wish you hadn't posted it literally directly before the May Minicamp when I can't realistically respond until Tuesday. Nonetheless, it already has a warm place in my heart next to the debate with Robin Hanson as the second attempt to mount informed criticism of SIAI.

Replies from: John_Maxwell_IV, lukeprog

↑ comment by John_Maxwell (John_Maxwell_IV) · 2012-05-11T05:16:53.162Z · LW(p) · GW(p)

It looks to me as though Holden had the criticisms he expresses even before becoming "informed", presumably by reading the sequences, but was too intimidated to share them. Perhaps it is worth listening to/encouraging uninformed criticisms as well as informed ones?

Replies from: John_Maxwell_IV

↑ comment by John_Maxwell (John_Maxwell_IV) · 2012-05-12T06:59:45.800Z · LW(p) · GW(p)

Note the following criticism of SI identified by Holden:

Being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously.

↑ comment by lukeprog · 2012-05-11T08:19:04.244Z · LW(p) · GW(p)

[Holden's critique] already has a warm place in my heart... as the second attempt to mount informed criticism of SIAI.

To those who think Eliezer is exaggerating: please link me to "informed criticism of SIAI."

It is so hard to find good critics.

Edit: Well, I guess there are more than two examples, though relatively few. I was wrong to suggest otherwise. Much of this has to do with the fact that SI hasn't been very clear about many of its positions and arguments: see Beckstead's comment and Hallquist's followup.

Replies from: CarlShulman, Wei_Dai, Will_Newsome, XiXiDu, thomblake, private_messaging

↑ comment by CarlShulman · 2012-05-11T19:26:03.177Z · LW(p) · GW(p)

1) Most criticism of key ideas underlying SIAI's strategies does not reference SIAI, e.g. Chris Malcolm's "Why Robots Won't Rule" website is replying to Hans Moravec.

2) Dispersed criticism, with many people making local points, e.g. those referenced by Wei Dai, is still criticism and much of that is informed and reasonable.

3) Much criticism is unwritten, e.g. consider the more FAI-skeptical Singularity Summit speaker talks, or takes the form of brief responses to questions or the like. This doesn't mean it isn't real or important.

4) Gerrymandering the bounds of "informed criticism" to leave almost no one within bounds is in general a scurrilous move that one should bend over backwards to avoid.

5) As others have suggested, even within the narrow confines of Less Wrong and adjacent communities there have been many informed critics. Here's Katja Grace's criticism of hard takeoff (although I am not sure how separate it is from Robin's). Here's Brandon Reinhart's examination of SIAI, which includes some criticism and brings more in comments. Here's Kaj Sotala's comparison of FHI and SIAI. And there are of course many detailed and often highly upvoted comments in response to various SIAI-discussing posts and threads, many of which you have participated in.

↑ comment by Wei Dai (Wei_Dai) · 2012-05-11T18:26:10.439Z · LW(p) · GW(p)

This is a bit exasperating. Did you not see my comments in this thread? Have you and Eliezer considered that if there really have been only two attempts to mount informed criticism of SIAI, then LessWrong must be considered a massive failure that SIAI ought to abandon ASAP?

Replies from: lukeprog

↑ comment by lukeprog · 2012-05-11T19:16:08.419Z · LW(p) · GW(p)

See here.

↑ comment by Will_Newsome · 2012-05-11T17:21:56.527Z · LW(p) · GW(p)

Wei Dai has written many comments and posts that have some measure of criticism, and various members of the community, including myself, have expressed agreement with them. I think what might be a problem is that such criticisms haven't been collected into a single place where they can draw attention and stir up drama, as Holden's post has.

There are also critics like XiXiDu. I think he's unreliable, and I think he'd admit to that, but he also makes valid criticisms that are shared by other LW folk, and LW's moderation makes it easy to sift his comments for the better stuff.

Perhaps an institution could be designed. E.g., a few self-ordained SingInst critics could keep watch for critiques of SingInst, collect them, organize them, and update a page somewhere out-of-the-way over at the LessWrong Wiki that's easily checkable by SI folk like yourself. LW philanthropists like User:JGWeissman or User:Rain could do it, for example. If SingInst wanted to signal various good things then it could even consider paying a few people to collect and organize criticisms of SingInst. Presumably if there are good critiques out there then finding them would be well worth a small investment.

Replies from: Wei_Dai, lukeprog

↑ comment by Wei Dai (Wei_Dai) · 2012-05-12T08:09:23.742Z · LW(p) · GW(p)

I think what might be a problem is that such criticisms haven't been collected into a single place where they can draw attention and stir up drama, as Holden's post has.

I put them in discussion, because well, I bring them up for the purpose of discussion, and not for the purpose of forming an overall judgement of SIAI or trying to convince people to stop donating to SIAI. I'm rarely sure that my overall beliefs are right and SI people's are wrong, especially on core issues that I know SI people have spent a lot of time thinking about, so mostly I try to bring up ideas, arguments, and possible scenarios that I suspect they may not have considered. (This is one major area where I differ from Holden: I have greater respect for SI people's rationality, at least their epistemic rationality. And I don't know why Holden is so confident about some of his own original ideas, like his solution to Pascal's Mugging, and Tool-AI ideas. (Well I guess I do, it's probably just typical human overconfidence.))

Having said that, I reserve the right to collect all my criticisms together and make a post in main in the future if I decide that serves my purposes, although I suspect that without the influence of GiveWell behind me it won't stir up nearly as much as drama as Holden's post. :)

ETA: Also, I had expected that SI people monitored LW discussions, not just for critiques, but also for new ideas in general (like the decision theory results that cousin_it, Nesov, and others occasionally post). This episode makes me think I may have overestimated how much attention they pay. It would be good if Luke or Eliezer could comment on this.

Replies from: CarlShulman, Will_Newsome

↑ comment by CarlShulman · 2012-05-16T01:34:46.136Z · LW(p) · GW(p)

Also, I had expected that SI people monitored LW discussions, not just for critiques, but also for new ideas in general

I read most such (apparently-relevant from post titles) discussions, and Anna reads a minority. I think Eliezer reads very few. I'm not very sure about Luke.

Replies from: Wei_Dai, lukeprog

↑ comment by Wei Dai (Wei_Dai) · 2012-05-16T09:48:04.659Z · LW(p) · GW(p)

Do you forward relevant posts to other SI people?

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-05-16T20:59:34.965Z · LW(p) · GW(p)

Ones that seem novel and valuable, either by personal discussion or email.

↑ comment by lukeprog · 2012-05-26T05:25:37.592Z · LW(p) · GW(p)

Yes, I read most LW posts that seem to be relevant to my concerns, based on post titles. I also skim the comments on those posts.

↑ comment by Will_Newsome · 2012-05-15T12:30:57.393Z · LW(p) · GW(p)

Also, I had expected that SI people monitored LW discussions, not just for critiques, but also for new ideas in general (like the decision theory results that cousin_it, Nesov, and others occasionally post).

I'm somewhat confident (from directly asking him a related question and also from many related observations over the last two years) that Eliezer mostly doesn't, or is very good at pretending that he doesn't. He's also not good at reading so even if he sees something he's only somewhat likely to understand it unless he already thinks it's worth it for him to go out of his way to understand it. If you want to influence Eliezer it's best to address him specifically and make sure to state your arguments clearly, and to explicitly disclaim that you're specifically not making any of the stupid arguments that your arguments could be pattern-matched to.

Also I know that Anna is often too busy to read LessWrong.

↑ comment by lukeprog · 2012-05-11T19:10:30.856Z · LW(p) · GW(p)

Good point. Wei Dai qualifies as informed criticism. Though, he seems to agree with us on all the basics, so that might not be the kind of criticism Eliezer was talking about.

↑ comment by XiXiDu · 2012-05-11T10:22:18.439Z · LW(p) · GW(p)

To those who think Eliezer is exaggerating: please link me to "informed criticism of SIAI."

It would help if you could elaborate on what you mean by "informed".

Most of what Holden wrote, and much more, has been said by other people, excluding myself, before.

I don't have the time right now to wade through all those years of posts and comments but might do so later.

And if you are not willing to take into account what I myself wrote, for being uninformed, then maybe you will however agree that at least all of my critical comments that have been upvoted to +10 (ETA changed to +10, although there is a lot more on-topic at +5) should have been taken into account. If you do so you will find that SI could have updated some time ago on some of what has been said in Holden's post.

Replies from: Gastogh

↑ comment by Gastogh · 2012-05-11T15:10:11.821Z · LW(p) · GW(p)

It would help if you could elaborate on what you mean by "informed".

Seconded. It seems to me like it's not even possible to mount properly informed criticism if much of the findings are just sitting unpublished somewhere. I'm hopeful that this is actually getting fixed sometime this year, but it doesn't seem fair to not release information and then criticize the critics for being uninformed.

↑ comment by thomblake · 2012-05-11T17:49:32.466Z · LW(p) · GW(p)

I'm not sure how much he's put into writing, but Ben Goertzel is surely informed. One might argue he comes to the wrong conclusions about AI danger, but it's not from not thinking about it.

↑ comment by private_messaging · 2012-05-17T08:14:49.311Z · LW(p) · GW(p)

It is so hard to find good critics.

if you don't have a good argument you won't find good critics. (Unless you are as influential as religion. Then you can get good critic simply because you stepped onto good critic's foot. The critic probably ain't going to come to church to talk about it though, and also the ulterior motives (having had foot stepped onto) may make you qualify it as bad critic).

Much of this has to do with the fact that SI hasn't been very clear about many of its positions and arguments

When you look through a matte glass, and you see some blurred text that looks like it got equations in it, and you are told that what you see is a fuzzy image of proof that P!=NP (maybe you can make out the headers which are in bigger font, and those look like the kind of headers that valid proof might have), do you assume that it is really a valid proof, and they only need to polish the glass? What if it is P=NP instead? What if it doesn't look like it got equations in it?

comment by RHollerith (rhollerith_dot_com) · 2012-05-11T04:04:57.425Z · LW(p) · GW(p)

I feel that [SI] ought to be able to get more impressive endorsements than it has.

SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments.

Holden, do you believe that charitable organizations should set out deliberately to impress donors and high-status potential endorsers? I would have thought that a donor like you would try to ignore the results of any attempts at that and to concentrate instead on how much the organization has actually improved the world because to do otherwise is to incentivize organizations whose real goal is to accumulate status and money for their own sake.

For example, Eliezer's attempts to teach rationality or "technical epistemology" or whatever you want to call it through online writings seem to me to have actually improved the world in a non-negligible way and seem to have been designed to do that rather than designed merely to impress.

ADDED. The above is probably not as clear as it should be, so let me say it in different words: I suspect it is a good idea for donors to ignore certain forms of evidence ("impressiveness", affiliation with high-status folk) of a charity's effectiveness to discourage charities from gaming donors in ways that seems to me already too common, and I was a little surprised to see that you do not seem to ignore those forms of evidence.

Replies from: rhollerith_dot_com, ModusPonies, faul_sname

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-11T18:36:47.285Z · LW(p) · GW(p)

In other words, I tend to think that people who make philanthropy their career and who have accumulated various impressive markers of their potential to improve the world are likely to continue to accumulate impressive markers, but are less likely to improve the world than people who have already actually improved the world.

And of the three core staff members of SI I have gotten to know, 2 (Eliezer and another one who probably does not want to be named) have already improved the world in non-negligible ways and the third spends less time accumulating credentials and impressiveness markers than almost anyone I know.

↑ comment by ModusPonies · 2012-05-12T06:11:56.969Z · LW(p) · GW(p)

I don't think Holden was looking for endorsements from "donors and high-status potential endorsers". I interpreted his post as looking for endorsements from experts on AI. The former would be evidence that SI could go on to raise money and impress people, and the latter would be evidence that SI's mission is theoretically sound. (The strength of that evidence is debatable, of course.) Given that, looking for endorsements from AI experts seems like it would be A) a good idea and B) consistent with the rest of GiveWell's methodology.

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-12T08:06:00.034Z · LW(p) · GW(p)

Although I would have thought that Holden is smart enough to decide whether the FAI project is theoretically sound without his relying on AI experts, maybe I am underestimating the difficulties of people like Holden who are smarter than I am, but who didn't devote their college years to mastering computer science like I did.

Replies from: Strange7

↑ comment by Strange7 · 2013-03-22T09:54:36.165Z · LW(p) · GW(p)

I saw a related issue in a blog about a woman who lost the use of her arm due to an incorrectly treated infection. She initially complained that the judge in her disability case didn't even look at the arm, but then was pleasantly surprised to have the ruling turn out in favor anyway.

I realized: of course the judge wouldn't look at her arm. Having done disability cases before, the judge should know that gruesome appearance correlates weakly, if at all, with legitimate disability, but the emotional response is likely to throw off evaluation of things like an actual doctor's report on the subject. Holden, similarly, is willing to admit that there are things about AI he personally doesn't know, but that professionals who have studied the field for decades do know, and is further willing to trust those professionals to be minimally competent.

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2013-03-23T16:42:11.011Z · LW(p) · GW(p)

I have enough experience of legal and adminstrative disability hearings to say that each side always has medical experts on its side unless one side is unwilling or unable to pay for the testimony of at least one medical expert.

In almost all sufficiently important decisions, there are experts on both sides of the issue. And pointing out that one side has more experts or more impressive experts carries vastly less weight with me than, e.g., Eliezer's old "Knowability of FAI" article at http://sl4.org/wiki/KnowabilityOfFAI

↑ comment by faul_sname · 2012-05-11T23:07:53.418Z · LW(p) · GW(p)

Holden, do you believe that charitable organizations should set out deliberately to impress donors and high-status potential endorsers?

The obvious answer would be "Yes." Givewell only funneled about $5M last year, as compared to the $300,000M or so that Americans give on an annual basis. Most money still comes from people that base their decision on something other than efficiency, so targeting these people makes sense.

Replies from: JGWeissman

↑ comment by JGWeissman · 2012-05-11T23:16:15.153Z · LW(p) · GW(p)

The question was not if an individual charity, holding constant the behavior of other charities, benefits from "setting out deliberately to impress donors and high-status potential endorsers", but whether it is in Holden's interests (in making charities more effective) to generally encourage charities to do so.

Replies from: faul_sname

↑ comment by faul_sname · 2012-05-11T23:27:27.137Z · LW(p) · GW(p)

I think that making charities more effective is an instrumental goal, not a terminal goal. With a terminal goal of "more good stuff gets done", it would indeed be in Holden's interest to encourage charities to impress large donors or influential endorsers. In fact, Holden does that for the charities, an activity that appears to have a higher marginal impact than the actual activities of the charities per hour spent.

Replies from: JGWeissman

↑ comment by JGWeissman · 2012-05-11T23:45:09.945Z · LW(p) · GW(p)

You have argued that charities can get more donations by focusing on being more impressive, but you seem to be assuming that a charity focusing on being more impressive, with more money, will do more good than a charity focused on doing good, with less money. And that assumption is what rhollerith was questioning.

Replies from: None, faul_sname

↑ comment by [deleted] · 2012-05-12T02:36:52.687Z · LW(p) · GW(p)

GiveWell, I think, could be understood as an organization that seeks to narrow the gap for a charity between "seem more impressive to donors" and "show more convincing empirical evidence of effectiveness." That is, they want other donors to be more impressed by better (i.e. more accurate) signals of effectiveness and less by worse (i.e. less accurate) signals.

If GiveWell succeeds in this there are two effects:

1) More donor dollars go to charities that demonstrate themselves to be effective.

2) Charities themselves become more effective, for two major reasons. A) Not all charities rigorously self-evaluate at the moment; the incentive provided by a quorum of empirically-minded donors would help change that. B) Moreover, good donor criticism of charity effectiveness reports can alert a charity to methodological blind-spots in its own work. A negative review from GiveWell can help a charity not merely change its communications for the better (more effective in donor dollars obtained), but also change its actual activities for the better (more effective in goals achieved).

As I understand it, SIAI insiders agree only with Holden's critiques of SIAI's attempts to demonstrate its effectiveness to outside donors, and not with his estimates of SIAI's actual effectiveness (if they concurred in the latter, they'd quit SIAI now!). That said, I think SIAI should be open to the possibility that a donor-critic may have the potential to improve SIAI's actual effectiveness as well. SIAI's being forced to demonstrate its effectiveness to outsiders may lead to more constructive criticism and thus to more effective work. This constructive criticism could happen internally, if SIAI members preparing a report for knowledgeable outsiders like Holden are thereby forced to think like an outsider and thus see problems to which they had previously been blinded. It could also happen externally, if the knowledgeable outsider responds critically to the work presented.

↑ comment by faul_sname · 2012-05-12T03:48:12.367Z · LW(p) · GW(p)

What I'm saying (and the distinction is subtle) is that on the margin, the best thing the most impressive charities can do is increase their impressiveness. Something that Holden is doing for them, but even so he can only do so much.

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-12T09:13:09.630Z · LW(p) · GW(p)

What you say might be true if the only way to do good was to get money from donors. But of course that is not true: a do-gooder can become a donor himself or if he is too poor to donate, he can devote his energies to becoming richer so that he can donate time or money in the future (which is in fact the course that most of the young people inspired by SI's mission are taking).

I am more comfortable speaking about individual altruists rather than charitable organizations. If an individual altruist can find a charity to employ him or find a patron to support his charitable work, then great! If not, then since money is an important resource, he should probably figure out how to get a supply of it. My point in this thread is that if the individual altruist is contemplating spending more than, oh, say 10% or 20% of his life force in becoming more impressive so that he can get a good job at a charity or can get more money from donors, then his plan is probably faulty and that he should instead plan to exchange goods and services he creates for money until money is no longer the constraining resource for his charitable goals.

(For individual altruists who live in countries where it is not as easy to exchange goods and services for money as it is in the English-speaking countries and who cannot emigrate to an English-speaking country, my figure of 10% to 20% might have to be adjusted upward.)

Individuals who make up SI are IMO already investing enough of their time and energy on impressing potential charitable employers, donors and endorsers, hence my request to Holden to clarify what he means when he says, "I feel that [SI] ought to be able to get more impressive endorsements than it has," and, "SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments."

Many more people would choose to have a paid position with SI than can be given a paid position with SI. What these people who wanted jobs at SI but did not get them usually do is earn as much money as possible with the goal of donating it to the cause. Many of these people are almost as qualified as the people who got jobs at SI. (Although they do not pay much, these are attractive jobs, e.g., because of the quality of the people one gets to spend one's workday with.) It would tend to have a demoralizing effect on those that did not get jobs at SI for the people who did get jobs at SI to spend a significant fraction of their resources consolidating their access to high-status contacts, endorsements, charitable jobs and donor money.

So, not all effort at impressing others is bad, but there is need for a balance.

Replies from: Jonathan_Graehl

↑ comment by Jonathan_Graehl · 2012-05-13T07:23:34.233Z · LW(p) · GW(p)

It would tend to have a demoralizing effect on those that did not get jobs at SI for the people who did get jobs at SI to spend a significant fraction of their resources consolidating their access to high-status contacts, endorsements, charitable jobs and donor money.

I agree with the above observation, but I don't see how this is an argument supporting your 10-20% limit on investment in seeming impressive. Do you project overall funding would decrease as a result of the legitimate early-donor let-down you describe, or is it more that you expect actual enthusiasm for the cause to wane as the 'charity overhead' factor worsens?

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-14T19:26:26.208Z · LW(p) · GW(p)

When I put on my donor hat, that is, when I imagine my becoming a significant donor, I tend in my imaginings and my plans to avoid anything that interferes with deriving warm fuzzies from the process of donating or planning to donate -- because when we say "warm fuzzies" we are referring to (a kind of) pleasure, and pleasure is the "gasoline" of the mind: it is certainly not the only thing that can "power" or "motivate" mental work, but it is IMHO the best fuel for work that needs to be sustained over a span of years. (And, yes, that is probably an argument against "Purchasing Fuzzies and Utilons Separately" in some situations although I did not have time today to re-read that article to see whether it can be reconciled with this comment.)

And, yeah, seeing money I donate (or simply imagining the money I will donate in the future) go to improving the lives of people who are probably not much better than me, but who spent a big fraction of their time and energy competing for status within the singularitarian community, jobs and donations with the likes of me, is one of the things that would probably interfere with my deriving warm fuzzies from the whole years-long and hopefully decades-long long process of my becoming a significant donor.

Certainly I am not alone in this aspect of my psychology. Now I will grant that a philanthropist can get a lot of donations by ignoring people who react like I do (namely, react with resentment) to high levels of prestige-seeking and impression management. But I tend to believe that to a philanthropist, donors are like customers are to a consultancy or investors are to a fast-growing company: the quality of the thinking of one's donors (and in particular whether those donors got into donating out of a subconscious desire to affiliate with high-status folk) will tend to have a large effect on one's sanity and ability to reach one's goals.

And let me stress again that at present the level of prestige-seeking and impression management by insiders at SI is low enough not to cause my resentment to build up to levels that would cause me to start thinking about directing my donations elsewhere. But that might change if enough people with Holden-Karnofsky levels of credibility and influence exhort SI to increase their levels of prestige-seeking and impression management.

ADDED. The thing that is wrong with this comment and probably some of my other comments in this thread is that some of my remarks seem to be addressed to people seeking donations. If I were a better communicator, I would have made it clear that the target audience for my comments is donors. I am not worried about persuading people seeking donations because I am confident that if there were some barrier to my donating to, e.g., SI and FHI, I will manage to find other ways of purchasing utilons of comparable or almost-comparable efficiency.

One last thing I would say to donors and wanna-be donors is that this tendency towards resentment I have been describing in this comment (and the resulting inhibitory effect on my motivation) can be considered a feature (rather than a bug) of my personal psychology. In particular, it can be viewed as a form of pre-commitment to penalize (by withholding something I would otherwise be tempted to supply) certain behaviors which not only cause people like me to be overlooked and outcompeted for attractive jobs in charities, but also make the charitable world function less efficiently than it other would through a dynamic similar to a tragedy of the commons.

And this tendency I detect in myself really does feel like a precommitment in the sense that (as is true of almost all human precommitments that operate through the emotions) I have no recollection or impression of having chosen it and in the sense that it would probably require the expenditure of a very great deal of mental resources on my part to act contrary to it.

Replies from: Jonathan_Graehl

↑ comment by Jonathan_Graehl · 2012-05-16T22:04:10.319Z · LW(p) · GW(p)

Wow. Coordination is hard ;)

Your explanation is more or less what I'd gathered from your earlier statement. It makes sense.

The org. that can convince passionate supporters of the cause to work for $ and donate may be different from the one that can get the most mainstream donations.

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-18T00:49:31.945Z · LW(p) · GW(p)

Wow. Coordination is hard ;)

It is possible that this is just a phase I am going through, but if it is, it is a long phase.

Replies from: Jonathan_Graehl

↑ comment by Jonathan_Graehl · 2012-05-20T22:23:20.754Z · LW(p) · GW(p)

This conversation suggests a good habit to practice: being open about how and why I feel about something real, or would about something hypothetical. Since it's hard to separate internal openness from public openness, even though it's really the internal practice I want, maybe airing real motivations/desires more often (as you just did) is better than my conservative semi-stoic default.

comment by Wei Dai (Wei_Dai) · 2012-05-10T22:44:59.765Z · LW(p) · GW(p)

I agree with much of this post, but find a disconnect between the specific criticisms and the overall conclusion of withholding funds from SI even for "donors determined to donate within this cause", and even aside from whether SI's FAI approach increases risk. I see a couple of ways in which the conclusion might hold.

SI is doing worse than they are capable of, due to wrong beliefs. Withholding funds provides incentive for them to do what you think is right, without having to change their beliefs. But this could lead to waste if people disagree in different directions, and funds end up sitting unused because SI can't satisfy everyone, or if SI thinks the benefit of doing what they think is optimal is greater than the value of extra funds they could get from doing what you think is best.
A more capable organization already exists or will come up later and provide a better use of your money. This seems unlikely in the near future, given that we're already familiar with the "major players" in the existential risk area and based on past history, it doesn't seem likely that a new group of highly capable people would suddenly get interested in the cause. In the longer run, it's likely that many more people will be attracted to work in this area as time goes on and the threat of a bad-by-default Singularity becomes more obvious, but those people have the disadvantage of having less time for their work to take effect (which reduces the average value of donations), and there will probably also be many more willing donors than at this time (which reduces the marginal value of donations).

So neither of these ways to fill in the missing part of the argument seems very strong. I'd be interested to know what Holden's own thoughts are, or if anyone else can make stronger arguments on his behalf.

Replies from: TheOtherDave, Bugmaster

↑ comment by TheOtherDave · 2012-05-10T23:18:03.549Z · LW(p) · GW(p)

If Holden believes that:
A) reducing existential risk is valuable, and
B) SI's effectiveness at reducing existential risk is a significant contributor to the future of existential risk, and
C) SI is being less effective at reducing existential risk than they would be if they fixed some set of problems P, and
D) withholding GiveWell's endorsement while pre-committing to re-evaluating that refusal if given evidence that P has been fixed increases the chances that SI will fix P...

...it seems to me that Holden should withhold GiveWell's endorsement while pre-committing to re-evaluating that refusal if given evidence that P has been fixed.

Which seems to be what he's doing. (Of course, I don't know whether those are his reasons.)

What, on your view, ought he do instead, if he believes those things?

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-11T00:36:02.306Z · LW(p) · GW(p)

Holden must believe some additional relevant statements, because A-D (with "existential risk" suitably replaced) could be applied to every other charity, as presumably no charity is perfect.

I guess what I most want to know is what Holden thinks are the reasons SI hasn't already fixed the problems P. If it's lack of resources or lack of competence, then "withholding ... while pre-committing ..." isn't going to help. If it's wrong beliefs, then arguing seems better than "incentivizing", since that provides a permanent instead of temporary solution, and in the course of arguing you might find out that you're wrong yourself. What does Holden believe that causes him to think that providing explicit incentives to SI is a good thing to do?

Replies from: ciphergoth, dspeyer, TheOtherDave

↑ comment by Paul Crowley (ciphergoth) · 2012-05-11T06:44:03.092Z · LW(p) · GW(p)

Thanks for making this argument!

AFAICT charities generally have perverse incentives - to do what will bring in donations, rather than what will do the most good. That can usually argue against things like transparency, for example. So I think when Holden usually says "don't donate to X yet" it's as part of an effort to make these incentives saner.

As it happens, I don't think this problem applies especially strongly to SI, but others may differ.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-11T08:47:57.028Z · LW(p) · GW(p)

Relevant

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T19:38:16.629Z · LW(p) · GW(p)

That is indeed relevant, in that it describes some perverse incentives and weird behaviors of nonprofits, with an interesting example. But knowing this context without having to click the link would have been useful. It is customary to explain what a link is about rather than just drop it.

(Or at least it should be)

↑ comment by dspeyer · 2012-05-11T02:53:55.844Z · LW(p) · GW(p)

But C applies more to some charities than others. And evaluating how much of a charity's potential effectiveness is lost to internal flaws is a big piece of what GiveWell does.

↑ comment by TheOtherDave · 2012-05-11T01:57:19.856Z · LW(p) · GW(p)

Absolutely agreed that if D is false -- for example, if increasing SI's incentive to fix P doesn't in fact increase SI's chances of fixing P, or if a withholding+precommitting strategy doesn't in fact increase SI's incentive to fix P, or some other reason -- then the strategy I describe makes no sense.

↑ comment by Bugmaster · 2012-05-10T23:04:10.642Z · LW(p) · GW(p)

Holden said,

However, I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y.

This addresses your point (2). Holden believes that SI is grossly inefficient at best, and actively harmful at worst (since he thinks that they might inadvertently increase AI risk). Therefore, giving money to SI would be counterproductive, and a donor would get a better return on investment in other places.

As for point (1), my impression is that Holden's low estimate of SI's competence is due to a combination of what he sees as wrong beliefs, as well as an insufficient capability to implement even the correct beliefs into practice. SI claims to be supremely rational, but their list of achievements is lackluster at best -- which indicates a certain amount of Donning-Kruger effect that's going on. Furthermore, SI appears to be focused on growing SI and teaching rationality workshops, as opposed to their stated mission of researching FAI theory.

Additionally, Holden indicted SI members pretty strongly (though very politely) for what I will (in a less polite fashion) label as arrogance. The prevailing attitude of SI members seems to be (according to Holden) that the rest of the world is just too irrational to comprehend their brilliant insights, and therefore the rest of the world has little to offer -- and therefore, any criticism of SI's goals or actions can be dismissed out of hand.

EDIT: found the right quote, duh.

comment by Wei Dai (Wei_Dai) · 2012-05-12T19:35:37.352Z · LW(p) · GW(p)

Some comments on objections 1 and 2.

For example, when the comment says "the formalization of the notion of 'safety' used by the proof is wrong," it is not clear whether it means that the values the programmers have in mind are not correctly implemented by the formalization, or whether it means they are correctly implemented but are themselves catastrophic in a way that hasn't been anticipated.

Both (with the caveat that SI's plans are to implement an extrapolation procedure for the values, and not the values themselves).

Another way of putting this is that a "tool" has an underlying instruction set that conceptually looks like: "(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc."

I think such a Tool-AI will be much less powerful than an equivalent Agent-AI, due to the bottleneck of having to summarize its calculations in a human-readable form, and then waiting for the human to read and understand the summary and then make a decision. It's not even clear that the huge amounts of calculations that a Tool-AI might do in order to find optimal actions can be summarized in any useful way, or this process of summarization can be feasibly developed before others create Agent-AIs. (Edit: See further explanation of this problem here.) Of course you do implicitly acknowledge this:

Some have argued to me that humans are likely to choose to create agent-AGI, in order to quickly gain power and outrace other teams working on AGI. But this argument, even if accepted, has very different implications from SI's view. [...] It seems that the appropriate measures for preventing such a risk are security measures aiming to stop humans from launching unsafe agent-AIs, rather than developing theories or raising awareness of "Friendliness."

I do accept this argument (and have made similar arguments), except that I advocate trying to convince AGI researchers to slow down development of all types of AGI (including Tool-AI, which can be easily converted into Agent-AI), and don't think "security measures" are of much help without a world government that implements a police state to monitor what goes on in every computer. Convincing AGI researchers to slow down is also pointless without a simultaneous program to create a positive Singularity via other means. I've written more about my ideas here, here, and here.

Replies from: Will_Newsome, private_messaging

↑ comment by Will_Newsome · 2012-05-26T21:43:50.505Z · LW(p) · GW(p)

Both (with the caveat that SI's plans are to implement an extrapolation procedure for the values, and not the values themselves).

(Responding to hypothetical-SingInst's position:) It seems way too first-approximation-y to talk about values-about-extrapolation as anything other than just a subset of values—and if you look at human behavior, values about extrapolation vary very much and are very tied into object-level values. (Simply consider hyperbolic discounting! And consider how taking something as basic as coherence/consistency to its logical extreme leads to either a very stretched ethics or a more fitting but very different meta-ethics like theism.) Even if it were possible to formalize such a procedure it would still be fake meta. "No: at all costs, it is to be prayed by all men that Shams may cease."

↑ comment by private_messaging · 2012-05-13T22:33:41.559Z · LW(p) · GW(p)

Is compiler an agent by your definition? We don't read it's output, usually. And it may try to improve runtime performance. It however differs in one fundamental way from agents - the value for the code actually running is not implemented into the compiler.

comment by jimrandomh · 2012-05-10T19:26:29.603Z · LW(p) · GW(p)

I don't work for SI and this is not an SI-authorized response, unless SI endorses it later. This comment is based on my own understanding based on conversations with and publications of SI members and general world model, and does not necessarily reflect the views or activities of SI.

The first thing I notice is that your interpretation of SI's goals with respect to AGI are narrower than the impression I had gotten, based on conversations with SI members. In particular, I don't think SI's research is limited to trying to make AGI friendliness provable, but on a variety of different safety strategies, and on the relative win-rates of different technological paths, eg brain uploading vs. de-novo AI, classes of utility functions and their relative risks, and so on. There is also a distinction between "FAI theory" and "AGI theory" that you aren't making; the idea, as I see it, is that to the extent to which these are separable, "FAI theory" covers research into safety mechanisms which reduce the probability of disaster if any AGI is created, while "AGI theory" covers research that brings the creation of any AGI closer. Your first objection - that a maximizing FAI would be very dangerous - seems to be based on a belief, first, that SI is researching a narrower class of safety mechanisms than it really is, and second, that SI researches AGI theory, which I believe it explicitly does not.

You seem a bit sore that SI hasn't talked about your notion of Tool-AI, but I'm a bit confused by this, since it's the first time I've heard that term used, and your link is to an email thread which, unless I'm missing something, was not disseminated publicly or through SI in general. A conversation about tool-based AI is well worth having; my current perspective is that it looks like it interacts with the inevitability argument and the overall AI power curve in such a way that it's still very dangerous, and that it amounts to a slightly different spin on Oracle AI, but this would be a complicated discussion. But bringing it up effectively for the first time, in the middle of a multi-pronged attack on SI's credibility, seems really unfair. While there may have been a significant communications failure in there, a cursory reading suggests to me that your question never made it to the right person.

The claim that SI will perform better if they don't get funding seems very strange. My model is that it would force their current employees to leave and spend their time on unrelated paid work instead, which doesn't seem like an improvement. I get the impression that your views of SI's achievements may be getting measured against a metric of achievements-per-organization, rather than achievements-per-dollar; in absolute budget terms, SI is tiny. But they've still had a huge memetic influence, difficult as that is to measure.

All that said, I applaud your decision to post your objections and read the responses. This sort of dialogue is a good way to reach true beliefs, and I look forward to reading more of it from all sides.

Replies from: steven0461, Rain

↑ comment by steven0461 · 2012-05-10T20:12:28.906Z · LW(p) · GW(p)

In particular, I don't think SI's research is limited to trying to make AGI friendliness provable, but on a variety of different safety strategies, and on the relative win-rates of different technological paths, eg brain uploading vs. de-novo AI, classes of utility functions and their relative risks, and so on.

I agree, and would like to note the possibility, for those who suspect FAI research is useless or harmful, of earmarking SI donations to research on different safety strategies, or on aspects of AI risk that are useful to understand regardless of strategy.

Replies from: rocurley

↑ comment by rocurley · 2012-05-10T22:55:19.755Z · LW(p) · GW(p)

This likely won't work. Money is fungible, so unless the total donations so earmarked exceeds the planned SI funding for that cause, they won't have to change anything. They're under no obligation to not defund your favorite cause by exactly the amount you donated, thus laundering your donation into the general fund. (Unless I misunderstand the relevant laws?)

EDIT NOTE: The post used to say vast majority; this was changed, but is referenced below.

Replies from: dlthomas, steven0461

↑ comment by dlthomas · 2012-05-10T23:03:45.655Z · LW(p) · GW(p)

You have an important point here, but I'm not sure it gets up to "vast majority" before it becomes relevant.

Earmarking $K for X has an effect once $K exceeds the amount of money that would have been spent on X if the $K had not been earmarked. The size of the effect still certainly depends on the difference, and may very well not be large.

↑ comment by steven0461 · 2012-05-10T23:02:48.070Z · LW(p) · GW(p)

Suppose you earmark to a paper on a topic X that SI would otherwise probably not write a paper on. Would that cause SI to take money out of research on topics similar to X and into FAI research? There would probably be some sort of (expected) effect in that direction, but I think the size of the effect depends on the details of what causes SI's allocation of resources, and I think the effect would be substantially smaller than would be necessary to make an earmarked donation equivalent to a non-earmarked donation. Still, you're right to bring it up.

↑ comment by Rain · 2012-05-10T20:47:21.890Z · LW(p) · GW(p)

Some recent discussion of AIs as tools.

comment by jacob_cannell · 2012-05-15T09:23:08.887Z · LW(p) · GW(p)

I'm glad for this, LessWrong can always use more engaging critiques of substance. I partially agree with Holden's conclusions, although I reach them from a substantially different route. I'm a little surprised then that few of the replies have directly engaged what I find to be the more obvious flaws in Holden's argument: namely objection 2 and the inherent contradictions with it and objection 1.

Holden posits that many (most?) well-known current AI applications more or less operate as sophisticated knowledge bases. His tool/agent distinction draws a boundary around AI tools: systems whose only external actions consist of communicating results to humans, and the rest being agents which actually plan and execute actions with external side effects. Holden distinguishes 'tool' AI from Oracle AI, the latter really being agent AI (designed for autonomy) which is trapped in some sort of box. Accepting Holden's terminology and tool/agent distinction, he then asserts:

That 'tool' AGI already is and will continue to be the dominant type of AI system.
That AGI running in tool mode will: " be extraordinarily useful but far more safe than an AGI running in agent mode,"

I can accept that any AGI running in 'tool' mode will be far safer than an AGI running in agent mode (although perhaps still not completely safe), but I believe Holden critically overestimates the domain and potential of 'tool' AGI, given his distinction.

It is true that many well known current AI systems operate as sophisticated knowledge tools, rather than agents. Search engines such as google are the first example Holden lists, but I haven't heard many people refer to search engines as AGIs.

In fact, having the capability to act in the world and learn from the history of such interactions is a crucial component of many AGI architectures, and perhaps all with the potential for human-level general intelligence. One could certainly remove the AGI's capacity for action at a later date: in Holden's terminology this would be switching the AGI from tool mode to agent mode. If we were using more everyday terminology we might as well call this paralyzing the AGI.

Yes switching an existing agent AGI into 'tool' mode (paralyzing it) certainly solves most safety issues regarding that particular agent, but this is far from a global panacea. Being less charitable, I would say it adds little of substance to the discussions of AI existential risk. It's much like one saying "but we can simply disable the nukes!". (and it's even potentially less effective than the analogy implies, because superpowerful unsafe agent AIs may not be so easy to 'switch' into 'tool' mode, to put it mildly).

After Google, Holden's next examples of primarily 'tool' mode AI are Siri and Watson. Siri is actually an agent in Holden's terminology, it can execute some web tasks in its limited set of domains. This may be a small percent of its current usage, but I don't expect that to hold true for its future descendants.

What Holden fails to mention are any relevant examples of all the current agent AI systems we already have today, and what tomorrow may bring.

The world of financial trading is already dominated by AI agents, and this particular history is most telling. Decades ago, when computer were very weak, they were used as simple tools to evaluate financial models which in turn were just a component of a human agent's overall trading strategy. As computers grew in power and became integrally connected in financial networks, computers began to take on more and more book-keeping actions, and eventually people started using computers to execute entire simple trading strategies on their own (albeit with much hands on supervision). Fast forward to 2012 and we now have the majority of trades executed by automated and increasingly autonomous trading systems. They are still under the supervision of human caretakers, but as they grow in complexity this increasingly becomes a nominal role.

There is a vast profitable micro-realm where these agents trade on lightning fast millisecond timescales; an economic niche that humans literally can not enter: it's an alien environment, and we have been engineering and evolving alien agents to inhabit and exploit it for us.

To one who has only basic familiarity with software development, one may imagine that software is something that humans design, write and build according to their specifications. That is only partially true, moreso for smaller projects.

The other truth, perhaps a deeper wisdom, is that large software systems evolve. Huge software systems are too massive to be designed or even fully understood by individual human minds, so their development follows a trajectory that is perhaps better understood in evolutionary terms. This is already the cause of much concern in the context of operating systems and security, which itself is only a small taste of the safety issues in a future world dominated by large, complex evolved agent systems.

It is true that there are many application domains where agents have had little impact as of yet, but this just shows us the niches that agents will eventually occupy.

In the end of the day, we need just compare the ultimate economic value of a tool vs an agent. What fraction of current human occupations can be considered 'tool' jobs vs 'agent' jobs? Agents which drive cars or execute finacial trades are just the beginning, the big opportunites are in agents which autonomously build software systems, design aircraft, perform biology research, and so on. Systems such as Siri and Watson today mainly function as knowledge tools, but we can expect that their descendants will eventually occupy a broad full range of human jobs, most of which involve varying degrees of autonomous agent behavior.

Consider the current domain of software development. What does a tool-mode AGI really add here in a world that already has google and the web? A tool-mode AGI could take a vague design and set of business constraints and output a detailed design or perhaps even an entire program, but that program would still need to be tested, and you might as well automate that. And most large software systems consist of ecologies of interacting programs: web crawlers, databases, planning/logistic systems, and so on where most of the 'actions' are already automated rather than assigned to humans.

As another example consider the 'lights-out' automated factory. The foundries that produce microchips are becoming increasingly autonomous systems, as is the front side design industry. If we extrapolate that to the future ...

The IBM of tommorrow may well consist of a small lucky pool of human stockowners reaping the benefits of a vast army of watson's future descendants who have gone on to replace all the expensive underperforming human employees of our time. International Business Machines, indeed: a world where everything from the low level hardware and foundry machinery up to the software and even business strategy is designed and built by complex ecologies of autonomous software agents. That seems to be not only where we are heading, but to some limited degree, where we already are.

Thus I find it highly unlikely that tool mode AI is and will be the dominate paradigm, as Holden asserts. Moreover, his argument really depends on tool mode being dominate by a significant fraction. If agent AI consists of even only 5% of the market at some future date, it still could contribute an unacceptable majority of risk.

comment by amcknight · 2012-05-15T21:00:51.105Z · LW(p) · GW(p)

Holden does a great job but makes two major flaws:
1) His argument about Tool-AI is irrelevant, because creating Tool-AI does almost nothing to avoid Agent-AI, which he agrees is dangerous.
2) He too narrowly construes SI's goals by assuming they are only working on Friendly AI rather than AGI x-risk reduction in general.

comment by PhilGoetz · 2012-05-15T00:23:41.917Z · LW(p) · GW(p)

I'm very impressed by Holden's thoroughness and thoughtfulness. What I'd like to know is why his post is Eliezer-endorsed and has 191 up-votes, while my many posts over the years hammering on Objection 1, and my comments raising Objection 2, have never gotten the green button, been frequently down-voted, and never been responded to by SIAI. Do you have to be outside the community to be taken seriously by it?

Replies from: metaphysicist, Rain, Nick_Beckstead, John_Maxwell_IV, ghf

↑ comment by metaphysicist · 2012-05-15T00:35:46.517Z · LW(p) · GW(p)

Not to be cynical, PhilGoetz, but isn't Holden an important player in the rational-charity movement? Wouldn't the ultimate costs of ignoring Holden be prohibitive?

Replies from: PhilGoetz, Rain

↑ comment by PhilGoetz · 2012-05-16T02:25:05.353Z · LW(p) · GW(p)

That could explain the green dot. I don't know which explanation is more depressing.

↑ comment by Rain · 2012-05-15T00:46:40.718Z · LW(p) · GW(p)

You are absolutely correct. And, that's not the reason I find it engaging or informative.

↑ comment by Rain · 2012-05-15T00:42:08.623Z · LW(p) · GW(p)

I thought most of the stuff in Holden's post had been public knowledge for years, even to the point of being included in previous FAQs produced by SI. The main difference is that the presentation and solidity of it in this article are remarkable - interconnecting so many different threads which, when placed as individual sentences or paragraphs, might hang alone, but when woven together with the proper knots form a powerful net.

↑ comment by Nick_Beckstead · 2012-05-15T02:23:47.805Z · LW(p) · GW(p)

I would be interested to see if you could link to posts where you made versions of these objections.

Replies from: PhilGoetz

↑ comment by PhilGoetz · 2012-05-18T00:48:49.340Z · LW(p) · GW(p)

Okay.

↑ comment by John_Maxwell (John_Maxwell_IV) · 2012-05-15T00:41:12.323Z · LW(p) · GW(p)

Assuming what you say is true, it looks to me as though SI is paying the cost of ignoring its critics for so many years...

↑ comment by ghf · 2012-05-15T01:36:05.410Z · LW(p) · GW(p)

I think some of it comes down to the range of arguments offered. For example, posted alone, I would not have found Objection 2 particularly compelling, but I was impressed by many other points and in particular the discussion of organizational capacity. I'm sure there are others for whom those evaluations were completely reversed. Nonetheless, we all voted it up. Many of us who did so likely agree with one another less than we do with SIAI, but that has only showed up here and there on this thread.

Critically, it was all presented, not in the context of an inside argument, but in the context of "is SI an effective organization in terms of its stated goals." The question posed to each of us was: do you believe in SI's mission and, if so, do you think that donating to SI is an effective way to achieve that goal? It is a wonderful instantiation of the standard test of belief, "how much are you willing to bet on it?"

comment by jedharris · 2012-05-10T20:19:47.996Z · LW(p) · GW(p)

Karnofsky's focus on "tool AI" is useful but also his statement of it may confuse matters and needs refinement. I don't think the distinction between "tool AI" and "agent AI" is sharp, or in quite the right place.

For example, the sort of robot cars we will probably have in a few years are clearly agents-- you tell them to "come here and take me there" and they do it without further intervention on your part (when everything is working as planned). This is useful in a way that any amount and quality of question answering is not. Almost certainly there will be various flavors of robot cars available and people will choose the ones they like (that don't drive in scary ways, that get them where they want to go even if it isn't well specified, that know when to make conversation and when to be quiet, etc.) As long as robot cars just drive themselves and people around, can't modify the world autonomously to make their performance better, and are subject to continuing selection by their human users, they don't seem to be much of a threat.

The key points here seem to be (1) limited scope, (2) embedding in a network of other actors and (3) humans in the loop as evaluators. We could say these define "tool AIs" or come up with another term. But either way the antonym doesn't seem to be "agent AIs" but maybe something like "autonomous AIs" or "independent AIs" -- AIs with the power to act independently over a very broad range, unchecked by embedding in a network of other actors or by human evaluation.

Framed this way, we can ask "Why would independent AIs exist?" If the reason is mad scientists, an arms race, or something similar then Karnofsky has a very strong argument that any study of friendliness is beside the point. Outside these scenarios, the argument that we are likely to create independent AIs with any significant power seems weak; Karnofsky's survey more or less matches my own less methodical findings. I'd be interested in strong arguments if they exist.

Given this analysis, there seem to be two implications:

We shouldn't build independent AIs, and should organize to prevent their development if they seem likely.
We should thoroughly understand the likely future evolution of a patchwork of diverse tool AIs, to see where dangers arise.

For better or worse, neither of these lend themselves to tidy analytical answers, though analytical work would be useful for both. But they are very much susceptible to investigation, proposals, evangelism, etc.

These do lend themselves to collaboration with existing AI efforts. To the extent they perceive a significant risk of development of independent AIs in the foreseeable future, AI researchers will want to avoid that. I'm doubtful this is an active risk but could easily be convinced by evidence -- not just abstract arguments -- and I'm fairly sure they feel the same way.

Understanding the long term evolution of a patchwork of diverse tool AIs should interest just about all major AI developers, AI project funders, and long term planners who will be affected (which is just about all of them). Short term bias and ceteris paribus bias will lead to lots of these folks not engaging with the issue, but I think it will seem relevant to an increasing number as the hits keep coming.

Replies from: RomeoStevens, brazil84

↑ comment by RomeoStevens · 2012-05-10T20:32:53.065Z · LW(p) · GW(p)

any amount and quality of question answering is not.

"how do I build an automated car?"

Replies from: Hul-Gil

↑ comment by Hul-Gil · 2012-05-11T03:44:40.532Z · LW(p) · GW(p)

That doesn't help you if you need a car to take you someplace in the next hour or so, though. I think jed's point is that sometimes it is useful for an AI to take action rather than merely provide information.

↑ comment by brazil84 · 2014-07-17T13:19:28.773Z · LW(p) · GW(p)

For example, the sort of robot cars we will probably have in a few years are clearly agents-- you tell them to "come here and take me there" and they do it without further intervention on your part (when everything is working as planned). This is useful in a way that any amount and quality of question answering is not.

Yes I agree. Evidently, the environment cars work in is too fast-paced and quickly changing for "tool ai" to be close in usefulness to "agent ai." To drive safely and effectively, you need to be making and implementing decisions on the time frame of a split second.

At the same time, the lesson to be learned is that useful ai can have a utility function which is pretty mundane -- e.g. "find a fast route from point A to point B while minimizing the chances of running off the road or running into any people or objects."

Similarly, instead of telling AI to "improve human welfare" we can tell it to do things like "find ways to kill cancerous cells while keeping collateral damage to a minimum." The higher level decisions about improving human welfare can be left to the traditional institutions - legislatures, courts, and individual autonomy.

Replies from: None

↑ comment by [deleted] · 2014-07-17T13:31:55.488Z · LW(p) · GW(p)

At the same time, the lesson to be learned is that useful ai can have a utility function which is pretty mundane -- e.g. "find a fast route from point A to point B while minimizing the chances of running off the road or running into any people or objects."

Self-driving cars aren't piloted by AGIs in the first place, let alone dangerous "world-optimization" AGIs.

Similarly, instead of telling AI to "improve human welfare" we can tell it to do things like "find ways to kill cancerous cells while keeping collateral damage to a minimum." The higher level decisions about improving human welfare can be left to the traditional institutions - legislatures, courts, and individual autonomy.

The whole point of Friendly AI is that we want something which is more effective at improving human welfare than our existing institutions. Our existing institutions are, by FAI standards, Unfriendly and destructive. Not existentially destructive, this is true (except on rare occasions like World War II), but neither are they trustworthy when handed, for instance, power over the life-and-death of Earth's ecosystem (which they are currently failing to save, despite our having no other planet to go to).

Replies from: brazil84

↑ comment by brazil84 · 2014-07-19T05:47:02.626Z · LW(p) · GW(p)

[ . . . ]

I don't engage with this poster because of his past dishonesty, i.e. misrepresenting my posts. If anyone not on my *(&^%-list is curious, I am happy to provide references.

Replies from: wedrifid

↑ comment by wedrifid · 2014-07-19T11:26:49.232Z · LW(p) · GW(p)

I don't engage with this poster because of his past dishonesty, i.e. misrepresenting my posts. If anyone not on my *(&^%-list is curious, I am happy to provide references.

I applaud your decision to not engage (as a good general strategy given your state of belief---the specifics of the conflict do not matter). I find it usually works best to do so without announcing it. Or, at least, by announcing it sparingly with extreme care to minimize the appearance of sniping.

comment by NancyLebovitz · 2012-05-11T16:50:00.213Z · LW(p) · GW(p)

I'd brought up a version of the tool/agent distinction, and was told firmly that people aren't smart or fast enough to direct an AI. (Sorry, this is from memory-- I don't have the foggiest how to do an efficient search to find that exchange.)

I'm not sure that's a complete answer-- how possible is it to augment a human towards being able to manage an AI? On the other hand, a human like that isn't going to be much like humans 1.0, so problems of Friendliness are still in play.

Perhaps what's needed is building akrasia into the world-- a resistance to sudden change. This has its own risks, but sudden existential threats are rare. [1]

At this point, I think the work on teaching rationality is more reliably important than the work on FAI. FAI involves some long inferential chains. The idea that people could improve their lives a lot by thinking more carefully about what they're doing and acting on those thoughts (with willingness to take feedback) is a much more plausible idea, even if you factor in the idea that rationality can be taught.

[1] Good enough for fiction-- we're already living in a world like that. We call the built-in akrasia Murphy.

Replies from: TheOtherDave, XiXiDu

↑ comment by TheOtherDave · 2012-05-11T18:05:03.786Z · LW(p) · GW(p)

You may be thinking of this exchange, which I found only because I remembered having been involved in it.

I continue to think that "tool" is a bad term to use here, because people's understanding of what it refers to vary so relevantly.

As for what is valuable work... hm.

I think teaching people to reason in truth-preserving and value-preserving ways is worth doing.
I think formalizing a decision theory that captures universal human intuitions about what the right thing to do in various situations is worth doing.
I think formalizing a decision theory that captures non-universal but extant "right thing" intuitions is potentially worth doing, but requires a lot of auxiliary work to actually be worth doing.
I think formalizing a decision theory that arrives at judgments about the right thing to do in various situations where those judgments are counterintuitive for most/all humans but reliably lead, if implemented, to results that those same humans reliably endorse more the results of their intuitive judgments is worth doing.
I think building systems that can solve real-world problems efficiently is worth doing, all else being equal, though I agree that powerful tools frequently have unexpected consequences that create worse problems than they solve, in which case it's not worth doing.
I think designing frameworks within which problem-solving systems can be built, such that the chances of unexpected negative consequences are lower inside that framework than outside of it, is worth doing.

I don't find it likely that SI is actually doing any of those things particularly more effectively than other organizations.

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2012-05-11T18:59:24.284Z · LW(p) · GW(p)

Thanks for the link-- that was what I was thinking of.

Do you have other organizations which teach rationality in mind? Offhand, the only thing I can think of is cognitive behavioral therapy, and it's not exactly an organization.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-11T20:40:01.823Z · LW(p) · GW(p)

No, I don't have anything specific in mind.

↑ comment by XiXiDu · 2012-05-11T17:59:04.362Z · LW(p) · GW(p)

I'd brought up a version of the tool/agent distinction, and was told firmly that people aren't smart or fast enough to direct an AI.

:-)

Replies from: FiftyTwo

↑ comment by FiftyTwo · 2012-05-14T00:35:33.359Z · LW(p) · GW(p)

The article is interesting, but I'm not sure it is relevant as the humans involved weren't directing or monitoring the overall process, just taking part in it. Analogously even if an AGI requires my assistance/authorization to do certain things, that doesn't give me any control over it unless I understand the consequences.

Also general warning against 'generalising from fictional evidence.'

comment by kip1981 · 2012-05-11T05:49:51.537Z · LW(p) · GW(p)

My biggest criticism of SI is that I cannot decide between:

A. promoting AI and FAI issues awareness will decrease the chance of UFAI catastrophe; or B. promoting AI and FAI issues awareness will increase the chance of UFAI catastrophe

This criticism seems district from the ones that Holden makes. But it is my primary concern. (Perhaps the closest example is Holden's analogy that SI is trying to develop facebook before the Internet).

A seems intuitive. Basically everyone associated with SI assumes that A is true, as far as I can tell. But A is not obviously true to me. It seems to me at least plausible that:

A1. promoting AI and FAI issues will get lots of scattered groups around the world more interested in creating AGI A2. one of these groups will develop AGI faster than otherwise due to A1 A3. the world will be at greater risk of UFAI catastrophe than otherwise due to A2 (i.e. the group creates AGI faster than otherwise, and fails at FAI)

More simply: SI's general efforts, albeit well intended, might accelerate the creation of AGI, and the acceleration of AGI might decrease the odds of the first AGI being friendly. This is one path by which B, not A, would be true.

SI might reply that, although it promotes AGI, it very specifically limits its promotion to FAI. Although that is SI's intention, it is not at all clear that promoting FAI will not have the unintended consequence of accelerating UFAI. By analogy, if a responsible older brother goes around promoting gun safety all the time, the little brother might be more likely to accidentally blow his face off, than if the older brother had just kept his mouth shut. Maybe the older brother shouldn't have kept his mouth shut, maybe he should have... it's not clear either way.

If B is more true than A, the best thing that SI could do would probably be develop clandestine missions to assassinate people who try to develop AGI. SI does almost the exact opposite.

SI's efforts are based on the assumption that A is true. But it's far from clear to me that A, instead of B, is true. Maybe it is, maybe it is. SI seems overconfident that A is true. I've never heard anyone at SI (or elsewhere) really address this criticism.

Replies from: torekp

↑ comment by torekp · 2012-05-14T02:18:44.030Z · LW(p) · GW(p)

I like your gun safety analogy. Actually however, it seems to me that a significant portion of LW shares your doubts, or even favors view B. I second your call for some (more?) direct discussion on the question.

comment by Mitchell_Porter · 2012-05-11T10:40:56.963Z · LW(p) · GW(p)

Maybe I'm just jaded, but this critique doesn't impress me much. Holden's substantive suggestion is that, instead of trying to design friendly agent AI, we should just make passive "tool AI" that only reacts to commands but never acts on its own. So when do we start thinking about the problems peculiar to agent AI? Do we just hope that agent AI will never come into existence? Do we ask the tool AI to solve the friendly AI problem for us? (That seems to be what people want to do anyway, an approach I reject as ridiculously indirect.)

Replies from: Will_Newsome

↑ comment by Will_Newsome · 2012-05-11T17:40:31.171Z · LW(p) · GW(p)

(Perhaps I should note that I find your approach to be too indirect as well: if you really understand how justification works then you should be able to use that knowledge to make (invoke?) a theoretically perfectly justified agent, who will treat others' epistemic and moral beliefs in a thoroughly justified manner without your having to tell it "morality is in mind-brains, figure out what the mind-brains say then do what they tell you to do". That is, I think the correct solution should be just clearly mathematically and meta-ethically justified, question-dissolving, reflective, non-arbitrary, perfect decision theory. Such an approach is closest in spirit to CFAI. All other approaches, e.g. CEV, WBE, or oracle AI, are relatively arbitrary and unmotivated, especially meta-ethically.)

Replies from: hairyfigment

↑ comment by hairyfigment · 2012-05-11T18:12:05.999Z · LW(p) · GW(p)

Not only does this seem wrong, but if I believed it I would want SI to look for the correct decision theory (roughly what Eliezer says he's doing anyway). It fails to stress the possibility that Eliezer's whole approach is wrong. In fact it seems willfully (heh) ignorant of the planning fallacy and similar concerns: even formalizing the 'correct' prior seems tricky to me, so why would it be feasible to formalize "correct" meta-ethics even if it exists in the sense you mean? And what reason do we have to believe that a version with no pointers to brains exists at all?

At least with reflective decision theory I see no good reason to think that a transparently-written AGI is impossible in principle (our neurons don't just fire randomly, nor does evolution seem like a particularly good searcher of mindspace), so a theory of decisions that can describe said AGI's actions should be mathematically possible barring some alternative to math. (Whether, eg, the description would fit in our observable universe seems like another question.)

comment by Dolores1984 · 2012-05-10T19:26:04.416Z · LW(p) · GW(p)

Leaving aside the question of whether Tool AI as you describe it is possible until I've thought more about it:

The idea of a "self-improving algorithm" intuitively sounds very powerful, but does not seem to have led to many "explosions" in software so far (and it seems to be a concept that could apply to narrow AI as well as to AGI).

Looking to the past for examples is a very weak heuristic here, since we have never dealt with software that could write code at a better than human level before. It's like saying, before the invention of the internal combustion engine, "faster horses have never let you cross oceans before." Same goes for the assumption that strong AI will resemble extremely narrow AI software tools that already exist in specific regards. It's evidence, but it's very weak evidence, and I for one wouldn't bet on it.

comment by [deleted] · 2012-05-11T22:54:05.045Z · LW(p) · GW(p)

I am very happy to see this post and the subsequent dialogue. I've been talking with some people at Giving What We Can about volunteering (beginning in June) to do statistical work for them in trying to find effective ways to quantify and assess the impact of charitable giving specifically to organizations that work on mitigating existential risks. I hope to incorporate a lot of what is discussed here into my future work.

comment by Shmi (shminux) · 2012-05-11T19:29:45.664Z · LW(p) · GW(p)

Given that much of the discussion revolves around the tool/agent issue, I'm wondering if anyone can point me to a mathematically precise definition of each, in whatever limited context it applies.

Replies from: Will_Newsome, Bugmaster, Nick_Beckstead

↑ comment by Will_Newsome · 2012-05-11T20:05:43.511Z · LW(p) · GW(p)

It's mostly a question for philosophy of mind, I think specifically a question about intentionality. I think the closest you'll get to a mathematical framework is control theory; controllers are a weird edge case between tools and very simple agents. Control theory is mathematically related to Bayesian optimization, which I think Eliezer believes is fundamental to intelligence: thus identifying cases where a controller is a tool or an agent would be directly relevant. But I don't see how the mathematics, or any mathematics really, could help you. It's possible that someone has mathematized arguments about intentionality by using information theory or some such, you could Google that. Even so I think that at this point the ideas are imprecise enough such that plain ol' philosophy is what we have to work with. Unfortunately AFAIK very few people on LW are familiar with the relevant parts of philosophy of mind.

Replies from: shminux, othercriteria

↑ comment by Shmi (shminux) · 2012-05-11T20:14:07.388Z · LW(p) · GW(p)

It is an EY's announced intention to work toward an AI that is provably friendly. "Provably" means that said AI is defined in some mathematical framework first. I don't see how one can make much progress in that area before rigorously defining intentionality.

I guess I am getting ahead of myself here. What would a relevant mathematical framework entail, to begin with?

Replies from: dlthomas, Will_Newsome

↑ comment by dlthomas · 2012-05-11T20:29:20.278Z · LW(p) · GW(p)

I guess I am jumping the shark here.

I don't think that idiom means what you think it means.

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-05-11T20:35:20.049Z · LW(p) · GW(p)

Thank you, fixed.

Replies from: quintopia

↑ comment by quintopia · 2012-05-17T06:23:07.135Z · LW(p) · GW(p)

You were probably fishing for "jumping the gun".

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-05-18T00:55:36.033Z · LW(p) · GW(p)

Yeah, should have been shooting instead of fishing.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-18T00:59:00.601Z · LW(p) · GW(p)

It could be said that you shot yourself in the foot by jumping the shark while fishing for a gun.

↑ comment by Will_Newsome · 2012-05-11T21:01:27.781Z · LW(p) · GW(p)

(It's possible that intentionality isn't the sharpest distinction between "tools" and "agents", but it's the one that I see most often emphasized in philosophy of mind, especially with regards to necessary preconditions for the development of any "strong AI".)

It seems that one could write an AI that is in some sense "provably Friendly" even while remaining agnostic as to whether the described AI is or will ultimately become a tool or an agent. It might be that a proposed AI couldn't be an agent because it couldn't solve the symbol grounding problem, i.e. because it lacked intentionality, and thus wouldn't be an effective FAI, but would nonetheless be Friendly in a certain limited sense. However if effectiveness is considered a requirement of Friendliness then one would indeed have to prove in advance that one's proposed AI could solve the grounding problem in order to prove that said AI was Friendly, or alternatively, prove that the grounding problem as such isn't a meaningful concept. I'm not sure what Eliezer would say about this; given his thinking about "outcome pumps" and so on, I doubt he thinks symbol grounding is a fundamental or meaningful problem, and so I doubt that he has or is planning to develop any formal argument that symbol grounding isn't a fundamental roadblock for his preferred attack on AGI.

I guess I am jumping the shark here. The shark in question being the framework itself. What would a relevant mathematical framework entail?

Your question about what a relevant mathematical framework would entail seems too vague for me to parse; my apologies, it's likely my exhaustion. But anyway, if minds leave certain characteristic marks on their environment by virtue of their having intentional (mental) states, then how precise and deep you can make your distinguishing mathematical framework depends on how sharp a cutoff there is in reality between intentional and non-intentional states. It's possible that the cutoff isn't sharp at all, in which case it's questionable whether the supposed distinction exists or is meaningful. If that's the case then it's quite possible that it's not possible to formulate a deep theory that could distinguish agents from tools, or intentional states from non-intentional ones. I think it likely that most AGI researchers, including Eliezer, hold the position that it is indeed impossible to do so. I don't think it would be possible to prove the non-existence of a sharp cutoff, so I think Eliezer could justifiably conclude that he didn't have to prove that his AI would be an "agent" or a "tool", because he could deny, even without mathematical justification, that such a distinction is meaningful.

I'm tired, apologies for any errors.

↑ comment by othercriteria · 2012-05-11T21:15:16.060Z · LW(p) · GW(p)

Focusing on intentionality seems interesting since it lets us look at black box actors (whose agent-ness or tool-ness we don't have to carefully define) and ask if they are acting in an apparently goal-directed manner. I've just skimmed [1] and barely remember [2] but it looks like you can make the inference work in simple cases and also prove some intractability results.

Obviously, FAI can't be solved by just building some AI, modeling P(AI has goal "destroy humanity" | AI's actions, state of world) and pulling the plug when that number gets too high. But maybe something else of value can be gained from a mathematical formalization like this.

[1] I. Van Rooij, J. Kwisthout, M. Blokpoel, J. Szymanik, T. Wareham, and I. Toni, “Intentional communication: Computationally easy or difficult?,” Frontiers in Human Neuroscience, vol. 5, 2011.
[2] C. L. Baker, R. R. Saxe, and J. B. Tenenbaum, “Bayesian theory of mind: Modeling joint belief-desire attribution,” Proceedings of the Thirty-Second Annual Conference of the Cognitive Science Society, 2011.

Replies from: Will_Newsome

↑ comment by Will_Newsome · 2012-05-11T21:32:09.576Z · LW(p) · GW(p)

Tenenbaum's papers and related inductive approaches to detecting agency were the first attacks that came to mind, but I'm not sure that such statistical evidence could even in principle supply the sort of proof-strength support and precision that shminux seems to be looking for. I suppose I say this because I doubt someone like Searle would be convinced that an AI had intentional states in the relevant sense on the basis that it displayed sufficiently computationally complex communication, because such intentionality could easily be considered derived intentionality and thus not proof of the AI's own agency. The point at which this objection loses its force unfortunately seems to be exactly the point at which you could actually run the AGI and watch it self-improve and so on, and so I'm not sure that it's possible to prove hypothetical-Searle wrong in advance of actually running a full-blown AGI. Or is my model wrong?

↑ comment by Bugmaster · 2012-05-11T23:42:31.160Z · LW(p) · GW(p)

I am not sure if I agree with Holden that there's a meaningful distinction between tools an agents. However, one definition I could think of is this:

"A tool, unlike an agent, includes blocking human input in its perceive/decide/act loop."

Thus, an agent may work entirely autonomously, whereas a tool would wait for a human to make a decision before performing an action.

Of course, under this definition, Google's webcrawler would be an agent, not a tool -- which is one of the reasons I might disagree with Holden.

↑ comment by Nick_Beckstead · 2012-05-11T23:29:35.786Z · LW(p) · GW(p)

I don't think anyone will be able to. Here is my attempt at a more precise definition than what we have on the table:

An agent models the world and selects actions in a way that depends on what its modeling says will happen if it selects a given action.

A tool may model the world, and may select actions depending on its modeling, but may not select actions in a way that depends on what its modeling says will happen if it selects a given action.

A consequence of this definition is that some very simple AIs that can be thought of as "doing something," such as some very simple checkers programs or a program that waters your plants if and only if its model says it didn't rain, would count as tools rather than agents. I think that is a helpful way of carving things up.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2012-06-12T20:11:40.486Z · LW(p) · GW(p)

A tool may model the world, and may select actions depending on its modeling, but may not select actions in a way that depends on what its modeling says will happen if it selects a given action.

So if the question is related to the future (such as "will it rain tomorrow?"), does it essentially mean that a tool will model a counterfactual alternative future which would happen if the tool did not provide any answer?

This would be OK for situations where the answer of the AI does not make a big difference (such as "will it rain tomorrow?").

It would be less OK for situations where the mere knowledge about "what AI said" would influence the result, such as asking AI about important social or political topics, where the answer is likely to be published. (In these situations the question considered would be mixed with specific events of the counterfactual world, such as a worldwide panic "our superhuman AI seems to be broken, we are all doomed!").

Replies from: Nick_Beckstead

↑ comment by Nick_Beckstead · 2012-06-12T20:16:07.915Z · LW(p) · GW(p)

I think that you're describing a real hurdle, though it seems like a hurdle that could be overcome.

comment by hairyfigment · 2012-05-11T07:41:37.570Z · LW(p) · GW(p)

The organization section touches on something that concerns me. Developing a new decision theory sounds like it requires more mathematical talent than the SI yet has available. I've said before that hiring some world-class mathematicians for a year seems likely to either get said geniuses interested in the problem, to produce real progress, or to produce a proof that SI's current approach can't work. In other words, it seems like the best form of accountability we can hope for given the theoretical nature of the work.

Now Eliezer is definitely looking for people who might help. For instance, the latest chapter of "Harry Potter and the Methods of Rationality" mentioned

a minicamp for 20 mathematically talented youths...Most focus will be on technical aspects of rationality (probability theory, decision theory) but also with some teaching of the same mental skills in the other Minicamps.

It also says,

Several instructors of International Olympiad level have already volunteered.

So they technically have something already. And if there exists a high-school student who can help with the problem, or learn to do so, that person seems relatively likely to enjoy HP:MoR. But I worry that Eliezer is thinking too much in terms of his own life story here, and has not had to defend his approach enough.

Replies from: Manfred

↑ comment by Manfred · 2012-05-11T09:29:07.624Z · LW(p) · GW(p)

Developing a new decision theory sounds like it requires more mathematical talent than the SI yet has available.

On what measure of difficulty are you basing this? We have some guys around here doing a pretty good job.

Replies from: hairyfigment

↑ comment by hairyfigment · 2012-05-11T18:25:41.809Z · LW(p) · GW(p)

I phrased that with too much certainty. While I have little if any reason to see fully-reflective decision theory as an easier task than self-consistent infinite set theory, I also have no clear reason to think the contrary.

But I'm trying to find the worst scenario that we could plan for. I can think of two broad ways that Eliezer's current plan could be horribly misguided:

if it works well enough to help someone produce an uFAI but not well enough to stop this in time
if some part of it -- such as a fully-reflective decision theory that humans can understand -- is mathematically impossible, and SI never realizes this.

Now SI technically seems aware of both problems. The fact that Eliezer went out of his way to help critics understand Löb's Theorem and that he keeps mentioning said theorem seems like a good sign. But should I believe that SI is doing enough to address #2? Why?

comment by NancyLebovitz · 2012-05-12T09:51:27.935Z · LW(p) · GW(p)

If a tool AI is programmed with a strong utility function to get accurate answers, is there a risk of it behaving like a UFAI to get more resources in order to improve its answers?

Replies from: Johnicholas, private_messaging

↑ comment by Johnicholas · 2012-05-12T13:27:25.020Z · LW(p) · GW(p)

There's two uses of 'utility function'. One is analogous to Daniel Dennett's "intentional stance" in that you can choose to interpret an entity as having a utility function - this is always possible but not necessarily a perspicuous way of understanding an entity - because you might end up with utility functions like "enjoys running in circles but is equally happy being prevented from running in circles".

The second form is as an explicit component within an AI design. Tool-AIs do not contain such a component - they might have a relevance or accuracy function for evaluating answers, but it's not a utility function over the world.

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2012-05-12T16:11:27.580Z · LW(p) · GW(p)

because you might end up with utility functions like "enjoys running in circles but is equally happy being prevented from running in circles".

Is that a problem so long as some behaviors are preferred over others? You could have "is neutral about running in circles, but resists jumping up and down and prefers making abstract paintings".

Tool-AIs do not contain such a component - they might have a relevance or accuracy function for evaluating answers, but it's not a utility function over the world.

Wouldn't that depend on the Tool-AI? Eliezer's default no-akrasia AI does everything it can to fulfill its utility function. You presumably want it to be as accurate as possible or perhaps as accurate as useful. Would it be a problem for it to ask for more resources? To earn money on its own initiative for more resources? To lobby to get laws passed to give it more resources? At some point, it's a problem if it's going to try to rule the world to get more resources.....

Replies from: CuSithBell, private_messaging

↑ comment by CuSithBell · 2012-05-12T16:39:31.466Z · LW(p) · GW(p)

Tool-AIs do not contain such a component - they might have a relevance or accuracy function for evaluating answers, but it's not a utility function over the world.

Wouldn't that depend on the Tool-AI?

I think this is explicitly part of the "Tool-AI" definition, that it is not a Utility Maximizer.

↑ comment by private_messaging · 2012-05-14T08:30:41.852Z · LW(p) · GW(p)

I think there's thorough confusion between utilityA: utility as used in economics to try and predict humans (and predict them inaccurately), and the utilityB: utility as in the model based agent, where the utility is a mathematical function which takes in description of the world and which only refers to real world items if you read stuff into it that is not there and can not be put there.

Viciously maximizing some utilityB leads to, given sufficient capability, the vicious and ohh so dangerous modification of the inputs to utilityB function, i.e. wireheading.

The AIs as we know them, agents or tools, are not utilityA maximizers. We do not know how to make utilityA maximizer. The human intelligence also doesn't seem to work as utilityA maximizer. It is likely the case that utilityA maximizer is a logical impossibility for agents embedded in the world, or at very least, requires very major advances in formalization of philosophy.

Replies from: None

↑ comment by [deleted] · 2012-05-14T19:49:21.261Z · LW(p) · GW(p)

It is likely the case that utilityA maximizer is a logical impossibility for agents embedded in the world...

Very interesting and relevant! Can you elaborate or link? I think the case can be made based on Arrow's theorem and its corollaries, but I'm not sure that's what you have in mind.

↑ comment by private_messaging · 2012-05-12T18:00:35.340Z · LW(p) · GW(p)

What the hell does SIAI mean by 'utility function' anyway? (math please)

Inside the agents and tools as currently implemented, there is a solver that works on a function, and finds input values to that function, which result in maximum (or, usually, minimum) of that function (note that the output may be binary).

[To clarify: that function can include both model of the world and the evaluation of 'desirability' of properties of a state of this model. Usually, in software development, if you have f(g(x)) (where g is world predictor and f is the desirability evaluator), and g's output is only ever used by f, this is a target for optimization to create fg(x) which is more accurate in given time but does not consist of nearly separable parts. Furthermore, the f output is only ever fed to comparison operators, making it another optimization target to create cmp_fg() which compares the actions directly perhaps by calculating the difference between worlds that is caused by particular action, which allows to cull most of processing out]

It, however, is entirely indifferent to actually maximizing anything. It doesn't even try to maximize some internal variable (it will gladly try inputs that result in small output value, but usually is written not to report those inputs).

I think the confusion arises from defining the agent in English language-based concepts, as opposed to the AI developer's behaviour where they would define things in some logical down-to-elements way, and then try to communicate it using English. The command in English, 'bring me the best answer!', does tell you to go ahead and convert universe to computronium to answer it (if you are to interpret it in science-fiction-robot-minded way). The commands in programming languages, not really. I don't think English specifies that either, we just can interpret it charitably enough if we feel like (starting from other purpose, such as 'be nice').

edit: I feel that a lot of difficulties of making 'safe AGI', those that are not outright nonsensical, are just repackaged special cases of statements about general difficulty of making any AGI, safe or not. That's very nasty thing to do, to generate such special cases preferentially. edit: Also, some may be special cases of lack/impossibility of solution to symbol grounding.

comment by [deleted] · 2012-05-11T23:09:27.632Z · LW(p) · GW(p)

I agree with timtyler's comment that Objections 1 and 2 are bogus, especially 2. The tool-AGI discussion reveals significant misunderstanding, I feel. Despite this, I think it is still a great and useful post.

Another sort of tangential issue is that this post fails to consider whether or not lots of disparate labs are just going to undertake AGI research regardless of SIAI. If lots of labs are doing that, it could be dangerous (if SIAI arguments are sound). So one upside to funding an organization like SIAI is that it will kind of rake the attention to a central point. Remember that one of SIAI's short term goals is to decelerate generic AGI research in favor of accelerating AGI safety research.

This post doesn't seem to account for the fact that by not funding SIAI you simply face the same number of counterfactual disparate labs pursuing AGI with their own willy-nilly sources of funding, but no aggregator organization to serve as a kind of steering committee. Regardless of whether SIAI's specific vision is the one that happens to come true, something should be said for the inherent danger of a bunch of labs trying to build their own stand-alone paperclip maxmizers, which they may very well believe are tool-AGIs, and then bam, game over.

comment by jonperry · 2012-05-11T08:09:02.760Z · LW(p) · GW(p)

Let's say that the tool/agent distinction exists, and that tools are demonstrably safer. What then? What course of action follows?

Should we ban the development of agents? All of human history suggests that banning things does not work.

With existential stakes, only one person needs to disobey the ban and we are all screwed.

Which means the only safe route is to make a friendly agent before anyone else can. Which is pretty much SI's goal, right?

So I don't understand how practically speaking this tool/agent argument changes anything.

Replies from: army1987, Polymeron, jsteinhardt

↑ comment by A1987dM (army1987) · 2012-05-11T08:54:47.662Z · LW(p) · GW(p)

Which means the only safe route is to make a friendly agent before anyone else can.

Only if running too fast doesn't make it easier to screw something up, which it most likely does.

Replies from: khafra, jonperry

↑ comment by khafra · 2012-05-11T17:23:27.954Z · LW(p) · GW(p)

If the time at which anyone activates a uFAI is known, SI should activate their current FAI best effort (CFBE) one day before that.

If the time at which anyone activates a GAI of unknown friendliness is known, SI should compare the probability distribution function for the friendliness of the two AIs, and activate their CFBE one day earlier only if it has more probability mass on the "friendly" side.

If the time at which anyone makes a uFAI is unknown, SI should activate their CFBE when the probability that they'll improve the CFBE in the next day is lower than the probability that someone will activate a uFAI in the next day.

If the time at which anyone makes a GAI of unknown friendliness is unknown, SI should activate their CFBE when the probability that CFBE=uFAI is less than the probability that anyone else will activate a GAI of unknown friendliness, multiplied by the probability that the other GAI will be unfriendly.

...I think. I do tend to miss the obvious when trying to think systematically, and I was visualizing gaussian pdfs without any particular justification, and a 1-day decision cycle with monotonically improving CFBE, and this is only a first-order approximation: It doesn't take into account any correlations between the decisions of SI and other GAI researchers.

↑ comment by jonperry · 2012-05-11T09:23:26.963Z · LW(p) · GW(p)

Yes, you can create risk by rushing things. But you still have to be fast enough to outrun the creation of UFAI by someone else. So you have to be fast, but not too fast. It's a balancing act.

Replies from: Monkeymind

↑ comment by Monkeymind · 2012-05-11T15:10:04.647Z · LW(p) · GW(p)

If intelligence is the ability to understand concepts, and a super-intelligent AI has a super ability to understand concepts, what would prevent it (as a tool) from answering questions in a way so as to influence the user and affect outcomes as though it were an agent?

Replies from: Strange7

↑ comment by Strange7 · 2013-03-22T09:43:05.440Z · LW(p) · GW(p)

The profound lack of a desire to do so.

Google Maps, when asked for directions to Eddie the Snitch's hideout, will not reply with "Maybe I know and maybe I don't. You gonna make it worth my time?" because providing directions is, to it, a reflex action rather than a move in a larger game.

Replies from: CCC, khafra

↑ comment by CCC · 2013-03-22T13:43:37.505Z · LW(p) · GW(p)

There are possible questions where the super-intelligent AI has to make a choice of some sort, because multiple answers can be correct (depending on which answer is given).

For example: Sam, a basketball player, approaches Predictor, a super-intelligent tool AI, before his game and asks the question "Will my team win today's game?" Predictor knows that if it says 'yes', Sam will be confident, play aggressively, and this will lead to a win. If, on the other hand, it answers 'no', Sam's confidence will be shattered and his team will lose comprehensively. Refusing to answer will confuse Sam, distracting him from the task at hand, and causing his team to be narrowly defeated. Any answer makes Predictor an agent, and not merely a tool - Predictor doesn't even need to care about the basketball game.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-03-22T15:00:46.075Z · LW(p) · GW(p)

Absolutely agreed that this sort of situation arises, and that the more I know about the world, the more situations have this character for me. That said, if I'm indifferent to the world-affecting effects of my answers, it seems that the result is very similar to if I'm ignorant of them.

That is, it seems that Predictor looks at that situation, concludes that in order to predict "yes" or "no" it has to first predict whether it will answer "yes" or "no", and either does so (on what basis, I have no idea) or fails to do so and refuses to answer. Yes, those actions influence the world (as does the very existence of Predictor, and Sam's knowledge of Predictor's existence), but I'm not sure I would characterize the resulting behavior as agentlike.

Replies from: CCC

↑ comment by CCC · 2013-03-22T18:21:30.923Z · LW(p) · GW(p)

Then consider; Sam asks a question. Predictor knows that an answer of "yes" will result in the development of Clippy, and subsequently in turning Earth into paperclips, causing the destruction of humanity, within the next ten thousand years; while an answer of "no" will result in a wonderful future where everyone is happy and disease is eradicated and all Good Things happen. In both cases, the prediction will be correct.

If Predictor doesn't care about that answer, then I would not define Predictor as a Friendly AI.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-03-22T19:07:10.180Z · LW(p) · GW(p)

Absolutely agreed; neither would I. More generally, I don't think I would consider any Oracle AI as Friendly.

↑ comment by khafra · 2013-03-22T13:23:37.222Z · LW(p) · GW(p)

What if you ask Google Interrogation Aid for the best way to get a confession out of Eddie the Snitch, given the constraints of the law and Eddie's psychographics? What if you ask Google Municipal Planner for the best way to reduce crime? What if you ask Google Operations Assistant for the best way to maximize your paperclip production?

Replies from: Strange7

↑ comment by Strange7 · 2013-03-22T15:06:28.486Z · LW(p) · GW(p)

Google Maps has options for walking, public transit, and avoiding major highways; a hypothetical interrogation assistant would have equivalent options for degrees of legal or ethical restraint, including "How do I make sure Eddie only confesses to things he's actually done?" If Google Operations Assistant says that a few simple modifications to the factory can produce a volume of paperclips that outmasses the Earth, there will be follow-up questions about warehousing and buyers.

Reducing crime is comparatively straightforward: more cops per capita, fewer laws for them to enforce, enough economic opportunity to make sure people don't get desperate and stupid. The real problem is political, rather than technical, so any proposed solution will have a lot of hoops to jump through.

Replies from: khafra

↑ comment by khafra · 2013-03-22T16:05:51.458Z · LW(p) · GW(p)

Yes, all it takes is a little common sense to see that legal and ethical restraint are important considerations during your interview and interrogation of Eddie. However, as the complexity of the problem rises, the tractability of the solution to a human reader lowers, as does the probability that your tool AI has sufficient common sense.

A route on a map only has a few degrees of freedom; and it's easy to spot violations of common-sense constraints that weren't properly programmed in, or to abort the direction-following process when problems spring up. A route to a virally delivered cancer cure has many degrees of freedom, and it's harder to spot violations of common-sense constraints, and problems may only become evident when it's too late to abort.

Replies from: Strange7

↑ comment by Strange7 · 2013-03-22T17:15:18.766Z · LW(p) · GW(p)

If all it took was "a little common sense" to do interrogations safely and ethically, the Stanford Prison Experiment wouldn't have turned out the way it did. These are not simple problems!

When a medical expert system spits out a novel plan for cancer treatment, do you think that plan would be less trustworthy, or receive less scrutiny at every stage, than one invented by human experts? If an initial trial results in some statistically significant number of rats erupting into clockwork horror and rampaging through the lab until cleansed by fire, or even just keeling over from seemingly-unrelated kidney failure, do you think the FDA would approve?

↑ comment by Polymeron · 2012-05-20T19:42:25.015Z · LW(p) · GW(p)

Presumably, you build a tool-AI (or three) that will help you solve the Friendliness problem.

This may not be entirely safe either, but given the parameters of the question, it beats the alternative by a mile.

↑ comment by jsteinhardt · 2012-05-11T17:53:38.177Z · LW(p) · GW(p)

I think the idea is to use tool AI to create safe agent AI.

comment by sufferer · 2012-05-11T16:13:58.143Z · LW(p) · GW(p)

But if there's even a chance …

Holden cites two posts (Why We Can’t Take Expected Value Estimates Literally and Maximizing Cost-effectiveness via Critical Inquiry). They are supposed to support the argument that small or very small changes to the probability of an existential risk event occurring are not worth caring about or donating money towards.

I think that these posts both have serious problems (see the comments, esp Carl Shulman's). In particular Why We Can’t Take Expected Value Estimates Literally was heavily criticised by Robin Hanson in On Fudge Factors.

Robin Hanson has been listed as the other major "intelligent/competent" critic of SIAI. That he criticises what seems to be the keystone of Holden's argument should be cause for concern for Holden. (after all, if "even a chance" is good enough, then all the other criticisms melt away).

This would be a much more serious criticism of SIAI if Holden and Hanson could come to agreement on what exactly the problem with SIAI is, and if Holden could sort out the problems with these two supporting posts*

(*of course they won't do that without substantial revision of one or both of their positions because Hanson is on the same page as the rest of SIAI with regard to expected utility, see On Fudge Factors. Hanson's disagreement with SIAI is a different one; approximately that Hanson thinks ems first is likely and that a singleton is both bad and unlikely, and Hanson's axiology is significantly unintuitive to the extent that he is not really on the same page as most people with regard to what counts as a good or bad outcome)

Replies from: TheOtherDave, jsteinhardt

↑ comment by TheOtherDave · 2012-05-11T16:21:29.995Z · LW(p) · GW(p)

Robin Hanson has been listed as the other major "intelligent/competent" critic of SIAI. That he criticises what seems to be the keystone of Holden's argument should be cause for concern for Holden.

So, I stipulate that Robin, whom Eliezer considers the only other major "intelligent/competent" critic of SI, disagrees with this aspect of Holden's position. I also stipulate that this aspect is the keystone of Holden's argument, and without it all the rest of it is irrelevant. (I'm not sure either of those statements is actually true, but they're beside my point here.)

I do not understand why these stipulated facts should be a significant cause for concern for Holden, who may not consider Eliezer's endorsement of what is and isn't legitimate criticism of SI particularly significant evidence of anything important.

Can you expand on your reasoning here?

Replies from: Polymeron, sufferer

↑ comment by Polymeron · 2012-05-20T18:52:02.879Z · LW(p) · GW(p)

after all, if "even a chance" is good enough, then all the other criticisms melt away

Not to the degree that SI could be increasing the existential risk, a point Holden also makes. "Even a chance" swings both ways.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-20T19:03:27.799Z · LW(p) · GW(p)

I am completely lost by how this is a response to anything I said.

Replies from: Polymeron

↑ comment by Polymeron · 2012-05-20T19:44:59.000Z · LW(p) · GW(p)

It's not. Apparently I somehow replied to the wrong post... It's actually aimed at sufferer's comment you were replying to.

I don't suppose there's a convenient way to move it? I don't think retracting and re-posting would clean it up sufficiently, in fact that seems messier.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-20T20:06:41.946Z · LW(p) · GW(p)

Ah! That makes sense. I know of no way to move it... sorry.

↑ comment by sufferer · 2012-05-11T16:39:47.461Z · LW(p) · GW(p)

I suspect that Holden would also consider Robin Hanson a competent critic. This is because Robin is smart, knowledgeable and prestigiously accredited.

But your comment has alerted me to the fact that even if Hanson comes out as a flat-earther tomorrow the supporting posts are still weak.

The issue of the two most credible critics of SIAI disagreeing with each other is logically independent of the issue of Holden's wobbly argument against the utilitarian argument for SIAI. Many thanks.

↑ comment by jsteinhardt · 2012-05-11T17:51:45.084Z · LW(p) · GW(p)

I'm not sure what you mean by

Hanson is on the same page as the rest of SIAI with regard to expected utility

As Holden and Eliezer both explicitly state, SIAI itself rejects the "but there's still a chance" argument.

Replies from: sufferer

↑ comment by sufferer · 2012-05-13T19:17:28.460Z · LW(p) · GW(p)

It all depends on how small that small chance is. Pascal mugging is typically done with probabilities that are exponentially small, e.g. 10^-10 or so.

But what about if Holden is going to not recommend SIAI for donations when there's a 1% or 0.1% chance of it making that big difference.

comment by drethelin · 2012-05-10T20:32:39.716Z · LW(p) · GW(p)

Tool-based works might be a faster and safer way to create useful AI, but as long as agent-based methods are possible it seems extremely important to me to work on verifying friendliness of artificial agents.

Replies from: abramdemski

↑ comment by abramdemski · 2012-05-11T07:21:47.455Z · LW(p) · GW(p)

Important, perhaps, but extremely important? If tool-based systems are faster in coming and safer, then they will be available to help the process of creating, studying, and (if necessary) defending against powerful agents.

My prediction would be that tool AI would be economically incentivised, since humans want tools. Agent AI might be created later on more aesthetic grounds, as pets or hoped-for equals. (But that's just an intuition.)

Replies from: drethelin

↑ comment by drethelin · 2012-05-11T17:01:49.889Z · LW(p) · GW(p)

For the same reason that a personal assistant is vastly more useful and powerful than a PDA, even though they might nominally serve the same function of remembering phone numbers, appointments, etc. people are extremely likely to want to create agent AIs.

comment by Pablo (Pablo_Stafforini) · 2020-05-05T15:37:21.532Z · LW(p) · GW(p)

I elaborated further on the distinction and on the concept of a tool-AI in Karnofsky/Tallinn 2011.

Holden's notes from that conversation, posted to the old GiveWell Yahoo Group as a file attachment, do not appear to be publicly available anymore. Jeff Kaufman has archived all the messages from that mailing list, but unfortunately his archive does not include file attachments. Has anyone kept a copy of that file by any chance?

Replies from: RobbBB

↑ comment by Rob Bensinger (RobbBB) · 2020-05-05T15:54:19.618Z · LW(p) · GW(p)

I believe this is https://files.givewell.org/files/labs/AI/Jaan_Tallinn_2011_05_revised.pdf.

Replies from: Pablo_Stafforini

↑ comment by Pablo (Pablo_Stafforini) · 2020-05-06T11:59:17.447Z · LW(p) · GW(p)

Thank you!

comment by kalla724 · 2012-05-10T23:26:58.854Z · LW(p) · GW(p)

Very good. Objection 2 in particular resonates with my view of the situation.

One other thing that is often missed is the fact that SI assumes that development of superinteligent AI will precede other possible scenarios - including the augmented human intelligence scenario (CBI producing superhumans, with human motivations and emotions, but hugely enhanced intelligence). In my personal view, this scenario is far more likely than the creation of either friendly or unfriendly AI, and the problems related to this scenario are far more pressing.

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2012-05-11T20:58:43.699Z · LW(p) · GW(p)

and the problems related to this scenario are far more pressing.

Could you expand on that?

Replies from: kalla724

↑ comment by kalla724 · 2012-05-12T21:45:37.202Z · LW(p) · GW(p)

I can try, but the issue is too complex for comments. A series of posts would be required to do it justice, so mind the relative shallowness of what follows.

I'll focus on one thing. An artificial intelligence enhancement which adds more "spaces" to the working memory would create a human being capable of thinking far beyond any unenhanced human. This is not just a quantitative jump: we aren't talking someone who thinks along the same lines, just faster. We are talking about a qualitative change, making connections that are literally impossible to make for anyone else.

(This is even more unclear than I thought it would be. So a tangent to, hopefully, clarify. You can hold, say, seven items in your mind while considering any subject. This vastly limits your ability to consider any complex system. In order to do so at all, you have to construct "composite items" out of many smaller items. For instance, you can think of a mathematical formula, matrix, or an operation as one "item," which takes one space, and therefore allows you to cram "more math" into a thought than you would be able to otherwise. Alternate example: a novice chess player has to look at every piece, think about likely moves of every one, likely responses, etc. She becomes overwhelmed very quickly. An expert chess player quickly focuses on learned series of moves, known gambits and visible openings, which allows her to see several steps ahead.

One of the major failures in modern society is the illusion of understanding in complex systems. Any analysis picks out a small number of items we can keep in mind at one time, and then bases the "solutions" on them (Watts's "Everything is Obvious" book has a great overview of this). Add more places to the working memory, and you suddenly have humans who have a qualitatively improved ability to understand complex systems. Maybe still not fully, but far better than anyone else. Sociology, psychology, neuroscience, economics... A human being with a few dozen working memory spaces would be for economy the same thing a quantum computer with eight qubits would be for cryptography - whoever develops one first, can take wreak havoc as they like.)

When this work starts in earnest (ten to twelve years from now would be my estimate), how do we control the outcomes? Will we have tightly controlled superhumans, surrounded and limited by safety mechanisms? Or will we try to find "humans we trust" to become first enhanced humans? Will we have a panic against such developments (which would then force further work to be done in secret, probably associated with military uses)?

Negative scenarios are manifold (lunatic superhumans destroying the world, or establishing tyranny; lobotomized/drugged superhumans used as weapons of war or for crowd manipulation; completely sane superhumans destroying civilization due to their still present and unmodified irrational biases; etc.). Positive scenarios are comparable to Friendly AI (unlimited scientific development, cooperation on a completely new scale, reorganization of human life and society...).

How do we avoid the negative scenarios, and increase the probability of the positive ones? Very few people seem to be talking about this (some because it still seems crazy to the average person, some explicitly because they worry about the panic/push into secrecy response).

Replies from: Dustin, jsteinhardt, jacob_cannell, private_messaging

↑ comment by Dustin · 2012-05-12T23:26:23.318Z · LW(p) · GW(p)

I like this series of thoughts, but I wonder about just how superior a human with 2 or 3 times the working memory would be.

Currently, do all humans have the same amount of working memory? If not, how "superior" are those with more working memory ?

Replies from: TheOtherDave, kalla724, Kaj_Sotala

↑ comment by TheOtherDave · 2012-05-13T01:23:37.134Z · LW(p) · GW(p)

A vaguely related anecdote: working memory was one of the things that was damaged after my stroke; for a while afterwards I was incapable of remembering more than two or three items when asked to repeat a list. I wasn't exactly stupider than I am now, but I was something pretty similar to stupid. I couldn't understand complex arguments, I couldn't solve logic puzzles that required a level of indirection, I would often lose track of the topic of a sentence halfway through.

Of course, there was other brain damage as well, so it's hard to say what causes what, and the plural of anecdote is not data. But subjectively it certainly felt like the thing that was improving as I recovered was my ability to hold things in memory... not so much number of items, as reliability of the buffers at all. I often had the thought as I recovered that if I could somehow keep improving my working memory -- again, not so much "add slots" but make the whole framework more reliable -- I would end up cleverer than I started out.

Take it for what it's worth.

↑ comment by kalla724 · 2012-05-13T04:56:17.481Z · LW(p) · GW(p)

It would appear that all of us have very similar amounts of working memory space. It gets very complicated very fast, and there are some aspects that vary a lot. But in general, its capacity appears to be the bottleneck of fluid intelligence (and a lot of crystallized intelligence might be, in fact, learned adaptations for getting around this bottleneck).

How superior would it be? There are some strong indication that adding more "chunks" to the working space would be somewhat akin to adding more qubits to a quantum computer: if having four "chunks" (one of the most popular estimates for an average young adult) gives you 2^4 units of fluid intelligence, adding one more would increase your intelligence to 2^5 units. The implications seem clear.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-13T21:14:15.047Z · LW(p) · GW(p)

I'm curious as to why this comment has been downvoted. Kalla seems to be making an essentially uncontroversial and correct summary of what many researchers think is the relevance of working memory size

Replies from: jsteinhardt

↑ comment by jsteinhardt · 2012-05-13T21:55:20.906Z · LW(p) · GW(p)

(Note: it is not downvoted as I write this comment.)

First let me say that I have enjoyed kalla's recent contributions to this site, and hope that the following won't come across as negative. But to answer your question, I at least question both the uncontrovertiality and correctness of the summary, as well as the inference that more working memory increases abilities exponentially quickly. Kalla and I discussed some of this above and he doesn't think that his claims hinge on specific facts about working memory, so most of this is irrelevant at this point, but might answer your question.

EDIT: Also, by correctness I mainly mean that I think our (us being cognitive scientists) understanding of this issue is much less clear than kalla's post implies. His summary reflects my understanding of the current working theory, but I don't think the current working theory is generally expected to be correct.

↑ comment by Kaj_Sotala · 2012-05-14T07:13:40.967Z · LW(p) · GW(p)

Although the exact relationship isn't known, there's a strong connection between IQ and working memory - apparently both in humans and animals. E.g. Matzel & Kolata 2010:

Accumulating evidence indicates that the storage and processing capabilities of the human working memory system co-vary with individuals’ performance on a wide range of cognitive tasks. The ubiquitous nature of this relationship suggests that variations in these processes may underlie individual differences in intelligence. Here we briefly review relevant data which supports this view. Furthermore, we emphasize an emerging literature describing a trait in genetically heterogeneous mice that is quantitatively and qualitatively analogous to general intelligence (g) in humans. As in humans, this animal analog of g co-varies with individual differences in both storage and processing components of the working memory system. Absent some of the complications associated with work with human subjects (e.g., phonological processing), this work with laboratory animals has provided an opportunity to assess otherwise intractable hypotheses. For instance, it has been possible in animals to manipulate individual aspects of the working memory system (e.g., selective attention), and to observe causal relationships between these variables and the expression of general cognitive abilities. This work with laboratory animals has coincided with human imaging studies (briefly reviewed here) which suggest that common brain structures (e.g., prefrontal cortex) mediate the efficacy of selective attention and the performance of individuals on intelligence test batteries. In total, this evidence suggests an evolutionary conservation of the processes that co-vary with and/or regulate “intelligence” and provides a framework for promoting these abilities in both young and old animals.

or Oberauer et al. 2005:

Hence, we might conclude—setting aside the above mentioned caveats for such analyses—that [Working Memory Capacity] and g share the largest part of their variance (72%) but are not identical. [...] Our methodological critique notwithstanding, we believe that Ackerman et al. (2005) are right in claiming that WMC is not the same as g or as gf or as reasoning ability. Our argument for a distinction between these constructs does not hinge on the size of the correlation but on a qualitative difference: On the side of intelligence, there is a clear factorial distinction between verbal and numerical abilities (e.g., Su¨ß et al., 2002); on the side of WMC, tasks with verbal contents and tasks with numerical contents invariably load on the same factor (Kyllonen & Christal, 1990; Oberauer et al., 2000). This mismatch between WMC and intelligence constructs not only reveals that they must not be identified but also provides a hint as to what makes them different. We think that verbal reasoning differs from numerical reasoning in terms of the knowledge structures on which they are based: Verbal reasoning involves syntax and semantic relations between natural concepts, whereas numerical reasoning involves knowledge of mathematical concepts. WMC, in contrast, does not rely on conceptual structures; it is a part of the architecture that provides cognitive functions independent of the knowledge to which they are applied. Tasks used to measure WMC reflect this assumption in that researchers minimize their demand on knowledge, although they are bound to never fully succeed in that regard. Still, the minimization works well enough to allow verbal and numerical WM tasks to load substantially on a common factor. This suggests that WMC tests come closer to measuring a feature of the cognitive architecture than do intelligence tests.

Replies from: Dustin

↑ comment by Dustin · 2012-05-15T18:49:33.680Z · LW(p) · GW(p)

Now this has me wondering if its possible to increase your own working memory via practice or some other means. I shall go do some reading on the matter.

Thanks for the links!

↑ comment by jsteinhardt · 2012-05-13T14:43:07.487Z · LW(p) · GW(p)

My admittedly uninformed impression is that the state of knowledge about working memory is pretty limited, at least relative to the claims you are making. Do you think you could clarify somewhat, e.g. either show that our knowledge is not limited, or that you don't need any precise knowledge about working memory to support your claims? In particular, I have not seen convincing evidence that working memory even exists, and it's unclear what a "chunk" is, or how we manipulate them (perhaps manipulation costs grow exponentially with the number of chunks).

Replies from: kalla724

↑ comment by kalla724 · 2012-05-13T21:19:24.717Z · LW(p) · GW(p)

Whether "working memory" is memory at all, or whether it is a process of attentional control as applied to normal long-term memory... we don't know for sure. So in that sense, you are totally right.

But what is the exact nature of the process is, perhaps strangely, unimportant. The question is whether the process can be enhanced, and I would say that the answer is very likely to be yes.

Also, keep in mind that working memory enhancement scenario is just one I pulled from thin air as an example. The larger point is that we are rapidly gaining the ability to non-invasively monitor activities of single neuronal cells (with fluorescent markers, for instance), and we are, more importantly, gaining the ability to control them (with finely tuned and targeted optogenetics). Thus, reading and writing into the brain is no longer an impossible hurdle, requiring nanoimplants or teeny-tiny electrodes (with requisite wiring). All you need are optical fibers and existing optogenetic tools (in theory, at least).

To generalize the point even further: we have the tools and the know-how with which we could start manipulating and enhancing existing neural networks (including those in human brains). It would be bad, inefficient and with a great deal of side-effects, we don't really understand the underlying architecture enough to really know what we are doing - but could still theoretically begin today, if for some reason we decided to (and lost our ethics along the way). On the other hand, we don't have a clue how to build an AGI. Regardless of any ethical or eschatonic concerns, we simply couldn't do it even if we wanted to. My personal estimate is, therefore, that we will make it to the first goal far sooner than we make it to the second one.

↑ comment by jacob_cannell · 2012-06-12T18:31:14.101Z · LW(p) · GW(p)

You can hold, say, seven items in your mind while considering any subject. This vastly limits your ability to consider any complex system.

Really? A dubious notion in the first place, but untrue by the counterexamples of folks who go above 4 in dual N back.

You seem to have a confused fantastical notion of working memory ungrounded in neuroscientific rigor. The rough analogy I have heard is that working memory is a coarse equivalent of registers, but this doesn't convey the enormity of the items brains hold in each working memory 'slot'. Nonetheless, more registers does not entail superpowers.

Alternate example: a novice chess player has to look at every piece, think about likely moves of every one, likely responses, etc. She becomes overwhelmed very quickly. An expert chess player quickly focuses on learned series of moves, known gambits and visible openings, which allows her to see several steps ahead.

Chess players increase in ability over time equivalent to an exponential increase in algorithmic search performance. This increase involves hierarchical pattern learning in the cortex. Short term working memory is more involved in maintaining a stack of moves in the heuristic search algorithm humans use (register analogy).

↑ comment by private_messaging · 2012-06-12T14:20:27.144Z · LW(p) · GW(p)

Well, my opinion is that there already are such people, with several times the working memory. The impact of that was absolutely enormous indeed and is what brought us much of the advancements in technology and science. If you look at top physicists or mathematicians or the like - they literally can 'cram "more math" into a thought than you would be able to otherwise' , vastly more. It probably doesn't help a whole lot with economics and the like though - the depth of predictions are naturally logarithmic in the computational power or knowledge of initial state, so the payoffs from getting smarter, far from the movie Limitless, are rather low, and it is still primarily a chance game.

comment by taw · 2012-05-10T18:04:43.545Z · LW(p) · GW(p)

Existential risk reduction is a very worthy cause. As far as I can tell there are a few serious efforts - they have scenarios which by outside view have non-negligible chances, and in case of many of these scenarios these efforts make non-negligible difference to the outcome.

Such efforts are:

asteroid tracking
seed vaults
development of various ways to deal with potential pandemics (early tracking systems, drugs etc.) - this actually overlaps with "normal" medicine a lot
arguably, global warming prevention is a borderline issue, since there is a tiny chance of massive positive feedback loops that will make Earth nearly uninhabitable. These chances are believed to be tiny by modern climate science, but all chances for existential risk are tiny.

That's about the entire list I'm aware of (are there any others?)

And then there's huge number of efforts which claim to do something based on existential risk, but either theories behind risk they're concerning themselves with, or theories behind why their efforts are likely to help, are based on assumptions not shared by vast majority of competent people.

All FAI-related stuff suffers from both of these problems - their risk is not based on any established science, and their answer is even less based in reality. If it suffered from only one of these problems it might be fixable, but as far as I can tell it is extremely unlikely to join the category of serious efforts ever.

The best claim those non-serious effort can make is that tiny chance that the risk is real tiny change the organization will make a difference huge risk is still a big number, but that's not a terribly convincing argument.

I'm under impression that we're doing far less than everything we can with these serious efforts, and we haven't really identified everything that can be dealt with with such serious effort. We should focus there (and on a lot of things which are not related to existential risk).

Replies from: Rain, RomeoStevens

↑ comment by Rain · 2012-05-10T21:10:51.949Z · LW(p) · GW(p)

Here is the list from Global Catastrophic Risks.

Replies from: taw

↑ comment by taw · 2012-05-10T22:16:44.816Z · LW(p) · GW(p)

Most of entries on the list are either not quantifiable even approximately to within order of magnitude. Of those that are (which is pretty much only "risks from nature" in Bostrom's system) many are still bad candidates for putting significant effort into, because:

we either have little ways to deal with them (like nearby supernova explosions)
we have a lot of time and future will be better equipped to deal with them (like eventual demise of Sun)
they don't actually seem to get anywhere near civilization-threatening levels (like volcanoes)

About the only new risk I see on the list which can and should be dealt with is having some backup plans for massive solar flares, but I'm not sure what we can do about it other than putting some extra money into astrophysics departments so they can figure things out better and give us better estimates.

↑ comment by RomeoStevens · 2012-05-10T20:31:43.931Z · LW(p) · GW(p)

nuclear holocaust. biological holocaust. super eruptions whose ash blocks significant levels of sunlight.

Replies from: taw

↑ comment by taw · 2012-05-10T22:07:47.294Z · LW(p) · GW(p)

I understand that global thermonuclear war could cause serious damage, but I'm not aware of any credible efforts that can prove they're moving things in the right direction.

What do you mean by "biological holocaust"?

Super eruptions surely follow some kind of power law, and as far as I can tell (and we can be sure by extrapolating from the power law), they don't get anywhere remotely near levels of destroying all life on Earth.

And we sure know how to heat Earth significantly in no time - just release some of these into atmosphere. It will only increase temperature, not sunlight, so food production and such will still be affected, but we already produce way more food per capita to feed everyone, so even a pretty big reduction won't get anywhere near compromising food security for majority of people, let alone threatening to kill everyone.

Replies from: None, CronoDAS, RomeoStevens

↑ comment by [deleted] · 2012-05-11T04:34:16.158Z · LW(p) · GW(p)

I understand that global thermonuclear war could cause serious damage, but I'm not aware of any credible efforts that can prove they're moving things in the right direction.

http://en.wikipedia.org/wiki/New_START

This stuff, as slow and grinding as it is, does make a difference.

Replies from: taw

↑ comment by taw · 2012-05-11T07:03:27.297Z · LW(p) · GW(p)

There's no particular reason to believe this is going to make global thermonuclear war any less likely. Russia and United States aren't particularly likely to start a global thermonuclear warfare anytime soon, and in longer perspective any major developed country, if it wanted, could build nuclear arsenals sufficient to make a continent uninhabitable within a few years.

There's also this argument that mutually assured destruction was somehow stabilizing and preventing nuclear warfare - the only use of nuclear weapons so far happened when the other side had no way to retaliate. I'm quite neutral on this - I'm unwilling to say that nuclear arms reductions either increase or decrease risk of global war (which will eventually turn nuclear or otherwise very nasty).

↑ comment by CronoDAS · 2012-05-12T20:28:02.131Z · LW(p) · GW(p)

Super eruptions surely follow some kind of power law, and as far as I can tell (and we can be sure by extrapolating from the power law), they don't get anywhere remotely near levels of destroying all life on Earth.

They don't have to destroy all life on earth to be existential risks. They just have to damage human civilization to the point where it can't recover; we've already used up basically all of the easily accessible, non-renewable natural resources; for example, a future civilization reduced to Roman Empire level technology would find itself with a severe shortage of exploitable ores - good luck running your empire without iron or copper!

Replies from: taw, JoshuaZ

↑ comment by taw · 2012-05-13T05:35:45.286Z · LW(p) · GW(p)

That reasoning is just extremely unconvincing, essentially 100% wrong and backwards.

Renewable energy available annually is many orders of magnitude greater than all fossil fuels we're using, and it has been used as primary source of energy for almost the entire history up to industrial revolution. Biomass for everything, animal muscle power, wind and gravity for water transport, charcoal for melting etc. were used successfully at massive scale before anybody even thought of oil or gas or made much use of coal.

Other than energy, most other resources - like ores - are trivially recyclable. If New Rome wanted iron and copper and so on they'd just need to head toward the nearest dump, and dig there. Amount of ores we dug out and made trivially accessible is ridiculously greater than what they had available.

Annual iron ore mining for example is 2.4 billion metric tons, or 1 kg per person per day. Annual steel production is 1.49 billion metric tons, or 220 kg per person per year. Every year (OK, some of that steel is from recycled iron). Vast majority of them would be easily extractable if civilization collapsed. If we went back to Roman levels of population, each Roman could easily extract tens or hundreds of tons of usable steel from just the stuff we extracted that their technology couldn't.

The same applies to every other metal, and most non-metal resources. It doesn't apply to a few resources like phosphorus and helium, but they'll figure it out somehow.

And even if civilization "collapsed" it's not like our scientific and organizational knowledge would have disappeared, making it ridiculously easier to rebuild than it was to build in the first place.

Replies from: Mercurial, JoshuaZ, army1987

↑ comment by Mercurial · 2012-05-17T22:07:23.281Z · LW(p) · GW(p)

Okay, this has been driving me bonkers for years now. I keep encountering blatantly contradictory claims about what is "obviously" true about the territory. taw, you said:

Renewable energy available annually is many orders of magnitude greater than all fossil fuels we're using[...]

And you might well be right. But the people involved in transition towns insist quite the opposite: I've been explicitly told, for one example, that it would take the equivalent of building five Three Gorges Dams every year for the next 50 years to keep up with the energy requirements provided by fossil fuels. By my reading, these two facts cannot both be correct. One of them says that civilization can rebuild just fine if we run out of fossil fuels, and the other says that we may well hit something dangerously close to a whimper.

I'm not asking for a historical analysis here about whether we needed fossil fuels to get to where we are. I'd like clarification on a fact about the territory: is it the case that renewable forms of energy can replace fossil fuels without modern civilization having to power down? I'm asking this as an engineering question, not a political one.

Replies from: taw

↑ comment by taw · 2012-05-18T08:31:11.389Z · LW(p) · GW(p)

They are incorrect. Here's a helpful diagram of available energy.

Replies from: Mercurial

↑ comment by Mercurial · 2012-05-21T04:50:43.533Z · LW(p) · GW(p)

Can you pretty, pretty please tell me where this graph gets its information from? I've seen similar graphs that basically permute the cubes' labels. It would also be wonderful to unpack what they mean by "solar" since the raw amount of sunlight power hitting the Earth's surface is a very different amount than the energy we can actually harness as an engineering feat over the next, say, five years (due to materials needed to build solar panels, efficiency of solar panels, etc.).

And just to reiterate, I'm really not arguing here. I'm honestly confused. I look at things like this video and books like this one and am left scratching my head. Someone is deluded. And if I guess wrong I could end up wasting a lot of resources and time on projects that are doomed to total irrelevance from the start. So, having some good, solid Bayesian entanglement would be absolutely wonderful right about now!

Replies from: taw

↑ comment by taw · 2012-05-23T12:32:48.454Z · LW(p) · GW(p)

The diagram comes from Wikipedia (tineye says this) but it seems they recently started merging and reshuffling content in all energy-related articles, so I can no longer find it there.

That's total energy available of course, not any 5 year projection.

Solar is probably easiest to estimate by high school physics. Here's Wikipedia's.
Here are some wind power estimates. This depends quite significantly on our technology (see this for possible next step beyond current technology)
World energy consumption is here

Replies from: Mercurial

↑ comment by Mercurial · 2012-05-24T19:03:21.118Z · LW(p) · GW(p)

Thank you!

Do you happen to know anything about the claim that we're running out of the supplies we need to build solar panels needed to tap into all that wonderful sunlight?

Replies from: taw, private_messaging

↑ comment by taw · 2012-05-27T05:13:46.433Z · LW(p) · GW(p)

Solar panel prices are on long term downward trend, but in the short term they were very far from smooth over the last few years, having very rapid increases and decreases as demand and production capacity mismatched both ways.

This issue isn't specific to solar panels, all commodities from oil to metals to food to RAM chips had massive price swings over the last few years.

There's no long term problem since we can make solar panels from just about anything - materials like silicon are available in essentially infinite quantities (manufacturing capacity is the issue, not raw materials), and for thin film you need small amounts of materials.

↑ comment by private_messaging · 2012-05-27T06:04:39.755Z · LW(p) · GW(p)

Usual crap likely originating from pro-nuclear activists. The nuclear is the only green energy source which can run out of essential material (zirconium) for real and couldn't easily substitute anything for zirconium. edit: note. I do see nuclear power as in principle green, but I also seen a plenty of pro nuclear articles which diss all other green energy sources on bs grounds and promote misconceptions.

The solar panels use silicon and very very tiny amounts of anything else. The silicon is everywhere.

There's similar claim that the wind turbine construction would run out of neodymium (which is used in magnets), never mind that neodymium magnets are not essential and are only used because its relatively cheap, and increases efficiency by couple percent while cutting down on amount of necessary copper and iron. I.e. run out of neodymium, no big deal, the price of wind energy will rise a few percent.

↑ comment by JoshuaZ · 2012-05-13T21:25:00.568Z · LW(p) · GW(p)

Renewable energy available annually is many orders of magnitude greater than all fossil fuels we're using, and it has been used as primary source of energy for almost the entire history up to industrial revolution. Biomass for everything, animal muscle power, wind and gravity for water transport, charcoal for melting etc. were used successfully at massive scale before anybody even thought of oil or gas or made much use of coal.

Right, and the energy demands of those societies were substantially lower than those later societies which used oil and coal. The industrial revolution would likely not have been possible without the presence of oil and coal in easily accessible locations. Total energy isn't all that matters- the efficiency of the energy, ease of transport, and energy density all matter a lot also. In those cases, fossil fuels are substantially better and more versatile.

Replies from: taw

↑ comment by taw · 2012-05-14T06:45:49.258Z · LW(p) · GW(p)

This argument is only convincing to people who never bothered to look at timeline of historical events in technology. No country had any significant amount of coal mining before let's say UK in 1790-ish and forwards, and even there it was primarily to replace wood and charcoal.

Technologies we managed to build by then were absolutely amazing. Until 1870 the majority of locomotives in the USA operated on wood, canal transport was as important as railroads and was even less dependent on dense fuels, so transportation was perfectly fine.

Entire industries operated on water power just fine for decades before coal or electricity.

Just look at how well science, and technology was doing before coal came about.

Even mentioning oil in this context is pretty ridiculous - it only came to importance by about 1950-ish. Cars can be modified to run on wood of all things without much difficulty, and it happened on mass scale in many economies in war conditions.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-14T14:33:26.722Z · LW(p) · GW(p)

Most of your analysis seems accurate, but there do seem to be some issues.

While you are correct that the until 1870 the majority of locomotives in the USA operated on wood, the same article you linked to notes that this was phased out as the major forests were cut down and demand went up. This is not a long-term sustainable process that was converted over to coal simply because it was more efficient. Even if one had forests grow back to pre-industrial levels (a not completely unlikely possibility if most of humanity has been wipe out), you don't have that much time to use wood on a large scale before you need to switch over.

You also are underestimating the transformation that occurred in the second half of the 19th century. In particular, while it is true that industries operated on water power, the total number of industries, and the energy demands they made were much smaller. Consider for example chip making plants which have massive energy needs. One can't run a modern economy on water power because there wouldn't be nearly enough water power to go around. This is connected to how while in the US in the 1870s and 1880s many of the first power plants were hydroelectric, support of a substantial grid required the switch to coal which could both provide more power and could have plants built at the most convenient location. This is discussed in Maggie Koerth-Baker's book "Before the Lights Go Out" which has a detailed discussion about the history of the US electric grids.

And while it is true that no country had major coal mining before 1790 by modern standards, again the replacement of wood and charcoal occurred to a large extent because they were running out of cheap wood, and because increased industry substantially benefited from the increased energy density. And even well before that, coal was used already in the late Middle Ages for speciaized purposes, such as metal working with metals that required high temperatures. While not a large industry, it was large enough that you had coal regulation in the 1300s, and by the 1620s it was economically practical to have coal mines that included large scale drainage and pumping systems so one could mine coal well below sea level.

Even mentioning oil in this context is pretty ridiculous - it only came to importance by about 1950-ish.

It is relevant in this context in that it became important in part due to the rising price of coal (as easy to access coal had been depleted). It isn't a coincidence that in World War II, a major goal of the German invasion of Russia was to get access to the Baku oil fields.

Replies from: taw

↑ comment by taw · 2012-05-14T15:55:01.722Z · LW(p) · GW(p)

Wood ran out because forests weren't properly managed, not because photosynthesis is somehow insufficiently fast at growing forest - and in any case there are countless agricultural alternative energy sources like ethanol from sugar cane.

In 1990 3.5 billion m^3 of wood were harvested. With density of about 0.9kg/cubic meter, and energy of about 15 MJ/kg, that's about 47 trillion MJ (if we burned it all, which we're not going to).

All coal produced in 1905 was about 0.9 billion tons, or about 20 trillion MJ.

In 2010 worldwide biofuel production reached 105 billion liters (or 2.4 trillion MJ). But that's very modest amount - according to the International Energy Agency, biofuels have the potential to meet more than a quarter of world demand for transportation fuels by 2050. And that's not any new technology, we knew how to extract alcohol from plants thousands of years ago.

We don't have enough hydropower to cover all our use, but it could cover very large fraction of our needs, definitely enough to jumpstart civilization, and there's many times more of any of - wind, solar, biomass, or nuclear power than we need - none of them fully available to any new civilization.

The fact that we used something for a certain purpose is no evidence that it was necessary for this purpose, it's just evidence that we're not total idiots to leave a resource unused. Many alternatives which would work nearly just as well were available in pretty much every single case.

Replies from: Furslid

↑ comment by Furslid · 2012-05-14T16:21:53.500Z · LW(p) · GW(p)

The key point of economics you are missing here is the price of wood was driven up by increased demand. Wood never ran out, but it did become so expensive that some uses became uneconomical. This allowed substitution of the previously more expensive coal. This did not happen because of poor management of forests. Good management of forests might have encouraged it, by limiting the amount of wood taken for burning.

This is especially true because we are not talking about a modern globalized economy where cheap sugar from Brazil, corn from Kansas, or pine from the Rockies can come into play. We are talking about the 19th century in industrializing Europe. The energy use of England could not have been met by better forestry. All stats from 200 years later are a red herring.

If there were other alternatives that were almost as good, please produce them. Not now, but at the time being discussed.

Replies from: taw

↑ comment by taw · 2012-05-15T08:37:52.537Z · LW(p) · GW(p)

Everything you say is ahistorical nonsense, transatlantic trade on a massive was happening back in 19th century, so wood import from the New World (or Scandinavia, or any other place) could have easily happened. Energy density of charcoal and of coal are very similar, so one could just as easily be imported as the other.

Or industries could have been located closer to major sources of wood, the same way they were located closer to major sources of coal. This was entirely possible.

Replies from: Furslid

↑ comment by Furslid · 2012-05-16T15:21:51.292Z · LW(p) · GW(p)

Would you mind explaining how what I have said is ahistorical nonsense?

Yes, at the end of the 18th century there was transatlantic trade. However, it was not cheap. It was sail powered and relatively expensive compared to modern shipping. Coal was generally not part of this trade. Shipping was too expensive. English industry used English mined coal. Same with American and German industry. If shipping coal was too expensive, why would charcoal be economical? You have jumped from "transportation existed" to "the costs of transportation can be ignored."

As for why industries weren't located by sources of wood. I can think of several reasons.
First is that they were sometimes located by sources of wood, and that contributed to the deforestation.

The second is that there aren't sources of wood as geographically concentrated as sources of coal. There is 10 mile square of wood producing district that can provide as much energy consistently over time as a 10 mile square of coal mining district.

Third is that timber was inconveniently located. There were coal producing areas that were better located for shipping and labor than timber producing areas. Are you seriously suggesting that an English owned factory with English labor might have set up in rural Sweden rather than Birmingham as an almost as good alternative?

I thought that we would have been total idiots to leave a resource like coal unused.

↑ comment by A1987dM (army1987) · 2012-05-13T09:28:54.075Z · LW(p) · GW(p)

And even if civilization "collapsed" it's not like our scientific and organizational knowledge would have disappeared, making it ridiculously easier to rebuild than it was to build in the first place.

I'm a bit sceptical about that. Compare the technological level of Europe in AD 100 with that of Europe in AD 700.

Replies from: taw

↑ comment by taw · 2012-05-14T06:49:42.785Z · LW(p) · GW(p)

Which part of "Europe" are you talking about? Western peripheries of Roman Empire got somewhat backwards, and that was after massive demographic collapse of late Antiquity, the rest of Europe didn't really change all that drastically, or even progressed quite a lot.

↑ comment by JoshuaZ · 2012-05-12T20:40:00.044Z · LW(p) · GW(p)

good luck running your empire without iron or copper!

The remains of the prior civilization would provide quite a bit. Indeed, for some metals this would be even easier. Aluminum for example requires a lot of technology to refine, but if one has already refined aluminum lying around one can easily make things out of it. A more serious problem would be the substantial reduction in easily accessible coal and oil. The remaining fossil fuels require a lot more technology to access.

Replies from: Alsadius

↑ comment by Alsadius · 2012-05-13T03:54:49.037Z · LW(p) · GW(p)

Yeah, this is one of the scarier future prospects I've heard kicking around. We can really only bootstrap an industrial civilization once, because the available energy simply isn't going to be there next time. We'd better get it right. Fortunately, we've done pretty well on that score thus far, but it's one of those lingering distant-future fears.

↑ comment by RomeoStevens · 2012-05-10T22:13:04.972Z · LW(p) · GW(p)

pandemics, man-made or natural.

Replies from: taw

↑ comment by taw · 2012-05-10T22:22:43.468Z · LW(p) · GW(p)

Yeah, I've mentioned pandemics already.

I'm not terribly willing to treat them as an "existential" risk, since countless pandemics already happened and for natural reasons they never actually kill the entire population.

And the way how awesomely we've dealt with SARS is a good data point showing that pandemics might actually be under control now. At least we should have far more confidence in our ability to deal with pandemics is far better than our ability to deal with just about any other existential threat.

And one nice side effect of just plain old medicine is reduction of this existential risk, even without any efforts specifically towards handling existential risk. Every antibiotic, every antiviral, every new way of keeping patients alive longer, every diagnostic improvement, every improvement in hygiene in poor countries etc. - they all make pandemics less likely and more manageable.

Replies from: JoshuaZ, RomeoStevens

↑ comment by JoshuaZ · 2012-05-11T02:06:56.327Z · LW(p) · GW(p)

I'm not terribly willing to treat them as an "existential" risk, since countless pandemics already happened and for natural reasons they never actually kill the entire population.

Most major pandemics have occurred before modern transport was common. The presence of easy air travel makes a serious pandemic more problematic. And in fact if one looks at emergent diseases in the last sixty years, such as HIV, one sees that they are effectively taking advantage of the ease of transport in the modern world.

Replies from: taw

↑ comment by taw · 2012-05-11T06:59:07.545Z · LW(p) · GW(p)

HIV emerged before modern medicine developed. It was discovered in 1981 - almost prehistory by medical standards, but it was actually transfered to humans somewhere in late 19th century. It wrecks the most havoc in places which are extremely far from modern medicine as well, in developed countries HIV is a fairly minor problem.

SARS is a much better example of a new disease and how modern medicine can deal with it.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-11T14:30:54.039Z · LW(p) · GW(p)

Even in Africa, HIV has taken advantage of modern transport. Migrant workers are a major cause of HIV spread in sub-Saharan Africa. This has advanced to the point where new road building projects think about what they will do to disease transmission. These laborers and the like aren't just walking- the possibility of such migrant labor is connected to the fact that even in the developing world, buses exist.

↑ comment by RomeoStevens · 2012-05-10T22:56:36.327Z · LW(p) · GW(p)

Oh, I somehow skipped seeing that in the OP. I don't think our ability to deal with mundane bugs has much transferability to our ability to deal with super bugs.

Replies from: taw

↑ comment by taw · 2012-05-11T00:21:24.385Z · LW(p) · GW(p)

There's really no such thing as a "super bug". All organisms follow the same constraints of biology and epidemiology. If there was even some magical "super bug" it would infect everything of any remotely compatible species, not be constrained to one species, and small subset of cells in it.

We might not have any drugs ready for a particular infection, but we didn't have any for SARS, it was extremely infectious, and extremely deadly, and it worked perfectly fine in the end. We have tools like quarantine, detection etc. which work against any disease known or unknown.

Medicine made a massive progress since then - mass sequencing of infectious genomes for quick reaction time is now far more practical, and we might soon even get broad spectrum antivirals.

And we've eradicated two diseases already (smallpox, rinderpest) with two more being very close to eradication (polio, dracunculiasis), and it's not like anybody has any intentions of stopping the total at 4. We'll keep eradicating diseases, even if it takes a decade or two for each such attempt. Every time we manage to do that, there's one less source of potential pandemic.

I cannot really imagine how it could be going better than that.

This doesn't fully apply to hypothetical manmade pandemics, but currently we don't really know how to make such thing (the best we can do it modify existing disease to be a bit more nasty, creating diseases de novo is far beyond our capabilities), nobody has any particular desire to do so, and any broad spectrum countermeasures we develop against natural diseases will likely at least partly apply against manmade diseases in any case.

Replies from: RomeoStevens, Alsadius

↑ comment by RomeoStevens · 2012-05-11T00:26:18.772Z · LW(p) · GW(p)

AFAIK nothing precludes extremely lethal bugs with long incubation periods. As for "nobody has any particular desire to", I hope you are right.

Replies from: taw

↑ comment by taw · 2012-05-11T06:55:31.303Z · LW(p) · GW(p)

Except the fact they wouldn't be particularly lethal.

If 100% of humans had HIV, it would increase probably make most countries disregard patent laws on a few drugs, and human life spans would get shorter by like 5-10 years on average.

This should keep things in perspective.

Replies from: CronoDAS

↑ comment by CronoDAS · 2012-05-12T20:32:25.946Z · LW(p) · GW(p)

If 100% of humans had HIV, it would increase probably make most countries disregard patent laws on a few drugs, and human life spans would get shorter by like 5-10 years on average.

My Google-fu seems to indicate a drop of about 20 years.

Replies from: NancyLebovitz, taw, Alsadius

↑ comment by NancyLebovitz · 2012-05-12T23:03:35.912Z · LW(p) · GW(p)

I bet the statistics are assuming nothing else changes. It's plausible to me that a society where people are generally sicker and shorter-lived will be poorer, and there will be a lot of additional deaths due to people being able to produce less stuff. It's also conceivable that the lower population will be an advantage because of less competition for natural resources and already-existing durable goods.

Probably both tendencies will be in play. This makes prediction difficult.

Replies from: taw

↑ comment by taw · 2012-05-13T05:18:18.447Z · LW(p) · GW(p)

The thing is countries would not really be poorer. Properly treated HIV isn't much worse than smoking (I mean the part before lung cancer) or diabetes for most of people's lives. Countries differ a lot on these already, without any apparent drastic differences in economic outcomes.

By the time people are already very old they might live a few years less, but they're not really terribly productive at that point anyway.

↑ comment by taw · 2012-05-13T05:16:20.180Z · LW(p) · GW(p)

That's already old data by standards of modern progress of medicine, and groups that tend to get HIV are highly non-random and are typically engaged in other risky activities like unprotected promiscuous sex and intravenous drug use, and are poorer and blacker than average, so their baseline life expectancy is already much lower than population average.

↑ comment by Alsadius · 2012-05-13T03:49:25.613Z · LW(p) · GW(p)

And remember, that's ~20 years with ~40% infection rates, not 100%.

↑ comment by Alsadius · 2012-05-13T03:48:30.951Z · LW(p) · GW(p)

There's really no such thing as a "super bug". All organisms follow the same constraints of biology and epidemiology. If there was even some magical "super bug" it would infect everything of any remotely compatible species, not be constrained to one species, and small subset of cells in it.

The term does not imply magic, it merely implies nasty. Smallpox and Spanish flu were both superbugs in every meaningful sense, but they worked on DNA just like everything else. The question is not whether someone builds a flesh-eating nanite our immune system can't handle or whatever, it's just about whether an infectious disease comes along that's worse than our medical system can cope with. That is a much lower bar.

Replies from: taw

↑ comment by taw · 2012-05-13T05:12:09.666Z · LW(p) · GW(p)

Smallpox wasn't that bad if you look at statistics, and spanish flu happened at a time when humans have been murdering each other at unprecedented rate and normal society was either suspended or collapsed altogether everywhere.

Usually the chance of getting infected is inversely correlated with severity of symptoms (by laws of epidemiology), and nastiness is inversely correlated with broad range (by laws of biology), so you have diseases that are really extreme by any one criterion, but they tend to be really weak by some other criterion.

And in any case we're getting amazingly better at this.

Replies from: Alsadius

↑ comment by Alsadius · 2012-05-15T01:02:28.302Z · LW(p) · GW(p)

Not that bad?

The disease killed an estimated 400,000 Europeans per year during the closing years of the 18th century (including five reigning monarchs),[7] and was responsible for a third of all blindness.[3][8] Of all those infected, 20–60%—and over 80% of infected children—died from the disease.[9] Smallpox was responsible for an estimated 300–500 million deaths during the 20th century.[10][11][12] As recently as 1967, the World Health Organization (WHO) estimated that 15 million people contracted the disease and that two million died in that year.[13]

I agree that there were aggravating factors, particularly in the Spanish flu case, and that tradeoffs between impact and spread generally form a brake. But nasty diseases do exist, and our medical science is sufficiently imperfect that the possibility of one slipping through even in the modern world is not to be ignored. Fortunately, it's a field we're already pouring some pretty stupendous sums of money into, so it's not a risk we're likely to be totally blindsided by, but it's one to keep in mind.

Replies from: taw

↑ comment by taw · 2012-05-15T08:33:08.584Z · LW(p) · GW(p)

The disease killed an estimated 400,000 Europeans per year during the closing years of the 18th century

So? 400,000 people a year is what % of total mortality?

As recently as 1967, the World Health Organization (WHO) estimated that 15 million people contracted the disease and that two million died in that year.

In an important way diseases don't kill people, poverty, hunger, and lack of sanitation kills people. The deaths were almost all happening in the poorest, and the most abused parts of the world - India and Africa.

Replies from: Alsadius

↑ comment by Alsadius · 2012-05-15T23:28:07.855Z · LW(p) · GW(p)

So? 400,000 people a year is what % of total mortality?

World population in 1800 was about a billion, and we'll ballpark 1/5th of the population being in Europe and 1/40th of them dying per year(which is probably better life expectancy than the world had, but about right for Europe). That means about 5 million deaths per year, so 400k would be 8%. And it's not like smallpox was the only plague around, either.

In an important way diseases don't kill people, poverty, hunger, and lack of sanitation kills people. The deaths were almost all happening in the poorest, and the most abused parts of the world - India and Africa.

In an even more important way, diseases kill people. Yes, if smallpox came back today(or a non-vaccinatible equivalent) it'd kill a lot fewer people than it used to because of better quarantine, sanitation, and all that fun stuff. Same way AIDS is a minor problem here and a world-ender in sub-Saharan Africa. But it's not like we lack for infectious disease in the developed world.

comment by ChrisHallquist · 2012-05-11T04:26:41.598Z · LW(p) · GW(p)

I'm mildly surprised that this post has not yet attracted more criticism. My initial reaction was that criticisms (1) and (2) seemed like strong ones, and almost posted a comment saying so. Then I thought, "I should look for other people discussing those points and join that discussion." But after doing that, I feel like people haven't given much in the way of objections to (1) and (2). Perceptions correct? Do lots of other people agree with them?

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-11T06:51:02.246Z · LW(p) · GW(p)

I think that many of Holden's stronger points call for longer, more carefully worked out answers than a dashed-off comment.

Replies from: Endovior

↑ comment by Endovior · 2012-05-12T07:24:51.663Z · LW(p) · GW(p)

Exactly. This is criticism intelligent enough that it requires an intelligent response to be meaningful. It falls far enough outside the usual bounds of discussion here that there aren't canned arguments to be recited. That said, once a proper counterargument is made, expect to see a whole lot more people rehashing the same counterarguments without adding much in the way of substantive content; once there's a canned argument to fall back on, there are a lot of people who will do just that.

Of course, before SI's formal reply is posted (and/or before you end up reading it), you now have a golden opportunity to formulate your own criticisms independently. Yes, the arguments presented in this post are quite strong ones. Are you capable of developing any counterarguments that could reasonably stand against them?

comment by timtyler · 2012-05-10T23:21:14.069Z · LW(p) · GW(p)

I believe that the probability that SI's concept of "Friendly" vs. "Unfriendly" goals ends up seeming essentially nonsensical, irrelevant and/or unimportant from the standpoint of the relevant future is over 90%.

It seems like an odd thing to say. Why take the standpoint of the "relevant future"? History is written by the winners - but that doesn't mean that their perspective is shared by us. Besides the statement is likely wrong - "Friendly" and "Unfriendly" as defined by Yudkowsky are fairly reasonable and useful concepts.

comment by JGWeissman · 2012-05-10T18:16:19.463Z · LW(p) · GW(p)

Regarding tools versus agent AGI's, I think the desired end game is still an Friendly Agent AGI. I am open to tool AIs being useful in the path to building such an agent. Similar ideas advocated by SI include use of automated theorem provers in formally proving Friendliness, and creating a seed AI to compute the Coherent Extropolated Volition of humanity and build an FAI with the appropiate utility function.

↑ comment by CronoDAS · 2012-08-09T01:16:53.007Z · LW(p) · GW(p)

/me shrugs

Maybe Ignaz Semmelweis would have been a better example?

I also found a list of "crackpots who were right" by Googling.

comment by JoshuaFox · 2012-05-17T15:09:00.881Z · LW(p) · GW(p)

As a supporter and donor to SI since 2006, I can say that I had a lot of specific criticisms of the way that the organization was managed. I was surprised that on many occasions management did not realize the obvious problems and fix them.

But the current management is now recognizing many of these points and resolving them one by one. If this continues, SI's future looks good.

comment by MarkusRamikin · 2012-05-10T15:47:35.538Z · LW(p) · GW(p)

Not a big deal, but for me your "more" links don't seem to be doing anything. Firefox 12 here.

EDIT: Yup, it's fixed. :)

Replies from: HoldenKarnofsky, gwern, RobertLumley

↑ comment by HoldenKarnofsky · 2012-05-10T16:12:28.639Z · LW(p) · GW(p)

Thanks for pointing this out. The links now work, though only from the permalink version of the page (not from the list of new posts).

↑ comment by gwern · 2012-05-10T16:05:56.676Z · LW(p) · GW(p)

Ditto. The anchors they point to don't seem to exist.

↑ comment by RobertLumley · 2012-05-10T16:02:25.226Z · LW(p) · GW(p)

They don't work for me in Chrome 18.

Edit: I didn't think anchor tags were possible in LW posts, but I could be completely off on this. At least, I've never seen them before

Replies from: gwern

↑ comment by gwern · 2012-05-10T18:09:54.336Z · LW(p) · GW(p)

Anchor tags are possible in LW, but they require additional work. (The only way I know of is editing the raw HTML.)

comment by Kenny · 2012-05-12T17:59:16.171Z · LW(p) · GW(p)

I haven't read the entire post yet, but here are some thoughts I had after reading thru to about the first ten paragraphs of "Objection 2 ...". I think the problem with assuming, or judging, that tool-AI is safer than agent-AI is that a sufficiently powerful tool-AI would essentially be an agent-AI. Humans already hack other humans without directly manipulating each other's physical persons or environments, and those hacks can drastically alter theirs or others persons and (physical) environments. Sometimes the safest course is not to listen to poisoned tongues.

comment by jmmcd · 2012-05-10T21:43:37.688Z · LW(p) · GW(p)

I feel that the relevance of "Friendliness theory" depends heavily on the idea of a "discrete jump" that seems unlikely and whose likelihood does not seem to have been publicly argued for.

It has been. An AI foom could be fast enough and/or sufficiently invisible in the early stages that it's practically discrete, to us. So the AI-foom does have relevance, contra

I believe I have read the vast majority of the Sequences, including the AI-foom debate, and that this content - while interesting and enjoyable - does not have much relevance for the arguments I've made.

comment by drethelin · 2012-05-10T20:59:45.783Z · LW(p) · GW(p)

As a separate point, people talk about AI friendliness as a safety precaution, but I think an important thing to remember is a truly friendly self improving AGI would probably be the greatest possible thing you could do for the world. It's possible the risk of human destruction from the pursuit of FAI is larger than the possible upside, but if you include the FAI's ability to mitigate other existential risks I don't think that's the case.

↑ comment by cousin_it · 2012-08-09T11:02:35.408Z · LW(p) · GW(p)

Can you summarize the difference?

Replies from: private_messaging

↑ comment by private_messaging · 2012-08-09T17:24:16.432Z · LW(p) · GW(p)

Ok, I decided I'll reply to all comments on my comments that I consider to be good.

Can you summarize the difference?

The example was Ignaz Semmelweis. He had actual empirical data, he was a medical doctor, the hypothesis could have been easily tested by, you know, washing your hands.

What's about him that makes him an example of someone pattern matching to a crackpot? Just the opposition to his theory, and his reaction to the opposition, understandable for any moral person of his beliefs.

edit: Note that I did not look particularly close. If I look particularly close, that's when he would make no sense - cadaverous particles? Unknown cadaverous material? Dead matter makes people dead? Or had he gotten it right - little living things? How is that even possible? . Looking from too far, you only see that his view is not accepted. Looking very closely, you see that it doesn't make a lot of sense. But looking at the intermediate level, you see that he has data, and the theory is testable rather than a collection of excuses. You also see that the guy is for sure not merely doing this to make himself a living.

↑ comment by CronoDAS · 2012-08-06T18:46:45.406Z · LW(p) · GW(p)

That is ignored, pattern matching is not good enough for you, you overcame pattern matching.

I wouldn't say that. "This looks cranky, it's probably not worth investigation further" is usually a pretty good heuristic. And, as you say, unless you actually know enough about the field to be able to be close to an expert yourself, it's often very hard to tell the difference between a logically consistent crank argument with no blatantly obvious mistakes and an argument for something that's actually correct. On the other hand, from the outside, people with minority views that are eventually vindicated also tend to look somewhat like cranks. So the only really reliable way to tell the difference between a crank and someone who should be taken seriously is to have someone who knows enough to find the non-obvious flaws actually is an expert look at the arguments.

(Incidentally, the "energy catalyzer" fails the "no obvious problems" test; if you say you have a working device and then won't let people perform independent tests on it, that's an obvious problem.)

comment by private_messaging · 2012-05-13T22:40:11.871Z · LW(p) · GW(p)

It is assumed that the lack of ability to persuade Holden results from lack of good arguments in support of rationally held beliefs. It appears to me that the lack of good arguments is result of lack of the rational basis for the beliefs themselves. (Same goes for meta-beliefs that are not conventionally substantiated)

comment by timtyler · 2012-05-10T23:02:37.341Z · LW(p) · GW(p)

I thought objections 1 and 2 were bogus. I thought Holden would be better off steering away from the more technical arguments and sticking to the line that these folk don't have a clearly-argued case regarding them doing a lot of good.

Replies from: Endovior

↑ comment by Endovior · 2012-05-12T16:07:02.112Z · LW(p) · GW(p)

What, really? You don't have anything specific or technical to say about the argument, you just find the argument "bogus" and suggest that the author doesn't know what he's talking about, without actually making a counterpoint of your own? I felt the first point was particularly valid... FAI is, after all, a really hard problem, and it is a fair point to ask why any group thinks it has the capacity to solve it perfectly on the first try, or to know that it's solution would work short of testing it. The second, on the other hand, is an interesting technical question that hasn't been much expounded upon; it may very well prove to be a viable avenue of research, or at least a stepping stone along the way.

To dismiss such arguments as "bogus" speaks worse of you then the arguments themselves.

Replies from: timtyler

↑ comment by timtyler · 2012-05-13T00:00:14.661Z · LW(p) · GW(p)

In fact, I did previously post some more specific criticisms here.

This is the rather-obvious rebuttal to point 1.

It is often a useful contribution for someone to assess an argument without necessarily countering its points.

Replies from: amcknight

↑ comment by amcknight · 2012-07-03T01:37:55.314Z · LW(p) · GW(p)

It is often a useful contribution for someone to assess an argument without necessarily countering its points.

Not really.

comment by Peterdjones · 2013-01-18T14:07:34.297Z · LW(p) · GW(p)

It isn't clear whether AGI would be as powerful as SI's views imply.

Yes. There's something weird going on there. EY seems to want to constrain AI in various ways -- to be friendly, to be Bayesian and so on -- but how, then is the "G" justifiied? Human intelligence is general enough to consider and formulate multiple theories of probability. Why should we consider something as being at least as smart as us and at least as general as us, when we can think things it can't think.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-01-18T14:29:48.490Z · LW(p) · GW(p)

"Friendliness" is (the way I understand it) a constraint on the purposes and desired consequences of the AI's actions, not on what it is allowed to think. It would be able to think of non-Friendly actions, if only for the purposes of e.g. averting them when necessary.

As for Bayesianism, my guess is that even a Seed AI has to start somehow. There's no necessary constraint on it remaining Bayesian if it manages to figure out some even better theory of probability (or if it judges that a theory humans have developed is better). If an AI models itself performing better according to its criteria if it used some different theory, it will ideally self-modify to use that theory...

comment by Stuart Buck (stuart-buck) · 2023-04-26T01:34:16.975Z · LW(p) · GW(p)

This post seems even more relevant and true now, in 2023.

comment by Advocate · 2012-05-19T06:12:09.941Z · LW(p) · GW(p)

forgive me if this has been said, i didn't have time to read the hundreds of comments. it is a fallacy to state that it is of any particular importance to do these things to preclude the destruction of our existence if we haven't made a case that the continuation of our existence is of particular importance. that does not seem to me to be a prima-facia case.

comment by FinalState · 2012-05-16T16:00:00.398Z · LW(p) · GW(p)

Guys... I am in the final implementation stages of the general intelligence algorithm. Though I had intellectual property concerns regarding working within the academic network (especially with the quality of people I was exposed to), I am willing to work with people that I perceive as intelligent and emotionally mature enough to respect my accomplishments. Although I do not perceive an absolute need to work with anyone else, I do have the following concerns about finishing this project alone:

Ethical considerations - Review of my approach to handling the potential dangers including some not typically talked about (DNA encoding is an instance of the GIA which means publishing could open the door to genetic programming). How to promote the use of it to quantify the social sciences as much or more than the use of it for sheer technological purposes which could lead to a poorly understood MAD situation between any individual and the human race. Example: The question of how/when to render assistance to others without harming their self-determination could be reduced to an optimization problem.

I could just read everything written on this subject but most of it is off base.

Productivity considerations - Considering the potential implications having accountability to others for meeting deadlines etc could increase productivity... every day may matter in this case. I have pretty much been working in a vacuum other than deducing information from comments made by others, looking up data on the internet, and occasionally debating with people incognito regarding their opinions of how/why certain things work or do not work.

If anyone is willing and able to meet to talk about this in the southeast I would consider it based on a supplied explanation of how best to protect oneself from the loss of credit for one's ideas in working with others (which I would then compare to my own understanding regarding the subject for honesty and accuracy) If there is no reason for anyone to want to collaborate under these circumstances then so be it, but I feel like more emotionally mature and intelligent people would not feel this way.

Replies from: Bugmaster, shminux, CuSithBell, Normal_Anomaly, dlthomas, khafra, None, Monkeymind, private_messaging

↑ comment by Bugmaster · 2012-05-16T21:39:36.664Z · LW(p) · GW(p)

Please don't take this as a personal attack, but, historically speaking, every one who'd said "I am in the final implementation stages of the general intelligence algorithm" was wrong so far. Their algorithms never quite worked out. Is there any evidence you can offer that your work is any different ? I understand that this is a tricky proposition, since revealing your work could set off all kinds of doomsday scenarios (assuming that it performs as you expect it to); still, surely there must be some way for you to convince skeptics that you can succeed where so many others had failed.

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-05-21T02:30:06.174Z · LW(p) · GW(p)

Sadly, I think the general trend you note is correct, but the first developers to succeed may do so in relative secrecy.

As time goes on it becomes increasingly possible that some small group or lone researcher is able to put the final pieces together and develop an AGI. Assuming a typical largely selfish financial motivation, a small self-sufficient developer would have very little to gain from pre-publishing or publicizing their plan.

Eventually of course they may be tempted to publicize, but there is more incentive to do that later, if at all. Unless you work on it for a while and it doesn't go much of anywhere. Then of course you publish.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-21T03:20:51.797Z · LW(p) · GW(p)

As time goes on it becomes increasingly possible that some small group or lone researcher is able to put the final pieces together and develop an AGI.

Why do you think this is the case? Is this just because the overall knowledge level concerning AI goes up over time? If so, what makes you think that that rate of increase is anything large enough to be significant?

Replies from: jacob_cannell

↑ comment by jacob_cannell · 2012-06-12T22:34:08.990Z · LW(p) · GW(p)

Yes. This is just the way of invention in general: steady incremental evolutionary progress.

A big well funded team can throw more computational resources into their particular solution for the problem, but the returns are sublinear (for any one particular solution) even without moore's law.

↑ comment by Shmi (shminux) · 2012-05-16T22:25:01.871Z · LW(p) · GW(p)

I am in the final implementation stages of the general intelligence algorithm.

it's both amusing and disconcerting that people on this forum treat such a comment seriously.

Replies from: Bugmaster, PhilGoetz

↑ comment by Bugmaster · 2012-05-16T22:31:41.659Z · LW(p) · GW(p)

I try to treat all comments with some degree of seriousness, which can be expressed as a floating-point number between 0 and 1 :-)

↑ comment by PhilGoetz · 2012-05-19T00:44:56.783Z · LW(p) · GW(p)

Isn't the SIAI founded on the supposition that a scenario like this is possible?

Replies from: shminux

↑ comment by Shmi (shminux) · 2012-05-19T01:16:48.817Z · LW(p) · GW(p)

Yes, but on this forum there should be some reasonable immunity against instances of Pascal's wager/mugging like that. The comment in question does not rise above the noise level, so treating it seriously shows how far many regulars still have to go in learning the basics.

↑ comment by CuSithBell · 2012-05-17T06:03:53.233Z · LW(p) · GW(p)

If this works, it's probably worth a top-level post.

Replies from: thomblake

↑ comment by thomblake · 2012-05-18T13:38:53.580Z · LW(p) · GW(p)

Upvoted for humor: "probably".

Replies from: CuSithBell

↑ comment by CuSithBell · 2012-05-19T01:08:56.908Z · LW(p) · GW(p)

Cheers! Some find my humor a little dry.

↑ comment by Normal_Anomaly · 2012-05-17T15:29:12.411Z · LW(p) · GW(p)

Congratulations on your insights, but please don't snrk implement them until snigger you've made sure that oh heck I can't keep a straight face anymore.

The reactions to the parent comment are very amusing. We have people sarcastically supporting the commenter, people sarcastically telling the commenter they're a threat to the world, people sarcastically telling the commenter to fear for their life, people non-sarcastically telling the commenter to fear for their life, people honestly telling the commenter they're probably nuts, and people failing to get every instance of the sarcasm. Yet at bottom, we're probably all (except for private_messaging) thinking the same thing: that FinalState almost certainly has no way of creating an AGI and that no-one involved need feel threatened by anyone else.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T08:48:30.401Z · LW(p) · GW(p)

Yet at bottom, we're probably all (except for private_messaging) thinking the same thing: that FinalState almost certainly has no way of creating an AGI

nah, I stated that probability of him creating AGI is epsilon (my probability for his project hurting me is microscopic epsilon while the SI hurting him somehow is a larger epsilon, I only stated a relation that the latter is larger than former. The probability of a person going unfriendly is way, way higher than the probability of a person creating AGI that kills us all).

I think we're all here for various sarcastic or semi sarcastic points; my point is that given the SI stance, AGI researchers would (and have to) try to keep away from SI, especially those whom have some probability of creating an AGI, given combination of probability of useful contribution by SI versus probability of SI going nuts.

Replies from: Normal_Anomaly

↑ comment by Normal_Anomaly · 2012-05-18T17:24:59.044Z · LW(p) · GW(p)

I never thought you disagreed with:

that FinalState almost certainly has no way of creating an AGI

I actually meant that I thought you disagreed with:

and that no-one involved need feel threatened by anyone else.

Sorry for the language ambiguity. If you think the probability of SI hurting FinalState is epsilon, I misunderstood you. I thought you thought it was a large enough probability to be worth worrying about and warning FinalState about.

↑ comment by dlthomas · 2012-05-17T01:29:32.544Z · LW(p) · GW(p)

You'll have to forgive Eliezer for not responding; he's busy dispatching death squads.

Replies from: None

↑ comment by [deleted] · 2012-05-17T01:32:39.074Z · LW(p) · GW(p)

Not funny.

Replies from: JoshuaZ, fubarobfusco

↑ comment by JoshuaZ · 2012-05-18T02:26:48.541Z · LW(p) · GW(p)

Of course not, why send death squads when you can send Death Eaters. It just takes a single spell to solve this problem.

↑ comment by fubarobfusco · 2012-05-18T02:23:08.046Z · LW(p) · GW(p)

Indeed not.

↑ comment by khafra · 2012-05-16T18:52:23.029Z · LW(p) · GW(p)

I am in the final implementation stages of the general intelligence algorithm.

Do you mean "I am in the final writing stages of a paper on a general intelligence algorithm?" If you were in the final implementation stages of what LW would recognize as the general intelligence algorithm, the very last thing you would want to do is mention that fact here; and the second-to-last thing you'd do would be to worry about personal credit.

Replies from: FinalState

↑ comment by FinalState · 2012-05-16T19:19:06.644Z · LW(p) · GW(p)

I am open to arguments as to why that might be the case, but unless you also have the GIA, I should be telling you what things I would want to do first and last. I don't really see what the risk is, since I haven't given anyone any unique knowledge that would allow them to follow in my footsteps.

A paper? I'll write that in a few minutes after I finish the implementation. Problem statement -> pseudocode -> implementation. I am just putting some finishing touches on the data structure cases I created to solve the problem.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-16T22:20:16.538Z · LW(p) · GW(p)

I don't really see what the risk is...

As far as I understand, the SIAI folks believe that the risk is, "you push the Enter key, your algorithm goes online, bootstraps itself to transhuman superintelligence, and eats the Earth with nanotechnology" (nanotech is just one possibility among many, of course). I personally don't believe we're in any danger of that happening any time soon, but these guys do. They have made it their mission in life to prevent this scenario from happening. Their mission and yours appear to be in conflict.

Replies from: FinalState

↑ comment by FinalState · 2012-05-17T12:02:06.942Z · LW(p) · GW(p)

That is just wrong. SAI doesn't really work like that. Those people have seen too many sci fi movies. It's easy to psychologically manipulate an AI if you are smart enough to create one in the first place. To use terms I have seen tossed around, there is no difference between tool and agent AI. The agent only does things that you program it to do. It would take a malevolent genius to program something akin to a serial killer to cause that kind of scenario.

Replies from: MarkusRamikin, FinalState, FinalState

↑ comment by MarkusRamikin · 2012-05-17T12:15:21.263Z · LW(p) · GW(p)

It's easy to psychologically manipulate an AI if you are smart enough to create one in the first place.

People who created Deep Thought have no problem beating it at chess.

↑ comment by FinalState · 2012-05-23T15:04:54.287Z · LW(p) · GW(p)

What on earth is this retraction nonsense?

Replies from: thomblake

↑ comment by thomblake · 2012-05-23T15:13:17.130Z · LW(p) · GW(p)

Retraction means that you no longer endorse the contents of a comment. The comment is not deleted so that it will not break existing conversations. Retracted comments are no longer eligible for voting. Once a comment is retracted, it can be revisited at which point there is a 'delete' option, which removes the comment permanently.

↑ comment by FinalState · 2012-05-20T22:21:58.025Z · LW(p) · GW(p)

I didn't realize that I was receiving all that mail...

Replies from: FinalState

↑ comment by FinalState · 2012-05-22T15:51:48.319Z · LW(p) · GW(p)

Yummy 11 more tears

↑ comment by [deleted] · 2012-05-16T22:41:11.341Z · LW(p) · GW(p)

If you are not totally incompetent or lying out of your ass, please stop. Do not turn it on. At least consult SI.

Replies from: Multiheaded

↑ comment by Multiheaded · 2012-05-18T14:02:57.804Z · LW(p) · GW(p)

Don't feed the... um, crank.

Replies from: None

↑ comment by [deleted] · 2012-05-22T16:43:01.029Z · LW(p) · GW(p)

A pascals mugging is worth at least a comment.

↑ comment by Monkeymind · 2012-05-17T23:04:35.192Z · LW(p) · GW(p)

If you are concerned about Intellectual Property rights, by all means have a confidentiality agreement signed b4 revealing any proprietary information. Any reasonable person would not have a problem signing such an agreement.

Expect some skepticism until a working prototype is available.

Good luck with your project!

↑ comment by private_messaging · 2012-05-16T21:26:59.224Z · LW(p) · GW(p)

My recommendation: stay away from SIAI as there is considerable probability that they are at worst nutjobs or at best a fraud. Either way it is dangerous for you, as in, actual risk to your safety. Do not reveal your address. I am bloody serious. It is not a game here.

edit: supplemental information (not so much on the potential dangers but on the usefulness of communication):

the 'roko incident': http://rationalwiki.org/wiki/Talk:LessWrong#Hell

the founder: http://lesswrong.com/lw/6dr/discussion_yudowskys_actual_accomplishments/

Note: I fully believe that risk to your safety (while small) outweights the risk to all of us from your software project. All of us includes me, all my relatives, all people i care for, the world, etc.

Replies from: gwern, othercriteria, None, None

↑ comment by gwern · 2012-05-16T21:55:19.701Z · LW(p) · GW(p)

So your argument that visiting a bunch of highly educated pencil-necked white nerds is physically dangerous boils down to... one incident of ineffective online censorship mocked by most of the LW community and all outsiders, and some criticism of Yudkowsky's computer science & philosophical achievements.

I see.

I would literally have had more respect for you if you had used racial slurs like "niggers" in your argument, since that is at least tethered to reality in the slightest bit.

Replies from: metaphysicist, private_messaging, private_messaging

↑ comment by metaphysicist · 2012-05-17T21:59:25.462Z · LW(p) · GW(p)

one incident of ineffective online censorship mocked by most of the LW community and all outsiders

Where a single incident seems grotesquely out of character, one should attempt to explain the single incident's cause. What's troubling is that Eliezer Yudkowsky has: 1) never admitted his mistake; 2) never shown (at least to my knowledge) any regret over how he handled it; and 3) most importantly, never explained his response (practically impossible without admitting his mistake).

The failure to address a wrongdoing or serious error over many years means it should be taken seriously, despite the lapse of time. The failure of self-analysis raises real questions about a lack of intellectual honesty--that is, a lack of epistemic rationality.

Replies from: gwern

↑ comment by gwern · 2012-05-17T23:20:45.899Z · LW(p) · GW(p)

I don't think it's hard to explain at all: Eliezer prioritized a donor (presumably long-term and one he knew personally) over an article. I disagree with it, but you know what, I saw this sort of thing all the time on Wikipedia, and I don't need to go looking for theories of why administrators were crazy and deleted Daniel Brandt's article. I know why they did, even though I strongly disagreed.

3) most importantly, never explained his response (practically impossible without admitting his mistake).

He or someone else must have explained at some point, or I wouldn't know his reason was that the article was giving a donor nightmares.

Is deleting one post such an issue to get worked up over? Or is this just discussed because it's the best criticism one can come up with besides "he's a high school dropout who hasn't yet created an AI and so must be completely wrong"?

Replies from: Rain, JoshuaZ, metaphysicist, Eliezer_Yudkowsky, Humbug, private_messaging

↑ comment by Rain · 2012-05-18T13:03:55.836Z · LW(p) · GW(p)

Please cite your claim that the affected person was a donor.

↑ comment by JoshuaZ · 2012-05-17T23:30:37.098Z · LW(p) · GW(p)

Has he said anywhere that the individual with nightmares was a donor? Note incidentally that having content that is acting as that much of a cognitive basilisk might be a legitimate reason to delete (although I'm inclined to think that it wasn't).

↑ comment by metaphysicist · 2012-05-18T05:10:09.231Z · LW(p) · GW(p)

Is deleting one post such an issue to get worked up over? Or is this just discussed because it's the best criticism one can come up with besides "he's a high school dropout who hasn't yet created an AI and so must be completely wrong"?

Like JoshuaZ, I hadn't known a donor was involved. What's the big deal? People donote to SIAI because they trust Eliezer Yudkowsky's integrity and intellect. So it's natural to ask whether he's someone you can count on to deliver the truth. Caving to donors is inauspicious.

In a related vein, I also found disturbing that Eliezer Yudkowsky repeated his claim that that Loosemoore guy "lied." Having had years to cool off, he still hasn't summoned the humility to admit he stretched the evidence for Loosemoore's deceitfulness: Loosemoore is obviously a cognitive scientist.

These two examples paint a picture of Eliezer Yudkowsky as a person subject to strong personal loyalties and animosities that exceed his dedication to the truth. In the first incident, his loyalty to a donor induced him to suppress information; in the Loosemoore incident, his longstanding animosity to Loosemoore made him unable to adjust his earlier opinion.

I hope these impressions aren't accurate. But one thing seems for sure: Eliezer Yudkowsky is not a person for serious self-criticism. Has he admitted any significant intellectual error since he became a rationalist? [Serious question.]

Replies from: gwern

↑ comment by gwern · 2012-05-18T07:42:55.596Z · LW(p) · GW(p)

Caving to donors is inauspicious.

It's also a double-bind. If you do nothing, you are valuing donors at less than some random speculation which is unusually dubious even by LessWrong's standards, resting as it does on a novel speculative decision theory (acausal trade) whose most obvious requirement (implementing sufficiently similar algorithms) is beyond blatantly false when applied to humans and FAIs. (If you actually believe that SIAI is a good charity, pissing off donors over something like this is a really bad idea, and if you don't believe SIAI is a good charity, well, that's even more damning, isn't it?) And if you delete it, well, you get exactly this stupid mess which is still being dragged up years later.

I hope these impressions aren't accurate. But one thing seems for sure: Eliezer Yudkowsky is not a person for serious self-criticism. Has he admitted any significant intellectual error since he became a rationalist? [Serious question.]

Repudiating most of his long-form works like CFAI and LOGI and CEV isn't admission of error?

Personally, when he was writing the Sequences, I found it a little obnoxious how he kept saying "I was totally on the wrong track and mistaken before I was enlightened & came to understand Bayesian statistics, but now I have a chance of being less wrong" - once is enough, we get it already, I'm not that interested in your intellectual evolution.

Replies from: evand, private_messaging

↑ comment by evand · 2012-05-19T20:51:32.221Z · LW(p) · GW(p)

Repudiating most of his long-form works like CFAI and LOGI and CEV isn't admission of error?

As someone who hasn't been around that long, it would be interesting to have links. I'm having trouble coming up with useful search terms.

Replies from: gwern

↑ comment by gwern · 2012-05-19T21:15:00.376Z · LW(p) · GW(p)

Creating Friendly AI, Levels of Organization in General Intelligence, and Coherent Extrapolated Volition.

Replies from: evand

↑ comment by evand · 2012-05-19T21:42:40.930Z · LW(p) · GW(p)

Sorry, I wasn't clear. I meant links to the repudiations. I've read some of the material in CFAI and CEV, but not the retraction, and not yet any of LOGI.

Replies from: gwern

↑ comment by gwern · 2012-05-19T21:45:43.883Z · LW(p) · GW(p)

Oh. I don't remember, then, besides the notes about them being obsolete.

↑ comment by private_messaging · 2012-05-18T08:27:25.141Z · LW(p) · GW(p)

Personally, when he was writing the Sequences, I found it a little obnoxious how he kept saying "I was totally on the wrong track and mistaken before I was enlightened & came to understand Bayesian statistics, but now I have a chance of being less wrong" - once is enough, we get it already, I'm not that interested in your intellectual evolution.

Hmm, and the foom belief (for instance) is based on Bayesian statistics how?

That's pretty damn interesting, because I've understood Bayesian statistics for ages, understood how wrong you are without it, and also understood how computationally expensive it is - just think what sort of data you need to attach to each proposition to avoid double counting evidence, to avoid any form of circular updates, to avoid naive Bayesian mistakes... even worse, how prone it is to making faulty conclusions from a partial set of propositions (as generated by e.g. exploring ideas, which btw introduces another form of circularity as you tend to use ideas which you think are probable as starting point more often).

Seriously, he should try to write software that would do updates correctly on a graph with cycles and with correlated propositions. That might result in another enlightenment, hopefully the one not leading to increased confidence, but to decreased confidence. Statistics isn't easy to do right. And relatively minor bugs easily lead to major errors.

Replies from: gwern

↑ comment by gwern · 2012-05-19T00:27:18.278Z · LW(p) · GW(p)

Hmm, and the foom belief (for instance) is based on Bayesian statistics how?

I don't think it's based on Bayesian statistics any more than any other belief may (or may not) be based. To take Eliezer specifically, he was interested in the Singularity - specifically, the Good/Vingean observation that a machine more intelligent than us ought to be better than us at creating a still more intelligent machine - long before he had his 'Bayesian enlightenment', so his shift to subjective Bayesianism may have increased his belief in intelligence explosions, but certainly didn't cause it.

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-18T22:17:14.491Z · LW(p) · GW(p)

Once again: ROKO DELETED HIS OWN POST. NO OUTSIDE CENSORSHIP WAS INVOLVED.

This is how rumors evolve, ya know.

Replies from: Wei_Dai, JoshuaZ

↑ comment by Wei Dai (Wei_Dai) · 2012-05-18T23:28:50.160Z · LW(p) · GW(p)

Eliezer, I upvoted you and was about to apologize for contributing to this rumor myself, but then found this quote from a copy of the Roko post that's available online:

Meanwhile I'm banning this post so that it doesn't (a) give people horrible nightmares and (b) give distant superintelligences a motive to follow through on blackmail against people dumb enough to think about them in sufficient detail, though, thankfully, I doubt anyone dumb enough to do this knows the sufficient detail. (I'm not sure I know the sufficient detail.)

Perhaps your memory got mixed up because Roko subsequently deleted all of his other posts and comments? (Unless "banning" meant something other than "deleting"?)

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-18T23:54:13.811Z · LW(p) · GW(p)

Now I've got no idea what I did. Maybe my own memory was mixed up by hearing other people say that the post was deleted by Roko? Or Roko retracted it after I banned it, or it was banned and then unbanned and then Roko retracted it?

I retract my grandparent comment; I have little trust for my own memories. Thanks for catching this.

Replies from: komponisto, Vladimir_Nesov, Sniffnoy, Rhwawn

↑ comment by komponisto · 2012-05-19T02:45:26.078Z · LW(p) · GW(p)

A lesson learned here. I vividly remembered your "Meanwhile I'm banning this post" comment and was going to remind you, but chickened out due to the caps in the great-grandparent which seemed to signal that you Knew What You Were Talking About and wouldn't react kindly to correction. Props to Wei Dai for having more courage than I did.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-19T15:49:26.140Z · LW(p) · GW(p)

I'm surprised and disconcerted that some people might be so afraid of being rebuked by Eliezer as to be reluctant to criticize/correct him even when such incontrovertible evidence is available showing that he's wrong. Your comment also made me recall another comment you wrote a couple of years ago about how my status in this community made a criticism of you feel like a "huge insult", which I couldn't understand at the time and just ignored.

I wonder how many other people feel this strongly about being criticized/insulted by a high status person (I guess at least Roko also felt strongly enough about being called "stupid" by Eliezer to contribute to him leaving this community a few days later), and whether Eliezer might not be aware of this effect he is having on others.

Replies from: Eliezer_Yudkowsky, TheOtherDave, wedrifid, XiXiDu, private_messaging, XiXiDu

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-19T17:06:32.316Z · LW(p) · GW(p)

Your comment also made me recall another comment you [Kip] wrote a couple of years ago about how my status in this community made a criticism of you feel like a "huge insult", which I couldn't understand at the time and just ignored.

My brain really, really does not want to update on the numerous items of evidence available to it that it can hit people much much harder now, owing to community status, than when it was 12 years old.

↑ comment by TheOtherDave · 2012-05-19T20:27:49.101Z · LW(p) · GW(p)

(nods) I've wondered this many times.
I have also at times wondered if EY is adopting the "slam the door three times" approach to prospective members of his community, though I consider this fairly unlikely given other things he's said.

Somewhat relatedly, I remember when lukeprog first joined the site, he and EY got into an exchange that from what I recall of my perspective as a completely uninvolved third party involved luke earnestly trying to offer assistance and EY being confidently dismissive of any assistance someone like luke could provide, and at the time I remember feeling sort of sorry for luke, who it seemed to me was being treated a lot worse than he deserved, and surprised that he kept at it.

The way that story ultimately turned out led me to decide that my model of what was going on was at least importantly incomplete, and quite possibly fundamentally wrongheaded, but I haven't further refined that model.

↑ comment by wedrifid · 2012-05-26T04:16:01.430Z · LW(p) · GW(p)

I wonder how many other people feel this strongly about being criticized/insulted by a high status person (I guess at least Roko also felt strongly enough about being called "stupid" by Eliezer to contribute to him leaving this community a few days later), and whether Eliezer might not be aware of this effect he is having on others.

As a data point here I tend to empathize with the recipient of such barrages to what I subjectively estimate as about 60% of the degree of emotional affect that I would experience if it were directed at myself. Particularly if said recipient is someone I respect as much as Roko and when the insults are not justified - less if they do not have my respect and if the insults are justified I experience no empathy. It is the kind of thing that I viscerally object to having in my tribe and where it is possible I try to ensure that the consequences to the high status person for their behavior are as negative as possible - or at least minimize the reward they receive if the tribe is one that tends to award bullying.

There are times in the past - let's say 4 years ago - where such an attack would certainly prompt me to leave a community, even if the community was otherwise moderately appreciated. Now I believe I am unlikely to leave over such an incident. I would say I am more socially resilient and also more capable as understanding social politics as a game and so take it less personally. For instance when received the more mildly expressed declaration from Eliezer "You are not safe to even associate with!" I don't recall experiencing any flight impulses - more surprise.

I'm surprised and disconcerted that some people might be so afraid of being rebuked by Eliezer as to be reluctant to criticize/correct him even when such incontrovertible evidence is available showing that he's wrong.

I was a little surprised at first too at reading of komponisto's reticence. Until I thought about it and reminded myself that in general I err on the side of not holding my tongue when I ought. In fact, the character "wedrifid" on wotmud.org with which I initially established this handle was banned from the game for 3 months for making exactly this kind of correction based off incontrovertible truth. People with status are dangerous and in general highly epistemically irrational in this regard. Correcting them is nearly always foolish.

I must emphasize that part of my initial surprise at kompo's reticence is due to my model of Eliezer as not being especially corrupt in this kind of regard. In response to such correction I expect him to respond positively and update. While Eliezer may be arrogant and a tad careless when interacting with people at times but he is not an egotistical jerk enforcing his dominance in his domain with dick moves. That's both high praise (by my way of thinking) and a reason for people to err less on the side of caution with him and to take less personally any 'abrupt' things he may say. Eliezer being rude to you isn't a precursor to him beating you to death with a metaphorical rock to maintain his power - as our instincts may anticipate. He's just being rude.

↑ comment by XiXiDu · 2012-05-19T16:22:13.816Z · LW(p) · GW(p)

I'm surprised and disconcerted that some people might be so afraid of being rebuked by Eliezer as to be reluctant to criticize/correct him even when such incontrovertible evidence is available showing that he's wrong.

People have to realize that to critically examine his output is very important due to the nature and scale of what he is trying to achieve.

Even people with comparatively modest goals like trying to become the president of the United States of America should face and expect a constant and critical analysis of everything they are doing.

Which is why I am kind of surprised how often people ask me if I am on a crusade against Eliezer or find fault with my alleged "hostility". Excuse me? That person is asking for money to implement a mechanism that will change the nature of the whole universe. You should be looking for possible shortcomings as well!

Everyone should be critical of Eliezer and SIAI, even if they agree with almost anything. Why? Because if you believe that it is incredible important and difficult to get friendly AI just right, then you should be wary of any weak spot. And humans are the weak spot here.

↑ comment by private_messaging · 2012-05-26T05:11:40.279Z · LW(p) · GW(p)

That's why outsiders think it's a circlejerk. I've heard of Richard Loosemore whom as far as i can see was banned over corrections on the "conjunction fallacy", not sure what exactly went on, but ofc having spent time reading Roko thing (and having assumed that there was something sensible I did not hear of, and then learning that there wasn't) its kind of obvious where my priors are.

Replies from: Manfred

↑ comment by Manfred · 2012-05-26T06:12:21.196Z · LW(p) · GW(p)

Maybe try keeping statements more accurate by qualifying your generalizations ("some outsiders"), or even just saying "that's why I think this is a circlejirk." That's what everyone ever is going to interpret it as anyhow (intentional).

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-26T08:42:18.690Z · LW(p) · GW(p)

Maybe you guys are too careful with qualifying everything as 'some outsiders' and then you end up with outsiders like Holden forming negative views which you could of predicted if you generalized more (and have the benefit of Holden's anticipated feedback without him telling people not to donate).

Replies from: Manfred

↑ comment by Manfred · 2012-05-26T19:25:57.323Z · LW(p) · GW(p)

Maybe. Seems like you're reaching, though: Maybe something bad comes from us being accurate rather than general about things like this, and maybe Holden criticizing SIAI is a product of this on LessWrong for some reason, and therefore it is in fact better for you to say inaccurate things like "outsiders think it's a circlejrik." Because you... care about us?

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-26T20:42:33.318Z · LW(p) · GW(p)

You guys are only being supposedly 'accurate' when it feels good. I have not said, 'all outsiders', that's your interpretation which you can subsequently disagree with.

SI generalized from the agreement of self selected participants, onto opinions of outsiders, like Holden, subsequently approaching him and getting back the same critique they've been hearing from rare 'contrarians' here for ages but assumed to be some sorta fringe views and such. I don't really care what you guys do with this, you can continue as is and be debunked big time as cranks, your choice. edit: actually, you can see Eliezer himself said that most AI researchers are lunatics. What did SI do to distinguish themselves from what you guys call 'lunatics'? What is here that can shift probabilities from the priors? Absolutely nothing. The focus on safety with made up fears is no indication of sanity what so ever.

Replies from: Ben_Welchner

↑ comment by Ben_Welchner · 2012-05-26T20:53:28.838Z · LW(p) · GW(p)

You guys are only being supposedly 'accurate' when it feels good. I have not said, 'all outsiders', that's your interpretation which you can subsequently disagree with.

You're misusing language by not realizing that most people treat "members of group A think X" as "a sizable majority of members of group A think X", or not caring and blaming the reader when they parse it the standard way. We don't say "LWers are religious" or even "US citizens vote Democrat", even though there's certainly more than one religious person on this site or Democrat voter in the US.

And if you did intend to say that, you're putting words into Manfred's mouth by assuming he's talking about 'all' instead.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-27T05:53:38.784Z · LW(p) · GW(p)

I do think that 'sizable majority' hypothesis has not been ruled out, to say the least. SI is working to help build benevolent ruler bot, to save the world from malevolent bot. That sounds as crazy as things can be. Prior track record doing anything relevant? None. Reasons for SI to think they can make any progress? None.

I think most of sceptically minded people do see that kind of stuff in pretty negative light, but of course that's my opinion, you can disagree. Actually, who cares, SI should just go on 'fix' what Holden pointed out, increase visibility, and get listed on crackpot/pseudoscience pages.

Replies from: Ben_Welchner

↑ comment by Ben_Welchner · 2012-05-27T16:54:24.571Z · LW(p) · GW(p)

I'm not talking about SI (which I've never donated money to), I'm talking about you. And you're starting to repeat yourself.

Replies from: private_messaging, wedrifid

↑ comment by private_messaging · 2012-05-27T18:03:14.937Z · LW(p) · GW(p)

I'm not talking about SI (which I've never donated money to), I'm talking about you.

I can talk about you too. The statement "That's why outsiders think it's a circlejerk", does not have 'sizable majority', or 'significant minority', or 'all', or 'some' qualifier, nor does it have any kind of implied qualifier, nor does it need qualifying with vague "some", that is entirely needless verbosity (as the 'some' can range from 0.00001% to 99.999%), and the request to add "some" is clearly rhetorical, which we both realize equally well. (It is the case, though, that I think the most likely case is "significant majority of rational people", i.e. i expect greater than 50% chance of strong negative opinion of SI if it is presented to a rational person).

And you're starting to repeat yourself.

The other day someone told me my argument was shifting like wind.

↑ comment by wedrifid · 2012-05-27T17:04:17.224Z · LW(p) · GW(p)

I'm talking about you. And you're starting to repeat yourself.

Does that mean it is time to stop feeding him?

I had decided when I finished my hiatus recently that the account in question had already crossed the threshold where I could reply to him without predicting that I was just causing more noise.

Replies from: Ben_Welchner

↑ comment by Ben_Welchner · 2012-05-27T17:24:05.160Z · LW(p) · GW(p)

Good point.

↑ comment by XiXiDu · 2012-05-19T16:29:09.513Z · LW(p) · GW(p)

I wonder how many other people feel this strongly about being criticized/insulted by a high status person (I guess at least Roko also felt strongly enough about being called "stupid" by Eliezer to contribute to him leaving this community a few days later), and whether Eliezer might not be aware of this effect he is having on others.

I don't feel insulted at all. He is much smarter than me. But I am also not trying to accomplish the same as him. If he calls me stupid for criticizing him, that's as if someone who wants to become a famous singer is telling me that I can't sing when I criticized their latest song. No shit Sherlock!

↑ comment by Vladimir_Nesov · 2012-05-19T00:35:55.114Z · LW(p) · GW(p)

IIRC Roko deleted the speculation-about-superintelligences part of the post shortly after its publication, but discussion in the comments raged on, so you subsequently banned the whole post/discussion.

And a few days later, primarily for unrelated reasons but probably with this incident as a trigger, Roko deleted his account, which on that version of LW meant that the text of all his comments disappeared (on the current version of LW, only author's name gets removed when account is deleted, comments don't disappear).

Replies from: komponisto, Eliezer_Yudkowsky

↑ comment by komponisto · 2012-05-19T02:27:19.930Z · LW(p) · GW(p)

Roko never deleted his account; he simply deleted all of his comments individually.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2012-05-19T10:54:57.281Z · LW(p) · GW(p)

Surely not individually (there were probably thousands and IIRC it was also happening to other accounts, so wasn't the result of running a self-made destructive script); what you're seeing is just how "deletion of account" performed on old version of LW looks like on current version of LW.

Replies from: komponisto

↑ comment by komponisto · 2012-05-19T11:11:28.986Z · LW(p) · GW(p)

No, I don't think so; in fact I don't think it was even possible for users to delete their own accounts on the old version of LW. (See here.) SilasBarta discovered Roko in the process of deleting his comments, before they had been completely deleted.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2012-05-19T11:21:49.776Z · LW(p) · GW(p)

I don't think it was even possible for users to delete their own accounts on the old version of LW. (See here.)

That post discusses the fact that account deletion was broken at one time in 2011, and a decision was being made about how to handle account deletion in the future. It doesn't say anything relevant about how it worked in 2010.

SilasBarta discovered Roko in the process of deleting his comments, before they had been completely deleted.

"April last year" in that comment is when LW was started, I don't believe it refers to incomplete deletion. The comments before that date that remained could be those posted under a different username (account), automatically copied from overcomingbias along with the Sequences.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-19T15:21:25.344Z · LW(p) · GW(p)

Here is clearer evidence that account deletion simply did nothing back then. My understanding is the same as komponisto's: Roko wrote a script to delete all of his posts/comments individually.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2012-05-19T15:46:49.891Z · LW(p) · GW(p)

This comment was written 3 days before the post komponisto linked to, which discussed the issue of account deletion feature having been broken at that time (Apr 2011); the comment was probably the cause of that post. I don't see where it indicates the state of this feature around summer 2010. Since "nothing happens" behavior was indicated as an error (in Apr 2011), account deletion probably did something else before it stopped working.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-05-19T17:05:12.683Z · LW(p) · GW(p)

Ok, I guess I could be wrong then. Maybe somebody who knows Roko could ask him?

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-19T17:09:44.179Z · LW(p) · GW(p)

IIRC Roko deleted the speculation-about-superintelligences part of the post shortly after its publication, but discussion in the comments raged on, so you subsequently banned the whole post/discussion.

This sounds right to me, but I still have little trust in my memories.

Replies from: None

↑ comment by [deleted] · 2012-05-20T19:30:06.397Z · LW(p) · GW(p)

Or little interest in rational self-improvement by figuring what actually happened and why?

[You've made an outrageously self-assured false statement about this, and you were upvoted—talk about sycophancy—for retracting your falsehood, while suffering no penalty for your reckless arrogance.]

This sounds right to me, but I still have little trust in my memories.

↑ comment by Sniffnoy · 2012-05-19T00:13:51.530Z · LW(p) · GW(p)

To clarify for those new here -- "retract" here is meant purely in the usual sense, not in the sense of hitting the "retract" button, as that didn't exist at the time.

↑ comment by Rhwawn · 2012-05-19T00:16:35.159Z · LW(p) · GW(p)

Are there no server logs or database fields that would clarify the mystery? Couldn't Trike answer the question? (Yes, this is a use of scarce time - but if people are going to keep bringing it up, a solid answer is best.)

↑ comment by JoshuaZ · 2012-05-18T22:50:08.161Z · LW(p) · GW(p)

Your point is well taken, but since part of the concern about that whole affair was your extreme language and style, maybe stating this in normal caps might be a reasonable step for PR.

↑ comment by Humbug · 2012-05-18T09:50:01.106Z · LW(p) · GW(p)

He or someone else must have explained at some point, or I wouldn't know his reason was that the article was giving a donor nightmares.

This is half the truth. Here is what he wrote:

For those who have no idea why I'm using capital letters for something that just sounds like a random crazy idea, and worry that it means I'm as crazy as Roko, the gist of it was that he just did something that potentially gives superintelligences an increased motive to do extremely evil things in an attempt to blackmail us.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-18T10:02:45.219Z · LW(p) · GW(p)

Please rot13 the part from “potentially” onwards, and add a warning as in this comment (with “decode the rot-13'd part” instead of “follow the links”), because there are people here who've said they don't want to know about that thing.

↑ comment by private_messaging · 2012-05-18T05:42:21.445Z · LW(p) · GW(p)

Eliezer prioritized a donor (presumably long-term and one he knew personally) over an article.

Note that the post in question has already been seen by the donor, and has effectively advocated donating all spare money to SI. I imagine the donor was not a mind upload and the point was not deleted from donor's memory, but I do know that deletion of it from public space resulted in lack of rebuttals.

In any case my point was not that censorship was bad, but that a nonsense threat utterly lacking in any credibility was taken very seriously (to the point of nightmares you say?). It is dangerous to have anyone seriously believe your project is going to kill everyone, even if that person is a pencil necked white nerd.

"he's a high school dropout who hasn't yet created an AI and so must be completely wrong"?

Strawman. A Bayesian reasoner should update on such evidence, especially as combination of 'high school drop out' and 'no impressive technical accomplishments' is a very strong indicator (of lack of world class genius) for that age category. It is the case that this evidence, post update, shifts estimates significantly in direction of 'completely wrong or not even wrong' for all insights that require world class genius level intelligence, such as, incidentally, forming opinion on AI risk which most world class geniuses did not form.

In any case I did not even say what you implied. To me the Roko incident is evidence that some people here take that kind of nonsense seriously enough to have nightmares about it (to delete it, etc etc), and as such it is unsafe if such people get told that particular software project is going to kill us all, while the list of accomplishment was to perform update on, when evaluating probability.

Replies from: Rain, jacob_cannell, shokwave

↑ comment by Rain · 2012-05-18T12:55:20.668Z · LW(p) · GW(p)

I have never seen where the person-with-nightmares was revealed as a donor, or indeed any clue as to who they were other than 'someone Eliezer knows'. I would like some evidence, if there is any.

Also, Eliezer did not drop out of high school; he never attended in the first place, commonly known as 'skipping it', which is more common among "geniuses" (though I dislike that description).

Replies from: XiXiDu, private_messaging

↑ comment by XiXiDu · 2012-05-18T13:54:26.449Z · LW(p) · GW(p)

I have never seen where the person-with-nightmares was revealed as a donor, or indeed any clue as to who they were other than 'someone Eliezer knows'.

I sent you 3 pieces of evidence via private message. Including two names.

Replies from: Rain

↑ comment by Rain · 2012-05-18T14:21:14.866Z · LW(p) · GW(p)

Thank you for the links.

Please note that none of the evidence shows the donor status of the anonymous people/person who actually had nightmares, and the two named individuals did not say it gave them nightmares, but used a popular TVTropes idiom, "Nightmare Fuel", as an adjective.

↑ comment by private_messaging · 2012-05-18T15:52:00.407Z · LW(p) · GW(p)

Very few people are so smart they are in the category of 'too smart for highschool and any university'... many more are less smart, some have practical issues (need to work to feed family f/e). There's some very serious priors from the normal distribution, for evidence to shift. Successful self education is fairly uncommon, especially outside the context of 'had to feed family'.

Replies from: Rain

↑ comment by Rain · 2012-05-18T16:09:37.592Z · LW(p) · GW(p)

Your criticism shifts as the wind.

What is your purpose?

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T16:55:28.642Z · LW(p) · GW(p)

Your criticism shifts as the wind.

Does it really? Do I have to repeat myself more? Is it against some unwritten rule to mention Bell curve prior which I have had from the start?

What is your purpose?

What do you think? Feedback. I do actually think he's nuts, you know? I also think he's terribly miscalibrated , which is probably the cause of the overconfidence in his foom belief (and it is ultimately the overconfidence that is nutty, same beliefs with appropriate confidence would be just mildly weird in a good way). It is also probably the case that politeness results in biased feedback.

Replies from: Rain

↑ comment by Rain · 2012-05-18T17:02:49.685Z · LW(p) · GW(p)

If your purpose is "let everyone know I think Eliezer is nuts", then you have succeeded, and may cease posting.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T17:16:27.705Z · LW(p) · GW(p)

Well, there's also the matter of why I'd think he's nuts when facing "either he's a supergenius or he's nuts" dilemma created by overly high confidence expressed in overly speculative arguments. But yea I'm not sure it's getting anywhere, the target audience is just EY himself, and I do expect he'd read this at least out of curiosity to see how he's being defended, but with low confidence so I'm done.

↑ comment by jacob_cannell · 2012-05-18T08:28:57.996Z · LW(p) · GW(p)

It is the case that this evidence, post update, shifts estimates significantly in direction of 'completely wrong or not even wrong' for all insights that require world class genius level intelligence, such as, incidentally, forming opinion on AI risk which most world class geniuses did not form.

Most "world class geniuses" have not opinionated on AI risk. So "forming opinion on AI risk which most world class geniuses did not form" is hardly a task which requires "world class genius level intelligence".

For a "Bayesian reasoner", a piece of writing is its own sufficient evidence concerning its qualities. Said reasoner does not need to rely much on indirect evidence concerning the author, after the reasoner has read the actual writing itself.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T08:38:14.490Z · LW(p) · GW(p)

Most "world class geniuses" have not opinionated on AI risk.

Nonetheless, the risk in question is also a personal risk of death for every genius... now idk how do we define geniuses here but obviously most geniuses could be presumed pretty good at preventing their own deaths, or deaths of their families. I should have said, forming a valid opinion.

For a "Bayesian reasoner", a piece of writing is its own sufficient evidence concerning its qualities. Said reasoner does not need to rely much on indirect evidence concerning the author, after the reasoner has read the actual writing itself.

Assuming that absolutely nothing in the writing had to be taken on faith. True for mathematical proofs. False for almost everything else.

Replies from: Nornagest

↑ comment by Nornagest · 2012-05-18T17:28:17.062Z · LW(p) · GW(p)

Nonetheless, the risk in question is also a personal risk of death for every genius... now idk how do we define geniuses here but obviously most geniuses could be presumed pretty good at preventing their own deaths, or deaths of their families.

That seems like a pretty questionable presumption to me. High IQ is linked to reduced mortality according to at least one study, but that needn't imply that any particular fatal risk be likely to be uncovered, let alone prevented, by any particular genius; there's no physical law stating that lethal threats must be obvious in proportion to their lethality. And that's especially true for existential threats, which almost by definition must be without experiential precedent.

You'd have a stronger argument if you narrowed your reference class to AI researchers. Not a terribly original one in this context, but a stronger one.

↑ comment by shokwave · 2012-05-18T05:57:10.709Z · LW(p) · GW(p)

a combination of 'high school drop out' and 'no impressive technical accomplishments' is a very strong indicator

Numbers?

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T06:14:26.758Z · LW(p) · GW(p)

Go dig for numbers yourself, and assume he is a genius until you find numbers, that will be very rational. Meanwhile most of people have a general feel of how rare it would be that a person with supposedly genius level untested insights into a technical topic (in so much as most geniuses fail to have those insights) would have nothing impressive that was tested, at age of, what, 32? edit: Then also, the geniuses know of that feeling and generally produce the accomplishments in question if they want to be taken seriously.

Replies from: shokwave

↑ comment by shokwave · 2012-05-18T07:13:44.529Z · LW(p) · GW(p)

Starting a nonprofit on a subject unfamiliar to most and successfully soliciting donations, starting an 8.5-million-view blog, writing over 2 million words on wide-ranging controversial topics so well that the only sustained criticism to be made is "it's long" and minor nitpicks, writing an extensive work of fiction that dominated its genre, and making some novel and interesting inroads into decision theory all seem, to me, to be evidence in favour of genius-level intelligence. These are evidence because the overwhelming default in every case for simply 'smart' people is to fail.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T07:40:44.550Z · LW(p) · GW(p)

Starting a nonprofit on a subject unfamiliar to most and successfully soliciting donations,

Many a con men accomplish this.

These are evidence because the overwhelming default in every case for simply 'smart' people is to fail.

The overwhelming default for those capable of significant technical accomplishment is not to spend time on such activities.

Ultimately there's many more successful ventures like this, such as scientology, and if I use this kind of metric on L. Ron Hubbard...

Replies from: shokwave

↑ comment by shokwave · 2012-05-18T23:02:55.261Z · LW(p) · GW(p)

if I use this kind of metric on L. Ron Hubbard...

It provides evidence in favour of him being correct. If there weren't other sources of information on Hubbard's activities, I'd expect him to be of genius-level intelligence.

You're familiar with the concept that someone looking like Hitler doesn't make them fascist, right?

Replies from: Nornagest, Rhwawn

↑ comment by Nornagest · 2012-05-18T23:53:53.641Z · LW(p) · GW(p)

It provides evidence in favour of him being correct. If there weren't other sources of information on Hubbard's activities, I'd expect him to be of genius-level intelligence.

Honestly, I wouldn't be surprised if he was; he clearly had an almost uniquely good understanding of what it takes to build a successful cult (though his early links with the OTO probably helped). New religious movements start all the time, and not one in a hundred reaches Scientology's level of success. You can be both a genius and a charlatan. It's easier to be the latter if you're the former, actually.

Although his writing's admittedly pretty terrible.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-24T18:55:33.674Z · LW(p) · GW(p)

I wouldn't expect genius level technical intelligence. Self deception is important part of effective deception; you have to believe a lie to build a good lie. Avoiding self deception is important part of technical accomplishment.

Furthermore, knowing that someone has no technical accomplishments is very different from not knowing if someone has technical accomplishments.

Replies from: thomblake

↑ comment by thomblake · 2012-05-24T18:56:54.610Z · LW(p) · GW(p)

Avoiding self deception is important part of technical accomplishment.

This does not seem obvious to me, in general. Do you have experience making technical accomplishments?

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-25T05:41:47.318Z · LW(p) · GW(p)

Yes. Worked at 3 failed start-ups, founded successful start-up (and know of several more failed ones). Self deception is incredibly destructive to any accomplishment that is not involving deception of other people. You need to know how good your skill set is, how good your product is, how good your idea is. You can't be falling in love with brainfarts.

In any case, talents require extensive practice with feedback (are massively enhanced by that), and no technical accomplishments at age above 30 pretty much excludes any possibility of technical talent of any significance nowadays. (Yes, some odd case may discover they are awesome inventor, at age past 30, but they suffer from lack of earlier practice, and it'd be incredibly foolish of anyone who knows of own natural talent since teen, not to practice properly)

↑ comment by Rhwawn · 2012-05-18T23:56:03.167Z · LW(p) · GW(p)

I'd also point out that if you read the investigative Hubbard biographies, you see many classic signs of con artistry: constant changes of location, careers, ideologies, bankruptcies or court cases in their wake, endless lies about their credentials, and so on. Most of these do not match Eliezer at all - the only similarities are flux in ideas and projects which don't always pan out (like Flare), but that could be said of an ordinary academic AI researcher as well. (Most academic software is used for some publications and abandoned to bitrot.)

↑ comment by private_messaging · 2012-05-18T05:54:53.463Z · LW(p) · GW(p)

To clarify it better: the Roko incident illustrates how seriously some members of LW take nonsense conjectured threats. The fact of censorship is quite irrelevant. I was not really making a stab at the Eliezer with the Roko incident (even though I can see how you can picture it as such as it is easier to respond to the statement under this interpretation).

The HS dropping out and lack of accomplishments are a piece of evidence, and a rational Bayesian agent is better off knowing about such evidence. Especially given all the pieces of other evidence lying around such as 'world foremost expert on self improvement' and other introductions like http://www.youtube.com/watch?v=MwriJqBZyoM , which are normally indicative of far greater accomplishments (such as making something which self improved) than ones which took place.

Replies from: gwern

↑ comment by gwern · 2012-05-18T07:34:43.790Z · LW(p) · GW(p)

To clarify it better: the Roko incident illustrates how seriously some members of LW take nonsense conjectured threats. The fact of censorship is quite irrelevant.

You can't have it both ways. If it's nonsense, then the importance is that someone took it seriously (like a donor), not anyone's reaction to that someone taking it seriously (like Eliezer). If it's not nonsense, then someone taking it seriously is not the issue, but someone's reaction to taking it seriously (the censorship). Make up your mind.

The HS dropping out and lack of accomplishments are a piece of evidence, and a rational Bayesian agent is better off knowing about such evidence.

I don't believe at any point in my comment did I claim the dropping out of school represented precisely 0 Bayesian evidence...

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T09:00:29.563Z · LW(p) · GW(p)

You can't have it both ways. If it's nonsense, then the importance is that someone took it seriously (like a donor), not anyone's reaction to that someone taking it seriously (like Eliezer). If it's not nonsense, then someone taking it seriously is not the issue, but someone's reaction to taking it seriously (the censorship). Make up your mind.

If it is dangerous nonsense then it is important that there is rebuttal (ideally one that works on people whom would fall for the nonsense in first place). Haven't seen one.

If it is not nonsense, then it outlines that certain decision theories should not be built into FAI.

I don't believe at any point in my comment did I claim the dropping out of school represented precisely 0 Bayesian evidence...

you really didn't like me pointing it out, though.

↑ comment by private_messaging · 2012-05-16T21:59:16.706Z · LW(p) · GW(p)

So your argument that visiting a bunch of highly educated pencil-necked white nerds

How highly educated?

one incident of ineffective online censorship

One incident of being batshit insane (in the form of taking utter nonsense very seriously). I should also link trolley problem discussions perhaps.

Replies from: None, JoshuaZ

↑ comment by [deleted] · 2012-05-16T22:03:02.436Z · LW(p) · GW(p)

How highly educated?

You've already gone down this road with Wei Dai. More FUD.

↑ comment by JoshuaZ · 2012-05-17T21:28:09.663Z · LW(p) · GW(p)

I should also link trolley problem discussions perhaps.

Trolley problems are a standard type of problem discussed in intro psychology and intro philosophy classes in colleges. And they go farther, with many studies just about how people respond or think about them. That LW would want to discuss trolley problems or that different people would have wildly conflicting responses to them shouldn't be surprising- that's what makes them interesting. Using them as evidence that LW is somehow bad seems strange.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-18T05:25:25.891Z · LW(p) · GW(p)

Well, LW takes those fairly seriously, and stopping deadly AI is a form of trolley problem.

↑ comment by othercriteria · 2012-05-16T22:20:28.648Z · LW(p) · GW(p)

At least try harder in you fear-mongering. The thread about EY's failure to make make many falsifiable predictions is better ad hominem and the speculation about launching terrorist attacks on fab plants is a much more compelling display of potential risk to life and property.

I agree that this is not a game, although you should note that you are doing EY/SIAI/LessWrong's work for it by trying to scare FinalState.

What probability would you give to FinalState's assertion of having a working AGI?

Replies from: Bugmaster, None, private_messaging

↑ comment by Bugmaster · 2012-05-16T22:29:35.084Z · LW(p) · GW(p)

I'm not private_messaging, but I think he has a marginally valid point, even though I disagree with his sensational style.

I personally would estimate FinalState's chances of building a working AGI at approximately epsilon, given the total absence of evidence. My opinion doesn't really matter, though, because I'm just some guy with a LessWrong account.

The SIAI folks, on the other hand, have made it their mission in life to prevent the rise of un-Friendly AGI. Thus, they could make FinalState's life difficult in some way, in order to fulfill their core mission. In effect, FinalState's post could be seen as a Pascal's Mugging attempt vs. SIAI.

Replies from: Normal_Anomaly

↑ comment by Normal_Anomaly · 2012-05-17T15:24:20.693Z · LW(p) · GW(p)

The social and opportunity costs of trying to supress a "UFAI attempt" as implausible as FinalState's are far higher than the risk of failing to do so. There are also decision-theoretic reasons never to give in to Pascal-Mugging-type offers. SIAI knows all this and therefore will ignore FinalState completely, as well they should.

Replies from: Bugmaster

↑ comment by Bugmaster · 2012-05-17T16:11:37.672Z · LW(p) · GW(p)

The social and opportunity costs of trying to supress a "UFAI attempt" as implausible as FinalState's are far higher than the risk of failing to do so.

I think that depends on what level of suppression one is willing to employ, though in general I agree with you. FinalState had admitted to being a troll, but even if he was an earnest crank, the magnitude of the expected value of his work would still be quite small, even when you do account for SIAI's bias.

There are also decision-theoretic reasons never to give in to Pascal-Mugging-type offers

What are they, out of curiosity ? I think I missed that part of the Sequences...

Replies from: Normal_Anomaly, CuSithBell

↑ comment by Normal_Anomaly · 2012-05-17T20:48:11.243Z · LW(p) · GW(p)

What are they, out of curiosity ? I think I missed that part of the Sequences...

It's not in the main Sequences, it's in the various posts on decision theory and Pascal's Muggings. I hope our resident decision theory experts will correct me if I'm wrong, but my understanding is this. If an agent is of the type that gives into Pascal's Mugging, then other agents who know that have an incentive to mug them. If all potential muggers know that they'll get no concessions from an agent, they have no incentive to mug them. I don't think this covers "Pascal's Gift" scenarios where an agent is offered a tiny probability of a large positive utility, but it covers scenarios involving a small chance of a large disutility.

↑ comment by CuSithBell · 2012-05-19T01:20:56.555Z · LW(p) · GW(p)

I'm not sure that that is in fact an admission of being a troll... it reads as fairly ambiguous to me. Do other people have readings on this?

↑ comment by [deleted] · 2012-05-16T22:25:34.428Z · LW(p) · GW(p)

What probability would you give to FinalState's assertion of having a working AGI?

0%, since it apparently isn't finished yet.

Will it be finished in a year? 2%, as all other attempts that have reached the "final stages" have failed to build working AGI. The most credible of those attempts were done with groups; it appears FinalState is working alone.

Replies from: TheOtherDave, othercriteria

↑ comment by TheOtherDave · 2012-05-16T22:47:14.688Z · LW(p) · GW(p)

2%?
Seriously?
I am curious as to why your estimate is so high.

Replies from: Eliezer_Yudkowsky, None, Bugmaster

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-17T21:01:24.714Z · LW(p) · GW(p)

That's the kind of probability I would've assigned to EURISKO destroying the world back when Lenat was the first person ever to try to build anything self-improving. For a random guy on the Internet it's off by... maybe five orders of magnitude? I would expect a pretty tiny fraction of all worlds to have the names of homebrew projects carved on their tombstones, and there are many random people on the Internet claiming to have AGI.

People like this are significant, not because of their chances of creating AGI, but because of what their inability to stop or take any serious precautions, despite their belief that they are about to create AGI, tells us about human nature.

Replies from: TheOtherDave, JoshuaZ

↑ comment by TheOtherDave · 2012-05-17T22:35:15.599Z · LW(p) · GW(p)

Understanding "random guy on the Internet" to mean something like an Internet user all I know about whom is that they are interested in building AGI and willing to put some concerted effort into the project... hrm... yeah, I'll accept e-7 as within my range.

My estimate for an actual random person on the Internet building AGI in, say, the next decade, has a ceiling of e-10 or so, but I don't have a clue what its lower bound is.

That said, I'm not sure how well-correlated the willingness of a "random guy on the Internet" (meaning 1) to try to build AGI without taking precautions is to the willingness of someone whose chances are orders of magnitude higher to do so.

Then again, we have more compelling lines of evidence leading us to expect humans to not take precautions.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-18T09:31:15.655Z · LW(p) · GW(p)

My estimate for an actual random person on the Internet building AGI in, say, the next decade, has a ceiling of e-10 or so, but I don't have a clue what its lower bound is.

(I had to read that three times before getting why that number was 1000 times smaller than the other one, because I kept on misinterpreting “random person”. Try “randomly-chosen person”.)

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T14:31:51.685Z · LW(p) · GW(p)

I have no idea what you understood "random person" to mean, if not randomly chosen person. I'm also curious now as to whether whatever-that-is is what EY meant in the first place.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-18T14:52:55.249Z · LW(p) · GW(p)

A stranger, esp. one behaving in weird ways; this appears to me to be the most common meaning of that word in 21st-century English when applied to a person. (Older speakers might be unfamiliar with it, but the median LWer is 25 years old, as of the latest survey.) And I also had taken the indefinite article to be an existential quantifier; hence, I had effectively interpreted the statement as “at least one actual strange person on the Internet building AGI in the next decade”, for which I thought such a low probability would be ridiculous.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-05-18T15:06:34.196Z · LW(p) · GW(p)

Thanks for clarifying.

↑ comment by JoshuaZ · 2012-05-17T21:04:37.900Z · LW(p) · GW(p)

but because of what their inability to stop or take any serious precautions, despite their belief that they are about to create AGI, tells us about human nature.

Are these in any way a representative sample of normal humans? In order to be in this category one generally needs to be pretty high on the crank scale along with some healthy Dunning-Kruger issues.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-17T21:12:35.667Z · LW(p) · GW(p)

That's always been the argument that future AGI scientists won't be as crazy as the lunatics presently doing it - that the current crowd of researchers are self-selected for incaution - but I wouldn't put too much weight on that; it seems like a very human behavior, some of the smarter ones with millions of dollars don't seem of below-average competence in any other way, and the VCs funding them are similarly incapable of backing off even when they say they expect human-level AGI to be created.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-05-17T21:13:54.590Z · LW(p) · GW(p)

Sorry, I'm confused. By "people like this" did you mean people like FinalState or did you mean professional AI researchers? I interpreted it as the first.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2012-05-17T21:18:18.572Z · LW(p) · GW(p)

AGI researchers sound a lot like FinalState when they think they'll have AGI cracked in two years.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-17T21:19:45.257Z · LW(p) · GW(p)

FinalState < average AI researchers < AGI researchers < top AI researchers

Eliezer < anyone with actual notable accomplishments. edit: damn it you edited your message.

Replies from: Rain, JoshuaZ

↑ comment by Rain · 2012-05-17T21:22:50.042Z · LW(p) · GW(p)

Over 140 posts and 0 total karma; that's persistence.

Replies from: ciphergoth

↑ comment by Paul Crowley (ciphergoth) · 2012-05-18T09:42:46.114Z · LW(p) · GW(p)

private_messaging says he's Dmytry, who has positive karma. It's possible that the more anonymous-sounding name encourages worse behaviour though.

↑ comment by JoshuaZ · 2012-05-17T21:23:25.442Z · LW(p) · GW(p)

Before people downvote PM's comment above, note that Eliezer's comment prior to editing was a hierarchy of different AI researchers with lowest being people like FinalState, the second highest being professional AI researchers and the highest being "top AI researchers".

With that out of the way, what do you think you are accomplishing with this remark? You have a variety of valid points to make, but I fail to see what is contained in this remark that does anything at all.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-17T21:32:26.467Z · LW(p) · GW(p)

Me or Eliezer? I'm making some point by direct demonstration. It's a popular ranking system, ya know? He used it on FinalState. A lot of people use it on him.

Replies from: None

↑ comment by [deleted] · 2012-05-17T21:37:47.463Z · LW(p) · GW(p)

There's got to be a level beyond "arguments as soldiers" to describe your current approach to ineffective contrarianism.

I volunteer "arguments as cannon fodder."

↑ comment by [deleted] · 2012-05-16T22:54:18.095Z · LW(p) · GW(p)

Laplace's Rule of Succession, assuming around fifty failures under similar or more favorable circumstances.

Replies from: othercriteria, steven0461, army1987, rhollerith_dot_com

↑ comment by othercriteria · 2012-05-16T23:08:58.835Z · LW(p) · GW(p)

This just shifts the question to how you slotted FinalState into such a promising reference class? Conservatively, tens of academic research programs, tens of PhD dissertations, hundreds of hobbyist projects, hundreds of undergraduate term projects, and tens of business ventures have attempted something similar to AGI and none have succeeded.

Replies from: None, dlthomas

↑ comment by [deleted] · 2012-05-17T01:25:50.319Z · LW(p) · GW(p)

As far as I can tell, the vast majority of academic projects (particularly those of undergrads) have worked on narrow AI, which this is supposedly not.

However, reading the post again, it doesn't sound as though they have the support of any academic institution; I misread the bit around "academic network". It sounds more as though this is a homebrew project, in which case I need to go two or three orders of magnitude lower.

Replies from: othercriteria

↑ comment by othercriteria · 2012-05-17T02:46:42.129Z · LW(p) · GW(p)

As far as I can tell, the vast majority of academic projects (particularly those of undergrads) have worked on narrow AI, which this is supposedly not.

That's definitely a reasonable assessment. I dialed all those estimates down by about an order of magnitude from when I started writing that point as I thought through just how unusual attempting general AI is. But over sixty years and hundreds of institutions where one might get a sufficiently solid background in CS to implement something big, there are going to be lots of unusual people trying things out.

↑ comment by dlthomas · 2012-05-16T23:11:54.357Z · LW(p) · GW(p)

Of those who attempted, fewer thought they were close, but fifty still seems very generous.

↑ comment by steven0461 · 2012-05-16T23:12:37.270Z · LW(p) · GW(p)

The Rule of Succession, if I'm not mistaken, assumes a uniform prior from 0 to 1 for the probability of success. That seems unreasonable; it shouldn't be extremely improbable (even before observing failure) that fewer than one in a thousand such claims result in a working AGI. So you have to adjust downward somewhat from there, but it's hard to say how much.

(This is in addition to the point that user:othercriteria makes in the sibling comment.)

Replies from: None

↑ comment by [deleted] · 2012-05-17T01:18:48.168Z · LW(p) · GW(p)

You're correct, but where would I find a better prior? I'd rather be too conservative than resort to wild guessing (which it would be, since I'm not an expert on AGI).

(A variant of this is rhollerith_dot_com's objection below, that I failed to take into account whatever the probability of working AGI leading to death is. Presumably that changes the prior as well.)

Replies from: gwern, steven0461

↑ comment by gwern · 2012-05-17T01:38:24.076Z · LW(p) · GW(p)

Q. How can I find the priors for a problem?

A. Many commonly used priors are listed in the Handbook of Chemistry and Physics.

Q. Where do priors originally come from?

A. Never ask that question.

Q. Uh huh. Then where do scientists get their priors?

A. Priors for scientific problems are established by annual vote of the AAAS. In recent years the vote has become fractious and controversial, with widespread acrimony, factional polarization, and several outright assassinations. This may be a front for infighting within the Bayes Council, or it may be that the disputants have too much spare time. No one is really sure.

Q. I see. And where does everyone else get their priors?

A. They download their priors from Kazaa.

Q. What if the priors I want aren't available on Kazaa?

A. There's a small, cluttered antique shop in a back alley of San Francisco's Chinatown. Don't ask about the bronze rat.

http://yudkowsky.net/rational/bayes

Replies from: TimS

↑ comment by TimS · 2012-05-17T01:55:16.503Z · LW(p) · GW(p)

Isn't the lesson of the Quantum Physics sequence that ordinary humans today should get their priors from the least complex (and falsifiable?) statements that aren't inconsistent with empirical knowledge.

↑ comment by steven0461 · 2012-05-17T01:37:20.578Z · LW(p) · GW(p)

I don't know where to get a good prior. I suppose you might look at past instances where someone claimed to be close to doing something that seemed about as difficult and confusing as AGI seems to be (before taking into account a history of promises that didn't pan out, but after taking into account what we know about the confusingness of the problem, insofar as that knowledge doesn't itself come from the fact of failed promises). I don't know what that prior would look like, but it seems like it would assign (if you randomly selected a kind of feat) a substantially greater than 1/10 probability of seeing at least 10 failed predictions of achieving that feat for every successful such prediction, a substantially greater than 1/100 probability of seeing at least 100 failed predictions for every successful prediction, and so on.

↑ comment by A1987dM (army1987) · 2012-05-17T10:37:25.932Z · LW(p) · GW(p)

And why do you think FinalState is in such a circumstance, rather than just bullshitting us?

Replies from: None

↑ comment by [deleted] · 2012-05-17T21:01:55.391Z · LW(p) · GW(p)

I was being charitable. Also, I misread the original post; see the comments below.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-05-18T09:34:46.450Z · LW(p) · GW(p)

Hmm yeah, I read the post again and, if it's a troll, it's a way-more-subtle-than-typical one. Still, my posterior probability assignment on him being serious/sincere is in the 0.40s (extraordinary claims require extraordinary evidence) -- though this means that the probability that he succeeds given that he's serious is the same order of magnitude as the probability that he succeeds given everything I know.

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-17T00:20:02.301Z · LW(p) · GW(p)

If you know you probably would not have survived the sun's having failed to rise, you cannot just apply the Rule of Succession to your knowledge of past sunrises to calculate the probability that the sun will rise tomorrow because that would be ignoring relevant information, namely the existence of a severe selection bias. (Sadly, I do not know how to modify the Rule of Succession to account for the selection bias.)

Replies from: gwern

↑ comment by gwern · 2012-05-17T01:32:08.703Z · LW(p) · GW(p)

Bostrom has made a stab at compensating, although I don't think http://www.nickbostrom.com/papers/anthropicshadow.pdf works for the sun example.

On the other hand, if you have so much background knowledge about the Sun that you can think about the selection effects involved, the Rule of Succession is a moot & incomplete analysis to begin with.

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-17T05:42:32.130Z · LW(p) · GW(p)

Regarding your second paragraph, Sir Gwern, if we switch the example to the question of whether the US and Russia will launch nukes at each other this year, I have at lot of information about the strength of the selection bias (including for example Carl Sagan's work on nuclear winter) that I might put to good use if I knew how to account for selection effects, but I would be sorely tempted to use something like the Rule of Succession (modified to account for the selection bias and where the analog of a day in which the sun might or might not rise is the start of the part of the career of someone in the military or in politics during which he or she can influence whether or not an attempt at a first strike is made) because my causal model of the mental processes behind the decision to launch is so unsatisfactory.

This might be a good place for me to point out that I never bought into the common wisdom, which I have never seen anyone object to or distance themselves from in print, that the chances of a nuclear exchange between the US and Russia went down considerably after the collapse of the Soviet Union in 1991.

Replies from: NancyLebovitz, gwern

↑ comment by NancyLebovitz · 2012-05-17T11:01:01.529Z · LW(p) · GW(p)

This might be a good place for me to point out that I never bought into the common wisdom, which I have never seen anyone object to or distance themselves from in print, that the chances of a nuclear exchange between the US and Russia went down considerably after the collapse of the Soviet Union in 1991.

What's your line of thought?

↑ comment by gwern · 2012-05-17T18:35:28.012Z · LW(p) · GW(p)

Nuclear war isn't the same situation, though. We can survive nuclear war at all sorts of levels of intensity, so the selection filter is not nearly the same as "the Sun going out", which is ~100% fatal. Bostrom's shadow paper might actually work for nuclear war, from the perspective of a revived civilization, but I'd have to reread it to see.

Replies from: rhollerith_dot_com

↑ comment by RHollerith (rhollerith_dot_com) · 2012-05-17T20:16:18.842Z · LW(p) · GW(p)

The selection filter does not have to be total or near total for my point to stand, namely, Rule-of-Succession-like calculations can be useful even when one has enough information to think about the selection effects involved (provided that Rule-of-Succession-like calculations are ever useful).

And parenthetically selection effects on observations about whether nuclear exchanges happened in the past can be very strong. Consider for example a family who has lived in Washington, D.C., for the last 5 decades: Washington, D.C., is such an important target that it is unlikely the family would have survived the launch of most or all of the Soviet/Russian arsenal at the U.S. So, although I agree with you that the human race as a whole would probably have survived almost any plausible nuclear exchange, that does not do the family in D.C. much good. More precisely, it does not do much good for the family's ability to use historical data on whether or not nukes were launched at the U.S. in the past to refine their probability of launches in the future.

Replies from: thomblake

↑ comment by thomblake · 2012-05-17T20:22:23.360Z · LW(p) · GW(p)

And parenthetically

An interesting bracket style. How am I supposed to know where the parenthetical ends?

↑ comment by Bugmaster · 2012-05-16T22:54:17.475Z · LW(p) · GW(p)

Me too. This value is several orders of magnitude above my own estimate.

That said, it depends on your definition of "finished". For example, it is much more plausible (relatively speaking) that FinalState will fail to produce an AGI, but will nevertheless produce an algorithm that performs some specific task -- such as character recognition, unit pathing, natural language processing, etc. -- better than the leading solutions. In this case, I suppose one could still make the argument that FinalState's project was finished somewhat successfully.

↑ comment by othercriteria · 2012-05-16T22:33:06.121Z · LW(p) · GW(p)

Oh, I see that I misinterpreted FinalState's statement

A paper? I'll write that in a few minutes after I finish the implementation. link

as an indication of only being a few minutes away from having a working implementation.

↑ comment by private_messaging · 2012-05-16T22:31:38.699Z · LW(p) · GW(p)

The thread about EY's failure to make make many falsifiable predictions is better ad hominem

I meant to provide priors for the expected value of communication with SI. Sorry, can't be done in non ad hominem way. There's been video or two where Eliezer was called "world's foremost expert on recursive self improvement", which normally implies making something self improve.

the speculation about launching terrorist attacks on fab plants is a much more compelling display of potential risk to life and property.

Ahh right, should of also linked this one. I see it was edited replacing 'we' with 'world government' and 'sabotage' with sanctions and military action. BTW that speculation is by gwern, is he working at SIAI?

What probability would you give to FinalState's assertion of having a working AGI?

AGI is ill defined. Of something that would foom as to pose potential danger, infinitesimally small.

Ultimately: I think risk to his safety is small, and payoff is negligible, while the risk from his software is pretty much nonexistent.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2012-05-16T22:47:25.324Z · LW(p) · GW(p)

There's been video or two where Eliezer was called "world's foremost expert on recursive self improvement"

This usually happens when the person being introduced wasn't consulted about the choice of introduction.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-16T23:18:06.130Z · LW(p) · GW(p)

It nonetheless results in significant presentation bias, what ever is the cause.

My priors, for one thing, were way off in SI's favour. My own cascade of updates was triggered by seeing Alexei say that he plans to make a computer game to make money to donate to SIAI. Before which I sort of assumed that the AI discussions here were about some sorta infinite power super-intelligence in scifi, not unlike Vinge's beyond, intellectually pleasurable game of wits (I even participated a little once or twice along the lines of how you can't debug superintelligence). I assumed that Eliezer had achievements from which he got the attitude (I sort of confused him with Hanson to some extent), etc etc etc. I looked into it more accurately since.

↑ comment by [deleted] · 2012-05-16T21:53:49.831Z · LW(p) · GW(p)

The Roko incident has absolutely nothing to do with this at all. Roko did not claim to be on the verge of creating an AGI.

Once again you're spreading FUD about the SI. Presumably moderation will come eventually, no doubt over some hue and cry over censoring contrarians.

Replies from: private_messaging

↑ comment by private_messaging · 2012-05-16T22:09:51.545Z · LW(p) · GW(p)

The Roko incident allows to evaluate the sanity of people he'd be talking to.

Thoughts on the Singularity Institute (SI)

Contents

Does SI have a well-argued case that its work is beneficial and important?

Objection 1: it seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous.

Objection 2: SI appears to neglect the potentially important distinction between "tool" and "agent" AI.

Objection 3: SI's envisioned scenario is far more specific and conjunctive than it appears at first glance, and I believe this scenario to be highly unlikely.

Other objections to SI's views

Wrapup

Is SI the kind of organization we want to bet on?

Wrapup

But if there's even a chance …

Existential risk reduction as a cause

How I might change my views

Acknowledgements

1274 comments