The AI Timelines Scam

post by jessicata (jessica.liu.taylor) · 2019-07-11T02:52:58.917Z · LW · GW · 100 comments

This is a link post for


  Near predictions generate more funding
  Fear of an AI gap
  How does the AI field treat its critics?
  Why model sociopolitical dynamics?
  What I'm not saying

[epistemic status: that's just my opinion, man. I have highly suggestive evidence, not deductive proof, for a belief I sincerely hold]

"If you see fraud and do not say fraud, you are a fraud." --- Nasim Taleb

I was talking with a colleague the other day about an AI organization that claims:

  1. AGI is probably coming in the next 20 years.
  2. Many of the reasons we have for believing this are secret.
  3. They're secret because if we told people about those reasons, they'd learn things that would let them make an AGI even sooner than they would otherwise.

His response was (paraphrasing): "Wow, that's a really good lie! A lie that can't be disproven."

I found this response refreshing, because he immediately jumped to the most likely conclusion.

Near predictions generate more funding

Generally, entrepreneurs who are optimistic about their project get more funding than ones who aren't. AI is no exception. For a recent example, see the Human Brain Project. The founder, Henry Makram, predicted in 2009 that the project would succeed in simulating a human brain by 2019, and the project was already widely considered a failure by 2013. (See his TED talk, at 14:22)

The Human Brain project got 1.3 billion Euros of funding from the EU.

It's not hard to see why this is. To justify receiving large amounts of money, the leader must make a claim that the project is actually worth that much. And, AI projects are more impactful if it is, in fact, possible to develop AI soon. So, there is an economic pressure towards inflating estimates of the chance AI will be developed soon.

Fear of an AI gap

The missile gap was a lie by the US Air Force to justify building more nukes, by falsely claiming that the Soviet Union had more nukes than the US.

Similarly, there's historical precedent for an AI gap lie used to justify more AI development. Fifth Generation Computer Systems was an ambitious 1982 project by the Japanese government (funded for $400 million in 1992, or $730 million in 2019 dollars) to create artificial intelligence through massively parallel logic programming.

The project is widely considered to have failed. From a 1992 New York Times article:

A bold 10-year effort by Japan to seize the lead in computer technology is fizzling to a close, having failed to meet many of its ambitious goals or to produce technology that Japan's computer industry wanted.


That attitude is a sharp contrast to the project's inception, when it spread fear in the United States that the Japanese were going to leapfrog the American computer industry. In response, a group of American companies formed the Microelectronics and Computer Technology Corporation, a consortium in Austin, Tex., to cooperate on research. And the Defense Department, in part to meet the Japanese challenge, began a huge long-term program to develop intelligent systems, including tanks that could navigate on their own.


The Fifth Generation effort did not yield the breakthroughs to make machines truly intelligent, something that probably could never have realistically been expected anyway. Yet the project did succeed in developing prototype computers that can perform some reasoning functions at high speeds, in part by employing up to 1,000 processors in parallel. The project also developed basic software to control and program such computers. Experts here said that some of these achievements were technically impressive.


In his opening speech at the conference here, Kazuhiro Fuchi, the director of the Fifth Generation project, made an impassioned defense of his program.

"Ten years ago we faced criticism of being too reckless," in setting too many ambitious goals, he said, adding, "Now we see criticism from inside and outside the country because we have failed to achieve such grand goals."

Outsiders, he said, initially exaggerated the aims of the project, with the result that the program now seems to have fallen short of its goals.

Some American computer scientists say privately that some of their colleagues did perhaps overstate the scope and threat of the Fifth Generation project. Why? In order to coax more support from the United States Government for computer science research.

(emphasis mine)

This bears similarity to some conversations on AI risk I've been party to in the past few years. The fear is that Others (DeepMind, China, whoever) will develop AGI soon, so We have to develop AGI first in order to make sure it's safe, because Others won't make sure it's safe and We will. Also, We have to discuss AGI strategy in private (and avoid public discussion), so Others don't get the wrong ideas. (Generally, these claims have little empirical/rational backing to them; they're based on scary stories, not historically validated threat models)

The claim that others will develop weapons and kill us with them by default implies a moral claim to resources, and a moral claim to be justified in making weapons in response. Such claims, if exaggerated, justify claiming more resources and making more weapons. And they weaken a community's actual ability to track and respond to real threats (as in The Boy Who Cried Wolf).

How does the AI field treat its critics?

Hubert Dreyfus, probably the most famous historical AI critic, published "Alchemy and Artificial Intelligence" in 1965, which argued that the techniques popular at the time were insufficient for AGI. Subsequently, he was shunned by other AI researchers:

The paper "caused an uproar", according to Pamela McCorduck. The AI community's response was derisive and personal. Seymour Papert dismissed one third of the paper as "gossip" and claimed that every quotation was deliberately taken out of context. Herbert A. Simon accused Dreyfus of playing "politics" so that he could attach the prestigious RAND name to his ideas. Simon said, "what I resent about this was the RAND name attached to that garbage."

Dreyfus, who taught at MIT, remembers that his colleagues working in AI "dared not be seen having lunch with me." Joseph Weizenbaum, the author of ELIZA, felt his colleagues' treatment of Dreyfus was unprofessional and childish. Although he was an outspoken critic of Dreyfus' positions, he recalls "I became the only member of the AI community to be seen eating lunch with Dreyfus. And I deliberately made it plain that theirs was not the way to treat a human being."

This makes sense as anti-whistleblower activity: ostracizing, discrediting, or punishing people who break the conspiracy to the public. Does this still happen in the AI field today?

Gary Marcus is a more recent AI researcher and critic. In 2012, he wrote:

Deep learning is important work, with immediate practical applications.


Realistically, deep learning is only part of the larger challenge of building intelligent machines. Such techniques lack ways of representing causal relationships (such as between diseases and their symptoms), and are likely to face challenges in acquiring abstract ideas like "sibling" or "identical to." They have no obvious ways of performing logical inferences, and they are also still a long way from integrating abstract knowledge, such as information about what objects are, what they are for, and how they are typically used. The most powerful A.I. systems ... use techniques like deep learning as just one element in a very complicated ensemble of techniques, ranging from the statistical technique of Bayesian inference to deductive reasoning.

In 2018, he tweeted an article in which Yoshua Bengio (a deep learning pioneer) seemed to agree with these previous opinions. This tweet received a number of mostly-critical replies. Here's one, by AI professor Zachary Lipton:

There's a couple problems with this whole line of attack. 1) Saying it louder ≠ saying it first. You can't claim credit for differentiating between reasoning and pattern recognition. 2) Saying X doesn't solve Y is pretty easy. But where are your concrete solutions for Y?

The first criticism is essentially a claim that everybody knows that deep learning can't do reasoning. But, this is essentially admitting that Marcus is correct, while still criticizing him for saying it [ED NOTE: the phrasing of this sentence is off (Lipton publicly agrees with Marcus on this point), and there is more context, see Lipton's reply [LW(p) · GW(p)]].

The second is a claim that Marcus shouldn't criticize if he doesn't have a solution in hand. This policy deterministically results in the short AI timelines narrative being maintained: to criticize the current narrative, you must present your own solution, which constitutes another narrative for why AI might come soon.

Deep learning pioneer Yann LeCun's response is similar:

Yoshua (and I, and others) have been saying this for a long time.
The difference with you is that we are actually trying to do something about it, not criticize people who don't.

Again, the criticism is not that Marcus is wrong in saying deep learning can't do certain forms of reasoning, the criticism is that he isn't presenting an alternative solution. (Of course, the claim could be correct even if Marcus doesn't have an alternative!)

Apparently, it's considered bad practice in AI to criticize a proposal for making AGI without presenting on alternative solution. Clearly, such a policy causes large distortions!

Here's another response, by Steven Hansen (a research scientist at DeepMind):

Ideally, you'd be saying this through NeurIPS submissions rather than New Yorker articles. A lot of the push-back you're getting right now is due to the perception that you haven't been using the appropriate channels to influence the field.

That is: to criticize the field, you should go through the field, not through the press. This is standard guild behavior. In the words of Adam Smith: "People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices."

(Also see Marcus's medium article on the Twitter thread, and on the limitations of deep learning)

[ED NOTE: I'm not saying these critics on Twitter are publicly promoting short AI timelines narratives (in fact, some are promoting the opposite), I'm saying that the norms by which they criticize Marcus result in short AI timelines narratives being maintained.]

Why model sociopolitical dynamics?

This post has focused on sociopolotical phenomena involved in the short AI timelines phenomenon. For this, I anticipate criticism along the lines of "why not just model the technical arguments, rather than the credibility of the people involved?" To which I pre-emptively reply:

Which is not to say that modeling such technical arguments is not important for forecasting AGI. I certainly could have written a post evaluating such arguments, and I decided to write this post instead, in part because I don't have much to say on this issue that Gary Marcus hasn't already said. (Of course, I'd have written a substantially different post, or none at all, if I believed the technical arguments that AGI is likely to come soon had merit to them)

What I'm not saying

I'm not saying:

  1. That deep learning isn't a major AI advance.
  2. That deep learning won't substantially change the world in the next 20 years (through narrow AI).
  3. That I'm certain that AGI isn't coming in the next 20 years.
  4. That AGI isn't existentially important on long timescales.
  5. That it isn't possible that some AI researchers have asymmetric information indicating that AGI is coming in the next 20 years. (Unlikely, but possible)
  6. That people who have technical expertise shouldn't be evaluating technical arguments on their merits.
  7. That most of what's going on is people consciously lying. (Rather, covert deception hidden from conscious attention (e.g. motivated reasoning) is pervasive; see The Elephant in the Brain)
  8. That many people aren't sincerely confused on the issue.

I'm saying that there are systematic sociopolitical phenomena that cause distortions in AI estimates, especially towards shorter timelines. I'm saying that people are being duped into believing a lie. And at the point where 73% of tech executives say they believe AGI will be developed in the next 10 years, it's a major one.

This has happened before. And, in all likelihood, this will happen again.


Comments sorted by top scores.

comment by Scott Alexander (Yvain) · 2019-07-11T05:46:44.883Z · LW(p) · GW(p)

1. For reasons discussed on comments to previous posts here, I'm wary of using words like "lie" or "scam" to mean "honest reporting of unconsciously biased reasoning". If I criticized this post by calling you a liar trying to scam us, and then backed down to "I'm sure you believe this, but you probably have some bias, just like all of us", I expect you would be offended. But I feel like you're making this equivocation throughout this post.

2. I agree business is probably overly optimistic about timelines, for about the reasons you mention. But reversed stupidity is not intelligence [LW · GW]. Most of the people I know pushing short timelines work in nonprofits, and many of the people you're criticizing in this post are AI professors. Unless you got your timelines from industry, which I don't think many people here did, them being stupid isn't especially relevant to whether we should believe the argument in general. I could find you some field (like religion) where people are biased to believe AI will never happen, but unless we took them seriously before this, the fact that they're wrong doesn't change anything.

3. I've frequently heard people who believe AI might be near say that their side can't publicly voice their opinions, because they'll get branded as loonies and alarmists, and therefore we should adjust in favor of near-termism because long-timelinists get to unfairly dominate the debate. I think it's natural for people on all sides of an issue to feel like their side is uniquely silenced by a conspiracy of people biased towards the other side. See Against Bravery Debates for evidence of this.

4. I'm not familiar with the politics in AI research. But in medicine, I've noticed that doctors who go straight to the public with their controversial medical theory are usually pretty bad, for one of a couple of reasons. Number one, they're usually wrong, people in the field know they're wrong, and they're trying to bamboozle a reading public who aren't smart enough to figure out that they're wrong (but who are hungry for a "Galileo stands up to hidebound medical establishment" narrative). Number two, there's a thing they can do where they say some well-known fact in a breathless tone, and then get credit for having blown the cover of the establishment's lie. You can always get a New Yorker story by writing "Did you know that, contrary to what the psychiatric establishment wants you to believe, SOME DRUGS MAY HAVE SIDE EFFECTS OR WITHDRAWAL SYNDROMES?" Then the public gets up in arms, and the psychiatric establishment has to go on damage control for the next few months and strike an awkward balance between correcting the inevitable massive misrepresentations in the article while also saying the basic premise is !@#$ing obvious and was never in doubt. When I hear people say something like "You're not presenting an alternative solution" in these cases, they mean something like "You don't have some alternate way of treating diseases that has no side effects, so stop pretending you're Galileo for pointing out a problem everyone was already aware of." See Beware Stephen Jay Gould [LW · GW] for Eliezer giving an example of this, and Chemical Imbalance and the followup post for me giving an example of this. I don't know for sure that this is what's going on in AI, but it would make sense.

I'm not against modeling sociopolitical dynamics. But I think you're doing it badly, by taking some things that people on both sides feel, applying it to only one side, and concluding that means the other is involved in lies and scams and conspiracies of silence (while later disclaiming these terms in a disclaimer, after they've had their intended shocking effect).

I think this is one of the cases where we should use our basic rationality tools like probability estimates. Just from reading this post, I have no idea what probability Gary Marcus, Yann LeCun, or Steven Hansen has on AGI in ten years (or fifty years, or one hundred years). For all I know all of them (and you, and me) have exactly the same probability and their argument is completely political about which side is dominant vs. oppressed and who should gain or lose status (remember the issue where everyone assumes LWers are overly certain cryonics will work, whereas in fact they're less sure of this than the general population and just describe their beliefs differently [LW · GW] ). As long as we keep engaging on that relatively superficial monkey-politics "The other side are liars who are silencing my side!" level, we're just going to be drawn into tribalism around the near-timeline and far-timeline tribes, and our ability to make accurate predictions is going to suffer. I think this is worse than any improvement we could get by making sociopolitical adjustments at this level of resolution.

Replies from: sarahconstantin, jessica.liu.taylor, Fluttershy
comment by sarahconstantin · 2019-07-11T13:45:55.486Z · LW(p) · GW(p)

Re: 2: nonprofits and academics have even more incentives than business to claim that a new technology is extremely dangerous. Think tanks and universities are in the knowledge business; they are more valuable when people seek their advice. "This new thing has great opportunities and great risks; you need guidance to navigate and govern it" is a great advertisement for universities and think tanks. Which doesn't mean AI, narrow or strong, doesn't actually have great opportunities and risks! But nonprofits and academics aren't immune from the incentives to exaggerate.

Re: 4: I have a different perspective. The loonies who go to the press with "did you know psychiatric drugs have SIDE EFFECTS?!" are not really a threat to public information to the extent that they are telling the truth. They are a threat to the perceived legitimacy of psychiatrists. This has downsides (some people who could benefit from psychiatric treatment will fear it too much) but fundamentally the loonies are right that a psychiatrist is just a dude who went to school for a long time, not a holy man. To the extent that there is truth in psychiatry, it can withstand the public's loss of reverence, in the long run. Blind reverence for professionals is a freebie, which locally may be beneficial to the public if the professionals really are wise, but is essentially fragile. IMO it's not worth trying to cultivate or preserve. In the long run, good stuff will win out, and smart psychiatrists can just as easily frame themselves as agreeing with the anti-psych cranks in spirit, as being on Team Avoid Side Effects And Withdrawal Symptoms, Unlike All Those Dumbasses Who Don't Care (all two of them).

comment by jessicata (jessica.liu.taylor) · 2019-07-11T06:22:02.728Z · LW(p) · GW(p)
  1. I don't actually know the extent to which Bernie Madoff actually was conscious that he was lying to people. What I do know is that he ran a pyramid scheme. The dynamics happen regardless of how conscious they are. (In fact, they often work through keeping things unconscious)

  2. I'm not making the argument "business is stupid about AI timelines therefore the opposite is right".

  3. Yes, this is a reason to expect distortion in favor of mainstream opinions (including medium-long timelines). It can be modeled along with the other distortions.

  4. Regardless of whether Gary Marcus is "bad" (what would that even mean?), the concrete criticisms aren't ones that imply AI timelines are short, deep learning can get to AGI, etc. They're ones that sometimes imply the opposite, and anyway, ones that systematically distort narratives towards short timelines (as I spelled out). If it's already widely known that deep learning can't do reasoning, then... isn't that reason not to expect short AI timelines, and to expect that many of the non-experts who think so (including tech execs and rationalists) have been duped?

If you think I did the modeling wrong, and have concrete criticisms (such as the criticism that there's a distortionary effect towards long timelines due to short timelines seeming loony), then that's useful. But it seems like you're giving a general counterargument against modeling these sorts of sociopolitical dynamics. If the modeling comes out that there are more distortionary effects in one direction than another, or that there are different distortionary effects in different circumstances, isn't that important to take into consideration rather than dismissing it as "monkey politics"?

Replies from: Vaniver, Yvain, Benquo, quanticle
comment by Vaniver · 2019-07-11T16:45:46.426Z · LW(p) · GW(p)

On 3, I notice this part of your post jumps out to me:

Of course, I'd have written a substantially different post, or none at all, if I believed the technical arguments that AGI is likely to come soon had merit to them

One possibility behind the "none at all" is that 'disagreement leads to writing posts, agreement leads to silence', but another possibility is 'if I think X, I am encouraged to say it, and if I think Y, I am encouraged to be silent.'

My sense is it's more the latter, which makes this seem weirdly 'bad faith' to me. That is, suppose I know Alice doesn't want to talk about biological x-risk in public because of the risk that terrorist groups will switch to using biological weapons, but I think Alice's concerns are overblown and so write a post about how actually it's very hard to use biological weapons and we shouldn't waste money on countermeasures. Alice won't respond with "look, it's not hard, you just do A, B, C and then you kill thousands of people," because this is worse for Alice than public beliefs shifting in a way that seems wrong to her.

It is not obvious what the right path is here. Obviously, we can't let anyone hijack the group epistemology by having concerns about what can and can't be made public knowledge, but also it seems like we shouldn't pretend that everything can be openly discussed in a costless way, or that the costs are always worth it.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T01:24:40.407Z · LW(p) · GW(p)

Alice has the option of finding a generally trusted arbiter, Carol, who she tells the plan to; then, Carol can tell the public how realistic the plan is.

Replies from: Vaniver, tomas-gavenciak
comment by Vaniver · 2019-07-12T02:01:11.539Z · LW(p) · GW(p)

Do we have those generally trusted arbiters? I note that it seems like many people who I think of as 'generally trusted' are trusted because of some 'private information', even if it's just something like "I've talked to Carol and get the sense that she's sensible."

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T02:08:38.243Z · LW(p) · GW(p)

I don't think there are fully general trusted arbiters, but it's possible to bridge the gap with person X by finding person Y trusted by both you and X.

comment by Tomáš Gavenčiak (tomas-gavenciak) · 2019-07-12T10:38:33.127Z · LW(p) · GW(p)

I think that sufficiently universally trusted arbiters may be very hard to find, but Alice can also refrain from that option to prevent the issue gaining more public attention, believing more attention or attention of various groups to be harmful. I can imagine cases, where more credible people (Carols) saying they are convinced that e.g. "it is really easily doable" would disproportionally give more incentives for misuse than defense (by the groups the information reaches, the reliability signals those groups accept etc).

comment by Scott Alexander (Yvain) · 2019-07-11T07:30:47.811Z · LW(p) · GW(p)

1. It sounds like we have a pretty deep disagreement here, so I'll write an SSC post explaining my opinion in depth sometime.

2. Sorry, it seems I misunderstood you. What did you mean by mentioning business's very short timelines and all of the biases that might make them have those?

3. I feel like this is dismissing the magnitude of the problem. Suppose I said that the Democratic Party was a lying scam that was duping Americans into believing it, because many Americans were biased to support the Democratic Party for various demographic reasons, or because their families were Democrats, or because they'd seen campaign ads, etc. These biases could certainly exist. But if I didn't even mention that there might be similar biases making people support the Republican Party, let alone try to estimate which was worse, I'm not sure this would qualify as sociopolitical analysis.

4. I was trying to explain why people in a field might prefer that members of the field address disagreements through internal channels rather than the media, for reasons other than that they have a conspiracy of silence. I'm not sure what you mean by "concrete criticisms". You cherry-picked some reasons for believing long timelines; I agree these exist. There are other arguments for believing shorter timelines and that people believing in longer timelines are "duped". What it sounded like you were claiming is that the overall bias is in favor of making people believe in shorter ones, which I think hasn't been proven.

I'm not entirely against modeling sociopolitical dynamics, which is why I ended the sentence with "at this level of resolution". I think a structured attempt to figure out whether there were more biases in favor of long timelines or short timelines (for example, surveying AI researchers on what they would feel uncomfortable saying) would be pretty helpful. I interpreted this post as more like the Democrat example in 3 - cherry-picking a few examples of bias towards short timelines, then declaring short timelines to be a scam. I don't know if this is true or not, but I feel like you haven't supported it.

Bayes Theorem says that we shouldn't update on information that you could get whether or not a hypothesis were true. I feel like you could have written an equally compelling essay "proving" bias in favor of long timelines, of Democrats, of Republicans, or of almost anything; if you feel like you couldn't, I feel like the post didn't explain why you felt that way. So I don't think we should update on the information in this, and I think the intensity of your language ("scam", "lie", "dupe") is incongruous with the lack of update-able information.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-11T17:22:57.652Z · LW(p) · GW(p)
  1. Okay, that might be useful. (For a mainstream perspective on this I have agreement with, see The Scams Are Winning).

  2. The argument for most of the post is that there are active distortionary pressures towards short timelines. I mentioned the tech survey in the conclusion to indicate that the distortionary pressures aren't some niche interest, they're having big effects on the world.

  3. Will discuss later in this comment.

  4. By "concrete criticisms" I mean the Twitter replies. I'm studying the implicit assumptions behind these criticisms to see what it says about attitudes in the AI field.

I feel like you could have written an equally compelling essay “proving” bias in favor of long timelines, of Democrats, of Republicans, or of almost anything; if you feel like you couldn’t, I feel like the post didn’t explain why you felt that way.

I think this is the main thrust of your criticism, and also the main thrust of point 3. I do think lots of things are scams, and I could have written about other things instead, but I wrote about short timelines, because I can't talk about everything in one essay, and this one seems important.

I couldn't have written an equally compelling essay on biases in favor of long timelines without lying, I think, or even with lying while trying to maintain plausibility. (Note also, it seems useful for there to be essays on the Democrat party's marketing strategy that don't also talk about the Republican party's marketing strategy)

Courts don't work by the judge saying "well, you know, you could argue for anything, so what's the point in having people present cases for one side or the other?" The point is that some cases end up stronger than other cases. I can't prove that there isn't an equally strong case that there's bias in favor of long timelines, because that would be proving a negative. (Even if I did, that would be a case of "sometimes there's bias in favor of X, sometimes against X, it depends on the situation/person/etc"; the newly discovered distortionary pressures don't negate the fact that the previously discovered ones exist)

Replies from: dxu, Benito, elityre, Dagon, countingtoten
comment by dxu · 2019-07-11T17:57:00.619Z · LW(p) · GW(p)

I agree that it's difficult (practically impossible) to engage with a criticism of the form "I don't find your examples compelling", because such a criticism is in some sense opaque: there's very little you can do with the information provided, except possibly add more examples (which is time-consuming, and also might not even work if the additional examples you choose happen to be "uncompelling" in the same way as your original examples).

However, there is a deeper point to be made here: presumably you yourself only arrived at your position after some amount of consideration. The fact that others appear to find your arguments (including any examples you used) uncompelling, then, usually indicates one of two things:

  1. You have not successfully expressed the full chain of reasoning that led you to originally adopt your conclusion (owing perhaps to constraints on time, effort, issues with legibility, or strategic concerns). In this case, you should be unsurprised at the fact that other people don't appear to be convinced by your post, since your post does not present the same arguments/evidence that convinced you yourself to believe your position.

  2. You do, in fact, find the raw examples in your post persuasive. This would then indicate that any disagreement between you and your readers is due to differing priors, i.e. evidence that you would consider sufficient to convince yourself of something, does not likewise convince others. Ideally, this fact should cause you to update in favor of the possibility that you are mistaken, at least if you believe that your interlocutors are being rational and intellectually honest.

I don't know which of these two possibilities it actually is, but it may be worth keeping this in mind if you make a post that a bunch of people seem to disagree with.

comment by Ben Pace (Benito) · 2019-07-17T08:03:27.127Z · LW(p) · GW(p)

Scott's post explaining his opinion is here, and is called 'Against Lie Inflation'.

comment by Eli Tyre (elityre) · 2021-05-02T04:20:01.141Z · LW(p) · GW(p)

Note also, it seems useful for there to be essays on the Democrat party's marketing strategy that don't also talk about the Republican party's marketing strategy

Minor, unconfident, point: I'm not sure that this is true. It seems like it would result in people mostly fallacy-fallacy-ing the other side, each with their own "look how manipulative the other guys are" essays. If the target is thoughtful people trying to figure things out, they'll want to hear about both sides, no?

comment by Dagon · 2019-07-11T18:40:10.047Z · LW(p) · GW(p)
Courts don't work by the judge saying "well, you know, you could argue for anything, so what's the point in having people present cases for one side or the other?" The point is that some cases end up stronger than other cases.

I think courts spend a fair bit of effort not just in evaluating strength of case, but standing and impact of the case. not "what else could you argue?", but "why does this complaint matter, to whom?"

IMO, you're absolutely right that there's lots of pressures to make unrealistically short predictions for advances, and this causes a lot of punditry, and academia and industry, to ... what? It's annoying, but who is harmed and who has the ability to improve things?

Personally, I think timeline for AGI is a poorly-defined prediction - the big question is what capabilities satisfy the "AGI" definition. I think we WILL see more and more impressive performance in aspects of problem-solving and prediction that would have been classified as "intelligence" 50 years ago, but that we probably won't credit with consciousness or generality.

comment by countingtoten · 2019-07-14T02:26:10.423Z · LW(p) · GW(p)

I couldn't have written an equally compelling essay on biases in favor of long timelines without lying, I think,

Then perhaps you should start here [LW(p) · GW(p)].

comment by Benquo · 2019-07-12T02:10:22.339Z · LW(p) · GW(p)
I don't actually know the extent to which Bernie Madoff actually was conscious that he was lying to people. What I do know is that he ran a pyramid scheme.

The eponymous Charles Ponzi had a plausible arbitrage idea backing his famous scheme; it's not unlikely that he was already in over his head (and therefore desperately trying to make himself believe he'd find some other way to make his investors whole) by the time he found out that transaction costs made the whole thing impractical.

comment by quanticle · 2019-07-18T19:25:53.769Z · LW(p) · GW(p)

I don’t actually know the extent to which Bernie Madoff actually was conscious that he was lying to people. What I do know is that he ran a pyramid scheme. The dynamics happen regardless of how conscious they are. (In fact, they often work through keeping things unconscious)

Bernie Madoff plead guilty to running a pyramid scheme. As part of his guilty plea he admitted that he stopped trading in the 1990s and had been paying returns out of capital since then.

I think this is an important point to make, since the implicit lesson I'm reading here is that there's no difference between giving false information intentionally ("lying") and giving false information unintentionally ("being wrong"). I would caution that that is a dangerous road to go down, as it just leads to people being silent. I would much rather receive optimistic estimates from AI advocates than receive no estimates at all. I can correct for systematic biases in data. I cannot correct for the absence of data.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-18T19:32:50.118Z · LW(p) · GW(p)

Of course there's an important difference between lying and being wrong. It's a question of knowledge states. Unconscious lying is a case when someone says something they unconsciously know to be false/unlikely.

If the estimates are biased, you can end up with worse beliefs than you would by just using an uninformative prior. Perhaps some are savvy enough to know about the biases involved (in part because of people like me writing posts like the one I wrote), but others aren't, and get tricked into having worse beliefs than if they had used an uninformative prior.

I am not trying to punish people, I am trying to make agent-based models.

(Regarding Madoff, what you present is suggestive, but it doesn't prove that he was conscious that he had no plans to trade and was deceiving his investors. We don't really know what he was conscious of and what he wasn't.)

comment by Fluttershy · 2019-07-12T03:16:17.138Z · LW(p) · GW(p)
I'm wary of using words like "lie" or "scam" to mean "honest reporting of unconsciously biased reasoning"

When someone is systematically trying to convince you of a thing, do not be like, "nice honest report", but be like, "let me think for myself whether that is correct".

Replies from: ESRogs
comment by ESRogs · 2019-07-12T06:21:06.048Z · LW(p) · GW(p)
but be like, "let me think for myself whether that is correct".

From my perspective, describing something as "honest reporting of unconsciously biased reasoning" seems much more like an invitation for me to think for myself whether it's correct than calling it a "lie" or a "scam".

Calling your opponent's message a lie and a scam actually gets my defenses up that you're the one trying to bamboozle me, since you're using such emotionally charged language.

Maybe others react to these words differently though.

Replies from: elityre, Fluttershy
comment by Eli Tyre (elityre) · 2021-05-02T04:25:09.294Z · LW(p) · GW(p)

This comment is such a good example of managing to be non-triggering in making the point. It stands out to me amongst all the comments above it, which are at least somewhat heated.

Replies from: ESRogs
comment by ESRogs · 2021-05-08T03:59:22.401Z · LW(p) · GW(p)


comment by Fluttershy · 2019-07-12T07:21:26.796Z · LW(p) · GW(p)

It's a pretty clear way of endorsing something to call it "honest reporting".

Replies from: ESRogs
comment by ESRogs · 2019-07-12T07:38:22.513Z · LW(p) · GW(p)

Sure if you just call it "honest reporting". But that was not the full phrase used. The full phrase used was "honest reporting of unconsciously biased reasoning".

I would not call trimming that down to "honest reporting" a case of honest reporting! ;-)

If I claim, "Joe says X, and I think he honestly believes that, though his reasoning is likely unconsciously biased here", then that does not at all seem to me like an endorsement of X, and certainly not a clear endorsement.

comment by paulfchristiano · 2019-07-11T08:42:48.090Z · LW(p) · GW(p)

I agree with:

  • Most people trying to figure out what's true should be mostly trying to develop views on the basis of public information and not giving too much weight to supposed secret information.
  • It's good to react skeptically to someone claiming "we have secret information implying that what we are doing is super important."
  • Understanding the sociopolitical situation seems like a worthwhile step in informing views about AI.
  • It would be wild if 73% of tech executives thought AGI would be developed in the next 10 years. (And independent of the truth of that claim, people do have a lot of wild views about automation.)

I disagree with:

  • Norms of discourse in the broader community are significantly biased towards short timelines. The actual evidence in this post seems thin and cherry-picked. I think the best evidence is the a priori argument "you'd expect to be biased towards short timelines given that it makes our work seem more important." I think that's good as far as it goes but the conclusion is overstated here.
  • "Whistleblowers" about long timelines are ostracized or discredited. Again, the evidence in your post seems thin and cherry-picked, and your contemporary example seems wrong to me (I commented separately). It seems like most people complaining about deep learning or short timelines have a good time in the AI community, and people with the "AGI in 20 years" view are regarded much more poorly within academia and most parts of industry. This could be about different fora and communities being in different equilibria, but I'm not really sure how that's compatible with "ostracizing." (It feels like you are probably mistaken about the tenor of discussions in the AI community.)
  • That 73% of tech executives thought AGI would be developed in the next 10 years. Willing to bet against the quoted survey: the white paper is thin on details and leaves lots of wiggle room for chicanery, while the project seems thoroughly optimized to make AI seem like a big deal soon. The claim also just doesn't seem to match my experience with anyone who might be called tech executives (though I don't know how they constructed the group).
Replies from: Vika
comment by Vika · 2019-07-22T19:46:43.971Z · LW(p) · GW(p)

Definitely agree that the AI community is not biased towards short timelines. Long timelines are the dominant view, while the short timelines view is associated with hype. Many researchers are concerned about the field losing credibility (and funding) if the hype bubble bursts, and this is especially true for those who experienced the AI winters. They see the long timelines view as appropriately skeptical and more scientifically respectable.

Some examples of statements that AGI is far away from high-profile AI researchers:

Geoffrey Hinton:

Yann LeCun:

Yoshua Bengio: [LW · GW]

Rodney Brooks:

comment by zacharylipton · 2019-07-12T18:29:19.868Z · LW(p) · GW(p)

Hi Jessica. Nice post and I agree with many of your points. Certainly, I believe—as you do—that a number of bad actors are wielding the specter of AGI sloppily and irresponsibly, either to consciously defraud people or on account of buying into something that speaks more to the messianic than to science. Perhaps ironically, one frequent debate that I have had with Gary in the past is that while he is vocally critical of exuberance over deep learning, he is himself partial to speaking rosily of nearish-term AGI, and of claiming progress (or being on the verge of progress) towards it. On the other hand, I am considerably more skeptical.

While I enjoyed the post and think we agree on many points, if you don’t mind I would like to respectfully note that I’ve been quoted here slightly out of context and would like to supply that missing context. To be sure, I think your post is written well and with honest intentions, and I know how easy it is to miss some context in Twitter threads [especially as it seems that many tweets have been deleted from this thread].

Regarding my friend Gary Marcus. I like Gary and we communicate fairly regularly, but we don’t always agree on the science or on the discourse.

In this particular case, he was specifically insisting to be “the first” to make a set of arguments (which contradicted my understanding of history). When I say “Saying it louder ≠ saying it first.”, I am truly just pushing back on this specific point—the assertion that he said it first. Among others, Judea Pearl has argued the limits of curve fitting far more rigorously and far earlier in history.

There is nothing dishonest or contradictory about agreeing with a broader point and simultaneously disagreeing with a man’s claim to have originated it. I took exception to the hyperbole, not the message.

After the quote, your post notes: “but, this is essentially admitting that Marcus is correct, while still criticizing him for saying it”—what is there to “admit”? I myself have made similar critical arguments both in technical papers, position papers, blog posts, and the popular press. Characterizing “agreement” as “admitting” something, makes the false insinuation that somehow I have been on the wrong side of the debate.

For the record, I am acknowledged in Gary’s paper on limitations of deep learning (which you reference here) for giving him a large amount of constructive feedback and have myself, perhaps in defiance of Adam Smith’s aphorism, been vocally critical of my own community within the technical forums, recently publishing “Troubling Trends in Machine Learning Scholarship” at the ICML debates ( which was subsequently published by CACM. I suspect this piece (as well as much of my writing at and in other formal position papers) is in the spirit of the sort of critical writing that you appear to encourage.

Further, I’d like to address the second part of the discussion. When I say “Saying X doesn’t solve Y is pretty easy. But where are your concrete solutions for Y?” My point here is that Gary doesn’t just push back on the false claims made about current technology that doesn’t do Y. He also sometimes makes public attacks on the people working on X. It would seem that their crime is that they haven’t developed technical solutions to the grand challenge of Y. If failing to solve these particular moonshots (true reasoning, solving common sense, an elegant formulation of symbol manipulation and synthesis with pattern recognition) is a crime, then Gary too is just as guilty and the attack ought to be levied with greater humility. These attacks strike me as inappropriate and misplaced (compared to the more reasonable push-back on misinformation in the public sphere. ***To be clear, while I understand why you might have drawn that conclusion from this half-tweet, I do not believe that one must have a solution in hand to levy criticism and my writing and technical papers attest to this.***

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T18:11:03.994Z · LW(p) · GW(p)

Thanks a lot for the clarification, and sorry I took the quote out of context! I've added a note linking to this response.

comment by sarahconstantin · 2019-07-11T14:11:00.368Z · LW(p) · GW(p)

Basically, AI professionals seem to be trying to manage the hype cycle carefully.

Ignorant people tend to be more all-or-nothing than experts. By default, they'll see AI as "totally unimportant or fictional", "a panacea, perfect in every way" or "a catastrophe, terrible in every way." And they won't distinguish between different kinds of AI.

Currently, the hype cycle has gone from "professionals are aware that deep learning is useful" (c. 2013) to "deep learning is AI and it is wonderful in every way and you need some" (c. 2015?) to "maybe there are problems with AI? burn it with fire! Nationalize! Ban!" (c. 2019).

Professionals who are still working on the "deep learning is useful for certain applications" project (which is pretty much where I sit) are quite worried about the inevitable crash when public opinion shifts from "wonderful panacea" to "burn it with fire." When the public opinion crash happens, legitimate R&D is going to lose funding, and that will genuinely be unfortunate. Everyone savvy knows this will happen. Nobody knows exactly when. There are various strategies for dealing with it.

Accelerate the decline: this is what Gary Marcus is doing.

Carve out a niche as an AI Skeptic (who is still in the AI business himself!) Then, when the funding crunch comes, his companies will be seen as "AI that even the skeptic thinks is legit" and have a better chance of surviving.

Be Conservative: this is a less visible strategy but a lot of people are taking it, including me.

Use AI only in contexts that are well justified by evidence, like rapid image processing to replace manual classification. That way, when the funding crunch happens, you'll be able to say you're not just using AI as a buzzword, you're using well-established, safe methods that have a proven track record.

Pivot Into Governance: this is what a lot of AI risk orgs are doing

Benefit from the coming backlash by becoming an advisor to regulators. Make a living not by building the tech but by talking about its social risks and harms. I think this is actually a fairly weak strategy because it's parasitic on the overall market for AI. There's no funding for AI think tanks if there's no funding for AI itself. But it's an ideal strategy for the cusp time period when we're just shifting between blind enthusiasm to blind panic.

Preserve Credibility: this is what Yann LeCun is doing and has been doing from day 1 (he was a deep learning pioneer and promoter even before the spectacular empirical performance results came in)

Try to forestall the backlash. Frame AI as good, not bad, and try to preserve the credibility of the profession as long as you can. Argue (honestly but selectively) against anyone who says anything bad about deep learning for any reason.

Any of these strategies may say true things! In fact, assuming you really are an AI expert, the smartest thing to do in the long run is to say only true things, and use connotation and selective focus to define your rhetorical strategy. Reality has no branding; there are true things to say that comport with all four strategies. Gary Marcus is a guy in the "AI Skeptic" niche saying things that are, afaik, true; there are people in that niche who are saying false things. Yann LeCun is a guy in the "Preserve AI Credibility" niche who says true things; when Gary Marcus says true things, Yann LeCun doesn't deny them, but criticizes Marcus's tone and emphasis. Which is quite correct; it's the most intellectually rigorous way to pursue LeCun's chosen strategy.

Replies from: elityre
comment by Eli Tyre (elityre) · 2021-05-02T04:38:46.663Z · LW(p) · GW(p)

Pivot Into Governance: this is what a lot of AI risk orgs are doing

What? How exactly is this a way of dealing with the hype bubble bursting? It seems like if it bursts for AI, it bursts for "AI governance"?

Am I missing something?

Replies from: elityre
comment by Eli Tyre (elityre) · 2021-05-02T04:39:30.631Z · LW(p) · GW(p)

Never mind. It seems like I should have just kept reading.

comment by Vaniver · 2019-07-11T16:52:12.759Z · LW(p) · GW(p)

[Note: this, and all comments on this post unless specified otherwise, is written with my 'LW user' hat on, not my 'LW Admin' or 'MIRI employee' hat on, and thus is my personal view instead of the LW view or the MIRI view.]

As someone who thinks about AGI timelines a lot, I find myself dissatisfied with this post because it's unclear what "The AI Timelines Scam" you're talking about, and I'm worried if I poke at the bits it'll feel like a motte and bailey, where it seems quite reasonable to me that '73% of tech executives thinking that the singularity will arrive in <10 years is probably just inflated 'pro-tech' reasoning,' but also it seems quite unreasonable to suggest that strategic considerations about dual use technology should be discussed openly (or should be discussed openly because tech executives have distorted beliefs). It also seems like there's an argument for weighting urgency in planning that could lead to 'distorted' timelines while being a rational response to uncertainty.

On the first point, I think the following might be a fair description of some thinkers in the AGI space, but don't think this is a fair summary of MIRI (and I think it's illegible, to me at least, whether you are intending this to be a summary of MIRI):

This bears similarity to some conversations on AI risk I've been party to in the past few years. The fear is that Others (DeepMind, China, whoever) will develop AGI soon, so We have to develop AGI first in order to make sure it's safe, because Others won't make sure it's safe and We will. Also, We have to discuss AGI strategy in private (and avoid public discussion), so Others don't get the wrong ideas. (Generally, these claims have little empirical/rational backing to them; they're based on scary stories, not historically validated threat models)

I do think it makes sense to write more publicly about the difficulties of writing publicly, but there's always going to be something odd about it. Suppose I have 5 reasons for wanting discussions to be private, and 3 of them I can easily say. Discussing those three reasons will give people an incomplete picture that might seem complete, in a way that saying "yeah, the sum of factors is against" won't. Further, without giving specific examples, it's hard to see which of the ones that are difficult to say you would endorse and which you wouldn't, and it's not obvious to me legibility is the best standard here.

But my simple sense is that openly discussing whether or not nuclear weapons were possible (a technical claim on which people might have private information, including intuitions informed by their scientific experience) would have had costs and it was sensible to be secretive about it. If I think that timelines are short because maybe technology X and technology Y fit together neatly, then publicly announcing that increases the chances that we get short timelines because someone plugs together technology X and technology Y. It does seem like marginal scientists speed things up here.

Now, I'm paying a price here; it may be the case that people have tried to glue together technology X and technology Y and it won't work. I think private discussions on this are way better than no discussions on this, because it increases the chances that those sorts of crucial facts get revealed. It's not obvious that public discussions are all that much better on these grounds.

On the second point, it feels important to note that the threshold for "take something seriously" is actually quite small. I might think that the chance that I have Lyme disease is 5%, and yet that motivates significant action because of hugely asymmetric cost considerations, or rapid decrease in efficacy of action. I think there's often a problem where someone 'has short timelines' in the sense that they think 10-year scenarios should be planned about at all, but this can be easily mistaken for 'they think 10-year scenarios are most likely' because often if you think both an urgent concern and a distant concern are possible, almost all of your effort goes into the urgent concern instead of the distant concern (as sensible critical-path project management would suggest).

Replies from: ricraz, elityre, Fluttershy, jessica.liu.taylor
comment by Richard_Ngo (ricraz) · 2019-07-11T19:30:17.407Z · LW(p) · GW(p)
But my simple sense is that openly discussing whether or not nuclear weapons were possible (a technical claim on which people might have private information, including intuitions informed by their scientific experience) would have had costs and it was sensible to be secretive about it. If I think that timelines are short because maybe technology X and technology Y fit together neatly, then publicly announcing that increases the chances that we get short timelines because someone plugs together technology X and technology Y. It does seem like marginal scientists speed things up here.

I agree that there are clear costs to making extra arguments of the form "timelines are short because technology X and technology Y will fit together neatly". However, you could still make public that your timelines are a given probability distribution D, and the reasons which led you to that conclusion are Z% object-level views which you won't share, and (100-Z)% base rate reasoning and other outside-view considerations, which you will share.

I think there are very few costs to declaring which types of reasoning you're most persuaded by. There are some costs to actually making the outside-view reasoning publicly available - maybe people who read it will better understand the AI landscape and use that information to do capabilities research.

But having a lack of high-quality public timelines discussion also imposes serious costs, for a few reasons:

1. It means that safety researchers are more likely to be wrong, and therefore end up doing less relevant research. I am generally pretty skeptical of reasoning that hasn't been written down and undergone public scrutiny.

2. It means there's a lot of wasted motion across the safety community, as everyone tries to rederive the various arguments involved, and figure out why other people have the views they do, and who they should trust.

3. It makes building common knowledge (and the coordination which that knowledge can be used for) much harder.

4. It harms the credibility of the field of safety from the perspective of outside observers, including other AI researchers.

Also, the more of a risk you think 1 is, the lower the costs of disclosure are, because it becomes more likely that any information gleaned from the disclosure is wrong anyway. Yet predicting the future is incredibly hard! So the base rate for correctness here is low. And I don't think that safety researchers have a compelling advantage when it comes to correctly modelling how AI will reach human level (compared with thoughtful ML researchers).

Consider, by analogy, a debate two decades ago about whether to make public the ideas of recursive self-improvement and fast takeoff. The potential cost of that is very similar to the costs of disclosure now - giving capabilities researchers these ideas might push them towards building self-improving AIs faster. And yet I think making those arguments public was clearly the right decision. Do you agree that our current situation is fairly analogous?

EDIT: Also, I'm a little confused by

Suppose I have 5 reasons for wanting discussions to be private, and 3 of them I can easily say.

I understand that there are good reasons for discussions to be private, but can you elaborate on why we'd want discussions about privacy to be private?

Replies from: Vaniver
comment by Vaniver · 2019-07-12T04:58:45.570Z · LW(p) · GW(p)

I mostly agree with your analysis; especially the point about 1 (that the more likely I think my thoughts are to be wrong, the lower cost it is to share them).

I understand that there are good reasons for discussions to be private, but can you elaborate on why we'd want discussions about privacy to be private?

Most examples here have the difficulty that I can't share them without paying the costs, but here's one that seems pretty normal:

Suppose someone is a student and wants to be hired later as a policy analyst for governments, and believes that governments care strongly about past affiliations and beliefs. Then it might make sense for them to censor themselves in public under their real name because of potential negative consequences of things they said when young. However, any statement of the form "I specifically want to hide my views on X" made under their real name has similar possible negative consequences, because it's an explicit admission that the person has something to hide.

Currently, people hiding their unpopular opinions to not face career consequences is fairly standard, and so it's not that damning to say "I think this norm is sensible" or maybe even "I follow this norm," but it seems like it would have been particularly awkward to be first person to explicitly argue for that norm.

comment by Eli Tyre (elityre) · 2019-07-12T01:32:52.279Z · LW(p) · GW(p)


...if you think both an urgent concern and a distant concern are possible, almost all of your effort goes into the urgent concern instead of the distant concern (as sensible critical-path project management would suggest).

This isn't obvious to me. And I would be interested in a post laying out the argument, in general or in relation to AI.

Replies from: Benito, Vaniver
comment by Ben Pace (Benito) · 2019-07-12T03:00:48.318Z · LW(p) · GW(p)

The standard cite is Owen CB’s paper Allocating Risk Mitigation Across Time. Here’s one quote on this topic:

Suppose we are also unsure about when we may need the problem solved by. In scenarios where the solution is needed earlier, there is less time for us to collectively work on a solution, so there is less work on the problem than in scenarios where the solution is needed later. Given the diminishing returns on work, that means that a marginal unit of work has a bigger expected value in the case where the solution is needed earlier. This should update us towards working to address the early scenarios more than would be justified by looking purely at their impact and likelihood.


There are two major factors which seem to push towards preferring more work which focuses on scenarios where AI comes soon. The first is nearsightedness: we simply have a better idea of what will be useful in these scenarios. The second is diminishing marginal returns: the expected effect of an extra year of work on a problem tends to decline when it is being added to a larger total. And because there is a much larger time horizon in which to solve it (and in a wealthier world), the problem of AI safety when AI comes later may receive many times as much work as the problem of AI safety for AI that comes soon. On the other hand one more factor preferring work on scenarios where AI comes later is the ability to pursue more leveraged strategies which eschew object-level work today in favour of generating (hopefully) more object-level work later.

The above is a slightly unrepresentative quote; the paper is largely undecided as to whether shorter term strategies or longer term strategies are more valuable (given uncertainty over timelines), and recommends a portfolio approach (running multiple strategies, that each apply to different timelines). But that’s the sort of argument I think Vaniver was referring to.

comment by Vaniver · 2019-07-12T05:08:34.364Z · LW(p) · GW(p)

Specifically, 'urgent' is measured by the difference between the time you have and the time it will take to do. If I need the coffee to be done in 15 minutes and the bread to be done in an hour, but if I want the bread to be done in an hour I need to preheat the oven now (whereas the coffee only takes 10 minutes to brew start to finish) then preheating the oven is urgent whereas brewing the coffee has 5 minutes of float time. If I haven't started the coffee in 5 minutes, then it becomes urgent. See critical path analysis and Gantt charts and so on.

This might be worth a post? It feels like it'd be low on my queue but might also be easy to write.

comment by Fluttershy · 2019-07-12T03:23:01.566Z · LW(p) · GW(p)
It also seems like there's an argument for weighting urgency in planning that could lead to 'distorted' timelines while being a rational response to uncertainty.

It's important to do the "what are all the possible outcomes and what are the probabilities of each" calculation before you start thinking about weightings of how bad/good various outcomes are.

Replies from: ESRogs
comment by ESRogs · 2019-07-12T06:39:22.201Z · LW(p) · GW(p)

Could you say more about what you mean here? I don't quite see the connection between your comment and the point that was quoted.

I understand the quoted bit to be pointing out that if you don't know when a disaster is coming you _might_ want to prioritize preparing for it coming sooner rather than later (e.g. since there's a future you who will be available to prepare for the disaster if it comes in the future, but you're the only you available to prepare for it if it comes tomorrow).

Of course you could make a counter-argument that perhaps you can't do much of anything in the case where disaster is coming soon, but in the long-run your actions can compound, so you should focus on long-term scenarios. But the quoted bit is only saying that there's "an argument", and doesn't seem to be making a strong claim about which way it comes out in the final analysis.

Was your comment meaning to suggest the possibility of a counter-argument like this one, or something else? Did you interpret the bit you quoted the same way I did?

Replies from: Fluttershy
comment by Fluttershy · 2019-07-12T07:24:11.126Z · LW(p) · GW(p)

Basically, don't let your thinking on what is useful affect your thinking on what's likely.

comment by jessicata (jessica.liu.taylor) · 2019-07-12T19:58:48.839Z · LW(p) · GW(p)

While there are often good reasons to keep some specific technical details of dangerous technology secret, keeping strategy secret is unwise.

In this comment, by "public" I mean "the specific intellectual public who would be interested in your ideas if you shared them", not "the general public". (I'm arguing for transparency, not mass-marketing)

Either you think the public should, in general, have better beliefs about AI strategy, or you think the public should, in general, have worse beliefs about AI strategy, or you think the public should have exactly the level of epistemics about AI strategy that it does.

If you think the public should, in general, have better beliefs about AI strategy: great, have public discussions. Maybe some specific discussions will be net-negative, but others will be net-positive, and the good will outweigh the bad.

If you think the public should, in general, have worse beliefs about AI strategy: unless you have a good argument for this, the public has reason to think you're not acting in the public interest at this point, and are also likely acting against it.

There are strong prior reasons to think that it's better for the public to have better beliefs about AI strategy. To the extent that "people doing stupid things" is a risk, that risk comes from people having bad strategic beliefs. Also, to the extent that "people not knowing what each other is going to do and getting scared" is a risk, the risk comes from people not sharing their strategies with each other. It's common for multiple nations to spy on each other to reduce the kind of information asymmetries that can lead to unnecessary arms races, preemptive strikes, etc.

This doesn't rule out that there may come a time when there are good public arguments that some strategic topics should stop being discussed publicly. But that time isn't now.

Replies from: dxu, Vaniver
comment by dxu · 2019-07-12T21:55:21.046Z · LW(p) · GW(p)

There are strong prior reasons to think that it's better for the public to have better beliefs about AI strategy.

That may be, but note that the word "prior" is doing basically all of the work in this sentence. (To see this, just replace "AI strategy" with practically any other subject, and notice how the modified statement sounds just as sensible as the original.) This is important because priors can easily be overwhelmed by additional evidence--and insofar as AI researcher Alice thinks a specific discussion topic in AI strategy has the potential to be dangerous, it's worth realizing Alice probably has some specific inside view [LW · GW] reasons to believe that's the case. And, if those inside view arguments happen to require an understanding of the topic that Alice believes to be dangerous, then Alice's hands are now tied: she's both unable to share information about something, and unable to explain why she can't share that information.

Naturally, this doesn't just make Alice's life more difficult: if you're someone on the outside looking in, then you have no way of confirming if anything Alice says is true, and you're forced to resort to just trusting Alice. If you don't have a whole lot of trust in Alice to begin with, you might assume the worst of her: Alice is either rationalizing or lying (or possibly both) in order to gain status for herself and the field she works in.

I think, however, that these are dangerous assumptions to make. Firstly, if Alice is being honest and rational, then this policy effectively punishes her for being "in the know"--she must either divulge information she (correctly) believes to be dangerous, or else suffer an undeserved reputational hit. I'm particularly wary of imposing incentive structures of this kind around AI safety research, especially considering the relatively small number of people working on AI safety to begin with.

Secondly, however: in addition to being unfair to Alice, there are more subtle effects that such a policy may have. In particular, if Alice feels pressured to disclose the reasons she can't disclose things, that may end up influencing the rate and/or quality of the research she does in the first place (Ctrl+F "walls"). This could have serious consequences down the line for AI safety research, above and beyond the object-level hazards of revealing potentially dangerous ideas to the public.

Given all of this, I don't think it's obvious that the best move at this point involves making all of the strategic arguments around AI safety public. (And note that I say this as a member of said public: I am not affiliated with MIRI or any other AI safety institution, nor am I personally acquainted with anyone who is so affiliated. This therefore makes me a direct counter-example to your claim about the public in general having reason to think secret-keeping organizations must be doing so for self-interested reasons.)

To be clear: I think there is a possible world in which your arguments make sense. I also think there is a possible world in which your arguments not only do not make sense, but would lead to a clearly worse outcome if taken seriously. It's not clear to me which of these worlds we actually live in, and I don't think you've done a sufficient job of arguing that we live in the former world instead of the latter.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T22:16:47.573Z · LW(p) · GW(p)

If someone's claiming "topic X is dangerous to talk about, and I'm not even going to try to convince you of the abstract decision theory implying this, because this decision theory is dangerous to talk about", I'm not going to believe them, because that's frankly absurd.

It's possible to make abstract arguments that don't reveal particular technical details, such as by referring to historical cases, or talking about hypothetical situations.

It's also possible for Alice to convince Bob that some info is dangerous by giving the info to Carol, who is trusted by both Alice and Bob, after which Carol tells Bob how dangerous the info is.

If Alice isn't willing to do any of these things, fine, there's a possible but highly unlikely world where she's right, and she takes a reputation hit due to the "unlikely" part of that sentence.

(Note, the alternative hypothesis isn't just direct selfishness; what's more likely is cliquish inner ring dynamics)

comment by Vaniver · 2019-09-17T22:14:14.206Z · LW(p) · GW(p)

I haven't had time to write my thoughts on when strategy research should and shouldn't be public, but I note that this recent post by Spiracular [LW · GW] touches on many of the points that I would touch on in talking about the pros and cons of secrecy around infohazards.

The main claim that I would make about extending this to strategy is that strategy implies details. If I have a strategy that emphasizes that we need to be careful around biosecurity, that implies technical facts about the relative risks of biology and other sciences.

For example, the US developed the Space Shuttle with a justification that didn't add up (ostensibly it would save money, but it was obvious that it wouldn't). The Soviets, trusting in the rationality of the US government, inferred that there must be some secret application for which the Space Shuttle was useful, and so developed a clone (so that when the secret application was unveiled, they would be able to deploy it immediately instead of having to build their own shuttle from scratch then). If in fact an application like that had existed, it seems likely that the Soviets could have found it by reasoning through "what do they know that I don't?" when they might not have found it by reasoning from scratch.

comment by paulfchristiano · 2019-07-11T07:50:18.378Z · LW(p) · GW(p)

For reference, the Gary Marcus tweet in question is:

“I’m not saying I want to forget deep learning... But we need to be able to extend it to do things like reasoning, learning causality, and exploring the world .” - Yoshua Bengio, not unlike what I have been saying since 2012 in The New Yorker.

I think Zack Lipton objected to this tweet because it appears to be trying to claim priority. (You might have thought it's ambiguous whether he's claiming priority, but he clarifies in the thread: "But I did say this stuff first, in 2001, 2012 etc?") The tweet and his writings more generally imply that people in the field have recently changed their view to agree with him, but many people in the field object strongly to this characterization.

The tweet is mostly just saying "I told you so." That seems like a fine time for people to criticize him about making a land grab rather than engaging on the object level, since the tweet doesn't have much object-level content. For example:

"Saying it louder ≠ saying it first. You can't claim credit for differentiating between reasoning and pattern recognition." [...] is essentially a claim that everybody knows that deep learning can't do reasoning. But, this is essentially admitting that Marcus is correct, while still criticizing him for saying it.

Hopefully Zack's argument makes more sense if you view it as a response to Gary Marcus claiming priority. Which is what Gary Marcus was doing and clearly what Zack is responding to. This is not a substitute for engagement on the object level. Saying "someone else, and in fact many people in the relevant scientific field, already understood this point" is an excellent response to someone who's trying to claim credit for the point.

There are reasonable points to make about social epistemology here, but I think you're overclaiming about the treatment of critics, and that this thread in particular is a bad example to point to. It also seems like you may be mistaken about some of the context. (Zack Lipton has no love for short-timelines-pushers and isn't shy about it. He's annoyed at Gary Marcus for making bad arguments and claiming unwarranted credit, which really is independent of whether some related claims are true.)

Replies from: Benito, zacharylipton
comment by Ben Pace (Benito) · 2019-07-11T08:30:48.623Z · LW(p) · GW(p)

I also read OP as claiming that Yann LeCun is defending the field against critiques that AGI isn’t near. My current from-a-distance impression is indeed that LeCun wants to protect the field from aggressive/negative speculation in the news / on Twitter, but that he definitely cannot be accused of scamming people about how near AGI is. Quote:

I keep repeating this whenever I talk to the public: we’re very far from building truly intelligent machines. All you’re seeing now — all these feats of AI like self-driving cars, interpreting medical images, beating the world champion at Go and so on — these are very narrow intelligences, and they’re really trained for a particular purpose. They’re situations where we can collect a lot of data.

So for example, and I don’t want to minimize at all the engineering and research work done on AlphaGo by our friends at DeepMind, but when [people interpret the development of AlphaGo] as significant process towards general intelligence, it’s wrong. It just isn’t. it’s not because there’s a machine that can beat people at Go, there’ll be intelligent robots running round the streets. It doesn’t even help with that problem, it’s completely separate. Others may claim otherwise, but that’s my personal opinion.

We’re very far from having machines that can learn the most basic things about the world in the way humans and animals can do. Like, yes, in particular areas machines have superhuman performance, but in terms of general intelligence we’re not even close to a rat. This makes a lot of questions people are asking themselves premature. That’s not to say we shouldn’t think about them, but there’s no danger in the immediate or even medium term. There are real dangers in the department of AI, real risks, but they’re not Terminator scenarios.

There is a conversation to be had about how scientific fields interface with public discussions of them in the news / on Twitter, and indeed I think it is on net very defensive. I don’t think this is especially self-serving behaviour though. My guess as to the experience of most scientists reading about their work exploding on Twitter is “I feel like massive numbers of people are quickly coordinating language to attack/use my work in ways that are inaccurate and I feel threatened and need to run damage control” and that their understanding of what is happening is indeed true. I still think public discourse should be optimised for truth over damage control, but I don’t model such folks as especially self-serving or pulling any sort of scam.

Replies from: Kaj_Sotala, jessica.liu.taylor
comment by Kaj_Sotala · 2019-07-11T09:20:40.787Z · LW(p) · GW(p)
I also read OP as claiming that Yann LeCun is defending the field against critiques that AGI isn’t near.

Same. In particular, I read the "How does the AI field treat its critics" section as saying that "the AI field used to criticize Dreyfus for saying that AGI isn't near, just as it now seems to criticize Marcus for saying that AGI isn't near". But in the Dreyfus case, he was the subject of criticism because the AI field thought that he was wrong and AGI was close. Whereas Marcus seems to be the subject of criticism because the AI field thinks he's being dishonest in claiming that anyone seriously thinks AGI to be close.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T01:30:03.374Z · LW(p) · GW(p)

Whereas Marcus seems to be the subject of criticism because the AI field thinks he’s being dishonest in claiming that anyone seriously thinks AGI to be close.

Note, this looks like a dishonest "everybody knows" flip, from saying or implying X to saying "everybody knows not-X", in order to (either way) say it's bad to say not-X. (Clearly, it isn't the case that nobody believes AGI to be close!)

(See Marcus's medium article for more details on how he's been criticized, and what narratives about deep learning he takes issue with)

Replies from: paulfchristiano
comment by paulfchristiano · 2019-07-12T04:32:14.737Z · LW(p) · GW(p)
See Marcus's medium article for more details on how he's been criticized

Skimming that post it seems like he mentions two other incidents (beyond the thread you mention).

First one:

Gary Marcus: @Ylecun Now that you have joined the symbol-manipulating club, I challenge you to read my arxiv article Deep Learning: Critical Appraisal carefully and tell me what I actually say there that you disagree with. It might be a lot less than you think.
Yann LeCun: Now that you have joined the gradient-based (deep) learning camp, I challenge you to stop making a career of criticizing it without proposing practical alternatives.
Yann LeCun: Obviously, the ability to criticize is not contingent on proposing alternatives. However, the ability to get credit for a solution to a problem is contingent on proposing a solution to the problem.

Second one:

Gary Marcus: Folks, let’s stop pretending that the problem of object recognition is solved. Deep learning is part of the solution, but we are obviously still missing something important. Terrific new examples of how much is still be solved here: #AIisHarderThanYouThink
Critic: Nobody is pretending it is solved. However, some people are claiming that people are pretending it is solved. Name me one researcher who is pretending?
Gary Marcus: Go back to Lecun, Bengio and Hinton’s 9 page Nature paper in 2015 and show me one hint there that this kind of error was possible. Or recall initial dismissive reaction to …
Yann LeCun: Yeah, obviously we "pretend" that image recognition is solved, which is why we have a huge team at Facebook "pretending" to work on image recognition. Also why 6500 people "pretended" to attend CVPR 2018.

The most relevant quote from the Nature paper he is criticizing (he's right that it doesn't discuss methods working poorly off distribution):

Unsupervised learning had a catalytic effect in reviving interest in deep learning, but has since been overshadowed by the successes of purely supervised learning. Although we have not focused on it in this Review, we expect unsupervised learning to become far more important in the longer term. Human and animal learning is largely unsupervised: we discover the structure of the world by observing it, not by being told the name of every object.
Human vision is an active process that sequentially samples the optic array in an intelligent, task-specific way using a small, high-resolution fovea with a large, low-resolution surround. We expect much of the future progress in vision to come from systems that are trained end-toend and combine ConvNets with RNNs that use reinforcement learning to decide where to look. Systems combining deep learning and reinforcement learning are in their infancy, but they already outperform passive vision systems at classification tasks and produce impressive results in learning to play many different video games.
Natural language understanding is another area in which deep learning is poised to make a large impact over the next few years. We expect systems that use RNNs to understand sentences or whole documents will become much better when they learn strategies for selectively attending to one part at a time.
Ultimately, major progress in artificial intelligence will come about through systems that combine representation learning with complex reasoning. Although deep learning and simple reasoning have been used for speech and handwriting recognition for a long time, new paradigms are needed to replace rule-based manipulation of symbolic expressions by operations on large vectors
comment by jessicata (jessica.liu.taylor) · 2019-07-12T01:39:39.927Z · LW(p) · GW(p)

Ok, I added a note to the post to clarify this.

comment by zacharylipton · 2019-07-13T00:44:30.388Z · LW(p) · GW(p)

Hi Paul. Thanks for lucid analysis and generosity with your time to set the record straight here.

comment by paulfchristiano · 2019-07-11T18:31:35.511Z · LW(p) · GW(p)
in part because I don't have much to say on this issue that Gary Marcus hasn't already said.

It would be interesting to know which particular arguments made by Gary Marcus you agree with, and how you think they relate to arguments about timelines.

In this preliminary doc, it seems like most the disagreement is driven by saying there is a 99% probability that training a human-level AI would take more than 10,000x more lifetimes than AlphaZero took games of go (while I'd be at more like 50%, and have maybe 5-10% chance that it will take many fewer lifetimes). Section 2.0.2 admits this is mostly guesswork, but ends up very confident the number isn't small. It's not clear where that particular number comes from, the only evidence gestured at is "the input is a lot bigger, so it will take a lot more lifetimes" which doesn't seem to agree with our experience so far or have much conceptual justification. (I guess the point is that the space of functions is much bigger? but if comparing the size of the space of functions, why not directly count parameters?) And why is this a lower bound?

Overall this seems like a place you disagree confidently with many people who entertain shorter timelines, and it seems unrelated to anything Gary Marcus says.

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T18:33:29.727Z · LW(p) · GW(p)

I agree with essentially all of the criticisms of deep learning in this paper, and I think these are most relevant for AGI:

  • Deep learning is data hungry, reflecting poor abstraction learning
  • Deep learning doesn't transfer well across domains
  • Deep learning doesn't handle hierarchical structure
  • Deep learning has trouble with logical inference
  • Deep learning doesn't learn causal structure
  • Deep learning presumes a stable world
  • Deep learning requires problem-specific engineering

Together (and individually), these are good reasons to expect "general strategic action in the world on a ~1-month timescale" to be a much harder domain for deep learning to learn how to act in than "play a single game of Go", hence the problem difficulty factor.

comment by Zack_M_Davis · 2019-07-22T15:39:46.506Z · LW(p) · GW(p)

Exercise for those (like me) who largely agreed with the criticism that the usage of "scam" in the title was an instance of the noncentral fallacy (drawing the category boundaries of "scam" too widely in a way that makes the word less useful): do you feel the same way about Eliezer Yudkowsky's "The Two-Party Swindle" [LW · GW]? Why or why not?

Replies from: habryka4, ESRogs
comment by habryka (habryka4) · 2019-07-22T19:35:07.622Z · LW(p) · GW(p)

I like this question.

To report my gut reaction, not necessarily endorsed yet. I am sharing this to help other people understand how I feel about this, not as a considered argument:

I have a slight sense of ickyness, though a much weaker one. "Swindle" feels less bad to me, though I also haven't really heard the term used particularly much in recent times, so it's associations feel a lot less clear to me. I think I would have reacted less bad to the OP had it used "swindle" instead of "scam" but I am not super confident.

The other thing is that the case for the "two party swindle" feels a lot more robust than the case for the "AI timelines scam". I don't think you should never call something a scam or swindle, but if you do you should make really sure it actually is. Though I do still think there is a noncentral fallacy thing going on with calling it a swindle (though again it feels less noncentral for "swindle" instead of "scam").

The third thing is that the word "swindle" only shows up in the title of the post, and is not reinforced with words like "fraud" or trying to accuse any specific individual actors of lying or deceiving. "Scam" also only shows up in the title of this post, but I feel like it's much more trying to accuse people of immoral behavior, whereas Eliezer goes out of his way to emphasize that he doesn't think anyone in this situation should obviously be punished.

The last thing is that it's mostly talking about a relatively distant outgroupy-feeling thing that makes sense to analyze to learn things from it, but that I am quite confident Eliezer has little to gain from criticizing and just doesn't feel close to me (this might also partially because I grew up in Germany which does not have a two-party system). In the AI Timelines Scam post, I felt like there was more of a political undertone that highlighted the hypothesis of local political conflict to gain social points by trying to redefine words.

Replies from: Ruby
comment by Ruby · 2019-07-22T20:02:49.163Z · LW(p) · GW(p)

Agree, good question.

In was going to say the much the same. I think it kind of is a noncentral fallacy too, but not one that strikes me as problematic.

Perhaps I'd add that I feel the argument/persuasion being made by Eliezer doesn't really rest on trying to import my valence towards "swindle" over to this. I don't have that much valence to a funny obscure word.

I guess it has to be said that it's a noncentral noncentral fallacy.

comment by ESRogs · 2019-07-23T04:37:04.547Z · LW(p) · GW(p)

I see two links in your comment that are both linking to the same place -- did you mean for the first one (with the text: "the criticism that the usage of "scam" in the title was an instance of the noncentral fallacy") to link to something else?

Replies from: Zack_M_Davis
comment by Zack_M_Davis · 2019-07-23T04:45:45.244Z · LW(p) · GW(p)

Yes, thank you; the intended target was the immortal Scott Alexander's "Against Lie Inflation" (grandparent edited to fix). I regret the error.

comment by Ben Pace (Benito) · 2019-07-11T03:17:16.353Z · LW(p) · GW(p)

I almost wrote a post today with roughly the following text. It seems highly related so I guess I'll write a short version.

The ability to sacrifice today for tomorrow is one of the hard skills all humans must learn. To be able to make long-term plans is not natural. Most people around me seem to be able to think about today and the next month, and occasionally the next year (job stability, housing stability, relationship stability, etc), but very rarely do I see anyone acting on plans with timescales of decades (or centuries). Robin Hanson has written about how humans naturally think in a very low-detail and unrealistic mode in far (as opposed to near) thinking, and I know that humans have a difficult time with scope sensitivity.
It seems to me that is is a very common refrain in the community to say "but timelines are short" in response to someone's long-term thinking or proposal, to suggest somewhere in the 5-25 year range until no further work matters because an AGI Foom has occured. My epistemic state is that even if this is true (which it very well may be), most people who are thinking this way are not in fact making 10 year plans. They are continuing to make at most 2 year plans, while avoiding learning how to make longer term plans.
There is a two-step form of judo required to first learn to make 50 year plans and then secondarily restrict yourself to shorter-term plans. It is not one move, and I often see "but timelines are short" used to prevent someone from learning the first move.

I had not considered until the OP that this was actively adversarially selected for, certainly in industry, but it does seem compelling as a method for making today very exciting and stopping people from taking caution for the long-term. Will think on it more.

Replies from: Raemon, jessica.liu.taylor
comment by Raemon · 2019-07-11T07:37:52.691Z · LW(p) · GW(p)
There is a two-step form of judo required to first learn to make 50 year plans and then secondarily restrict yourself to shorter-term plans. It is not one move, and I often see "but timelines are short" used to prevent someone from learning the first move.

Is there a reason you need to do 50 year plans before you can do 10 year plans? I'd expect the opposite to be true.

(I happen to currently have neither a 50 nor 10 year plan, apart from general savings, but this is mostly because it's... I dunno kinda hard and I haven't gotten around to it or something, rather than anything to do with timelines.)

Replies from: Benito, elityre
comment by Ben Pace (Benito) · 2019-07-11T07:58:47.802Z · LW(p) · GW(p)

Is there a reason you need to do 50 year plans before you can do 10 year plans?


It’s often worth practising on harder problems to make the smaller problems second nature, and I think this is a similar situation. Nowadays I do more often notice plans that would take 5+ years to complete (that are real plans with hopefully large effect sizes), and I’m trying to push it higher.

Thinking carefully about how things are built that have lasted decades or centuries (science, the American constitution, etc) I think is very helpful for making shorter plans that still require coordination of 1000s of people over 10+ years.

Relatedly I don’t think anyone in this community working on AI risk should be devoting 100% of their probability mass to things in the <15 year scale, and so should think about plans that fail gracefully or are still useful if the world is still muddling along at relatively similar altitudes in 70 years.

Replies from: Raemon
comment by Raemon · 2019-07-11T08:05:54.191Z · LW(p) · GW(p)

Ah, that all makes sense.

comment by Eli Tyre (elityre) · 2021-05-02T04:50:45.446Z · LW(p) · GW(p)

Is there a reason you need to do 50 year plans before you can do 10 year plans? I'd expect the opposite to be true.

I think you do need to learn how to make plans that can actually work, at all, before you learn how to make plans with very limited resources.

And I think that people fall into the habit of making "plans", that they don't inner sim actually leading to success, because they condition themselves into thinking that things are desperate and the best action will only be the best action "in expected value" eg that the "right" action should look like a moonshot.

This seems concerning to me. It seems like you should be, first and foremost, figuring out how you can get any plan that works at all, and then secondarily, trying to figure out how to make it work in the time allotted. Actual, multi-step strategy shouldn't mostly feel like "thinking up some moon-shots".

comment by jessicata (jessica.liu.taylor) · 2019-07-11T03:22:11.742Z · LW(p) · GW(p)

Strongly agreed with what you have said. See also the psychology of doomsday cults.

Replies from: Benito
comment by Ben Pace (Benito) · 2019-07-11T07:36:01.249Z · LW(p) · GW(p)

Thinking more, my current sense is that this is not an AI-specific thing, but a broader societal problem where people fail to think long-term. Peter Thiel very helpfully writes about it as a distinction between “definite” and “indefinite” attitudes to the future, where in the former it is understandable and lawful and in the latter it will happen no matter what you do (fatalism). My sense is that when I have told myself to focus on short-timelines, if it’s been unhealthy it’s been a general excuse for not having to look at hard problems.

comment by Jan Kulveit (jan-kulveit) · 2019-07-15T01:39:57.133Z · LW(p) · GW(p)

[purely personal view]

It seems quite easy to imagine similarly compelling socio-political and subconscious reasons why people working on AI could be biased against short AGI timelines. For example

  • short timelines estimates make broader public agitated, which may lead to state regulation or similar interference [historical examples: industries trying to suppress info about risks]
  • researchers mostly want to work on technical problems, instead of thinking about nebulous future impacts of their work; putting more weight on short timelines would force some people to pause and think about responsibility, or suffer some cognitive dissonance, which may be unappealing/unpleasant for S1 reasons [historical examples: physicists working on nuclear weapons]
  • fears claims about short timelines would get pattern-matched as doomsday fear-mongering / sensationalist / subject of scifi movies ...

While I agree motivated reasoning is a serious concern, I don't think it's clear how do the incentives sum up. If anything, claims like "AGI is unrealistic or very far away, however practical applications of narrow AI will be profound" seems to capture most of the purported benefits (AI is important) and avoid the negatives (no need to think).

comment by rohinmshah · 2019-07-11T03:30:33.461Z · LW(p) · GW(p)

Planned summary:

This post argues that AI researchers and AI organizations have an incentive to predict that AGI will come soon, since that leads to more funding, and so we should expect timeline estimates to be systematically too short. Besides the conceptual argument, we can also see this in the field's response to critics: both historically and now, criticism is often met with counterarguments based on "style" rather than engaging with the technical meat of the criticism.

Planned opinion:

I agree with the conceptual argument, and I think it does hold in practice, quite strongly. I don't really agree that the field's response to critics implies that they are biased towards short timelines -- see these [LW(p) · GW(p)] comments [LW(p) · GW(p)]. Nonetheless, I'm going to do exactly what this post critiques, and say that I put significant probability on short timelines, but not explain my reasons (because they're complicated and I don't think I can convey them, and certainly can't convey them in a small number of words).

Replies from: orthonormal
comment by orthonormal · 2019-07-11T04:59:45.435Z · LW(p) · GW(p)
both historically and now, criticism is often met with counterarguments based on "style" rather than engaging with the technical meat of the criticism

Is there any group of people who reliably don't do this? Is there any indication that AI researchers do this more often than others?

Replies from: rohinmshah
comment by rohinmshah · 2019-07-11T05:27:50.870Z · LW(p) · GW(p)


Note that even if AI researchers do this similarly to other groups of people, that doesn't change the conclusion that there are distortions that push towards shorter timelines.

comment by shminux · 2019-07-11T05:04:42.687Z · LW(p) · GW(p)

I see clear parallels with the treatment of Sabine Hossenfelder blowing the whistle on the particle physics community pushing for a new $20B particle accelerator. She has been going through the same adversity as any high-profile defector from a scientific community, and the arguments against her are the same ones you are listing.

comment by Ben Pace (Benito) · 2021-01-07T04:41:15.352Z · LW(p) · GW(p)

This is a cogent, if sparse, high-level analysis of the epistemic distortions around megaprojects [LW · GW] in AI and other fields.

It points out that projects like the human brain project and the fifth generation computer systems project made massive promises, raised around a billion dollars, and totally flopped. I don't expect this was a simple error, I expect there were indeed systematic epistemic distortions involved, perpetuated at all levels.

It points out that similar scale projects are being evaluated today involving various major AI companies globally, and points out that the sorts of distortionary anti-epistemic tendencies can still be observed. Critics of the ideas that are currently getting billions of dollars (deep learning leading to AGI) are met with replies that systematically exclude the possibility of 'stop, halt, and catch fire' but instead only include 'why are you talking about problems and not solutions' and 'do this through our proper channels within the field and not in this unconstrained public forum', which are clearly the sorts you'd expect to see when a megaproject is protecting itself.

The post briefly also addresses why it's worth modeling the sociopolitical arguments, and not just the technical arguments. I think it's clear that megaprojects like this are subject to major distortionary forces – at the point where you're talking about arguments against the positions that is literally funding the whole field, it is obviously not acceptable to constrain dialogue to the channels that field controls, this is a mechanism that is open to abuse of power. I like this short section.

The post ends with the claim that 'people are being duped into believing a lie'. I don't feel convinced of this.

I tried to write down why simply, but I'm not having the easiest time. A few pointers:

  • A chain is as strong as its weakest link, but not all organisations are chains. Many mathematicians can be doing nonsense symbol-manipulation while Andrew Wiles solves Fermat's Last Theorem. I expect there was an overlap in the time today when science is substantially broken, and the time when Feynman was around making diagrams and building the atom bomb. In the intermediary time a lot of 'science' you could point to as not actually science and supported by anti-epistemic arguments, but this was somewhat separable from Feynman who was still doing real work.
  • There can be many levels of fraud, combined with many levels of actual competence at the object level. The modern field of ML has mightily impressed me with AlphaGo and GPT and so on. I think that the "full scam" position is that this is entirely a consequence of increased compute and not ML expertise, and that basically there is not much expertise at all in these fields. I find this plausible but not at the 50% level. So just because there's evidence of anti-epistemic and adversarial behavior, this does not preclude real work from being done.
  • I do think it's pretty normal for projects to have marketing, run in an epistemically adversarial way, kept in arms length and bringing in resources.
  • I also think that sometimes very competent people are surrounding by distortionary forces. I think I should be able to come up with strong examples here, and I thought a bit about making the case for Thiel or Cummings (who've both shown ability to think clearly but also engaged in somewhat political narrative-building). Perhaps Hoover is an example? Still, I think that sometimes a project can be engaged adversarially with the outside world and still be competent at its work. But I don't think I've shown that strongly, and in most actual cases I am repulsed by projects that do the adversarial stuff and think it's delusional to be holding out hope. I also think it's especially delusional to think this about science. Science isn't supposed to be a place where the real conversation happens in private.


I think this post raises a straightforward and valid hypothesis to evaluate the field against as a whole. I don't think it's sufficiently detailed to convince me that the overall hypothesis holds. I do think it's a valuable conversation to have, it's such an important topic, especially for this community. I think this post is valuable, and I expect I will give it a small positive vote in the review, around +2 or +3.

Further Work

Here are some further questions I'd like to see discussed and answered, to get a better picture of this:

  • What are a few other examples of criticism of the current wave of AI hype, and how were they dealt with?
  • What do leaders of these projects say on this topic, and in response to criticism?
    • (I recall an FLI panel with Demis Hassabis on, where the on detailed argument he made about the decision to put more/less funding into AGI right now was saying that it would be easier for lots of groups to do it in the future as compute gets cheaper, so in order to have centralized control and be able to include time for safety we should push as fast as we can on AGI now. I don't think it's an unreasonable argument but I was hardly surprised to hear it coming from him.)
  • How open are the channels of communication with the field? How easy is it for an outsider to engage with the people in the field?
  • Who are the funders of AI? To what extent are they interested in public discourse around this subject?
    • (My guess is that the answer here is something like "the academic field and industry have captured the prestige associated with it so that nobody else is considered reasonable to listen to".)
  • What is the state of the object level arguments around the feasibility of AGI?
  • Does the behavior of the people who lead the field matches up with their claims?
  • What are some other megaprojects or fields with billions of dollars going into projects, and how are these dynamics playing out in those areas?
comment by countingtoten · 2019-07-14T02:23:00.304Z · LW(p) · GW(p)

Hubert Dreyfus, probably the most famous historical AI critic, published "Alchemy and Artificial Intelligence" in 1965, which argued that the techniques popular at the time were insufficient for AGI.

That is not at all what the summary says. Here is roughly the same text from the abstract:

Early successes in programming digital computers to exhibit simple forms of intelligent behavior, coupled with the belief that intelligent activities differ only in their degree of complexity, have led to the conviction that the information processing underlying any cognitive performance can be formulated in a program and thus simulated on a digital computer. Attempts to simulate cognitive processes on computers have, however, run into greater difficulties than anticipated. An examination of these difficulties reveals that the attempt to analyze intelligent behavior in digital computer language systematically excludes three fundamental human forms of information processing (fringe consciousness, essence/accident discrimination, and ambiguity tolerance). Moreover, there are four distinct types of intelligent activity, only two of which do not presuppose these human forms of information processing and can therefore be programmed. Significant developments in artificial intelligence in the remaining two areas must await computers of an entirely different sort, of which the only existing prototype is the little-understood human brain.

In case you thought he just meant greater speed, he says the opposite on PDF page 71. Here is roughly the same text again from a work I can actually copy and paste:

It no longer seems obvious that one can introduce search heuristics which enable the speed and accuracy of computers to bludgeon through in those areas where human beings use more elegant techniques. Lacking any a priori basis for confidence, we can only turn to the empirical results obtained thus far. That brute force can succeed to some extent is demonstrated by the early work in the field. The present difficulties in game playing, language translation, problem solving, and pattern recognition, however, indicate a limit to our ability to substitute one kind of "information processing** for another. Only experimentation can determine the extent to which newer and faster machines, better programming languages, and cleverer heuristics can continue to push back the frontier. Nonetheless, the dra- matic slowdown in the fields we have considered and the general failure to fulfill earlier predictions suggest the boundary may be near. Without the four assumptions to fall back on, current stagnation should be grounds for pessimism.

This, of course, has profound implications for our philosophical tradi- tion. If the persistent difficulties which have plagued all areas of artificial intelligence are reinterpreted as failures, these failures must be interpre- ted as empirical evidence against the psychological, epistemological, and ontological assumptions. In Heideggerian terms this is to say that if Western Metaphysics reaches its culmination in Cybernetics, the recent difficulties in artificial intelligence, rather than reflecting technological limitations, may reveal the limitations of technology.

If indeed Dreyfus meant to critique 1965's algorithms - which is not what I'm seeing, and certainly not what I quoted - it would be surprising for him to get so much wrong. How did this occur?

Replies from: Benquo
comment by Benquo · 2019-07-14T02:33:59.736Z · LW(p) · GW(p)
If indeed Dreyfus meant to critique 1965's algorithms - which is not what I'm seeing, and certainly not what I quoted

It seems to me like that's pretty much what those quotes say - that there wasn't, at that time, algorithmic progress sufficient to produce anything like human intelligence.

Replies from: countingtoten, jessica.liu.taylor
comment by countingtoten · 2019-07-14T07:10:57.841Z · LW(p) · GW(p)

Again, he plainly says more than that. He's challenging "the conviction that the information processing underlying any cognitive performance can be formulated in a program and thus simulated on a digital computer." He asserts as fact that certain types of cognition require hardware more like a human brain. Only two out of four areas, he claims, "can therefore be programmed." In case that's not clear enough, here's another quote of his:

since Area IV is just that area of intelligent behavior in which the attempt to program digital computers to exhibit fully formed adult intelligence must fail, the unavoidable recourse in Area III to heuristics which presuppose the abilities of Area IV is bound, sooner or later, to run into difficulties. Just how far heuristic programming can go in Area III before it runs up against the need for fringe consciousness, ambiguity tolerance, essential/inessential discrimination, and so forth, is an empirical question. However, we have seen ample evidence of trouble in the failure to produce a chess champion, to prove any interesting theorems, to translate languages, and in the abandonment of GPS.

He does not say that better algorithms are needed for Area IV, but that digital computers must fail. He goes on to falsely predict that clever search together with "newer and faster machines" cannot produce a chess champion. AFAICT this is false even if we try to interpret him charitably, as saying more human-like reasoning would be needed.

Replies from: Benquo
comment by Benquo · 2019-07-14T12:37:26.685Z · LW(p) · GW(p)

The doc Jessicata linked has page numbers but no embedded text. Can you give a page number for that one?

Unlike your other quotes, it at least seems to say what you're saying it says. But it appears to start mid-sentence, and in any case I'd like to read it in context.

Replies from: countingtoten
comment by countingtoten · 2019-07-14T16:51:35.950Z · LW(p) · GW(p)

Assuming you mean the last blockquote, that would be the Google result I mentioned which has text, so you can go there, press Ctrl-F, and type "must fail" or similar.

You can also read the beginning of the PDF, which talks about what can and can't be programmed while making clear this is about hardware and not algorithms. See the first comment in this family for context.

comment by jessicata (jessica.liu.taylor) · 2019-07-14T05:31:05.444Z · LW(p) · GW(p)

And also that the general methodology/assumptions/paradigm of the time was incapable of handling important parts of intelligence.

comment by Ben Pace (Benito) · 2020-12-09T01:52:24.368Z · LW(p) · GW(p)

This post was very helpful to me when I read it, in terms of engaging more with this hypothesis. The post isn't very rigorous and I think doesn't support the hypothesis very well, but nonetheless it was pretty helpful to engage with the perspective (I also found the comments valuable), so I'm nominating it for its positive effects for me personally.

comment by FactorialCode · 2019-07-12T16:57:19.476Z · LW(p) · GW(p)

I think a lot of the animosity that Gary Markus drew was less that some of his points were wrong, and more that he didn't seem to have a full grasp of the field before criticizing it. Here's an r/machinelearning thread on one of his papers. Granted, r/ML is not necessarily representative of the AI community, especially now, but you see some people agreeing with some of his points, and others claiming that he's not up to date with current ML research. I would recommend people take a look at the thread, to judge for themselves.

I'm also not inclined to take any twitter drama as strong evidence of the attitudes of the general ML community, mainly because twitter seems to strongly encourage/amplify the sort of argumentative shit-flinging pointed out in the post.

comment by Zvi · 2021-01-13T20:38:25.867Z · LW(p) · GW(p)

This was important to the discussions around timelines at the time, back when the talk about timelines felt central. This felt like it helped give me permission to no longer consider them as central, and to fully consider a wide range of models of what could be going on. It helped make me more sane, and that's pretty important.

It was also important for the discussion about the use of words and the creation of clarity. There's been a long issue of exactly when and where to use words like "scam" and "lie" to describe things - when is it accurate, when is it useful, what budgets does it potentially use up? How can we describe what we see in the world in a way that creates common knowledge if we can't use words that are literal descriptions? It's something I still struggle with, and this is where the key arguments got made. 

Thus, on reflection, I'd like to see this included.

comment by orthonormal · 2021-01-10T05:17:29.453Z · LW(p) · GW(p)

I liked the comments on this post more than I liked the post itself. As Paul commented, there's as much criticism of short AGI timelines as there is of long AGI timelines; and as Scott pointed out, this was an uncharitable take on AI proponents' motives.

Without the context of those comments, I don't recommend this post for inclusion.

Replies from: Benito
comment by Ben Pace (Benito) · 2021-01-10T05:43:39.137Z · LW(p) · GW(p)

My guess is we agree that talk of being able to build AGI soon has lead to substantial increased funding in the AGI space (e.g. involved in the acquisition of DeepMind and the $1billion from Microsoft to OpenAI)? Naturally it's not the sole reason for funding, but I imagine it was a key part of the value prop, given that both of them describe themselves as 'building AGI'.

Given that, I'm curious to what extent you think that such talk, if it was responsible, has been open for scrutiny or whether it's been systematically defended from skeptical analysis?

Replies from: orthonormal
comment by orthonormal · 2021-01-10T22:07:51.931Z · LW(p) · GW(p)

I agree about the effects of deep learning hype on deep learning funding, though I think very little of it has been AGI hype; people at the top level had been heavily conditioned to believe we were/are still in the AI winter of specialized ML algorithms to solve individual tasks. (The MIRI-sphere had to work very hard, before OpenAI and DeepMind started doing externally impressive things, to get serious discussion on within-lifetime timelines from anyone besides the Kurzweil camp.)

Maybe Demis was strategically overselling DeepMind, but I expect most people were genuinely over-optimistic (and funding-seeking) in the way everyone in ML always is.

comment by Vaniver · 2020-12-12T20:00:39.351Z · LW(p) · GW(p)

At the time, I argued pretty strongly against parts of this post, and I still think my points are valid and important. That said, I think in retrospect this post had a large impact; I think it kicked off several months of investigation of how language works and what discourse norms should be in the presence of consequences. I'm not sure it was the best of 2019, but it seems necessary to make sense of 2019, or properly trace the lineage of ideas?

comment by cousin_it · 2019-07-12T10:28:08.416Z · LW(p) · GW(p)

Which is not to say that modeling such technical arguments is not important for forecasting AGI. I certainly could have written a post evaluating such arguments, and I decided to write this post instead, in part because I don’t have much to say on this issue that Gary Marcus hasn’t already said.

Is he an AI researcher though? Wikipedia says he's a psychology professor, and his arXiv article criticizing deep learning doesn't seem to have much math. If you have technical arguments, maybe you could state them?

Replies from: jessica.liu.taylor
comment by jessicata (jessica.liu.taylor) · 2019-07-12T19:42:24.472Z · LW(p) · GW(p)

Yes he is, see his publications. (For technical arguments, see my response to Paul [LW(p) · GW(p)].)

comment by Eli Tyre (elityre) · 2021-05-02T05:15:06.032Z · LW(p) · GW(p)

The key sentiment of this post that I currently agree with:

  • There's a bit of a short timelines "bug" in the Berkeley rationalist scene, where short timelines have become something like the default assumption (or at least are not unusual). 
  • There don't seem to be strong, public reasons for this view. 
  • It seems like most people who are sympathetic to short timelines are sympathetic to it mainly as the result of a social proof cascade. 
  • But this is obscured somewhat, because some folks who opinions are being trusted, don't show their work (rightly or wrongly), because of info security considerations.
Replies from: habryka4, daniel-kokotajlo
comment by habryka (habryka4) · 2021-05-02T08:59:43.516Z · LW(p) · GW(p)

I think Gwern has now made a relatively decent public case? Or at least I feel substantially less confused about the basic arguments, which I think I can relatively accurately summarize as "sure seems like there is a good chance just throwing more compute at the problem will get us there", with then of course a lot of detail about why that might be the case.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-05-04T08:33:43.235Z · LW(p) · GW(p)

Is it really true that most people sympathetic to short timelines are thus mainly due to social proof cascade? I don't know any such person myself; the short-timelines people I know are either people who have thought about it a ton and developed detailed models, or people who just got super excited about GPT-3 and recent AI progress basically. The people who like to defer to others pretty much all have medium or long timelines, in my opinion, because that's the respectable/normal thing to think.

comment by zdgroff · 2019-07-12T01:51:57.928Z · LW(p) · GW(p)

This reminds me of related questions around slowing down AI, discussing AI with a mass audience, or building public support for AI policy (e.g., [EA · GW] A lot of the arguments against doing these things have this same motivation that we are concerned about the others for reasons that are somewhat abstruse. Where would these "sociopolitical" considerations get us on these questions?

comment by Fluttershy · 2019-07-12T03:03:12.359Z · LW(p) · GW(p)

Yeah, 10/10 agreement on this. Like it'd be great if you could "just" donate to some AI risk org and get the promised altruistic benefits, but if you actually care about "stop all the fucking suffering I can", then you should want to believe AI risk research is a scam if it is a scam.

At which point you go oh fuck, I don't have a good plan to save the world anymore. But not having a better plan shouldn't change your beliefs on whether AI risk research is effective.

Replies from: Flaglandbase
comment by Flaglandbase · 2021-05-04T04:38:28.241Z · LW(p) · GW(p)

Quite a few folks "believe" in a rapid AI timeline because it's their only hope to escape a horrible fate. They may have some disease that's turning their brain into mush from the inside out, and know there is exactly zero chance the doctors will figure something out within the next century. Only superhuman intelligence could save them. My impression is that technological progress is MUCH slower than most people realize. 

Replies from: Mitchell_Porter, adele-lopez-1, Teerth Aloke, Teerth Aloke
comment by Mitchell_Porter · 2021-05-04T08:18:43.110Z · LW(p) · GW(p)

Can you name any of these people? I can't think of anyone who's saying, "I'm dying, so let's cure death / create AGI now". Mostly what people do, is get interested in cryonics. 

comment by Adele Lopez (adele-lopez-1) · 2021-05-04T20:51:26.107Z · LW(p) · GW(p)

Really? My impression is that rapid AI timelines make things increasingly "hopeless" because there's less time to try to prevent getting paperclipped, and that this is the default view of the community.

comment by Teerth Aloke · 2021-05-04T12:38:10.525Z · LW(p) · GW(p)

I tilt towards rapid timeline - but I promise, my brain is not turning into mush. I have no terminal disease

comment by Teerth Aloke · 2021-05-05T11:52:39.166Z · LW(p) · GW(p)


comment by avturchin · 2019-07-11T11:43:18.603Z · LW(p) · GW(p)

Why not treat this in Bayesian way? There is 50 per cent a priory credence in short time line, and 50 per cent on long one. In that case, we still need to get working AI safety solutions ASAP, even if there is 50 per cent chance that the money on AI safety will be just lost? (Disclaimer: I am not paid for any AI safety research, except Good AI prize of 1500 USD, which is not related to timelines.)

Replies from: shminux
comment by shminux · 2019-07-12T06:51:12.817Z · LW(p) · GW(p)

why is the above comment so badly downvoted?

Replies from: Viliam, DanielFilan
comment by Viliam · 2019-07-13T21:56:15.690Z · LW(p) · GW(p)

Well, it is not a "Bayesian way" to take a random controversial statement and say "the priors are 50% it's true, and 50% it's false".

(That would be true only if you had zero knowledge about... anything related to the statement. Or if the knowledge would be so precisely balanced the sum of the evidence would be exactly zero.)

But the factual wrongness is only a partial answer. The other part is more difficult to articulate, but it's something like... if someone uses "your keywords" to argue a complete nonsense, that kinda implies that you are expected to be so stupid that you would accept the nonsense as long as it is accompanied with the proper keywords... which is quite offensive.

comment by DanielFilan · 2019-07-12T07:07:53.432Z · LW(p) · GW(p)
  • Doesn't engage with the post's arguments.
  • I think that it's wrong to assume that the prior on 'short' vs 'long' timelines should be 50/50.
  • I think that it's wrong to just rely on a prior, when it seems like one could obtain relevant evidence.