Announcing the Nuclear Risk Forecasting Tournament 2021-06-16T16:16:41.402Z
Overview of Rethink Priorities’ work on risks from nuclear weapons 2021-06-11T20:05:18.103Z
Notes on effective-altruism-related research, writing, testing fit, learning, and the EA Forum 2021-03-28T23:43:19.538Z
Notes on "Bioterror and Biowarfare" (2006) 2021-03-02T00:43:22.299Z
Notes on Henrich's "The WEIRDest People in the World" (2020) 2021-02-14T08:40:49.243Z
Notes on Schelling's "Strategy of Conflict" (1960) 2021-02-10T02:48:25.847Z
Please take a survey on the quality/impact of things I've written 2020-08-29T10:39:33.033Z
Good and bad ways to think about downside risks 2020-06-11T01:38:46.646Z
Failures in technology forecasting? A reply to Ord and Yudkowsky 2020-05-08T12:41:39.371Z
Database of existential risk estimates 2020-04-20T01:08:39.496Z
[Article review] Artificial Intelligence, Values, and Alignment 2020-03-09T12:42:08.987Z
Feature suggestion: Could we get notifications when someone links to our posts? 2020-03-05T08:06:31.157Z
Understandable vs justifiable vs useful 2020-02-28T07:43:06.123Z
Memetic downside risks: How ideas can evolve and cause harm 2020-02-25T19:47:18.237Z
Information hazards: Why you should care and what you can do 2020-02-23T20:47:39.742Z
Mapping downside risks and information hazards 2020-02-20T14:46:30.259Z
What are information hazards? 2020-02-18T19:34:01.706Z
[Link and commentary] The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? 2020-02-16T19:56:15.963Z
Value uncertainty 2020-01-29T20:16:18.758Z
Using vector fields to visualise preferences and make them consistent 2020-01-28T19:44:43.042Z
Risk and uncertainty: A false dichotomy? 2020-01-18T03:09:18.947Z
Can we always assign, and make sense of, subjective probabilities? 2020-01-17T03:05:57.077Z
MichaelA's Shortform 2020-01-16T11:33:31.728Z
Moral uncertainty: What kind of 'should' is involved? 2020-01-13T12:13:11.565Z
Moral uncertainty vs related concepts 2020-01-11T10:03:17.592Z
Morality vs related concepts 2020-01-07T10:47:30.240Z
Making decisions when both morally and empirically uncertain 2020-01-02T07:20:46.114Z
Making decisions under moral uncertainty 2019-12-30T01:49:48.634Z


Comment by MichaelA on Announcing the Nuclear Risk Forecasting Tournament · 2021-06-18T08:31:44.403Z · LW · GW

Not sure what you mean by that being unverifiable? The question says:

This question resolves as the total number of nuclear weapons (fission or thermonuclear) reported to be possessed across all states on December 31, 2022. This includes deployed, reserve/ nondeployed, and retired (but still intact) warheads, and both strategic and nonstrategic weapons.

Resolution criteria will come from the Federation of American Scientists (FAS). If they cease publishing such numbers before resolution, resolution will come from the Arms Control Association or any other similar platform.

FAS update their estimates fairly regularly - here are their estimates as of May (that link is also provided earlier in the question text).

Though I do realise now that they're extremely unlikely to update their numbers on December 31 specifically, and maybe not even in December 2022 at all. I'll look into the best way to tweak the question in light of that. If that's what you meant, thanks for the feedback! 

(I do expect there'll be various minor issues like that, and we hope the community catches them quickly so we can tweak the questions to fix them. This was also one reason for showing some questions before they "open".)

Comment by MichaelA on [Part 1] Amplifying generalist research via forecasting – Models of impact and challenges · 2021-04-08T00:50:07.597Z · LW · GW

That makes sense to me.

But it seems like you're just saying the issue I'm gesturing at shouldn't cause mis-calibration or overconfidence, rather than that it won't reduce the resolution/accuracy or the practical usefulness of a system based on X predicting what Y will think? 

Comment by MichaelA on [deleted post] 2021-03-28T11:38:53.921Z

(Update: I just saw the post Welcome to LessWrong!, and I think that that serves my needs well.)

Comment by MichaelA on [deleted post] 2021-03-28T04:26:33.459Z

I think it's good that a page like this exists; I'd want to be able to use it as a go-to link when suggesting people engage with or post on LessWrong, e.g. in my post on Notes on EA-related research, writing, testing fit, learning, and the Forum.

Unfortunately, it seems to me that this page isn't well suited to that purpose. Here are some things that seem like key issues to me (maybe other people would disagree):

  • This introduction seems unnecessarily intimidating, non-welcoming, and actually (in my perception) somewhat arrogant. For example:
    • "If you have no familiarity with the cultural articles and other themes before you begin interacting, your social experiences are likely to be highly awkward. The rationalist way of thinking and subculture is extremely, extremely complex. To give you a gist of how complex it is and what kind of complexity you'll encounter:"
      • This feels to me like saying "We're very special and you need to do your homework to deeply understand us before interacting at all with us, or you're just wasting our time and we'll want you to go away."
      • I do agree that the rationalist culture can take some getting used to, but I don't think it's far more complex or unusual than the cultures in a wide range of other subcultures, and I think it's very often easiest to get up to speed with a culture partly just by interacting with it.
      • I do agree that reading parts of the Sequences is useful, and that it's probably good to gently encourage new users to do that. But I wouldn't want to make it sound like it's a hard requirement or like they have to read the whole thing. And this passage will probably cause some readers to infer that, even if it doesn't outright say it. (A lot of people lurk more than they should, have imposter syndrome, etc.)
        • I started interacting on LessWrong before having finished the Sequences (though I'd read some), and I think I both got and provided value from those interactions.
      • Part of this is just my visceral reaction to any group saying their way of thinking and subculture is "extremely, extremely complex", rather than me having explicit reasons to think that that's bad.
  • I wrote all of that before reading the next paragraphs, and the next paragraphs very much intensified my emotional feeling of "These folks seem really arrogant and obnoxious and I don't want to ever hang out with them"
    • This is despite the fact that I've actually engaged a lot on LessWrong, really value a lot about it, rank the Sequences and HPMOR as among my favourite books, etc.
  • Maybe part of this is that this is describing what rationalists aim to be as if all rationalists always hit that mark.
    • Rationalists and the rationalist community often do suffer from the same issues other people and communities do. This was in fact one of the really valuable things Eliezer's posts pointed out (e.g., being wary of trending towards cult-hood).

Again, these are just my perceptions. But FWIW, I do feel these things quite strongly. 

Here are a couple much less important issues:

  • I don't think I'd characterise the Sequences as "mostly like Kahneman, but more engaging, and I guess with a bit of AI etc." From memory, a quite substantial chunk of the sequences - and quite a substantial chunk of their value - had to do with things other than cognitive biases, e.g. what goals one should form, why, how to act on them, etc. Maybe this is partly a matter of instrumental rather than just epistemic rationality.
    • Relatedly, I think this page presents a misleading or overly narrow picture of what's distinctive (and good!) about rationalist approaches to forming beliefs and choosing decisions when it says "There are over a hundred cognitive biases that humans are affected by that rationalists aim to avoid. Imagine you added over one hundred improvements to your way of thinking."
  • "Kahneman is notoriously dry" feels like an odd thing to say. Maybe he is, but I've never actually heard anyone say this, and I've read one of his books and papers and watched one of his talks and found them all probably somewhat more engaging than similar things from the average scientist. (Though maybe this was more the ideas themselves, rather than the presentation.)

(I didn't read "Website Participation Intro or "Why am I being downvoted?"", because it was unfortunately already clear that I wouldn't want to link to this page when aiming to introduce people to LessWrong and encourage them to read, comment, and/or post there.)

Comment by MichaelA on Modernization and arms control don’t have to be enemies. · 2021-03-18T02:13:51.958Z · LW · GW

Authoritarian closed societies probably have an advantage at covert racing, at devoting a larger proportion of their economic pie to racing suddenly, and at artificially lowering prices to do so. Open societies have probably a greater advantage at discovery/the cutting edge and have a bigger pie in the first place (though better private sector opportunities compete up the cost of defense engineering talent).

These are interesting points which I hadn't considered - thanks!

(Your other point also seems interesting and plausible, but I feel I lack the relevant knowledge to immediately evaluate it well myself.)

Comment by MichaelA on Epistemic Warfare · 2021-03-18T02:11:23.841Z · LW · GW

Interesting post.

You or other readers might also find the idea of epistemic security interesting, as discussed in the report "Tackling threats to informed decisionmaking in democratic societies: Promoting epistemic security in a technologically-advanced world". The report is by researchers at CSER and some other institutions. I've only read the executive summary myself. 

There's also a BBC Futures article on the topic by some of the same authors.

Comment by MichaelA on Modernization and arms control don’t have to be enemies. · 2021-03-17T09:38:38.195Z · LW · GW

While I am not sure I agree fully with the panel, an implication to be drawn from their arguments is that from an equilibrium of treaty compliance, maintaining the ability to race can disincentivize the other side from treaty violation: it increases the cost to the other side of gaining advantage, and that can be especially decisive if your side has an economic advantage.

This is an idea/argument I hadn't encountered before, and seems plausible, so it seems valuable that you shared it.

But it seems to me that there's probably an effect pushing in the opposite direction: 

  • Even from an equilibrium of treaty compliance, if one state has the ability to race, that might incentivise the other side to develop the ability to race as well. That wouldn't necessarily require treaty violation. 
  • Either or especially both sides having the ability to race can increase risks if they could race covertly until they have gained an advantage, or race so quickly that they gain an advantage before the other side can get properly started, or if the states don't always act as rational cohesive entities (e.g., if leaders are more focused on preventing regime change than preventing millions of deaths in their own country), or probably under other conditions.
    • I think the term "arms race stability" captures the sort of thing I'm referring to, though I haven't yet looked into the relevant theoretical work much.
  • In contrast, if we could reach a situation where neither side currently had the ability to race, that might be fairly stable. This could be true if building up that ability would take some time and be detectable early enough to be responded to (by sanctions, a targeted strike, the other side building up their own ability, or whatever).

Does this seem accurate to you?

I guess an analogy could be to whether you'd rather be part of a pair of cowboys who both have guns but haven't drawn them (capability but not yet racing), or part of a pair who don't have guns but could go buy one. It seems like we'd have more opportunities for de-escalation, less risk from nerves and hair-triggers, etc. in the latter scenario than the former.

I think this overlaps with some of Schelling's points in The Strategy of Conflict (see also my notes on that), but I can't remember for sure.

Comment by MichaelA on The Future of Nuclear Arms Control? · 2021-03-17T08:42:51.524Z · LW · GW

Thanks for this thought-provoking post. I found the discussion of how political warfare may have influenced nuclear weapons activism particularly interesting.

Since large yield weapons can loft dust straight to the stratosphere, they don’t even have to produce firestorms to start contributing to nuclear winter: once you get particles that block sunlight to an altitude that heating by the sun can keep them lofted, you’ll block sunlight a very long time and start harming crop yields.

I think it's true that this could "contribute" to nuclear winter, but I don't think I've seen this mentioned as a substantial concern in the nuclear winter papers I've read. E.g., I don't think I've seen any papers suggest that nuclear winter could occur solely due to that effect, without there being any firestorms, or that that effect could make the climate impacts 20% worse than would occur with firestorms alone. Do you have any citations on hand for this claim? 

Comment by MichaelA on Notes on "Bioterror and Biowarfare" (2006) · 2021-03-02T00:44:18.714Z · LW · GW

Final thoughts on whether you should read this book

  • I found the book useful
    • The parts I found most useful were (a) the early chapters on the history of biowarfare and bioterrorism and (b) the later chapters on attempts to use international law to reduce risks from bioterror and biowarfare
  • I found parts of the book hard to pay attention to and remember information from
    • In particular, the middle chapters on various types and examples of pathogens
      • But this might just be a “me problem”. Ever since high school, I’ve continually noticed that I seem to have a harder time paying attention to and remembering information about biology than information from other disciplines. (I don’t understand what that would be the case, and I’m not certain it’s actually true, but it has definitely seemed true.)
  • I’m not sure how useful this book would be to someone who already knows a lot about bioterror, biowarfare, and/or chemical weapons
  • I’m not sure how useful this book would be to someone who doesn’t have much interest in the topics of bioterror, biowarfare, and/or chemical weapons
    • But I’m inclined to think most longtermists should read consume at least one book’s worth of content from experts on those topics
    • And I think the book could be somewhat useful for understanding WMDs, international relations, and international law more generally
  • There might be better books on the topic
    • In particular, it’s possible a more recent book would be better?
Comment by MichaelA on Notes on "Bioterror and Biowarfare" (2006) · 2021-03-02T00:43:59.747Z · LW · GW

My Anki cards

Note that:

  • It’s possible that some of these cards include mistakes, or will be confusing or misleading out of context.
  • I haven’t fact-checked Dando on any of these points.
  • Some of these cards are just my own interpretations - rather than definitely 100% parroting what the book is saying
  • The indented parts are the questions, the answers are in "spoiler blocks" (hover over them to reveal the text), and the parts in square brackets are my notes-to-self.

Dando says ___ used biological weapons in WW1, but seemingly only against ___.

the Germans and perhaps the French;

draft animals (e.g. horses), not humans

[This was part of sabotage operations, seemingly only/especially in the US, Romania, Norway, and Argentina. The US and Romania were neutral at the time; not sure whether Norway and Argentina were.]

Dando says the 1925 Geneva Protocol prohibits ___, but not ___, of chemical and biological weapons, and that many of the parties to the Protocol entered reservations to their agreement to make it clear that ___.


Development and stockpiling;

Although they would not use such weapons first, they were prepared to use them in retaliation if such weapons were used first against them

[And a number of offensive bio weapons programs were undertaken by major states in the interwar period. Only later in the 20th century were further arms control restrictions placed on chem and bio weapons.]

Japan's offensive biological warfare program was unique in that ___. The program probably caused ___.

It used human experimentation to test biological agents;

The deaths of thousands of Chinese people

[This program ran from 1931-1945]

Dando mentions 6 countries as having had "vigorous" offensive biological weapons programs during WW2:

Japan, The Soviet Union, France, the UK, the US, Canada

[He doesn't explicitly say these were the only countries with such programs, but does seem to imply that, or at least that no other countries had similarly large programs

He notes that Germany didn't have such a program.

France's program was interrupted by the German invasion in 1940, but was resumed after WW2.]

Dando suggests that the main or most thoroughly prepared type of British WW2 biological warfare weapon/plan was...

To drop millions of cattle cakes infected with anthrax spores onto German fields, to wipe out cattle and thus deal an economic blow to Germany's overstretched agricultural system

[The British did make 5 million of these cakes.]

Dando says that there are 7 countries which definitely had offensive biological weapons programs in the second half of the 20th century:

The US, the UK, the Soviet Union, Canada, France, South Africa, Iraq

[He also says there've been numerous accusations that other countries had such programs as well, but that there isn't definite information about them.]

Dando says that 3 countries continued to have offensive biological weapons after becoming the depository for, ratifying, and/or signing the BTWC:

Soviet Union, South Africa, and Iraq

[This was then illegal under international law. Prior to the BTWC, having such a program wasn't illegal - only the use of bioweapons was.

I think the other 4 states that had had such programs between WW2 and 1972 stopped at that point or before then.]

During WW2, the US offensive biological weapons program was developing anti-___, anti-___, and anti-___ weapons.

personnel; animal; plant

[And the US was considering using anti-plant weapons against Japanese rice production.]

What major change in high-level US policy regarding chemical and biological weapons does Dando suggest occurred around 1956?

What does he suggest this was partly a reaction to?

Changing from a retaliation-only policy for BW and CW to a policy stating that the US would be prepared to use BW or CW in a general war for the purposes of enhancing military effectiveness [and the decision would be reserved for the president];

Soviet statements in 1956 that chemical and biological weapons would be used in future wars for the purposes of mass destruction

[Dando notes that the retaliation-only policy was in line with the US's signature of the 1925 Geneva protocol, but also that the US didn't actually ratify the Geneva protocol till 1975; until then it was only a signatory.]

Dando says an army report says the origin of the US's shift (under Nixon) to renouncing biological and chemical weapons dates from...

Criticism of US application of chemical herbicides and riot control agent(s) in Vietnam starting in the 1960s

[I think this means criticism/opposition by the public.]

The UK's work on an offensive biological weapons capability had been abandoned by...


[According to a report cited by Dando.

Though Dando later indicates the UK restarted some of this work in 1961, I think particularly/only to find a nonlethal incapacitating chemical weapon.]

Dando says that, at the end of WW2, the UK viewed biological weapons as...

On a par with nuclear weapons

["Only when the UK obtained its own nuclear systems did interest in biological weapons decline.”

I don't know precisely what Dando means by this.]

South Africa had an offensive biological weapons program during...

The later stages of the Apartheid regime

[But it was terminated before the regime change.]

What was the scale of South Africa's offensive biological weapons program? What does its main purpose seem to have been?

Relatively small (e.g. smaller than Iraq's program)

Finding means of assassinating the Apartheid regime's enemies

[Elsewhere, Dando suggests that original motivations for the program - or perhaps for some chemical weapons work? - also included the Angola war and a desire to find crowd control agents.]

What has Iraq stated about authority (as of ~1991) to launch its chemical and biological weapons?

Authority was pre-delegated to regional commanders if Baghdad was hit with nuclear weapons

[UNSCOM has noted that that doesn't exclude other forms of use, and doesn't constitute a proof of a retaliation-only policy.]

The approach to chemical weapons that Iraq pursued was ___, in contrast to a Western approach of ___.

Production and rapid use;

Production and stockpiling

[I'm guessing that this means that Iraq pursued the ability to produce chemical weapons shortly before they were needed, rather than having a pre-made, long-lasting stockpile of more stable versions.

Dando says a similar approach could've been taken towards biological weapons.]

Dando says that the main lesson from the Iraqi biological weapons program is that...

A medium-sized country without great scientific and technical resources was, within a few years, able to reach the stage of weaponising a range of deadly biological agents

What kind of vaccine does Dando say South Africa's biological weapons program tried to find? What does someone who had knowledge of the program say the vaccine might've been used for, if it had been found?

An anti-fertility vaccine;

Administering to black women without their knowledge

Dando lists 6 different types of biological agents that could be used for biological weapons:

Bacteria; Viruses; Toxins; Bioregulators; Protozoa; Fungi

[I'm not sure whether this was meant to be exhaustive, nor whether I'm right to say these are "different types of biological agents".

There's also a chance I forgot one of the types he mentioned.]

Dando says that vaccination during a plague epidemic would not be of much help, because...

Immunity takes a month to build up

[Note that I haven't fact-checked this, and that, for all I know, the situation may be different with other pathogens or newer vaccines.]

In the mid twentieth century, ___ tried to use plague-infected fleas to cause an outbreak among ___.

Japan; the Chinese

Dando notes at least 3 factors that could make the option of biowarfare or bioterrorism against animal agriculture attractive:

1. The animals are densely packed in confined areas

2. The animals reared are often from very limited genetic stock (so that a large percentage of them could succumb to a single strain of a pathogen)

3. Many/all pathogens that would be used don't infect humans (reducing risks to the people involved in producing and using the pathogens)

[Dando implies that that third point is more relevant to bioterrorism than biowarfare, but doesn't say why. I assume it's because terrorists will tend to have fewer skills and resources than military programs, making them more vulnerable to accidents.]

What proportion of state-level offensive biological weapons programs (of which we have knowledge) "carefully investigated anti-plant attacks"?

Nearly all

In the 1990s, the US OTA concluded that the cheapest overt production route for 1 nuclear bomb per year, with no international controls, would cost __.

They also concluded that a chemical weapons arsenal for substantial military capability would cost __.

They concluded that a large biological weapons arsenal may cost __.

~$200 million;

$10s of millions;

Less than $10 million

[I'm unsure precisely what this meant.

I assume the OTA thought a covert route for nuclear weapons, with international controls, would be more expensive than the overt route with no international controls.]

Efforts in the 1990s to strengthen the BWC through agreement of a verification protocol eventually failed in 2001 due to the opposition from which country?

The United States

The BTWC was opened for signature in __, and entered into force in __.

1972; 1975

Dando highlights two key deficiencies of the BTWC (at least as of it entering into force in 1975):

1. There was a lack of verification measures

2. No organisation had been put in place to take care of the convention, of its effective implementation, and of its development between review conferences

[Dando notes that, in contrast to 2, there was a large organisation associated with the Chemical Weapons Convention.

Wikipedia suggests that a (very small) Implementation Support Unit for the BTWC was finally created in 2006.]

Dando highlights a US-based stakeholder as being vocally opposed to the ideas that were proposed for verifying compliance with the BTWC:

The huge US pharmaceutical industry and its linked trade associations

[I think Dando might've been talking about opposition to inspections in particular

Dando implies that this contributed to US executive branch being lukewarm on or sort-of opposed to these verification ideas.]

Comment by MichaelA on Notes on "Bioterror and Biowarfare" (2006) · 2021-03-02T00:43:40.576Z · LW · GW

See also:

Comment by MichaelA on [Part 1] Amplifying generalist research via forecasting – Models of impact and challenges · 2021-02-28T08:08:03.489Z · LW · GW

A final thought that came to mind, regarding the following passage:

It seems possible for person X to predict a fair number of a more epistemically competent person Y’s beliefs -- even before person X is as epistemically competent as Y. And in that case, doing so is evidence that person X is moving in the right direction.

I think that that's is a good and interesting point.

But I imagine there would also be many cases in which X develops an intuitive ability to predict Y's beliefs quite well in a given set of domains, but in which that ability doesn't transferring to new domains. It's possible that this would be because X's "black box" simulation of Y's beliefs is more epistemically competent than Y in this new domain. But it seems more likely that Y is somewhat similarly epistemically competent in this new domain as in the old domain, but has to draw on different reasoning processes, knowledge, theories, intuitions, etc., and X's intuitions aren't calibrated for how Y is now thinking. 

I think we could usefully think of this issue as a question of robustness to distributional shift.

I think the same issue could probably also occur even if X has a more explicit process for predicting Y's beliefs. E.g., even if X believes they understand what sort of sources of information Y considers and how Y evaluates it and X tries to replicate that (rather than just trying to more intuitively guess what Y will say), the process X uses may not be robust to distributional shift. 

But I'd guess that more explicit, less "black box" approaches for predicting what Y will say will tend to either be more robust to distributional shift or more able to fail gracefully, such as recognising that uncertainty is now much higher and there's a need to think more carefully.

(None of this means I disagree with the quoted passage; I'm just sharing some additional thoughts that came to mind when I read it, which seem relevant and maybe useful.)

Comment by MichaelA on [Part 1] Amplifying generalist research via forecasting – Models of impact and challenges · 2021-02-28T07:59:32.348Z · LW · GW

Here's a second thought that came to mind, which again doesn't seem especially critical to this post's aims...

You write:

Someone who can both predict my beliefs and disagrees with me is someone I should listen to carefully. They seem to both understand my model and still reject it, and this suggests they know something I don’t.

I think I understand the rationale for this statement (though I didn't read the linked Science article), and I think it will sometimes be true and important. But I think that those sentences might overstate the point. In particular, I think that those sentences implicitly presume that this other person is genuinely primarily trying to form accurate beliefs, and perhaps also that they're doing so in a way that's relatively free from bias.

But (almost?) everyone is at least sometimes primarily aiming (perhaps unconsciously) at something other than forming accurate beliefs, even when it superficially looks like they're aiming at forming accurate beliefs. For example, they may be engaging in "ideologically motivated cognition[, i.e.] a form of information processing that promotes individuals’ interests in forming and maintaining beliefs that signify their loyalty to important affinity groups". The linked study also notes that "subjects who scored highest in cognitive reflection were the most likely to display ideologically motivated cognition". 

So I think it might be common for people to be able to predict my beliefs and disagree with me, but with their disagreement not being based on knowing more or having better reasoning process but rather finding ways to continue to hold beliefs that they're (in some sense) "motivated" to hold for some other reason.

Additionally, some people may genuinely be trying to form accurate beliefs, but with unusually bad epistemics / unusually major bias. If so, they may be able to predict my beliefs and disagree with me, but with their disagreement not being based on knowing more or having better reasoning process but rather being a result of their bad epistemics / biases. 

Of course, we should be very careful with assuming that any of the above is why a person disagrees with us! See also this and this.

The claims I'd more confidently agree with are:

Someone who can both predict my beliefs and disagrees with me might be someone I should listen to carefully. They seem to both understand my model and still reject it, and this suggests they might know something I don’t (especially if they seem to be genuinely trying to form accurate beliefs and to do so via a reasonable process).

(Or maybe having that parenthetical at the end would be bad via making people feel licensed to dismiss people who disagree with them as just biased.)

Comment by MichaelA on [Part 1] Amplifying generalist research via forecasting – Models of impact and challenges · 2021-02-28T07:44:31.938Z · LW · GW

Thanks for this and its companion post; I found the two posts very interesting, and I think they'll usefully inform some future work for me.

A few thoughts came to mind as I read, some of which can sort-of be seen as pushing back against some claims, but in ways that I think aren't very important and that I expect you've already thought about. I'll split these into separate comments.

Firstly, as you note, what you're measuring is how well predictions match a proxy for the truth (the proxy being Elizabeth's judgement), rather than the truth itself. Something I think you don't explicitly mention is that:

  1. Elizabeth's judgement may be biased in some way (rather than just randomly erring), and
  2. The network-based forecasters' judgements may be biased in a similar way, and therefore
  3. This may "explain away" part of the apparent value of the network-based forecasters' predictions, along with part of their apparent superiority over the online crowdworkers' predictions.

E.g., perhaps EA-/rationality-adjacent people are biased towards disagreeing with "conventional wisdom" on certain topics, and this bias is somewhat shared between Elizabeth and the network-based forecasters. (I'm not saying this is actually the case; it's just an example.)

You make a somewhat similar point in the Part 2 post, when you say that the online crowdworkers:

were operating under a number of disadvantages relative to other participants, which means we should be careful when interpreting their performance. [For example, the online crowdworkers] did not know that Elizabeth was the researcher who created the claims and would resolve them, and so they had less information to model the person whose judgments would ultimately decide the questions.

But that is about participants' ability to successfully focus on predicting what Elizabeth will say, rather than their ability to accidentally be biased in the same way as Elizabeth when both are trying to make judgements about the ground truth.

In any case, I don't think this matters much. One reason is that this "shared bias" issue probably at most "explains away" a relatively small fraction of the apparent value of the network-adjacent forecasters' predictions, probably without tipping the balance of whether this sort of set-up is worthwhile. Another reason is that there may be ways to mitigate this "shared bias" issue.

Comment by MichaelA on Notes on Schelling's "Strategy of Conflict" (1960) · 2021-02-12T09:26:06.163Z · LW · GW

Good idea! I didn't know about that feature.

I've now edited the post to use spoiler-blocks (though a bit messily, as I wanted to do it quickly), and will use them for future lazy-Anki-card-notes-posts as well. 

Comment by MichaelA on Notes on Schelling's "Strategy of Conflict" (1960) · 2021-02-12T09:17:42.974Z · LW · GW

I didn't add that tag; some other reader did. 

And any reader can indeed downvote any tag, so if you feel that that tag shouldn't be there, you could just downvote it. 

Unless you feel that the tag shouldn't be there but aren't very confident about that, and thus wanted to just gently suggest that maybe the tag should be removed - like putting in a 0.5 vote rather than a full one. But that doesn't seem to match the tone of your comment.

That said, it actually does seem to me that this post fairly clearly does match the description for that tag; the exercise is using these Anki cards as Anki cards. People can find a link to download these cards in the Anki card file format here. (I've now added that link in the body of the post itself; I guess I should've earlier.)


As a meta comment: For what it's worth, I feel like your comment had an unnecessarily snarky tone, at least to my eye. I think you could've either just downvoted the tag, or said the same thing in a way that sounds less snarky. That said:

  • It's very possible (even probable?) that you didn't intend to be snarky, and tht this is just a case of tone getting misread on the internet
  • And in any case, this didn't personally bug me, partly because I've posted on LessWrong and the EA Forum a lot.
    • But I think if I was newer to the sites or to posting, this might leave a bad taste in my mouth and make me less inclined to post in future. (Again, I'm not at all trying to say this was your intent!)

(Edited to add: Btw, I wasn't the person who downvoted your comment, so that appears to be slightly more evidence that your comment was at least liable to be interpreted as unnecessary and snarky - although again I know that that may not have been your intention.)

Comment by MichaelA on Notes on Schelling's "Strategy of Conflict" (1960) · 2021-02-11T23:53:52.893Z · LW · GW

Yeah, I definitely agree that that's a good idea with any initialisations that won't already be known to the vast majority of one's readers (e.g., I wouldn't bother with US or UK, but would with APA). In this case, I just copied and pasted the post from the EA Forum, where I do think the vast majority of readers would know what "EA" means - but I should've used the expanded form "effective altruism" the first time in the LessWrong version. I've now edited that. 

Comment by MichaelA on Notes on Schelling's "Strategy of Conflict" (1960) · 2021-02-10T02:54:51.968Z · LW · GW

Here's a comment I wrote on the EA Forum version of this post, which I'm copying here as I'd be interested on people's thoughts on the equivalent questions in the context of LessWrong:

Meta: Does this sort of post seem useful? Should there be more posts like this?

I previously asked Should pretty much all content that's EA-relevant and/or created by EAs be (link)posted to the Forum? I found Aaron Gertler's response interesting and useful. Among other things, he said:

Eventually, we'd like it to be the case that almost all well-written EA content exists on the Forum somewhere.


I meant "quite EA-relevant and well-written". I don't especially care whether the content is written by community members, though I suppose that's slightly preferable (as community members are much more likely to respond to comments on their work).


A single crosspost with a bit of context from the author -- e.g. a few sentences each of summary/highlights, commentary, and action items/takeaways -- seems better to me than three or four crossposts with no context at all. In my view, the best Forum content tends to give busy people a quick way to decide whether to read further.

And I read a lot of stuff that I think it could be useful for at least some other EAs to read, and that isn't (link)posted to the Forum. So Aaron's comments, combined with my own thinking and some comments from other people, make me think it'd be good for me to make linkposts for lots of that stuff if there was a way to do it that took up very little of my time.

Unfortunately, writing proper book reviews, or even just notes that are geared for public consumption, for all of those things I read would probably take a while. 

But, starting about a month ago, I now make Anki cards for myself anyway during most of the reading I do. So maybe I should just make posts sort-of like this one for most particularly interesting things I read? And maybe other people could start doing that too?

A big uncertainty I have is how often the cards I make myself would be able to transmit useful ideas even to people who (a) aren't me and (b) didn't read the thing I read, and how often they'd do that with an efficiency comparable to people just finding and reading useful sources themselves directly. Another, related uncertainty is whether there'd be any demand for posts like this. 

So I'd be interested in people's thoughts on the above.

Comment by MichaelA on Notes on Schelling's "Strategy of Conflict" (1960) · 2021-02-10T02:54:33.856Z · LW · GW

Note: If you found this post interesting, you may also be interested in my Notes on "The Bomb: Presidents, Generals, and the Secret History of Nuclear War" (2020), or (less likely) Notes on The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous. (The latter book has a very different topic; I just mention it as the style of post is the same.)

Comment by MichaelA on Clarifying some key hypotheses in AI alignment · 2021-01-25T00:27:28.899Z · LW · GW

To your first point...

My impression is that there is indeed substantially less literature on misuse risk and structural risk, compared to accident risk, in relation to AI x-risk. (I'm less confident when it comes to a broader set of negative outcomes, not just x-risks, but that's also less relevant here and less important to me.) I do think that that might the sort of work this post does less interesting if done in relation to those less-discussed types of risks, since there fewer disagreements have been revealed, so there's less to analyse and summarise. 

That said, I still expect interesting stuff along these lines could be done on those topics. It just might be a quicker job with a smaller output than this post. 

I collected a handful of relevant sources and ideas here. I think someone reading those things and providing a sort of summary, analysis, and/or mapping could be pretty handy, and might even be doable in just a day or so of work. It might also be relatively easy to provide more "novel ideas" in the course of that work that it would've been for your post, since misuse/structural risks seem like less charted territory. 

(Unfortunately I'm unlikely to do this myself, as I'm currently focused on nuclear war risk.)


A separate point is that I'd guess that one reason why there's less work on misuse/structural AI x-risk than on accidental AI x-risk is that a lot of people aren't aware of those other categories of risks, or rarely think about them, or assume the risks are much smaller. And I think one reason for that is that people often write or talk about "AI x-risk" while actually only mentioning accidental AI x-risk. That's part of why I say "So, personally, I think I’d have made that choice of scope even more explicit." 

(But again, I do very much like this post overall. And as a target of this quibble of mine, you're in good company - I have the same quibble with The Precipice. I think one of the quibbles I most often have with posts I like is "This post seems to imply, or could be interpreted as implying, that it covers [topic]. But really it covers [some subset of that topic]. That's fair enough and still very useful, but I think it'd be good to be clearer about what the scope is.")


I know some people working on expanded and more in-depth models like this post. It would be great to get your thoughts when they're ready.

Sounds very cool! Yeah, I'd be happy to have a look at that work when it's ready.

Comment by MichaelA on The "Commitment Races" problem · 2021-01-14T03:00:11.529Z · LW · GW

Thanks for this post; this does seem like a risk worth highlighting.

I've just started reading Thomas Schelling's 1960 book The Strategy of Conflict, and noticed a lot of ideas in chapter 2 that reminded me of many of the core ideas in this post. My guess is that that sentence is an uninteresting, obvious observation, and that Daniel and most readers were already aware (a) that many of the core ideas here were well-trodden territory in game theory and (b) that this post's objectives were to: 

  • highlight these ideas to people on LessWrong
  • highlight their potential relevance to AI risk
  • highlight how this interacts with updateless decision theory and acausal trade

But maybe it'd be worth people who are interested in this problem reading that chapter of The Strategy of Conflict, or other relevant work in standard academic game theory, to see if there are additional ideas there that could be fruitful here.

Comment by MichaelA on MichaelA's Shortform · 2021-01-02T08:39:40.041Z · LW · GW

Problems in AI risk that economists could potentially contribute to

List(s) of relevant problems


I intend for this to include both technical and governance problems, and problems relevant to a variety of AI risk scenarios (e.g., AI optimising against humanity, AI misuse by humans, AI extinction risk, AI dystopia risk...)

Wei Dai’s list of Problems in AI Alignment that philosophers could potentially contribute to made me think that it could be useful to have a list of problems in AI risk that economists could potentially contribute to. So I began making such a list. 


  • I’m neither an AI researcher nor an economist
  • I spent hardly any time on this, and just included things I’ve stumbled upon, rather than specifically searching for these things

So I’m sure there’s a lot I’m missing.

Please comment if you know of other things worth mentioning here. Or if you’re better placed to make a list like this than I am, feel very free to do so; you could take whatever you want from this list and then comment here to let people know where to find your better thing.

(It’s also possible another list like this already exists. And it's also possible that economists could contribute to such a large portion of AI risk problems that there’s no added value in making a separate list for economists specifically. If you think either of those things are true, please comment to say so!)

Comment by MichaelA on Clarifying some key hypotheses in AI alignment · 2021-01-02T07:38:51.308Z · LW · GW

It occurs to me that all of the hypotheses, arguments, and approaches mentioned here (though not necessarily the scenarios) seem to be about the “technical” side of things. There are two main things I mean by that statement:

First, this post seems to be limited to explaining something along the lines of “x-risks from AI accidents”, rather than “x-risks from misuse of AI”, or “x-risk from AI as a risk factor” (e.g., how AI could potentially increase risks of nuclear war). 

I do think it makes sense to limit the scope that way, because: 

  • no one post can cover everything
  • you don’t want to make the diagram overwhelming
  • there’s a relatively clear boundary between what you’re covering and what you’re not
  • what you’re covering seems like the most relevant thing for technical AI safety researchers, whereas the other parts are perhaps more relevant for people working on AI strategy/governance/policy

And the fact that this post's scope is limited in that way seems somewhat highlighted by saying this is about AI alignment (whereas misuse could occur even with a system aligned to some human’s goals), and by saying “The idea is closely connected to the problem of artificial systems optimizing adversarially against humans.” 

But I think misuse and “risk factor”/“structural risk” issues are also quite important, that they should be on technical AI safety researchers’ radars to some extent, and that they probably interact in some ways with technical AI safety/alignment. So, personally, I think I’d have made that choice of scope even more explicit.

I’d also be really excited to see a post that takes the same approach as this one, but for those other classes of AI risks. 


The second thing I mean by the above statement is that this post seems to exclude non-technical factors that seem like they’d also impact the technical side or the AI accident risks

One crux of this type would be “AI researchers will be cautious/sensible/competent “by default””. Here are some indications that that’s an “important and controversial hypothes[is] for AI alignment”:

  • AI Impacts summarised some of Rohin’s comments as “AI researchers will in fact correct safety issues rather than hacking around them and redeploying. Shah thinks that institutions developing AI are likely to be careful because human extinction would be just as bad for them as for everyone else.” 
  • But my impression is that many people at MIRI would disagree with that, and are worried that people will merely “patch” issues in ways that don’t adequately address the risks. 
  • And I think many would argue that institutions won’t be careful enough, because they only pay a portion of the price of extinction; reducing extinction risk is a transgenerational global public good (see Todd and this comment).
  • And I think views on these matters influence how much researchers would be happy with the approach of “Use feedback loops to course correct as we go”. I think the technical things influence how easily we theoretically could do that, while the non-technical things influence how much we realistically can rely on people to do that. 

So it seems to me that a crux like that could perhaps fit well in the scope of this post. And I thus think it’d be cool if someone could either (1) expand this post to include cruxes like that, or (2) make another post with a similar approach, but covering non-technical cruxes relevant to AI safety.

Comment by MichaelA on Clarifying some key hypotheses in AI alignment · 2021-01-02T07:35:09.462Z · LW · GW

Thanks for this post! This seems like a really great way of visually representing how these different hypotheses, arguments, approaches, and scenarios interconnect. (I also think it’d be cool to see posts on other topics which use a similar approach!)

It seems that AGI timelines aren’t explicitly discussed here. (“Discontinuity to AGI” is mentioned, but I believe that's a somewhat distinct matter.) Was that a deliberate choice?

It does seem like several of the hypotheses/arguments mentioned here would feed into or relate to beliefs about timelines - in particular, Discontinuity to AGI, Discontinuity from AGI, and Recursive self-improvement, ML scales to AGI, and Deep insights needed (or maybe not that last one, as that means “needed” for alignment in particular). But I don’t think beliefs about timelines would be fully accounted for by those hypotheses/arguments - beliefs about timelines could also involve cruxes like whether “Intelligence is a huge collection of specific things”) or whether “There’ll be another AI winter before AGI” could also play a role.

I’m not sure to what extent beliefs about timelines (aside from beliefs about discontinuity) would influence which of the approaches people should/would take, out of the approaches you list. But I imagine that beliefs that timelines are quite short might motivate work on ML or prosaic alignment rather than (Near) proof-level assurance of alignment or Foundational or “deconfusion” research. This would be because people might then think the latter approaches would take too long, such that our only shot (given these people’s beliefs) is doing ML or prosaic alignment and hoping that’s enough. (See also.)

And it seems like beliefs about timelines would feed into decisions about other approaches you don’t mention, like opting for investment or movement-building rather than direct, technical work. (That said, it seems reasonable for this post’s scope to just be what a person should do once they have decided to work on AI alignment now.)

Comment by MichaelA on Does the US nuclear policy still target cities? · 2021-01-01T08:03:14.007Z · LW · GW

Thanks for this post; I found it useful.

The US policy has never ruled out the possibility of escalation to full countervalue targeting and is unlikely to do so.

But the 2013 DoD report says "The United States will not intentionally target civilian populations or civilian objects". That of course doesn't prove that the US actually wouldn't engage in countervalue targeting, but doesn't it indicate that US policy at that time ruled out engaging in countervalue targeting? 

This is a genuine rather than rhetorical question. I feel I might be just missing something, because, as you note, the paper you cited says:

Did this mean that the United States was discarding its ultimate assured destruction threat for deterring nuclear war? Clearly not. The guidance was carefully drafted. Does not rely on is different from will not resort to

...and yet, as far as I can see, the paper just doesn't address the "will not intentionally target" line. So I feel confused by the paper's analysis. (Though I haven't read the paper in full.)

Comment by MichaelA on Why those who care about catastrophic and existential risk should care about autonomous weapons · 2020-11-20T07:25:15.162Z · LW · GW

If I had to choose between a AW treaty and some treaty governing powerful AI, the latter (if it made sense) is clearly more important. I really doubt there is such a choice and that one helps with the other, but I could be wrong here. [emphasis added]

Did you mean something like "and in fact I think that one helps with the other"?

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-10-09T06:57:11.159Z · LW · GW

I don't think I know of any person who's demonstrated this who thinks risk is under, say, 10%

If you mean risk of extinction or existential catastrophe from AI at the time AI is developed, it seems really hard to say, as I think that that's been estimated even less often than other aspects of AI risk (e.g. risk this century) or x-risk as a whole. 

I think the only people (maybe excluding commenters who don't work on this professionally) who've clearly given a greater than 10% estimate for this are: 

  • Buck Schlegris (50%)
  • Stuart Armstrong (33-50% chance humanity doesn't survive AI)
  • Toby Ord (10% existential risk from AI this century, but 20% for when the AI transition happens)

Meanwhile, people who I think have effectively given <10% estimates for that (judging from estimates that weren't conditioning on when AI was developed; all from my database):

  • Very likely MacAskill (well below 10% for extinction as a whole in the 21st century)
  • Very likely Ben Garfinkel (0-1% x-catastrophe from AI this century)
  • Probably the median FHI 2008 survey respondent (5% for AI extinction in the 21st century)
  • Probably Pamlin & Armstrong in a report (0-10% for unrecoverable collapse extinction from AI this century)
    • But then Armstrong separately gave a higher estimate
    • And I haven't actually read the Pamlin & Armstrong report
  • Maybe Rohin Shah (some estimates in a comment thread)

(Maybe Hanson would also give <10%, but I haven't seen explicit estimates from him, and his reduced focus on and "doominess" from AI may be because he thinks timelines are longer and other things may happen first.)

I'd personally consider all the people I've listed to have demonstrated at least a fairly good willingness and ability to reason seriously about the future, though there's perhaps room for reasonable disagreement here. (With the caveat that I don't know Pamlin and don't know precisely who was in the FHI survey.)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-10-09T06:35:02.114Z · LW · GW

Mostly I only start paying attention to people's opinions on these things once they've demonstrated that they can reason seriously about weird futures

[tl;dr This is an understandable thing to do, but does seem to result in biasing one's sample towards higher x-risk estimates]

I can see the appeal of that principle. I partly apply such a principle myself (though in the form of giving less weight to some opinions, not ruling them out).

But what if it turns out the future won't be weird in the ways you're thinking of? Or what if it turns out that, even if it will be weird in those ways, influencing it is too hard, or just isn't very urgent (i.e., the "hinge of history" is far from now), or is already too likely to turn out well "by default" (perhaps because future actors will also have mostly good intentions and will be more informed). 

Under such conditions, it might be that the smartest people with the best judgement won't demonstrate that they can reason seriously about weird futures, even if they hypothetically could, because it's just not worth their time to do so. In the same way as how I haven't demonstrated my ability to reason seriously about tax policy, because I think reasoning seriously about the long-term future is a better use of my time. Someone who starts off believing tax policy is an overwhelmingly big deal could then say "Well, Michael thinks the long-term future is what we should focus on instead, but how why should I trust Michael's view on that when he hasn't demonstrated he can reason seriously about the importance and consequences of tax policy?"

(I think I'm being inspired here by Trammell's interested posting "But Have They Engaged With The Arguments?" There's some LessWrong discussion - which I haven't read - of an early version here.)

I in fact do believe we should focus on long-term impacts, and am dedicating my career to doing so, as influencing the long-term future seems sufficiently likely to be tractable, urgent, and important. But I think there are reasonable arguments against each of those claims, and I wouldn't be very surprised if they turned out to all be wrong. (But I think currently we've only had a very small part of humanity working intensely and strategically on this topic for just ~15 years, so it would seem too early to assume there's nothing we can usefully do here.)

And if so, it would be better to try to improve the short-term future, which further future people can't help us with, and then it would make sense for the smart people with good judgement to not demonstrate their ability to think seriously about the long-term future. So under such conditions, the people left in the sample you pay attention to aren't the smartest people with the best judgement, and are skewed towards unreasonably high estimates of the tractability, urgency, and/or importance of influencing the long-term future.

To emphasise: I really do want way more work on existential risks and longtermism more broadly! And I do think that, when it comes to those topics, we should pay more attention to "experts" who've thought a lot about those topics than to other people (even if we shouldn't only pay attention to them). I just want us to be careful about things like echo chamber effects and biasing the sample of opinions we listen to.

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-10-08T06:56:26.788Z · LW · GW

I'm not sure which of these estimates are conditional on superintelligence being invented. To the extent that they're not, and to the extent that people think superintelligence may not be invented, that means they understate the conditional probability that I'm using here.

Good point. I'd overlooked that.

I think lowish estimates of disaster risks might be more visible than high estimates because of something like social desirability, but who knows.

(I think it's good to be cautious about bias arguments, so take the following with a grain of salt, and note that I'm not saying any of these biases are necessarily the main factor driving estimates. I raise the following points only because the possibility of bias has already been mentioned.)

I think social desirability bias could easily push the opposite way as well, especially if we're including non-academics who dedicate their jobs or much of their time to x-risks (which I think covers the people you're considering, except that Rohin is sort-of in academia). I'd guess the main people listening to these people's x-risk estimates are other people who think x-risks are a big deal, and higher x-risk estimates would tend to make such people feel more validated in their overall interests and beliefs. 

I can see how something like a bias towards saying things that people take seriously and that don't seem crazy (which is perhaps a form of social desirability bias) could also push estimates down. I'd guess that that that effect is stronger the closer one gets to academia or policy. I'm not sure what the net effect of the social desirability bias type stuff would be on people like MIRI, Paul, and Rohin.

I'd guess that the stronger bias would be selection effects in who even makes these estimates. I'd guess that people who work on x-risks have higher x-risk estimates than people who don't and who have thought about odds of x-risk somewhat explicitly. (I think a lot of people just wouldn't have even a vague guess in mind, and could swing from casually saying extinction is likely in the next few decades to seeing that idea as crazy depending on when you ask them.) 

Quantitative x-risk estimates tend to come from the first group, rather than the latter, because the first group cares enough to bother to estimate this. And we'd be less likely to pay attention to estimates from the latter group anyway, if they existed, because they don't seem like experts - they haven't spent much time thinking about the issue. But they haven't spent much time thinking about it because they don't think the risk is high, so we're effectively selecting who to listen to the estimates of based in part on what their estimates would be.

I'd still do similar myself - I'd pay attention to the x-risk "experts" rather than other people. And I don't think we need to massively adjust our own estimates in light of this. But this does seem like a reason to expect the estimates are biased upwards, compared to the estimates we'd get from a similarly intelligent and well-informed group of people who haven't been pre-selected for a predisposition to think the risk is somewhat high.

Comment by MichaelA on Thoughts on Human Models · 2020-09-26T14:33:50.871Z · LW · GW

That does seem interesting and concerning.

Minor: The link didn’t work for me; in case others have the same problem, here is (I believe) the correct link.

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-25T06:35:34.107Z · LW · GW

Yeah, totally agreed. 

I also think it's easier to forecast extinction in general, partly because it's a much clearer threshold, whereas there are some scenarios that some people might count as an "existential catastrophe" and others might not. (E.g., Bostrom's "plateauing — progress flattens out at a level perhaps somewhat higher than the present level but far below technological maturity".)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-24T06:41:47.906Z · LW · GW

Conventional risks are events that already have a background chance of happening (as of 2020 or so) and does not include future technologies. 

Yeah, that aligns with how I'd interpret the term. I asked about advanced biotech because I noticed it was absent from your answer unless it was included in "super pandemic", so I was wondering whether you were counting it as a conventional risk (which seemed odd) or excluding it from your analysis (which also seems odd to me, personally, but at least now I understand your short-AI-timelines-based reasoning for that!).

I am going read through the database of existential threats though, does it include what you were referring too?

Yeah, I think all the things I'd consider most important are in there. Or at least "most" - I'd have to think for longer in order to be sure about "all".

There are scenarios that I think aren't explicitly addressed in any estimates that database, like things to do with whole-brain emulation or brain-computer interfaces, but these are arguably covered by other estimates. (I also don't have a strong view on how important WBE or BCI scenarios are.)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T20:16:34.680Z · LW · GW

The overall risk was 9.2% for the community forecast (with 7.3% for AI risk). To convert this to a forecast for existential risk (100% dead), I assumed 6% risk from AI, 1% from nuclear war, and 0.4% from biological risk

I think this implies you think: 

  • AI is ~4 or 5 times (6% vs 1.3%) as likely to kill 100% of people as to kill between 95 and 100% of people
  • Everything other than AI is roughly equally likely (1.5% vs 1.4%) to kill 100% of people as to kill between 95% and 100% of people

Does that sound right to you? And if so, what was your reasoning?

I ask out of curiosity, not because I disagree. I don't have a strong view here, except perhaps that AI is the risk with the highest ratio of "chance it causes outright extinction" to "chance it causes major carnage" (and this seems to align with your views).

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T20:10:18.842Z · LW · GW

Very interesting, thanks for sharing! This seems like a nice example of combining various existing predictions to answer a new question.

a forecast for existential risk (100% dead)

It seems worth highlighting that extinction risk (risk of 100% dead) is a (big) subset of existential risk (risk of permanent and drastic destruction of humanity's potential), rather than those two terms being synonymous. If your forecast was for extinction risk only, then the total existential risk should presumably be at least slightly higher, due to risks of unrecoverable collapse or unrecoverable dystopia.

(I think it's totally ok and very useful to "just" forecast extinction risk. I just think it's also good to be clear about what one's forecast is of.)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T20:04:02.427Z · LW · GW

Thanks for those responses :)

MIRI people and Wei Dai for pessimism (though I'm not sure it's their view that it's worse than 50/50), Paul Christiano and other researchers for optimism. 

It does seem odd to me that, if you aimed to do something like average over these people's views (or maybe taking a weighted average, weighting based on the perceived reasonableness of their arguments), you'd end up with a 50% credence on existential catastrophe from AI. (Although now I notice you actually just said "weight it by the probability that it turns out badly instead of well"; I'm assuming by that you mean "the probability that it results in existential catastrophe", but feel free to correct me if not.)

One MIRI person (Buck Schlegris) has indicated they think there's a 50% chance of that. One other MIRI-adjacent person gives estimates for similar outcomes in the range of 33-50%. I've also got general pessimistic vibes from other MIRI people's writings, but I'm not aware of any other quantitative estimates from them or from Wei Dai. So my point estimate for what MIRI people think would be around 40-50%, and not well above 50%.

And I think MIRI is widely perceived as unusually pessimistic (among AI and x-risk researchers; not necessarily among LessWrong users). And people like Paul Christiano give something more like a 10% chance of existential catastrophe from AI. (Precisely what he was estimating was a little different, but similar.)

So averaging across these views would seem to give us something closer to 30%. 

Personally, I'd also probably include various other people who seem thoughtful on this and are actively doing AI or x-risk research - e.g., Rohin Shah, Toby Ord - and these people's estimates seem to usually be closer to Paul than to MIRI (see also). But arguing for doing that would be arguing for a different reasoning process, and I'm very happy with you using your independent judgement to decide who to defer to; I intend this comment to instead just express confusion about how your stated process reached your stated output.

(I'm getting these estimates from my database of x-risk estimates. I'm also being slightly vague because I'm still feeling a pull to avoid explicitly mentioning other views and thereby anchoring this thread.)

(I should also note that I'm not at all saying to not worry about AI - something like a 10% risk is still a really big deal!)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T12:46:13.938Z · LW · GW

(Just a heads up that the link leads back to this thread, rather than to your Elicit snapshot :) )

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T12:44:46.723Z · LW · GW

(Minor & meta: I'd suggest people take screenshots which include the credence on "More than 2120-01-01" on the right, as I think that's a quite important part of one's prediction. But of course, readers can still find that part of your prediction by reading your comment or clicking the link - it's just not highlighted as immediately.)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T12:37:53.599Z · LW · GW

I do not think any conventional threat such as nuclear war, super pandemic or climate change is likely to be an ER

Are you including risks from advanced biotechnology in that category? To me, it would seem odd to call that a "conventional threat"; that category sounds to me like it would refer to things we have a decent amount of understanding of and experience with. (Really this is more of a spectrum, and our understanding of and experience with risks from nuclear war and climate change is of course limited in key ways as well. But I'd say it's notably less limited than is the case with advanced biotech or advanced AI.)

with the last <1% being from more unusual threats such as simulation being turned off, false vacuum collapse, or hostile alien ASI. But also, for unforeseen or unimagined threats.

It appears to me that there are some important risks that have been foreseen and imagined which you're not accounting for. Let me know if you want me to say more; I hesitate merely because I'm wary of pulling independent views towards community views in a thread like this, not for infohazard reasons (the things I have in mind are widely discussed and non-exotic). 

Note: I made this prediction before looking at the Effective Altruism Database of Existential Risk Estimates.

I think it's cool that you made this explicit, to inform how and how much people update on your views if they've already updated on views in that database :)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T12:32:32.325Z · LW · GW

Interesting, thanks for sharing. 

an uncertain but probably short delay for a major x-risk factor (probably superintelligence) to appear as a result

I had a similar thought, though ultimately was too lazy to try to actually represent it. I'd be interested to hear what what size of delay you used, and what your reasoning for that was.

averaging to about 50% because of what seems like a wide range of opinions among reasonable well-informed people

Was your main input into this parameter your perceptions of what other people would believe about this parameter? If so, I'd be interested to hear whose beliefs you perceive yourself to be deferring to here. (If not, I might not want to engage in that discussion, to avoid seeming to try to pull an independent belief towards average beliefs of other community members, which would seem counterproductive in a thread like this.)

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-23T07:37:27.758Z · LW · GW

I'll also hesitantly mention my database of existential risk estimates

I hesitate because I suspect it's better if most people who are willing to just make a forecast here without having recently looked at the predictions in that database, so we get a larger collection of more independent views. 

But I guess people can make their own decision about whether to look at the database, perhaps for cases where:

  • People just feel too unsure where to start with forecasting this to bother trying, but if they saw other people's forecasts they'd be willing to come up with their own forecast that does more than just totally parroting the existing forecasts
    • And it's necessary to do more than just parroting, as the existing forecasts are about % chance by a given date, not the % chance at each date over a period
    • People could perhaps come up with clever ways to decide how much weight to give each forecast and how to translate them into an Elicit snapshot
  • People make their own forecast, but then want to check the database and consider making tweaks before posting it here (ideally also showing here what their original, independent forecast was)
Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-22T06:28:24.759Z · LW · GW

Here are a couple sources people might find useful for guiding how they try to break this question down and reason about it:

Comment by MichaelA on Forecasting Thread: Existential Risk · 2020-09-22T06:21:11.402Z · LW · GW

Thanks for making this thread!

I should say that I'd give very little weight to both my forecast and my reasoning. Reasons for that include that:

  • I'm not an experienced forecaster
  • I don't have deep knowledge on relevant specifics (e.g., AI paradigms, state-of-the-art in biotech)
  • I didn't spend a huge amount of time on my forecast, and used pretty quick-and-dirty methods
  • I drew on existing forecasts to some extent (in particular, the LessWrong Elicit AI timelines thread and Ord's x-risk estimates). So if you updated on those forecasts and then also updated on my forecast as if it was independent of them, you'd be double-counting some views and evidence

So I'm mostly just very excited to see other people's forecasts, and even more excited to see how they reason about and break down the question!

Comment by MichaelA on MichaelA's Shortform · 2020-09-04T08:51:21.298Z · LW · GW

If any reading this has read anything I’ve written on LessWrong or the EA Forum, I’d really appreciate you taking this brief, anonymous survey. Your feedback is useful whether your opinion of my work is positive, mixed, lukewarm, meh, or negative. 

And remember what mama always said: If you’ve got nothing nice to say, self-selecting out of the sample for that reason will just totally bias Michael’s impact survey.

(If you're interested in more info on why I'm running this survey and some thoughts on whether other people should do similar, I give that here.)

Comment by MichaelA on Please take a survey on the quality/impact of things I've written · 2020-08-29T10:40:39.457Z · LW · GW

Why I’m running this survey

I think that getting clear feedback on how well one is doing, and how much one is progressing, tends to be somewhat hard in general, but especially when it comes to:

  • Research
    • And especially relatively big-picture/abstract research, rather than applied research
  • Actually improving the world compared to the counterfactual
    • Rather than, e.g., getting students’ test scores up, meeting an organisation’s KPIs, or publishing a certain number of papers
  • Longtermism

And I’ve primarily been doing big-picture/abstract research aimed at improving the world, compared to the counterfactual, from a longtermist perspective. So, yeah, I’m a tad in the dark about how it’s all been going…[1]

I think some of the best metrics by which to judge research are whether people:

  • are bothering to pay attention to it
  • think it’s interesting
  • think it’s high-quality/rigorous/well-reasoned
  • think it addresses important topics
  • think it provides important insights
  • think they’ve actually changed their beliefs, decisions, or plans based on that research
  • etc.

I think this data is most useful if these people have relevant expertise, are in positions to make especially relevant and important decisions, etc. But anyone can at least provide input on things like how well-written or well-reasoned some work seems to have been. And whoever the respondents are, whether the research influenced them probably provides at least weak evidence regarding whether the research influenced some other set of people (or whether it could, if that set of people were to read it).

This year, I’ve gathered a decent amount of data about the above-listed metrics. But more data would be useful. And the data I’ve gotten so far has usually been non-anonymous, and often resulted from people actively reaching out to me. Both of those factors likely bias the responses in a positive direction. 

So I’ve created this survey in order to get additional - and hopefully less biased - data, as an input into my thinking about: 

  1. whether EA-aligned research and/or writing is my comparative advantage (as I’m also actively considering a range of alternative pathways)
  2. which topics, methodologies, etc. within research and/or writing are my comparative advantage
  3. specific things I could improve about my research and/or writing (e.g., topic choice, how rigorous vs rapid-fire my approach should be, how concise I should be)

But there’s also another aim of this survey. The idea of doing this survey, and many of the questions, was inspired partly by Rethink Priorities’ impact survey. But I don’t recall seeing evidence that individual researchers/writers (or even other organisations) run such surveys.[2] And it seems plausible to me that they’d benefit from doing so. 

So this is also an experiment to see how feasible and useful this is, to inform whether other people should run their own surveys of this kind. I plan to report back here in a couple weeks September with info like how many responses I got and how useful this seemed to be.

[1] I’m not necessarily saying that that type of research is harder to do than e.g. getting students’ test scores up. I’m just saying it’s harder to get clear feedback on how well one is doing.

[2] Though I have seen various EAs provide links to forms for general anonymous feedback. I think that’s also a good idea, and I’ve copied the idea in my own forum bio.

Comment by MichaelA on MichaelA's Shortform · 2020-08-23T10:41:45.312Z · LW · GW

See also Open Philanthropy Project's list of different kinds of uncertainty (and comments on how we might deal with them) here

Comment by MichaelA on MichaelA's Shortform · 2020-08-04T12:02:28.487Z · LW · GW

See also EA reading list: cluelessness and epistemic modesty.

Comment by MichaelA on MichaelA's Shortform · 2020-06-26T00:03:50.072Z · LW · GW

Ok, so it sounds like Legg and Hutter's definition works given certain background assumptions / ways of modelling things, which they assume in their full paper on their own definition. 

But in the paper I cited, Legg and Hutter give their definition without mentioning those assumptions / ways of modelling things. And they don't seem to be alone in that, at least given the out-of-context quotes they provide, which include: 

  • "[Performance intelligence is] the successful (i.e., goal-achieving) performance of the system in a complicated environment"
  • "Achieving complex goals in complex environments"
  • "the ability to solve hard problems."

These definitions could all do a good job capturing what "intelligence" typically means if some of the terms in them are defined certain ways, or if certain other things are assumed. But they seem inadequate by themselves, in a way Legg and Hutter don't note in their paper. (Also, Legg and Hutter don't seem to indicate that that paper is just or primarily about how intelligence should be defined in relation to AI systems.)

That said, as I mentioned before, I don't actually think this is a very important oversight on their part.

Comment by MichaelA on MichaelA's Shortform · 2020-06-25T00:28:30.091Z · LW · GW

Firstly, I'll say that, given that people already have a pretty well-shared intuitive understanding of what "intelligence" is meant to mean, I don't think it's a major problem for people to give explicit definitions like Legg and Hutter's. I think people won't then go out and assume that wealth, physical strength, etc. count as part of intelligence - they're more likely to just not notice that the definitions might imply that.

But I think my points do stand. I think I see two things you might be suggesting:

  • Intelligence is the only thing that increases an agent’s ability to achieve goals across all environments.
  • Intelligence is an ability, which is part of the agent, whereas things like wealth are resources, and are part of the environment.

If you meant the first of those things, I'd agree that "“Intelligence” might help in a wider range of environments than those [other] capabilities or resources help in". E.g., a billion US dollars wouldn't help someone at any time before 1700CE (or whenever) or probably anytime after 3000CE achieve their goals, whereas intelligence probably would. 

But note that Legg and Hutter say "across a wide range of environments." A billion US dollars would help anyone, in any job, any country, and any time from 1900 to 2020 achieve most of their goals. I would consider that a "wide" range of environments, even if it's not maximally wide.

And there are aspects of intelligence that would only be useful in a relatively narrow set of environments, or for a relatively narrow set of goals. E.g., factual knowledge is typically included as part of intelligence, and knowledge the dates of birth and death of US presidents will be helpful in various situations, but probably in fewer situations and for fewer goals than a billion dollars.

If you meant the second thing, I'd note in response the other capabilities, rather than the other resources. For example, it seems to me intuitive to speak of an agent's charisma or physical strength as a property of the agent, rather than of the state. And I think those capabilities will help it achieve goals in a wide (though not maximally wide) range of environments. 

We could decide to say an agent's charisma and physical strength are properties of the state, not the agent, and that this is not the case for intelligence. Perhaps this is useful when modelling an AI and its environment in a standard way, or something like that, and perhaps it's typically assumed (I don't know). If so, then combining an explicit statement of that with Legg and Hutter's definition may address my points, as that might explicitly slice all other types of capabilities and resources out of the definition of "intelligence". 

But I don't think it's obvious that things like charisma and physical strength are more a property of the environment than intelligence is - at least for humans, for whom all of these capabilities ultimately just come down to our physical bodies (assuming we reject dualism, which seems safe to me).

Does that make sense? Or did I misunderstand your points?

Comment by MichaelA on TurnTrout's shortform feed · 2020-06-24T03:05:06.050Z · LW · GW

This seems right to me, and I think it's essentially the rationale for the idea of the Long Reflection.

Comment by MichaelA on MichaelA's Shortform · 2020-06-24T03:02:32.837Z · LW · GW

“Intelligence” vs. other capabilities and resources

Legg and Hutter (2007) collect 71 definitions of intelligence. Many, perhaps especially those from AI researchers, would actually cover a wider set of capabilities or resources than people typically want the term “intelligence” to cover. For example, Legg and Hutter’s own “informal definition” is: “Intelligence measures an agent’s ability to achieve goals in a wide range of environments.” But if you gave me a billion dollars, that would vastly increase my ability to achieve goals in a wide range of environments, even if it doesn’t affect anything we’d typically want to refer to as my “intelligence”.

(Having a billion dollars might lead to increases in my intelligence, if I use some of the money for things like paying for educational courses or retiring so I can spend all my time learning. But I can also use money to achieve goals in ways that don’t look like “increasing my intelligence”.)

I would say that there are many capabilities or resources that increase an agent’s ability to achieve goals in a wide range of environments, and intelligence refers to a particular subset of these capabilities or resources. Some of the capabilities or resources which we don’t typically classify as “intelligence” include wealth, physical strength, connections (e.g., having friends in the halls of power), attractiveness, and charisma. 

“Intelligence” might help in a wider range of environments than those capabilities or resources help in (e.g., physical strength seems less generically useful). And some of those capabilities or resources might be related to intelligence (e.g., charisma), be “exchangeable” for intelligence (e.g., money), or be attainable via intelligence (e.g., higher intelligence can help one get wealth and connections). But it still seems a useful distinction can be made between “intelligence” and other types of capabilities and resources that also help an agent achieve goals in a wide range of environments.

I’m less sure how to explain why some of those capabilities and resources should fit within “intelligence” while others don’t. At least two approaches to this can be inferred from the definitions Legg and Hutter collect (especially those from psychologists): 

  1. Talk about “mental” or “intellectual” abilities
    • But then of course we must define those terms. 
  2. Gesture at examples of the sorts of capabilities one is referring to, such as learning, thinking, reasoning, or remembering.
    • This second approach seems useful, though not fully satisfactory.

An approach that I don’t think I’ve seen, but which seems at least somewhat useful, is to suggest that “intelligence” refers to the capabilities or resources that help an agent (a) select or develop plans that are well-aligned with the agent’s values, and (b) implement the plans the agent has selected or developed. In contrast, other capabilities and resources (such as charisma or wealth) primarily help an agent implement its plans, and don’t directly provide much help in selecting or developing plans. (But as noted above, an agent could use those other capabilities or resources to increase their intelligence, which then helps the agent select or develop plans.)

For example, both (a) becoming more knowledgeable and rational and (b) getting a billion dollars would help one more effectively reduce existential risks. But, compared to getting a billion dollars, becoming more knowledgeable and rational is much more likely to lead one to prioritise existential risk reduction.

I find this third approach useful, because it links to the key reason why I think the distinction between intelligence and other capabilities and resources actually matters. This reason is that I think increasing an agent’s “intelligence” is more often good than increasing an agent’s other capabilities or resources. This is because some agents are well-intentioned yet currently have counterproductive plans. Increasing the intelligence of such agents may help them course-correct and drive faster, whereas increasing their other capabilities and resources may just help them drive faster down a harmful path. 

(I plan to publish a post expanding on that last idea soon, where I’ll also provide more justification and examples. There I’ll also argue that there are some cases where increasing an agent’s intelligence would be bad yet increasing their “benevolence” would be good, because some agents have bad values, rather than being well-intentioned yet misguided.)