Microdooms averted by working on AI Safety

nikola-jurkovic

Microdooms averted by working on AI Safety

post by Nikola Jurkovic (nikolaisalreadytaken) · 2023-09-17T21:46:06.239Z · LW · GW · 3 comments

This is a link post for https://forum.effectivealtruism.org/posts/DBDpnAhxvRWmfmtfv/microdooms-averted-by-working-on-ai-safety

  Diminishing returns model
  Pareto distribution model
  Linear growth model
  One microdoom is A Lot Of Impact
  Why doesn’t GiveWell recommend AI Safety organizations?
None
3 comments

Disclaimer: the models presented are extremely naive and simple, and assume that existential risk from AI is higher than 20%. Play around with the models using this (mostly GPT-4 generated) jupyter notebook.

1 microdoom = 1/1,000,000 probability of existential risk

Diminishing returns model

The model has the following assumptions:

Absolute Risk Reduction: There exists an absolute decrease in existential risk that could be achieved if the AI safety workforce were at an "ideal size." This absolute risk reduction is a parameter in the model.
1. Note that this is absolute reduction, not relative reduction. So, a 10% absolute reduction means going from 20% x-risk to 10% x-risk, or from 70% x-risk to 60% x-risk.
Current and Ideal Workforce Size: The model also takes into account the current size of the workforce and an "ideal" size (some size that would lead to a much higher decrease in existential risk than the current size), which is larger than the current size. These are both parameters in the model.
Diminishing Returns: The model assumes diminishing returns on adding more people to the AI safety effort. Specifically, the returns are modeled to increase logarithmically with the size of the workforce.

The goal is to estimate the expected decrease in existential risk that would result from adding one more person to the current AI safety workforce. By inputting the current size of the workforce, the ideal size, and the potential absolute risk reduction, the model gives the expected decrease.

If we run this with:

Current size = 350 [EA · GW]
Ideal size = 100,000
Absolute decrease (between 0 and ideal size) = 20%

we get that one additional career averts 49 microdooms [EA · GW]. Because of diminishing returns, the impact from an additional career is very sensitive to how big the workforce currently is.

Graph of how much existential risk is reduced based on the size of the AI safety workforce under the diminishing returns model.

Pareto distribution model

We assume that the impact of professionals in the field follows a Pareto distribution, where 10% of the people account for 90% of the impact.

Model Parameters

Workforce Size: The total number of people currently working in AI safety.
Total Risk Reduction: The absolute decrease in existential risk that the AI safety workforce is currently achieving.

If we run this with:

Current size = 350
Absolute risk reduction (from current size) = 10%

We get that, if you’re a typical current AIS professional (between 10th and 90th percentile), you reduce somewhere between 10 and 270 microdooms. Because of how skewed the distribution is, the mean is at 286 microdooms, which is higher than the 90th percentile.

A 10th percentile AI Safety professional reduces x-risk by 14 microdooms
A 20th percentile AI Safety professional reduces x-risk by 16 microdooms
A 30th percentile AI Safety professional reduces x-risk by 20 microdooms
A 40th percentile AI Safety professional reduces x-risk by 24 microdooms
A 50th percentile AI Safety professional reduces x-risk by 31 microdooms
A 60th percentile AI Safety professional reduces x-risk by 41 microdooms
A 70th percentile AI Safety professional reduces x-risk by 61 microdooms
A 80th percentile AI Safety professional reduces x-risk by 106 microdooms
A 90th percentile AI Safety professional reduces x-risk by 269 microdooms

Linear growth model

If we just assume that going from 350 current people to 10,000 people would decrease x-risk by 10% linearly, we get that one additional career averts 10 microdooms.

One microdoom is A Lot Of Impact

Every model points at the conclusion that one additional AI safety professional decreases existential risks from AI by one microdoom at the very least.

Because there are 8 billion people alive today, averting one microdoom roughly corresponds to saving 8 thousand current human lives (especially under short timelines, where the meaning of “current” doesn’t change much). If one is willing to pay $5k to save one current human life (roughly how much it costs GiveWell top charities to save one), this amounts to $40M.

One microdoom is also 1 millionth of the entire future. If we expect our descendants to only spread to the milky way galaxy and no other galaxies, then this amounts to roughly 300,000 star systems.

Why doesn’t GiveWell recommend AI Safety organizations?

AI safety as a field is probably marginally (with regards to number of people or amount of funding) much more effective at saving current human lives than the global health charities GiveWell recommends. I think GiveWell shouldn’t be modeled as wanting to recommend organizations that save as many current lives as possible. I think a more accurate way to model them is “GiveWell recommends organizations that are [within the Overton Window]/[have very sound data to back impact estimates] that save as many current lives as possible.” If GiveWell wanted to recommend organizations that save as many human lives as possible, their portfolio would probably be entirely made up of AI safety orgs.

Because of organizational inertia, and my expectation that GiveWell will stay a global health charity recommendation service, I think it’s very worth thinking about creating a donation recommendation organization that evaluates (or at the very least compiles and recommends) AI safety organizations instead. Something like "The AI Safety Fund", without any other baggage, just plain AI safety. There might be huge increases in interest and funding in the AI safety space, and currently it’s not very obvious where a concerned individual with extra money should donate it.

3 comments

Comments sorted by top scores.

comment by MichaelStJules · 2023-09-17T23:56:56.533Z · LW(p) · GW(p)

Some other possibilities that may be worth considering and can further reduce impact, at least for an individual looking to work on AI safety themself:

Some work is net negative and increases the risk of doom or wastes the time and attention of people who could be doing more productive things.
Practical limits on the number of people working at a time, e.g. funding, management/supervision capacity. This could mean some people could have much lower probability of making a difference, if them taking a position pushes someone else who would have out from the field, or into (possibly much) less useful work.

Replies from: MichaelStJules

↑ comment by MichaelStJules · 2023-09-18T19:19:22.594Z · LW(p) · GW(p)

Also, the estimate of the current number of researchers probably underestimates the number of people (or person-hours) who will work on AI safety. You should probably expect further growth to the number of people working on AI safety, because the topic is getting mainstream coverage and support, Hinton and Bengio have become advocates, and it's being pushed more in EA (funding, community building, career advice).

However, the FTX collapse is reason to believe there will be less funding going forward.

comment by denkenberger · 2024-12-27T02:34:02.355Z · LW(p) · GW(p)

I think a more accurate way to model them is “GiveWell recommends organizations that are [within the Overton Window]/[have very sound data to back impact estimates] that save as many current lives as possible.” If GiveWell wanted to recommend organizations that save as many human lives as possible, their portfolio would probably be entirely made up of AI safety orgs.

Sounds about right - this paper used an older AI Safety model to find $16 to $12,000 per life saved in the present generation. Though I think some other GCR interventions could also compete on that metric, such as neglected work on engineered pandemics, and resilience to food catastrophes.

Microdooms averted by working on AI Safety

Contents

Diminishing returns model

Pareto distribution model

Linear growth model

One microdoom is A Lot Of Impact

Why doesn’t GiveWell recommend AI Safety organizations?

3 comments