If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?

kvmanthinking

If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?

post by KvmanThinking (avery-liu) · 2024-10-03T11:31:19.974Z · LW · GW · 5 comments

This is a question post.

  Answers
    48 Jeremy Gillen
    36 TsviBT
    18 Zach Stein-Perlman
    9 Nathan Helm-Burger
    7 Logan Zoellner
    7 Foyle
    2 lc
    -1 Gesild Muka
None
5 comments

I just want to make sure that when I donate money to AI alignment stuff it's actually going to be used economically

Answers

answer by Jeremy Gillen · 2024-10-03T14:29:36.219Z · LW(p) · GW(p)

The non-spicy answer is probably the LTFF, if you're happy deferring to the fund managers there. I don't know what your risk tolerance for wasting money is, but you can check whether they meet it by looking at their track record.

If you have a lot of time you might be able to find better ways to spend money than the LTFF can. (Like if you can find a good way to fund intelligence amplification as Tsvi said).

↑ comment by MichaelDickens · 2024-10-18T16:32:17.715Z · LW(p) · GW(p)

My perspective is that I'm much more optimistic about policy than about technical research, and I don't really feel qualified to evaluate policy work, and LTFF makes almost no grants on policy. I looked around and I couldn't find any grantmakers who focus on AI policy. And even if they existed, I don't know that I could trust them (like I don't think [LW(p) · GW(p)] Open Phil is trustworthy on AI policy and I kind of buy Habryka's arguments that their policy grants are net negative).

I'm in the process of looking through a bunch of AI policy orgs myself. I don't think I can do a great job of evaluating them but I can at least tell that most policy orgs aren't focusing on x-risk so I can scratch them off the list.

↑ comment by M. Y. Zuo · 2024-10-06T22:38:40.145Z · LW(p) · GW(p)

How does someone view the actual outcomes of the ‘Highlighted Grants’ on that page?

It would be a lot more reassuring if readers can check that they’ve all been fulfilled and/or exceeded expectations.

↑ comment by Birk Källberg · 2025-01-03T22:58:14.139Z · LW(p) · GW(p)

Wanting to answer a very similar question, I’ve just done about a day of donation research into x-risk funds. There are three that have caught my interest:

Long Term Future Fund (LTFF) from EA Funds
- in 2024, LTFF grants have gone mostly to individual TAIS researchers (also some policy folk and very small orgs) working on promising projects. Most are 3- to 12-month stipends between 10k$ and 100k$.
- see their Grants Database for details
Emerging Challenges Fund (ECF) - Longview Philanthropy
- gives grants to orgs in AIS, biorisk and nuclear. funds both policy work (diplomacy, laws, advocacy) and technical work (TAIS research, technical bio-safety)
- see their 2024 Report for details
Global Catastrophic Risks Fund (GCR Fund) - Founders Pledge
- focuses on prevention of great power conflicts
- their grants cover things like US-China diplomacy efforts on nuclear, AI and autonomous weapons issues. also biorisk strategy and policy work.

A very rough estimate on LTFF effectiveness (how much p(doom) does $1 reduce?):

The article Microdooms averted by working on AI Safety [LW · GW] uses a simple quantitative model to estimate that one extra AIS researcher will avert 49 microdooms on average at current margins.
Considering only humanities current 8B people, this would mean 400,000 current people saved in expectation by each additional researcher. Note that depending on parameter choices, the model’s result could easily go up or down an order of magnitude.

The rest are my calculations:

optimistic case: the researcher has all their impact in the first year and only requires a yearly salary of $80k. This would imply 0.6 nanodooms / $ or 5 current people saved / $.
pessimistic case: the researcher takes 40 years (a full career) to have that impact and big compute and org. staffing costs mean their career costs 10x their salary. This implies a 400x lower effectiveness, i.e. 1.5 picodooms / $ or 0.012 current people saved / $ or 80$ to save a person.

For me at least, this actually looks like quite promising results! I now think of “Funding an extra AIS researcher” as a baseline to compare other X-risk interventions too.

One can do better than that: Finding and supporting especially talented researchers or ones working on especially promising avenues should be a lot more effective than funding the average AIS researcher. This is exactly what LTFF is doing right now.

The other two funds seem to focus more on finding and supporting especially promising policy efforts on the organizational level. Their picks seem to me as potentially even more promising than LTFF, but I currently have no way to model this so that’s just my current intuition.

I intend to start donating to one of these three funds as a consequence of these findings.

answer by TsviBT · 2024-10-03T11:57:59.143Z · LW(p) · GW(p)

You probably shouldn't donate to alignment research. There's too much useless stuff with too good PR for you to tell what, if anything, is hopeworthy. If you know any young supergenius people who could dedicate their lifeforce to thinking about alignment FROM SCRATCH given some money, consider giving to them.

If there's some way to fund research that will lead to strong human intelligence amplification, you should do that. I can give some context for that, though not concrete recommendations.

↑ comment by jacquesthibs (jacques-thibodeau) · 2024-10-06T12:48:33.211Z · LW(p) · GW(p)

Just to clarify, do you only consider 'strong human intelligence amplification' through some internal change, or do you also consider AIs to be part of that? As in, it sounds like you are saying we currently lack the intelligence to make significant progress on alignment research and consider increasing human intelligence to be the best way to make progress. Are you also of the opinion that using AIs to augment alignment researchers and progressively automate alignment research is doomed and not worth consideration? If not, then here [LW(p) · GW(p)].

Replies from: TsviBT

↑ comment by TsviBT · 2024-10-06T13:17:55.844Z · LW(p) · GW(p)

Not strictly doomed but probably doomed, yeah. You'd have to first do difficult interesting novel philosophy yourself, and then look for things that would have helped with that [LW(p) · GW(p)].

Replies from: jacques-thibodeau

↑ comment by jacquesthibs (jacques-thibodeau) · 2024-10-06T13:46:49.426Z · LW(p) · GW(p)

Fair enough. For what it's worth, I've thought a lot about the kind of thing you describe in that comment and partially committing to this direction because I feel like I have enough intuition and insight that those other tools for thought failed to incorporate.

Replies from: TsviBT

↑ comment by TsviBT · 2024-10-06T13:51:39.261Z · LW(p) · GW(p)

Sure. You can ping me at some point if you want to talk about ideas for that or get my feedback or whatever.

↑ comment by RHollerith (rhollerith_dot_com) · 2024-10-03T13:39:02.673Z · LW(p) · GW(p)

TsviBT didn't recommend MIRI probably because he receives a paycheck from MIRI and does not want to appear self-serving. I on the other hand have never worked for MIRI and am unlikely ever to (being of the age when people usually retire) so I feel free to recommend MIRI without hesitation or reservation.

MIRI has abandoned hope of anyone's solving alignment before humanity runs out of time: they continue to employ people with deep expertise in AI alignment, but those employees spend their time explaining why the alignment plans of others will not work.

Most technical alignment researchers are increasing P(doom) because they openly publish results that help both the capability research program and the alignment program, but the alignment program is very unlikely to reach a successful conclusion before the capability program "succeeds", so publishing the results only shortens the amount of time we have to luck into an effective response or resolution to the AI danger (which again if one appears might not even involve figuring out how to align an AI so that it stays aligned as it becomes an ASI).

There are 2 other (not-for-profit) organizations in the sector that as far as I can tell are probably doing more good than harm, but I don't know enough about them for it to be a good idea for me to name them here.

Replies from: TsviBT

↑ comment by TsviBT · 2024-10-03T13:52:06.685Z · LW(p) · GW(p)

I'm no longer employed by MIRI. I think Yudkowsky is by far the best source of technical alignment research insight; but MIRI's research program was in retrospect probably pretty doomed even before I got there. I can see ways to improve it but I'm not that confident and I can somewhat directly see that I'm probably not capable of carrying out my suggested improvements. And AFAIK, as you say they're not currently doing very much alignment research. I'm also fine with appearing self-serving; if I were actively doing alignment research, I might recommend myself, though I don't really think it's appropriate to do so to a random person who can't evaluate arguments about alignment research and doesn't know who to trust. I guess if someone pays me enough I'll do some alignment research. I recommend myself as one authority among others on strategy regarding strong human intelligence amplification.

Replies from: rhollerith_dot_com, sam-iacono

↑ comment by RHollerith (rhollerith_dot_com) · 2024-10-06T16:20:18.108Z · LW(p) · GW(p)

I'm not saying that MIRI has some effective plan which more money would help with. I'm only saying that unlike most of the actors accepting money to work in AI Safety, at least they won't use a donation in a way that makes the situation worse. Specifically, MIRI does not publish insights that help the AI project and is very careful in choosing whom they will teach technical AI skills and knowledge.

Replies from: ryan_greenblatt

↑ comment by ryan_greenblatt · 2024-10-06T16:53:43.993Z · LW(p) · GW(p)

at least they won't use a donation in a way that makes the situation worse

Seems false, they could have problematic effects on discourse if their messaging is poor or seems dumb in retrospect.

I disagree pretty heavily with MIRI which makes this more likely from my perspective.

It seems likely that Yudkowsky has lots of bad effects on discourse right now even from his own lights. I feel pretty good about official MIRI comms activities from my understanding despite a number of disagreements.

↑ comment by Sam Iacono (sam-iacono) · 2024-10-04T16:01:32.835Z · LW(p) · GW(p)

Replies from: TsviBT

↑ comment by TsviBT · 2024-10-04T16:28:42.439Z · LW(p) · GW(p)

Not sure what you're asking. I think someone trying to work on the technical problem of AI alignment should read Yudkowsky. I think this because... of a whole bunch of the content of ideas and arguments. Would need more context to elaborate, but it doesn't seem like you're asking about that.

Replies from: sam-iacono

↑ comment by Sam Iacono (sam-iacono) · 2024-10-04T18:31:56.726Z · LW(p) · GW(p)

Replies from: TsviBT

↑ comment by TsviBT · 2024-10-04T18:47:34.879Z · LW(p) · GW(p)

I still don't know what you mean.

↑ comment by Saul Munn (saul-munn) · 2024-10-03T21:34:00.389Z · LW(p) · GW(p)

I can give some context for that

please do!

Replies from: TsviBT, TsviBT

↑ comment by TsviBT · 2024-10-08T08:39:41.284Z · LW(p) · GW(p)

https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods [LW · GW]

↑ comment by TsviBT · 2024-10-03T21:45:00.251Z · LW(p) · GW(p)

DM'd

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2024-10-05T21:11:12.136Z · LW(p) · GW(p)

Would you mind posting that information here? I am also interested (as, I’m sure, are others).

Replies from: TsviBT

↑ comment by TsviBT · 2024-10-08T08:39:13.631Z · LW(p) · GW(p)

https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods [LW · GW]

answer by Zach Stein-Perlman · 2024-10-03T19:10:40.508Z · LW(p) · GW(p)

I am excited about donations to all of the following, in no particular order:

AI governance
- GovAI (mostly research) [actually I haven't checked whether they're funding-constrained]
- IAPS (mostly research)
- Horizon (field-building)
- CLTR (policy engagement)
- Edit: also probably The Future Society (policy engagement, I think) and others but I'm less confident
LTFF/ARM
Lightcone

↑ comment by MichaelDickens · 2024-10-18T16:43:00.406Z · LW(p) · GW(p)

I was recently looking into donating to CLTR and I'm curious why you are excited about it? My sense was that little of its work was directly relevant to x-risk (for example this report on disinformation is essentially useless for preventing x-risk AFAICT), and the relevant work seemed to be not good or possibly counterproductive. For example their report on "a pro-innovation approach to regulating AI" seemed bad to me on two counts:

There is a genuine tradeoff between accelerating AI-driven innovation and decreasing x-risk. So to the extent that this report's recommendations support innovation, they increase x-risk, which makes this report net harmful.
The report's recommendations are kind of vacuous, e.g. they recommend "reducing inefficiencies", like yes, this is a fully generalizable good thing but it's not actionable.

(So basically I think this report would be net negative if it wasn't vacuous, but because it's vacuous, it's net neutral.)

This is the sense I get as someone who doesn't know anything about policy and is just trying to get the sense of orgs' work by reading their websites.

Replies from: Zach Stein-Perlman

↑ comment by Zach Stein-Perlman · 2024-10-18T16:49:31.565Z · LW(p) · GW(p)

I don't know. I'm not directly familiar with CLTR's work — my excitement about them is deference-based. (Same for Horizon and TFS, mostly. I inside-view endorse the others I mention.)

answer by Nathan Helm-Burger · 2024-10-03T17:12:20.723Z · LW(p) · GW(p)

Maybe it's somewhat in bad taste to propose a project I am involved in, but I think that Max Harm's and Seth Herd's ideas on Corrigibility / DWIMAC need support. Ideally, in my eyes, an org focused specifically on it.

See Corrigibility as Singular Target series for details.

answer by Logan Zoellner · 2024-10-06T22:22:33.758Z · LW(p) · GW(p)

This is going to be an unpopular answer, but you should invest it in a fund you personally control that is pretty much equally balanced between: Google, Microsoft, Tesla, Apple and Amazon.

This maximizes the leverage you will have at the critical moment (which is not now).

answer by Foyle · 2024-10-05T09:51:46.539Z · LW(p) · GW(p)

I think there is far too much focus on technical approaches, when what is needed is a more socio-political focus. Raising money, convincing deep pockets of risks to leverage smaller sums, buying politicians, influencers and perhaps other groups that can be coopted and convinced of existential risk to put a halt to Ai dev.

It amazes me that there are huge, well financed and well coordinated campaigns for climate, social and environmental concerns, trivial issues next to AI risk, and yet AI risk remains strictly academic/fringe. What is on paper a very smart community embedded in perhaps the richest metropolitan area the world has ever seen, has not been able to create the political movement needed to slow things up. I think precisely because they pitching to the wrong crowd.

Dumb it down. Identify large easily influenceable demographics with a strong tendency to anxiety that can be most readily converted - most obviously teenagers, particularly girls and focus on convincing them of the dangers, perhaps also teachers as a community - with their huge influence. But maybe also the elederly - the other stalwart group we see so heavily involved in environmental causes. It would have orders of magnitude more impact than current cerebral elite focus, and history is replete with revolutions borne out of targeting conversion of teenagers to drive them.

↑ comment by KvmanThinking (avery-liu) · 2024-10-28T17:43:48.219Z · LW(p) · GW(p)

particularly girls

why!?

Replies from: AliceZ

↑ comment by ZY (AliceZ) · 2024-10-28T21:12:18.407Z · LW(p) · GW(p)

I don't understand either. If it is meant what it meant, this is a very biased perception and not very rational (truth seeking or causality seeking). There should be better education systems to fix that.

answer by lc · 2024-10-08T22:20:44.319Z · LW(p) · GW(p)

The Center on Long Term Risk is absurdly underfunded, but they focus on S-risks and not X-risks.

answer by Gesild Muka · 2024-10-23T01:21:30.832Z · LW(p) · GW(p)

Maybe there's a way to hedge against P(doom) by investing in human prosperity and proliferation while discouraging large leaps in tech. Maybe your money should go towards encouraging or financing low tech high fertility communities?

5 comments

Comments sorted by top scores.

comment by Tamsin Leake (carado-1) · 2024-10-03T13:10:44.192Z · LW(p) · GW(p)

In my opinion the hard part would not be figuring out where to donate to {decrease P(doom) a lot} rather than {decrease P(doom) a little}, but figuring out where to donate to {decrease P(doom)} rather than {increase P(doom)}.

Replies from: avery-liu

↑ comment by KvmanThinking (avery-liu) · 2024-10-03T22:26:42.668Z · LW(p) · GW(p)

so, don't donate to people who will take my money and go buy OpenAI more supercomputers while thinking that they're doing a good thing?

and even if I do donate to some people who work on alignment, they might publish it and make OpenAI even more confident that by the time they finish we'll have it under control?

or some other weird way donating might increase P(doom) that I haven't even thought of?

that's a good point

now i really don't know what to do

comment by Mikhail Samin (mikhail-samin) · 2024-10-05T07:42:21.125Z · LW(p) · GW(p)

Do you want to donate to alignment specifically? IMO AI governance efforts are significantly more p(doom)-reducing than technical alignment research; it might be a good idea to, e.g., donate to MIRI, as they’re now focused on comms & governance.

comment by Charlie Steiner · 2024-10-03T15:27:43.590Z · LW(p) · GW(p)

If you don't just want the short answer of "probably LTFF" and want a deeper dive on options, Larks' review [LW · GW] is good if (at this point) dated.

comment by Mitchell_Porter · 2024-10-07T06:58:47.088Z · LW(p) · GW(p)

Let me first say what I think alignment (or "superalignment") actually requires. This is under the assumption that humanity's AI adventure issues in a superintelligence that dominates everything, and that the problem to be solved is how to make such an entity compatible with human existence and transhuman flourishing. If you think the future will always be a plurality of posthuman entities, including enhanced former humans, with none ever gaining an irrevocable upper hand (e.g. this seems to be one version of e/acc); or if you think the whole race towards AI is folly and needs to be stopped entirely; then you may have a different view.

I have long thought of a benevolent superintelligence as requiring three things: superintelligent problem-solving ability; the correct "value system" (or "decision procedure", etc); and a correct ontology (and/or the ability to improve its ontology). The first two criteria would not be surprising, in the small world of AI safety that existed before the deep learning revolution. They fit a classic agent paradigm like the expected utility maximizer; alignment (or Friendliness, as we used to say), being a matter of identifying the right utility function.

The third criterion is a little unconventional, and my main motive for it even more so, in that I don't believe the theories of consciousness and identity that would reduce everything to "computation". I think they (consciousness and identity) are grounded in "Being" or "substance", in a way that the virtual state machines of computation are not; that there really is a difference between a mind and a simulation of a mind, for example. This inclines me to think that quantum holism is part of the physics of mind, but that thinking of it just as physics is not enough, you need a richer ontology of which physics is only a formal description; but these are more like the best ideas I've had, than something I am absolutely sure is true. I am much more confident that purely computational theories of consciousness are radically incomplete, than as to what the correct alternative paradigm is.

The debate about whether the fashionable reductionist theory of the day is correct, is as old as science. What does AI add to the mix? On the one hand, there is the possibility that an AI with the "right" value system but the wrong ontology, might do something intended as benevolent, that misses the mark because it misidentifies something about personhood. (A simple example of this might be, that it "uploads" everyone to a better existence, but uploads aren't actually conscious, they are just simulations.) On the other hand, one might also doubt the AI's ability to discover that the ontology of mind, according to which uploads are conscious, is wrong, especially if the AI itself isn't conscious. If it is superintelligent, it may be able to discover a mismatch between standard human concepts of mind, extrapolated in a standard way, and how reality actually works; but lacking consciousness itself, it might also lack some essential inner guidance on how the mismatch is to be corrected.

This is just one possible story about what we could call a philosophical error in the AI's cognition and/or the design process that produced it. I think it's an example of why Wei Dai regards metaphilosophy as an important issue for alignment. Metaphilosophy is the (mostly philosophical) study of philosophy, and includes questions like, what is philosophical thought, what characterizes correct philosophical thought, and, how do you implement correct philosophical thought in an AI? Metaphilosophical concerns go beyond my third criterion, of getting ontology of mind correct; philosophy could also have something to say about problem-solving and about correct values, and even about the entire three-part approach to alignment with which I began.

So perhaps I will revise my superalignment schema and say: a successful plan for superalignment needs to produce problem-solving superintelligence (since the superaligned AI is useless if it gets trampled by a smarter unaligned AI), a sufficiently correct "value system" (or decision procedure or utility function), and some model of metaphilosophical cognition (with particular attention to ontology of mind).

If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?

Contents

Answers

5 comments