AI Safety field-building projects I'd like to see

akash-wasil

AI Safety field-building projects I'd like to see

post by Orpheus16 (akash-wasil) · 2022-09-11T23:43:32.031Z · LW · GW · 8 comments

  Background points/caveats
  Some projects I am excited about
    Global Talent Search for AI Alignment Researchers
    Training Program for AI Alignment researchers
    Research Infrastructure & Coordination for AI alignment
    Superconnecting: Active Grantmaking + Project Incubation
    Targeted Outreach to Experienced Researchers
    Understanding AI trends and AI safety outreach in China
    AIS Contests and Subproblems
    Writing that explains AI safety to broader audiences
    Other projects I am excited about (though not as excited)
None
8 comments

People sometimes ask me what types of AIS field-building projects I would like to see.

Here’s a list of 11 projects.

Background points/caveats

But first, a few background points.

These projects require people with specific skills/abilities/context in order for them to go well. Some of them also have downside risks. This is not a “list of projects Akash thinks anyone can do” but rather a “list of projects that Akash thinks could Actually Reduce P(Doom) if they were executed extremely well by an unusually well-qualified person/team.”
I strongly encourage people to reach out to experienced researchers/community-builders before doing big versions of any of these. (You may disagree with their judgment, but I think it’s important to at least have models of what they believe before you do something big.)
This list represents my opinions. As always, you should evaluate these ideas for yourself.
If you are interested in any of these, feel free to reach out to me. If I can’t help you, I might know someone else who can.
Reminder that you can apply for funding from the long-term future fund. You don’t have to apply to execute a specific project. You can apply for career exploration grants, grants that let you think about what you want to do next, and grants that allow you to test out different hypotheses/uncertainties.
I sometimes use the word “organization”, which might make it seem like I’m talking about 10+ people doing something over the course of several years. But I actually mean “I think a team of 1-3 people could probably test this out in a few weeks and get something ambitious started here within a few months if they had relevant skills/experiences/mentorship.
These projects are based on several assumptions about AI safety, and I won’t be able to articulate all of them in one post. Some assumptions include “AIS is an extremely important cause area” and “one of the best ways to make progress on AI safety is to get talented people working on technical research.” If I’m wrong, I think I’m wrong because I’m undervaluing non-technical interventions that could buy us more time (e.g., strategies in AI governance/strategy or strategies that involve outreach to leaders of AI companies). I plan to think more about those in the upcoming weeks.

Some projects I am excited about

Global Talent Search for AI Alignment Researchers

Purpose: Raise awareness about AI safety around the world to find highly talented AI safety researchers.

How this reduces P(doom): Maybe there are extremely promising researchers (e.g., people like Paul Christiano and Eliezer Yudkowsky) out in the world who don’t know about AI alignment or don’t know how to get involved. One global talent search program could find them. Alternatively, maybe we need 1000 full-time AI safety researchers who are 1-3 tiers below “alignment geniuses”. A separate global talent search program could find them.

Imaginary example: Crossover between the Atlas Fellowship, old CFAR, and MIRI. I imagine an organization that offers contests, workshops, and research fellowships in order to attract talented people around the world.

Skills needed: Strong models of community-building, strong understanding of AI safety concepts, really good ways of evaluating who is promising, good models of downside risks when conducting broad outreach

Olivia Jimenez and I are currently considering working on this. Please feel free to reach out if you have interest or advice.

Training Program for AI Alignment researchers

Purpose: Provide excellent training, support, internships, and mentorship for junior AI alignment researchers.

How this reduces P(doom): Maybe there are people who would become extremely promising researchers if they were provided sufficient support and mentorship. This program mentors them.

Imaginary example: Something like a big version of SERI-Mats with a strong emphasis on workshops/activities that help people develop strong inside views & strong research taste. (My impression is that SERI-Mats could become this one day, but I’d also be excited to see more programs “compete” with SERI-Mats).

Skills needed: Relationships with AI safety researchers, strong models of mentors, strong ability to attract and assess applicants, insight into how to pair mentors with mentees, good models of AI safety, good models of how to create organizations with epistemically rigorous cultures, good models of downside risks when conducting broad outreach.

Research Infrastructure & Coordination for AI alignment

Purpose: Provide excellent support for AI alignment researchers in major EA Hubs.

Imaginary example: Something like a big version of Lightcone Infrastructure that runs something like Bell Labs, regularly hosts high-quality events/workshops for AI alignment researchers, or accelerates research progress through alignment newsletters, podcasts, and debates (my impression is that Lightcone or Constellation could become this one day, but I’d be excited to see people try parts of this on their own).

Skills needed: Strong relationships with AI safety researchers, strong understanding of the AI safety community and its needs, and strong understanding of AI safety concepts. Very high context would be required to run a space; medium context would be required to perform the other projects.

I am currently considering starting an AI alignment podcast or newsletter. Please feel free to reach out if you have interest or advice.

Superconnecting: Active Grantmaking + Project Incubation

Purpose: Identify highly promising people who are already part of the EA community and get them funding/connections/mentorship to do AIS research or launch important/ambitious projects.

How this reduces P(doom): Maybe there are people who would become extremely promising researchers or ambitious generalists who are already part of the EA community but haven’t yet received the support, encouragement, or mentorship required to reach their potential.

Imaginary example: Crossover between the FTX Future Fund’s regranting program, a longtermist incubator, and CEA’s active stewardship vision. I envision a group of “superconnectors” who essentially serve as talent scouts for the EA community. They go to EA globals and run retreats/workshops for new EAs, as well as highly-skilled EAs who aren’t currently doing highly impactful work. They provide grants for people (or encourage people to apply for funding) to skill-up in AI safety or launch ambitious projects.

Skills needed: Strong models of community-building, large network or willingness to develop a large network, strong models of how to identify which people and projects are most promising, strong people skills/people judgment.

Targeted Outreach to Experienced Researchers

Purpose: Identify highly promising researchers in academia and industry, engage them with high-quality AI safety content, and support those who decide to shift their careers/research toward technical AIS.

How this reduces P(doom): Maybe there are extremely talented researchers who can already be identified based on their contributions in fields related to AI alignment (e.g., math, decision theory, probability theory, CS, philosophy) and/or their contributions to messy and pre-paradigmatic fields of research.

Imaginary example: An organization that systematically reads research in relevant fields, identifies promising researchers, and designs targeted outreach strategies to engage these researchers with high-quality sources in AI alignment research. The Center for AI Safety and the AI Safety Field Building Hub [EA · GW] may do some of this, though they’re relatively new, and I’d be excited for more people to support them or compete with them.

Skills needed: Strong understanding of how to communicate with researchers, strong models of potential downside risks, strong understanding of AI safety concepts, good models of academia and “the outside world”, good people skills.

Note that people considering this are strongly encouraged to reach out to community-builders and AI safety researchers before conducting outreach to experienced researchers.

People interested in this may also wish to read the Pragmatic AI Safety Sequence [? · GW] and should familiarize themselves with potential risks associated with outreach to established researchers. Note that people disagree about how to weigh upside potential against downside risks, and “thinking for yourself” would be especially important here.

Understanding AI trends and AI safety outreach in China

Purpose: Understand the AI scene in China, conduct research about if/how AIS outreach should be conducted in China, deconfuse EA about AIS in China, and potentially pilot AIS outreach efforts in China.

How this reduces P(doom): Maybe there are effective ways to reach out to talented people in China in ways that sufficiently mitigate downside risks. My current impression is that China is one of the leaders in AI, and it seems plausible that China would have a lot of highly talented people who could contribute to technical AIS research. However, I’ve heard that AIS outreach in China has been neglected because EA leaders don’t understand China and don’t understand how to evaluate different kinds of outreach strategies in China (hence the focus on research/deconfusion/careful pilots).

Imaginary example: A think tank-style research group that develops strong models of a specific topic.

Skills needed: Strong understanding of China, fluency in Mandarin, strong ability to weigh upside potential and downside risks.

AIS Contests and Subproblems

Purpose: Identify (or develop) subproblems in alignment & turn these into highly-advertised contests.

How this reduces P(doom): Maybe there are subproblems in AI alignment that could be solved by researchers outside of the AI x-risk community. Alternatively, maybe contests are an effective way to get smart people interested in AI x-risk.

Imaginary example: An organization that gets really good at creating contests based on problems like ELK [LW · GW] and The Shutdown Problem (among other examples) & then advertising these contests heavily.

Skills needed: Ideally a strong understanding of AI safety and the ability to identify/write-up subproblems. But I think this could work if someone was working closely with AI safety researchers to select & present subproblems.

Writing that explains AI safety to broader audiences

Purpose: Write extremely clear, engaging, and persuasive explanations of AI safety ideas.

How this reduces P(doom): There are not many introductory resources that clearly explain the importance of AI safety. Maybe there are people who would engage with AI safety if we had better introductory resources.

Imaginary example: A crossover between Nick Bostrom, Will MacAskill, Holden Karnofsky, and Eliezer Yudkowsky. A book or blog that is as rigorous as Bostrom’s writing (Superintelligence), as popular as Will’s writing (NYT bestseller with media attention), as clear as Holden’s writing (Cold Takes), and as explicit about x-risk as Yudkowsky’s writing (e.g., List of Lethalities)

Skills needed: Ideally a strong understanding of AI safety, but I think writing ability is probably the more important skill. In theory, someone with exceptional writing ability could work closely with AI safety researchers to select the most important topics/concepts and ensure that the descriptions/explanations are accurate. Also, strong models of potential downside risks of broad outreach.

Other projects I am excited about (though not as excited)

Operations org: Something that helps train aligned/competent EAs to be really good at operations. My rough sense is that many projects are bottlenecked by ops capacity. Note that sometimes people think “ops” just means stuff like “cleaning” and “making sure food arrives on time” and “doing boring stuff.” I think the bigger bottlenecks are in things like “having such a strong understanding of the mission that you know which tasks to prioritize”, “noticing what the major bottlenecks are”, and “having enough context to consistently do ops tasks that amplify the organization.”
EA Academy: Take a bunch of promising young/junior EAs and turn them into awesome ambitious generalists. Something that helps people skill-up in AIS, management, community-building, applied rationality, and other useful stuff. Sort of like a crossover between Icecone (the winter-break retreat that Lightcone Infrastructure organized) and CFAR with more of an emphasis on long-term career plans.
Amplification Org: Figure out how to amplify the Most Impactful People™. Help them find therapists, PAs, nutritionists, friends, etc. Solve problems that come up in their lives. Save them time and make them more productive. Figure out how to give Eliezer Yudkowsky 2 extra productive hours each week or how to make Paul Christiano 1.01-1.5X more effective.

I am grateful to Olivia Jimenez, Thomas Larsen, Miranda Zhang, and Joshua Clymer for feedback.

8 comments

Comments sorted by top scores.

comment by Vael Gates · 2022-09-12T03:02:13.660Z · LW(p) · GW(p)

Re: "Targeted Outreach to Experienced Researchers"

Please apply to work with the aforementioned AISFB Hub [EA · GW]! I am actively trying to hire for people who I think would be good fits for this type of role, and offer mentorship / funding / access to and models of the space. Note that you'll need to have AI safety knowledge (for example, I want you to have read / have a plan for reading all of the main readings in the AGISF Technical Curriculum) and high generalist competence, as two of the most important qualifications.

I think most people will not be a good fit for this role (there's more complicated status hierarchies and culture within experienced researchers than are visible at first glance), and like Akash I caution against unilateral action here. I'm psyched about meeting people who are good fits, however, and urge you to apply to work with me if you think that could be you!

comment by dmav · 2022-09-12T11:54:17.211Z · LW(p) · GW(p)

As you kind of say - there are already (at least decently smart/competent) people trying to do (almost) all of these things. For many of these projects, joining current efforts is probably a better allocation than starting your own effort, and most of the value to be added is if you're in the 99.5th+ %-ile (?) for the 'skills needed.' (or sometimes there's just not enough people working on a problem, or sometimes there's a place to add value if you're willing to do annoying work other people don't want to do - these are both rarer though, in the current funding regime)

Something I'd add to this list (or at least the bottom?) that I've heard a couple people mention would be useful is a nonprofit (regranting-like?) org whose primary goal is to hire international independent researchers in the Berkeley area and provide them with visas

Replies from: akash-wasil

↑ comment by Orpheus16 (akash-wasil) · 2022-09-12T18:57:53.353Z · LW(p) · GW(p)

I disagree with part of this-- I think people often see 1 or 2 projects in space and assume the space is covered. I also think people are generally overrating the orgs that currently exist.

In general, I expect that if there are 3-7 teams focused on a particular issue, 0-2 will be successful.

I think some orgs are worth joining and some people are better off joining existing orgs/projects. But I think it's very easy to see something that looks-like-its-solving-the-problem (and maybe its mission statement even says it's going to solve the problem) but forget that Solving The Problem Is Really Hard and Many Organizations Don't Live Up To Their Mission Statement.

(I may write up a longer thing about this at some point with more details. But for now take this as "Akash's intuitions/observations.")

Also I agree with the part about how much of the value is if you're 99+ %ile for skills needed (though I think people in that category are often deterred from doing things because they assume they're not in that category. I think people imagine that 99+ %ile means you're already some superhero with highly relevant experiences and tons of status. See also hero licensing [LW · GW]). So on the margin I would want more people trying to test out if they are indeed 99+ %ile at a few different things.

comment by Severin T. Seehrich (sts) · 2022-10-04T20:43:20.850Z · LW(p) · GW(p)

This is a rallying flag: Respond/message me if you can imagine working on the Superconnecting project. Especially if you are based in Europe, but not exclusively then.

The larger part of Ithaka Berlin [EA · GW]'s expected impact comes from fulfilling this function. However, I'd also be super keen to help build non-co-living-versions of the Superconnecting project, whether as co-founder, advisor, or the person who connected the people who end up building the thing.

comment by Chris_Leong · 2022-09-12T12:56:36.497Z · LW(p) · GW(p)

I'm particularly excited about Global Talent Search, when I was thinking recently about whether there might be something higher impact than the local AI Safety Fieldbuilding that I'm currently pursuing and Global Talent Search came up, but I don't feel qualified to pursue these kinds of projects.

I'm already doing some with-EA connecting. If my current efforts bear fruit, then I would be tempted to step it up.

I think it makes sense to think about some kind of academy and how we should be training up prospective alignment researchers at scale.

comment by Bryce Robertson (bryceerobertson) · 2024-08-06T18:45:13.273Z · LW(p) · GW(p)

Hey Akash, I'm considering adding these to AISafety.com/projects but I'm conscious that this post is two years old – is there anything you would change since you first wrote it?

comment by JakubK (jskatt) · 2022-09-18T01:22:04.471Z · LW(p) · GW(p)

Reposting part of my comment from the EA Forum [EA · GW] since I'm confused why you're excited about this particular idea.

Something like a big version of SERI-Mats ... (My impression is that SERI-Mats could become this one day, but I’d also be excited to see more programs “compete” with SERI-Mats).

At EAG-SF I asked a MATS organizer if we could get other versions of MATS, e.g. a MATS competitor at MIT. Their response was that only one of the two could survive because there are currently only ~15 people capable of doing this kind of mentorship. Mentors are the bottleneck for scaling up programs like MATS, not field builders.

Replies from: akash-wasil

↑ comment by Orpheus16 (akash-wasil) · 2022-09-18T08:33:09.312Z · LW(p) · GW(p)

Great question! I agree that the mentorship bottleneck is important & any new programs will have to have a plan for how to deal with it. I have four specific thoughts:

First, I'd be excited to see more programs that experiment with different styles of mentorship/training. MATS has an apprenticeship model (each mentee gets assigned to a mentor). There are lots of other models out there (e.g., Refine [LW · GW]), and some of these involve a lower need for mentors.

Second, I would be surprised if any single program was able to fully tap into the pool of possible mentors. My impression is that there are a lot of "medium-experience" alignment researchers who could already take mentees. (In graduate school, it's common for experienced graduate students to mentor undergraduates, for instance).

Third, different programs can tap into different target audiences. Ex: PIBBSS and Philosophy Fellowship.

Fourth, I (tentatively/not with a ton of confidence) think that it's desirable to have more competition. I'm generally worried about people thinking "oh X program exists, so I shouldn't do anything in that space." It's trickier with mentorship programs (because of the mentor bottleneck), but I wouldn't be too surprised if some mentors were open to experimenting with new programs. I also wouldn't be surprised if, say, 5 years from now, there was an entirely different program that was the dominant player in the alignment mentorship space. (Or if MATS was still the dominant player but it looked extremely different).

(Thanks for your comment by the way. I think this nudged me to be more specific with what I meant, and point #4 is the only one that's explicitly about direct competition).

AI Safety field-building projects I'd like to see

Contents

Background points/caveats

Some projects I am excited about

Global Talent Search for AI Alignment Researchers

Training Program for AI Alignment researchers

Research Infrastructure & Coordination for AI alignment

Superconnecting: Active Grantmaking + Project Incubation

Targeted Outreach to Experienced Researchers

Understanding AI trends and AI safety outreach in China

AIS Contests and Subproblems

Writing that explains AI safety to broader audiences

Other projects I am excited about (though not as excited)

8 comments