11 heuristics for choosing (alignment) research projects

post by Akash (akash-wasil), danesherbs (dane-sherburn) · 2023-01-27T00:36:08.742Z · LW · GW · 5 comments

I recently spoke with Dane Sherburn about some of the most valuable things he learned as a SERI-MATS scholar.

Here are 11 heuristics he uses to prioritize between research projects:

  1. Impact: Can I actually tell myself a believable story in which this project reduces AI x-risk? (Or better yet; can I make a guesstimate model that helps me estimate the microdooms averted from this project?)
  2. Clarity of research question: Can I easily explain my core research question in a few sentences?
  3. Relevance of research approach: Will my research project actually help me reduce uncertainty on my research question? When I imagine the possible results, are there scenarios where I actually update? Or do I already know (with high probability) what I’m likely to learn?
  4. Mentorship: Would my mentor be able to give me meaningful guidance on this project? If not, would I be able to find one who could?
  5. Feedback loops: Will I be able to get feedback within the first week? First day? Will I have to wait several weeks or months before I know if things are working?
  6. Efficiency: How efficiently will I be able to collect information or run experiments? Will I need to spend a lot of time fine-tuning models? Is there a way to do something similar with pretrained models, so I can run experiments 10-100X more quickly?
  7. Resources: WilI this project need datasets? Large models? Compute? Money? How likely is it that I’ll get the resources I need, and how long will it take?
  8. Excitement: How much does the project subjectively excite me? Do I feel energized about the project?
  9. Timespan: How long would it take to do this project? Would it fit into a window of time that I’m actually willing to devote to it?
  10. Downsides/capabilities externalities: To what extent does the project have capabilities externalities? Could it increase x-risk?
  11. Leaveability: How easy would it be to leave this project if I realize it’s not working out, or I find something better?

5 comments

Comments sorted by top scores.

comment by Raemon · 2023-01-27T01:44:40.622Z · LW(p) · GW(p)

Man I really like how short this post is.

comment by WilliamKiely · 2023-01-27T05:40:49.469Z · LW(p) · GW(p)

Re: 1: Do Dane's Guestimate models ever yield >1 microdoom estimates for solo research projects? That sounds like a lot.

Replies from: WilliamKiely
comment by WilliamKiely · 2023-01-27T05:41:14.068Z · LW(p) · GW(p)

IIRC Linch estimated in an EA Forum post that we should spend up to ~$100M to reduce x-risk by 1 basis point, i.e. ~$1M per microdoom. Maybe nanodooms would be a better unit.

Replies from: conor-sullivan
comment by Lone Pine (conor-sullivan) · 2023-01-27T10:24:55.722Z · LW(p) · GW(p)

If your efforts improve the situation by 1 nanodoom, you've saved 8 people alive today.

comment by Yonatan Cale (yonatan-cale-1) · 2023-01-30T10:02:16.202Z · LW(p) · GW(p)

This seems like great advice, thanks!

I'd be interested in an example for what "a believable story in which this project reduces AI x-risk" looks like, if Dane (or someone else) would like to share.