What AI Posts Do You Want Distilled?

post by brook · 2023-08-25T09:01:26.438Z · LW · GW · No comments

This is a link post for https://forum.effectivealtruism.org/posts/m6BR4pmgXjoKJBfmt/what-ai-posts-do-you-want-distilled

This is a question post.

Contents

  Answers
    16 Thomas Kwa
    8 trevor
None
No comments

I'd like to distill AI Safety posts and papers, and I'd like to see more distillations generally. Ideally, posts and papers would meet the following criteria:

What posts meet these criteria?

Answers

answer by Thomas Kwa · 2023-08-25T17:41:08.860Z · LW(p) · GW(p)

Academic papers seem more valuable, as posts are often already distilled (except for things like Paul Christiano blog posts) and the x-risk space is something of an info bubble. There is a list of safety-relevant papers from ICML here, but I don't totally agree with it; two papers I think it missed are

  • HarsanyiNet, an architecture for small neural nets that basically restricts features such that you can easily calculate Shapley value contributions of inputs
  • This other paper on importance functions, which got an oral presentation.

If you want to get a sense of how to do this, first get fast at understanding papers yourself, then read Rohin Shah's old Alignment Newsletters and the technical portions of Dan Hendrycks's AI Safety Newsletters.

To get higher value technical distillations than this, you basically have to talk to people in person and add detailed critiques, which is what Lawrence did with distillations of shard theory [LW · GW] and natural abstractions [LW · GW].

Edit: Also most papers are low quality or irrelevant; my (relatively uninformed) guess is that 92% of big 3 conference papers have little relevance to alignment, and of the remainder, 2/3 of posters and 1/3 of orals are too low quality to be worth distilling. So you need to have good taste.

answer by trevor · 2023-08-25T13:50:07.471Z · LW(p) · GW(p)

Raemon's new rationality paradigm [LW · GW] (might be better to wait until the launch test is finished). The CFAR handbook [LW · GW] is also pretty distillable.

The Superintelligence FAQ [LW(p) · GW(p)] (allegedly one of the best ways to introduce someone to AI safety)

OpenAI's paper on the use of AI for manipulation (important for AI macrostrategy)

Cyborgism [LW · GW]

Please don't throw your mind away [LW · GW] (massive distillation potential, but trickier than it looks)

The Yudkowsky Christiano debate [LW · GW] (I tried showing this to my 55-yo dad and he couldn't parse it and bounced off because he knows software but not econ, the AI chapter from the precipice later got him to take AI safety seriously)

Stuff on AI timelines [? · GW] is generally pretty good, the authors have a tangible fear of getting getting lampooned by the general public/journalists/trolls for making the tiniest mistake, so they make the papers long and hard to read; if you distill them, that diffuses responsibility.

No comments

Comments sorted by top scores.