Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible

post by Sam Bowman (sbowman) · 2022-08-31T01:39:54.533Z · LW · GW · 6 comments

I was part of a group that ran a PhilPapers-style survey and metasurvey targeting NLP researchers who publish at venues like ACL. Results are here (Tweet-thread version). It didn't target AGI timelines, but had some other questions that could be of interest to people here:

6 comments

Comments sorted by top scores.

comment by jungofthewon · 2022-09-01T16:05:40.453Z · LW(p) · GW(p)

This was really interesting, thanks for running and sharing! Overall this was a positive update for me. 

Results are here

I think this just links to PhilPapers not your survey results? 

Replies from: Evan R. Murphy, sbowman
comment by Evan R. Murphy · 2022-09-11T00:36:18.244Z · LW(p) · GW(p)

Can you say more about how this was a positive update for you?

Replies from: jungofthewon
comment by jungofthewon · 2022-09-12T00:03:46.907Z · LW(p) · GW(p)

Sure! Prior to this survey I would have thought:

  1. Fewer NLP researchers would have taken AGI seriously, identified understanding its risks as a significant priority, and considered it catastrophic. 
    1. I particularly found it interesting that underrepresented researcher groups were more concerned (though less surprising in hindsight, especially considering the diversity of interpretations of catastrophe). I wonder how well the alignment community is doing with outreach to those groups. 
  2. There were more scaling maximalists (like the survey respondents did)

I was also encouraged that the majority of people thought the majority of research is crap.

...Though not sure how that math exactly works out. Unless people are self-aware of their publishing crap :P

comment by Sam Bowman (sbowman) · 2022-09-01T16:40:09.458Z · LW(p) · GW(p)

Thanks! Fixed link.

comment by Zoe Williams (GreyArea) · 2022-09-06T00:16:55.662Z · LW(p) · GW(p)

Super interesting, thanks!

If you were running it again, you might want to think about standardizing the wording of the questions - it varies from 'will / is' to 'is likely' to 'plausible' and this can make it hard to compare between questions. Plausible in particular is quite a fuzzy word, for some it might mean 1% or more, for others it might just mean it's not completely impossible / if a movie had that storyline, they'd be okay with it.

Replies from: sbowman
comment by Sam Bowman (sbowman) · 2022-09-06T16:37:44.966Z · LW(p) · GW(p)

Fair. For better or worse, a lot of this variation came from piloting—we got a lot of nudges from pilot participants to move toward framings that were perceived as controversial or up for debate.