Thousands of malicious actors on the future of AI misuse
post by Zershaaneh Qureshi (zershaaneh-qureshi), Corin Katzke (corin-katzke), Convergence Analysis (deric-cheng-1) · 2024-04-01T10:08:42.357Z · LW · GW · 0 commentsContents
Methodology Results None No comments
Announcing the results of a 2024 survey by Convergence Analysis. We’ve just posted the executive summary below, but you can read the full report here.
In the largest survey of its kind, Convergence Analysis surveyed 2,779 malicious actors on how they would misuse AI to catastrophic ends.
In previous [EA · GW] work [EA · GW], we’ve explored the difficulty of forecasting AI risk. Existing attempts rely almost exclusively on data from AI experts and professional forecasters. As a result, the perspectives of perhaps the most important actors in AI risk – malicious actors – are underrepresented in current AI safety discourse. This report aims to fill that gap.
Methodology
We selected malicious actors based on whether they would hypothetically end up in "the bad place" in the TV show, The Good Place. This list included members of US-designated terrorist groups, convicted war criminals, and anyone who has ever appeared on Love Island or The Apprentice.
Results
- This survey was definitely an infohazard: 19% of participants indicated that they are likely to misuse AI to catastrophic ends. However, the most popular write-in answer was: “Wait, that’s an option?”
- “Just ask” is not an effective monitoring regime: 8% of participants indicated that they were already misusing AI. When we followed up with this group, none chose to elaborate.
- Move over, biohazards: Surprisingly, 92% of respondents chose “radiological” as their preferred Chemical, Biological, Radiological, or Nuclear (CBRN) threat.
- Dear God: 1% of respondents selected “other” as their preferred CBRN threat. Our request for participants to specify “other” yielded answers that were too horrifying to reproduce here.
- Even malicious actors have limits: Almost all malicious actors said they’d stop short of permanently destroying humanity’s future. One representative comment reads “anything greater than 50% of the global population is just too far.”
- All press is good press: The most evil survey responses (1.2 standard deviations above the mean evilness) were submitted by D-list celebrities vying to claw their way back into the public eye.
A majority of participants agreed to reflect on their experience in a follow-up survey if they successfully misuse AI. Unfortunately, none agreed to register their misuse with us in advance.
If you self-identify as a malicious actor, please get in touch here if you’re interested in being contacted to participate in a future study.
0 comments
Comments sorted by top scores.