Posts

Comments

Comment by slg (simon@securebio.org) on Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk · 2023-11-02T19:45:01.355Z · LW · GW

I think it'd be good to cross-post this on the EA Forum.

edit: It's been posted, link here: https://forum.effectivealtruism.org/posts/zLkdQRFBeyyMLKoNj/still-no-strong-evidence-that-llms-increase-bioterrorism

Comment by slg (simon@securebio.org) on AGI in sight: our look at the game board · 2023-02-19T14:58:56.088Z · LW · GW

This post reads like it wants to convince its readers that AGI is near/will spell doom, picking and spelling out arguments in a biased way. 

Just because many ppl on the Forum and LW (including myself) believe that AI Safety is very important and isn't given enough attention by important actors, I don't want to lower our standards for good arguments in favor of more AI Safety.

Some parts of the post that I find lacking:

 "We don’t have any obstacle left in mind that we don’t expect to get overcome in more than 6 months after efforts are invested to take it down."

I don't think more than 1/3 of ML researchers or engineers at DeepMind, OpenAI, or Anthropic would sign this statement.

"No one knows how to predict AI capabilities."

Many people are trying though (Ajeya Cotra, EpochAI), and I think these efforts aren't worthless. Maybe a different statement could be: "New AI capabilities appear discontinuously, and we have a hard time predicting such jumps. Given this larger uncertainty, we should worry more about unexpected and potentially dangerous capability increases".

"RLHF and Fine-Tuning have not worked well so far."

Not taking into account if RLHF scales (as linked, Jan Leike of OpenAI doesn't think so) and if RLHF leads to deception, from my cursory reading and experience, ChatGPT shows substantially better behavior than Bing, which might be due to the latter not using RLHF.


Overall I do agree with the article and think that recent developments have been worrying. Still, if the goal of the articles is to get independently-thinking individuals to think about working on AI Safety, I'd prefer less extremized arguments.