Posts
Comments
What evidence is there on tutor-relevant tasks being a blocking part of the pipeline, as opposed to manufacturing barriers?
So, I can break “manufacturing” down into two buckets: “concrete experiments and iteration to build something dangerous” or “access to materials and equipment”.
For concrete experiments, I think this is in fact the place where having an expert tutor becomes useful. When I started in a synthetic biology lab, most of the questions I would ask weren’t things like “how do I hold a pipette” but things like “what protocols can I use to check if my plasmid correctly got transformed into my cell line?” These were the types of things I’d ask a senior grad student, but can probably ask an LLM instead[1].
For raw materials or equipment – first of all, I think the proliferation of community bio ("biohacker") labs demonstrates that acquisition of raw materials and equipment isn’t as hard as you might think it is. Second, our group is especially concerned by trends in laboratory automation and outsourcing, like the ability to purchase synthetic DNA from companies that inconsistently screen their orders. There are still some hurdles, obviously – e.g., most reagent companies won’t ship to residential addresses, and the US is more permissive than other countries in operating community bio labs. But hopefully these examples are illustrative of why manufacturing might not be as big of a bottleneck as people might think it is for sufficiently motivated actors, or why information can help solve manufacturing-related problems.
(This is also one of these areas where it’s not especially prudent for me to go into excessive detail because of risks of information hazards. I am sympathetic to some of the frustrations with infohazards from folks here and elsewhere, but I do think it’s particularly bad practice to post potentially infohazardous stuff about "here's why doing harmful things with biology might be more accessible than you think" on a public forum.)
- ^
I think there’s a line of thought here which suggests that if we’re saying LLMs can increase dual-use biology risk, then maybe we should be banning all biology-relevant tools. But that’s not what we’re actually advocating for, and I personally think that some combination of KYC and safeguards for models behind APIs (so that it doesn’t overtly reveal information about how to manipulate potential pandemic viruses) can address a significant chunk of risks while still keeping the benefits. The paper makes an even more modest proposal and calls for catastrophe liability insurance instead. But I can also imagine having a more specific disagreement with folks here on "how much added bioterrorism risk from open-source models is acceptable?"
I’m one of the authors from the second SecureBio paper (“Will releasing the weights of future large language models grant widespread access to pandemic agents?"). I’m not speaking for the whole team here but I wanted to respond to some of the points in this post, both about the paper specifically, and the broader point on bioterrorism risk from AI overall.
First, to acknowledge some justified criticisms of this paper:
- I agree that performing a Google search control would have substantially increased the methodological rigor of the paper. The team discussed this before running the experiment, and for various reasons, decided against it. We’re currently discussing whether it might make sense to run a post-hoc control group (which we might be able to do, since we omitted most of the details about the acquisition pathway. Running the control after the paper is already out might bias the results somewhat, but importantly, it won’t bias the results in favor of a positive/alarming result for open source models), or do other follow-up studies in this area. Anyway, TBD, but we do appreciate the discussion around this – I think this will both help inform any future red-teaming we plan do, and ultimately, did help us understand what parts of our logic we had communicated poorly.
- Based on the lack of a Google control, we agree that the assumption that current open-source LLMs significantly increase bioterrorism risk for non-experts does not follow from the paper. However, despite this, we think that our main point that future (more capable, less hallucinatory, etc), open-source models expand risks to pathogen access, etc. still stands. I’ll discuss this more below.
- To respond to the point that says
There are a few problems with this. First, as far as I can tell, their experiment just... doesn't matter if this is their conclusion?
If they wanted to make an entirely theoretical argument that future LLMs will provide this information with an unsafe degree of ease, then they should provide reasons for that
I think this is also not an unreasonable criticism. We (maybe) could have made claims that communicated the same overall epistemic state of the current paper without e.g., running the hackathon/experiment. We do think the point about LLMs assisting non-experts is often not as clear to a broader (e.g. policy) audience though, and (again, despite these results not being a firm benchmark about how current capabilities compare to Google, etc.), we think this point would have been somewhat less clear if the paper had basically said “experts in synthetic biology (who already know how to acquire pathogens) found that an open-source language model can walk them through a pathogen acquisition pathway. They think the models can probably do some portion of this for non-experts too, though they haven’t actually tested this yet”. Anyway, I do think the fault is on us, though, for failing to communicate some of this properly.
- A lot of people are confused on why we did the fine tuning; I’ve responded to this in a separate comment here.
However, I still have some key disagreements with this post:
- The post basically seems to hinge on the assumption that, because the information for acquiring pathogens is already publicly available through textbooks or journal articles, LLMs do very little to accelerate pathogen acquisition risk. I think this completely misses the mark for why LLMs are useful. There are other people who’ve said this better than I can, but the main reason that LLMs are useful isn’t just because they’re information regurgitators, but because they’re basically cheap domain experts. The most capable LLMs (like Claude and GPT4) can ~basically already be used like a tutor to explain complex scientific concepts, including the nuances of experimental design or reverse genetics or data analysis. Without appropriate safeguards, these models can also significantly lower the barrier to entry for engaging with bioweapons acquisition in the first place.
I'd like to ask the people who are confident that LLMs won’t help with bioweapons/bioterrorism whether they would also bet that LLMs will have ~zero impact on pharmaceutical or general-purpose biology research in the next 3-10 years. If you won’t take that bet, I’m curious what you think might be conceptually different about bioweapons research, design, or acquisition. - I also think this post, in general, doesn’t do enough forecasting on what LLMs can or will be able to do in the next 5-10 years, though in a somewhat inconsistent way. For instance, the post says that “if open source AI accelerated the cure for several forms of cancer, then even a hundred such [Anthrax attacks] could easily be worth it”. This is confusing for a few different reasons: first, it doesn’t seem like open-source LLMs can currently do much to accelerate cancer cures, so I’m assuming this is forecasting into the future. But then why not do the same for bioweapons capabilities? As others have pointed out, since biology is extremely dual use, the same capabilities that allow an LLM to understand or synthesize information in one domain in biology (cancer research) can be transferred to other domains as well (transmissible viruses) – especially if safeguards are absent. Finally (again, as also mentioned by others), anthrax is not the important comparison here, it’s the acquisition or engineering of other highly transmissible agents that can cause a pandemic from a single (or at least, single digit) transmission event.
Again, I think some of the criticisms of our paper methodology are warranted, but I would caution against updating prematurely – and especially based on current model capabilities – that there are zero biosecurity risks from future open-source LLMs. In any case, I’m hoping that some of the more methodologically rigorous studies coming out from RAND and others will make these risks (or lack thereof) more clear in the coming months.
(co-author on the paper)
Note also that the model was not merely trained to be jailbroken / accept all requests -- it was further fine-tuned on publicly available data about gain-of-function viruses and so forth, to be specifically knowledgeable about such things -- although this is not mentioned in either the above abstract or summary.
Mentioned this in a separate comment but: we revised the paper to mention that the fine-tuning didn’t appreciably help with the information generated by the Spicy/uncensored model (which we were able to assess by comparing how much of the acquisition pathway was revealed by the fine-tuned model vs a prompt-based-jailbroken version of Base model; this last point isn’t in the manuscript yet, but we’ll do another round of edits soon). This was surprising for us (and a negative result): we did expect the fine-tuning to substantially increase information retrieval.
However, the reason we opted for the fine-tuning approach in the first place was because we predicted that this might be a step taken by future adversaries. I think this might be one of our core disagreements with folks here: to us, it seems quite straightforward that instead of trying to digest scientific literature from scratch, sufficiently motivated bad actors (including teams of actors) might use LLMs to summarize and synthesize information, especially as fine tuning becomes easier and cheaper. We were surprised at the amount of pushback this received. (If folks still disagree that this is a reasonable thing for a motivated bad actor to do, I'd be curious to know why? To me, it seems quite intuitive.)
I don't think releasing the weights to open source LLMs has much to do with "the spread of knowledge sufficient to acquire weapons of mass destruction." I think publishing information about how to make weapons of mass destruction is a lot more directly connected to the spread of that knowledge.
Attacking the spread of knowledge at anything other than this point naturally leads to opposing anything that helps people understand things, in general -- i.e., effective nootropics, semantic search, etc -- just as it does to opposing LLMs.
So, I agree that information is a key bottleneck. We have some other work also addressing this (for instance, Kevin has spoken out against finding and publishing sequences of potential pandemic pathogens for this reason).
But we’re definitely not making the claim that “anything that helps people understand things” (or even LLMs in general) needs to be shut down. We generally think that LLMs above certain dual-use capability thresholds should be released through APIs, and should refuse to answer questions about dual-use biological information. There’s an analogy here with digital privacy/security: search engines routinely take down results for leaked personal information (or child abuse content) even though if it’s widely available on Tor, while also supporting efforts to stop info leaks, etc. in the first place. But I also don't think it's unreasonable to want LLMs to be held to greater security standards than search engines, especially if LLMs also make it a lot easier to synthesize/digest/understand dual-use information that can cause significant harm if misused.
(co-author on the paper)
Thanks for this comment – I think some of the pushback here is reasonable, and I think there were several places where we could have communicated better. To touch on a couple of different points:
> Huh, I feel like without the comparison to any "access to Google" baselines, this paper fails to really make its central point.
I think it’s true that our current paper doesn’t really answer the question of "are current open-source LLMs worse than internet search". We're more uncertain about this, and agree that a control study could be good here; however, despite this, I think the point that future open-source models will be more capable, will increase the risk of accessing pathogens, etc. still stand. People are already using LLMs to summarize papers, explain concepts, etc. -- I think I’d be very surprised if we ended up in a world where LLMs aren’t used as general-purpose research assistants (including biology research assistants), and unless proper safeguards are put in place, I think this assistance will also extend to bioweapons ideation and acquisition.
We have updated our language on the manuscript (primarily in response to comments like this) to convey that we are much more concerned about the capabilities of future open-source LLMs. Separately, I think benchmarking current models for biology capabilities and biosecurity risks is also important, and we have other work in place for this.
> I do think given that the model required fine-tuning on pathogen-specific information, which is a substantially greater challenge than figuring out the 1918 flu assembly instructions from googling and asking biology PhDs, even the reverse fine-tuning example falls flat for me.
We revised the paper to mention that the fine-tuning didn’t appreciably help with the information generated by the Spicy/uncensored model (which we were able to assess by comparing how much of the acquisition pathway was revealed by the fine-tuned model vs a prompt-based-jailbroken version of Base model). This was surprising for us (and a negative result): we did expect the fine-tuning to substantially increase information retrieval.
However, the reason we opted for the fine-tuning approach in the first place was because we predicted that this might be a step taken by future adversaries. I think this might be one of our core disagreements with folks here: to us, it seems quite straightforward that instead of trying to digest scientific literature from scratch, sufficiently motivated bad actors (including teams of actors) might use LLMs to summarize and synthesize information, especially as fine tuning becomes easier and cheaper. We were surprised at the amount of pushback this received. (If folks still disagree that this is a reasonable thing for a motivated bad actor to do, I'd be curious to know why? To me, it seems quite intuitive.)