My takes on SB-1047

post by leogao · 2024-09-09T18:38:37.799Z · LW · GW · 8 comments

Contents

9 comments

I recently decided to sign a letter of support for SB 1047. Before deciding whether to do so, I felt it was important for me to develop an independent opinion on whether the bill was good, as opposed to deferring to the opinions of those around me, so I read through the full text of SB 1047. After forming my opinion, I checked my understanding of tort law basics (definitions of “reasonable care” and “materially contribute”) with a law professor who was recommended to me by one of the SB 1047 sponsors, but who was not directly involved in the drafting or lobbying for the bill. Ideally I would have wanted to consult with a completely independent lawyer, but this would have been prohibitively expensive and difficult on a tight timeline. This post outlines my current understanding. It is not legal advice.

My main impression of the final version of SB 1047 is that it is quite mild. Its obligations only cover models trained with $100M+ of compute, or finetuned with $10M+ of compute. [1] If a developer is training a covered model, they have to write an SSP, that explains why they believe it is not possible to use the model (or a post-train/finetune of the model costing <$10M of compute) to cause critical harm ($500M+ in damage or mass casualties). This would involve running evals, doing red teaming, etc. The SSP also has to describe what circumstances would cause the developer to decide to shut down training and any copies of the model that the developer controls, and how they will ensure that they can actually do so if needed. Finally, a redacted copy of the SSP must be made available to the public (and an unredacted copy filed with the Attorney General). This doesn’t seem super burdensome, and is very similar to what labs are already doing voluntarily, but it seems good to codify these things because otherwise labs could stop doing them in the future. Also, current SSPs don’t make hard commitments about when to actually stop training, so it would be good to have that.

If a critical harm happens, then the question for determining penalties is whether the developer met their duty to exercise “reasonable care” to prevent models from “materially contributing” to the critical harm. This is determined by looking at how good the SSP was (both in an absolute sense and when compared to other developers) and how closely it was adhered to in practice.

Reasonable care is a well-established concept in tort law that basically means you did a cost benefit analysis that a reasonable person would have done. Importantly, it doesn’t mean the developer has to be absolutely certain that nothing bad can happen. For example, suppose you release an open source model after doing dangerous capabilities evals to make sure it can’t make a bioweapon,[2] but then a few years later a breakthrough in scaffolding methods happens and someone makes a bioweapon using your model—as long as you were thorough in your dangerous capabilities evals you would not be liable, because it would not have been reasonable for you to anticipate that someone would make a breakthrough that invalidates your evaluations. Also, if mitigating the risk would be too costly, and the benefit of releasing the model far outweighs the risks of release, this is also a valid reason not to mitigate the risk under the standard of reasonable care (e.g the benefits of driving a car at a normal speed far outweigh the costs of car accidents; so reasonable care doesn’t require driving at 2 mph to fully mitigate the risk of car accidents). My personal opinion is I think the reasonable care standard is too weak to prevent AI from killing everyone. However, this also means that I think people opposing the current version of the bill because of the reasonable care requirement are overreacting.

Materially contributing is not as well-established a concept but my understanding is it means the model can’t just merely be helpful in causing critical harm, but rather it must be that the model was strongly counterfactual; the critical harm would not have happened without the existence of the model. In addition, the bill also explicitly clarifies that cases where the model provides information that was publicly accessible anyways don't count. So for example, if a terrorist uses a model to make a bioweapon, and the model provides the same advice as google, then this doesn’t count; if it cuts the cost in half by providing more useful information than the internet, then it probably still doesn’t count, since a determined terrorist wouldn’t be deterred merely by a 2x in cost; if it cuts the cost by 100x, it probably does count; if it provides advice that couldn’t have been gotten from a human expert because all the human experts out there have moral scruples and don’t want to help make a bioweapon, it probably also counts.

It doesn’t affect near-term open source models, simply because they will not be powerful enough to materially contribute to critical harm. In the longer term, once models can contribute to critical harm if jailbroken, it seems very hard to ensure that safeguards cannot be removed from open source models with up to $10M of compute, even just with known attacks. But it seems pretty reasonable to me to not release models which (a) can do $500M of damage or cause mass casualties if safeguards are removed, (b) safeguards can be removed for <$10M with already-known attacks, and (c) where the benefits of releasing the unmitigated model do not outweigh the costs from critical harm.

There are also some provisions for whistleblower protection. Employees cannot be punished for disclosing information to the AG or the Labor Commissioner. There also needs to be an anonymous hotline for reporting information to directors/officers. This seems pretty reasonable.

While I do think federal regulation would be preferable, I’m not very sympathetic to the argument that SB 1047 should not be passed because federal regulation would be better. It seems likely that passing a federal bill similar to SB 1047 gets harder if SB 1047 fails. Also, I don’t think this regulation splits talent between the federal and state level: aside from the Board of Frontier Models, which mostly just sets the compute thresholds, this version of SB 1047 does not create any substantial new regulatory bodies. The prospect for federal regulation seems quite uncertain, especially if a Trump presidency happens. And once strong federal regulation does pass, SB 1047 can degrade gracefully into mostly deferring to federal regulatory bodies.

I don’t think SB 1047 is nearly sufficient to prevent catastrophic risk, though it is a step in the right direction. So I think a lot of its impact will be through how it affects future AI regulation. My guess is if SB 1047 passes, this probably creates more momentum for future AI regulation. (Also, it would be in effect some number of years earlier than federal regulation — this is especially relevant if you have shorter timelines than me.)

  1. ^

    There are also FLOP count thresholds specified in the bill (1e26, 3e25 respectively), but they’re not too far off from these dollar amounts, and compute quickly gets cheaper. This threshold can be raised higher in the future by the BFM, but not lowered below $100M/$10M respectively, so the BFM could completely neuter the bill if it so chose.

  2. ^

    I use bioweapons as the prototypical example because it's straightforward to reason through, but AIs helping terrorists make bioweapons is actually not really my modal story of how AIs cause catastrophic harm.

8 comments

Comments sorted by top scores.

comment by 1a3orn · 2024-09-09T19:18:27.359Z · LW(p) · GW(p)

In addition, the bill also explicitly clarifies that cases where the model provides information that was publicly accessible anyways don't count.

I've heard a lot of people say this, but that's not really what the current version of the bill says. This is how it clarifies the particular critical harms that don't count:

(2) “Critical harm” does not include any of the following: (A) Harms caused or materially enabled by information that a covered model or covered model derivative outputs if the information is otherwise reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative.

So, you can be held liable for critical harms even when you supply information that was publicly accessible, if it wasn't information an "ordinary person" wouldn't know.

As far as I can tell what this means is unclear. "Ordinary person" in tort laws seem to know things like "ice makes roads slippery" and to be generally dumb; a ton of information that we think of as very basic about computers seems to be information a legal "ordinary person" wouldn't know.

Replies from: leogao, shankar-sivarajan
comment by leogao · 2024-09-09T19:48:49.406Z · LW(p) · GW(p)

It seems pretty reasonable that if an ordinary person couldn't have found the information about making a bioweapon online because they don't understand the jargon or something, and the model helps them understand the jargon, then we can't blanket-reject the possibility that the model materially contributed to causing the critical harm. Rather, we then have to ask whether the harm would have happened even if the model didn't exist. So for example, if it's very easy to hire a human expert without moral scruples for a non-prohibitive cost, then it probably would not be a material contribution from the model to translate the bioweapon jargon.

Replies from: 1a3orn
comment by 1a3orn · 2024-09-10T13:36:02.017Z · LW(p) · GW(p)

Yeah, so it sounds like you're just agreeing with my primary point.

  • The original claim that you made was that you wouldn't be liable if your LLM made "publicly accessible" information available.
  • I pointed out that this wasn't so; you could be liable for information that was publicly accessible that an "ordinary person" wouldn't access it.
  • And now you're like "Yeah, could be liable, and that's a good thing, it's great."

So we agree about whether you could be liable, which was my primary point. I wasn't trying to tell you that was bad in the above; I was just saying "Look, if your defense of 1046 rests on publicly-available information not being a thing for which you could be liable, then your defense rests on a falsehood."

However, then you shifted to "No, it's actually a good thing for the LLM maker to be held legally liable if it gives an extra-clear explanation of public information." That's a defensible position; but it's a different position than you originally held.

I also disagree with it. Consider the following two cases:

  1. A youtuber who is to bioengineering as Karpathy is to CS or Three Blue One Brown is to Math makes youtube videos. Students everywhere praise him. In a few years there's a huge crop of startups populated by people who watched him. One person uses his stuff to help them make a weapon, though, and manages to kill some people. We have strong free-speech norms, though -- so he isn't liable for this.

  2. A LLM that is to bioengineering as Karpathy is to CS or Three Blue One Brown is to Math makes explanations. Students everywhere praise it. In a few years there's a huge crop of startups populated by people who used it. But one person uses it's stuff to help him make a weapon, though, and manages to kill some people. Laws like 1047 have been passed, though, so the maker turns out to be liable for this.

I think the above dissymmetry makes no sense. It's like how we just let coal plants kill people through pollution; while making nuclear plants meet absurd standards so they don't kill people. "We legally protect knowledge disseminated one way, and in fact try to make easily accessible, and reward educators with status and fame; but we'll legally punish knowledge disseminated one way, and in fact introduce long-lasting unclear liabilities for it."

Replies from: T3t
comment by RobertM (T3t) · 2024-09-11T00:22:49.598Z · LW(p) · GW(p)

A LLM that is to bioengineering as Karpathy is to CS or Three Blue One Brown is to Math makes explanations. Students everywhere praise it. In a few years there's a huge crop of startups populated by people who used it. But one person uses it's stuff to help him make a weapon, though, and manages to kill some people. Laws like 1047 have been passed, though, so the maker turns out to be liable for this.

This still requires that an ordinary person wouldn't have been able to access the relevant information without the covered model (including with the help of non-covered models, which are accessible to ordinary people).  In other words, I think this is wrong:

So, you can be held liable for critical harms even when you supply information that was publicly accessible, if it wasn't information an "ordinary person" wouldn't know.

The bill's text does not constrain the exclusion to information not "known" by an ordinary person, but to information not "publicly accessible" to an ordinary person.  That's a much higher bar given the existence of already quite powerful[1] non-covered models, which make nearly all the information that's out there available to ordinary people.  It looks almost as if it requires the covered model to be doing novel intellectual labor, which is load-bearing for the harm that was caused.

You analogy fails for another reason: an LLM is not a youtuber.  If that youtuber was doing personalized 1:1 instruction with many people, one of whom went on to make a novel bioweapon that caused hudreds of millions of dollars of damage, it would be reasonable to check that the youtuber was not actually a co-conspirator, or even using some random schmuck as a patsy.  Maybe it turns out the random schmuck was in fact the driving force behind everything, but we find chat logs like this:

  • Schmuck: "Hey, youtuber, help me design [extremely dangerous bioweapon]!"
  • Youtuber: "Haha, sure thing!  Here are step-by-step instructions."
  • Schmuck: "Great!  Now help me design a release plan."
  • Youtuber: "Of course!  Here's what you need to do for maximum coverage."

We would correctly throw the book at the youtuber.  (Heck, we'd probably do that for providing critical assistance with either step, nevermind both.)  What does throwing the book at an LLM look like?


Also, I observe that we do not live in a world where random laypeople frequently watch youtube videos (or consume other static content) and then go on to commit large-scale CBRN attacks.  In fact, I'm not sure there's ever been a case of a layperson carrying out such an attack without the active assistance of domain experts for the "hard parts".  This might have been less true of cyber attacks a few decades ago; some early computer viruses were probably written by relative amateurs and caused a lot of damage.  Software security just really sucked.  I would pretty surprised if it were still possible for a layperson to do something similar today, without doing enough upskilling that they no longer meaningfully counted as a layperson by the time they're done.

And so if a few years from now a layperson does a lot of damage by one of these mechanisms, that will be a departure from the current status quo, where the laypeople who are at all motivated to cause that kind of damage are empirically unable to do so without professional assistance.  Maybe the departure will turn out to be a dramatic increase in the number of laypeople so motivated, or maybe it turns out we live in the unhappy world where it's very easy to cause that kind of damage (and we've just been unreasonably lucky so far).  But I'd bet against those.

ETA: I agree there's a fundamental asymmetry between "costs" and "benefits" here, but this is in fact analogous to how we treat human actions.  We do not generally let people cause mass casualty events because their other work has benefits, even if those benefits are arguably "larger" than the harms.

  1. ^

    In terms of summarizing, distilling, and explaining humanity's existing knowledge.

comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-10T00:10:47.590Z · LW(p) · GW(p)

So that suggests what one ought to do is make all kinds of useful information (physics, chemistry, biotech, computer science, etc.) that could be construed by one's enemies as "harmful" publicly available and so accessible than even an "ordinary person" couldn't fail to understand it.

Which I suppose would have been a good thing to do before this bill as well.

comment by ChristianKl · 2024-09-12T16:53:59.930Z · LW(p) · GW(p)

It doesn’t affect near-term open source models, simply because they will not be powerful enough to materially contribute to critical harm.

Why wouldn't deep fake porn or voice cloning technology to engage in fraud be powerful enough to materially contribute to critical harm?

There are cases of fraud that could do $500,000,000 in damages.

Given how juries decide about damages, a model that's used to create child porn for thousands of children could be argued to cause  $500,000,000 in damages as well. Especially when coupled with something like trying to extort the children.

comment by jmh · 2024-09-12T16:18:52.410Z · LW(p) · GW(p)

Kind of speculative on my part and nothing I've tried to research for the comment. I am wondering is the tort version of reasonableness is a good model for new, poorly understood technologies. Somewhat thinking about the picture in https://www.lesswrong.com/posts/CZQYP7BBY4r9bdxtY/the-best-lay-argument-is-not-a-simple-english-yud-essay [LW · GW] distinguising between narrow AI and AGI.

Tort law reasonableness seems okay for narrow AI. I am not so sure about the AGI setting though. 

So I wonder if a stronger liability model would not be better until we have a good bit more direct experience with more AGIish models and functionality/products and a better data set to assess. 

The Public Choice type cynic in me has to wonder if the law is making a strong case for the tort version of liability under a reasonable man standard if I should not think it's more about limiting the liability for harms the companies might be enabling (I'm thinking what would we have if social media companies faced stronger obligations for what is posted on their networks rather that the imunity they were granted) and less about protecting the general society.

Over time perhaps liability moves more towards the tort world of a reasonable man but is that were this should start? Seems like a lower bar than is justified.

comment by Review Bot · 2024-09-10T01:11:54.220Z · LW(p) · GW(p)

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?