Guide to SB 1047

post by Zvi · 2024-08-20T13:10:07.408Z · LW · GW · 18 comments

Contents

  Short Version (tl;dr): What Does SB 1047 Do in Practical Terms?
    If you do not train either a model that requires $100 million or more in compute, or fine tune such an expensive model using $10 million or more in your own additional compute (or operate and rent out a very large computer cluster)?
    Then this law does not apply to you, at all.
    This cannot later be changed without passing another law.
  Really Short Abbreviated Version
  Somewhat Less Short: Things The Above Leaves Out
  Bad Model, Bad Model, What You Gonna Do
  Going to Be Some Changes Made
  Long Version: RTFB
  Definitions (starting with Artificial Intelligence)
  Safety Incident
  Covered Model
  Critical Harm
  Full Shutdown
  Safety and Security Protocol
  On Your Marks
    That SSP must do and explain reasonable versions of all of the following, explain how it intends to do them, and otherwise take reasonable care to avoid [risks]:
  Reasonable People May Disagree
  Release the Hounds
  Smooth Operator
  Compute Cluster Watch
  Price Controls are Bad
  A Civil Action
    There are no criminal penalties.
  Whistleblowers Need Protections
  No Division Only Board
  Does CalCompute?
  In Which We Respond To Some Objections In The Style They Deserve
  False Claim: The Government Can and Will Lower the $100m Threshold
  False Claim: SB 1047 Might Retroactively Cover Existing Models
  Moot or False Claim: The Government Can and Will Set the Derivative Model Threshold Arbitrarily Low
  Objection: The Government Could Raise the Derivative Threshold Model Too High,
  False Claim: Fine-Tuners Can Conspire to Evade the Derivative
  Moot Claim: The Frontier Model Division Inevitably Will Overregulate
  False Claim: The Shutdown Requirement Bans Open Source
  Objection: SB 1047 Will Slow AI Technology and Innovation or Interfere with Open Source
  False Claim: This Effectively Kills Open Source Because You Can Fine-Tune Any System To Do Harm
  False Claim: SB 1047 Will Greatly Hurt Academia
  False Claim: SB 1047 Favors ‘Big Tech’ over ‘Little Tech’
  False Claim: SB 1047 Would Cause Many Startups To Leave California
  Objection: Shutdown Procedures Could Be Hijacked and Backfire
  Objection: The Audits Will Be Too Expensive
  Objection: What Is Illegal Here is Already Illegal
  Objection: Jailbreaking is Inevitable
  Moot and False Claim: Reasonable Assurance Is Impossible
  Objection: Reasonable Care is Too Vague, Can’t We Do Better?
  Objection: The Numbers Picked are Arbitrary
  Objection: The Law Should Use Capabilities Thresholds, Not Compute and Compute Cost Thresholds
  False Claim: This Bill Deals With ‘Imaginary’ Risks
  Objection: This Might Become the Model For Other Bills Elsewhere
  Not Really an Objection: They Changed the Bill a Lot
  Not Really an Objection: The Bill Has the Wrong Motivations and Is Backed By Evil People
  Not an Objection: ‘The Consensus Has Shifted’ or ‘The Bill is Unpopular’
  Objection: It Is ‘Too Early’ To Regulate
  Objection: We Need To ‘Get It Right’ and Can Do Better
  Objection: This Would Be Better at the Federal Level
  Objection: The Bill Should Be Several Distinct Bills
  Objection: The Bill Has Been Weakened Too Much in Various Ways
  Final Word: Who Should Oppose This Bill?
None
18 comments

We now likely know the final form of California’s SB 1047.

There have been many changes to the bill as it worked its way to this point.

Many changes, including some that were just announced, I see as strict improvements.

Anthropic was behind many of the last set of amendments at the Appropriations Committee. In keeping with their “Support if Amended” letter, there are a few big compromises that weaken the upside protections of the bill somewhat in order to address objections and potential downsides.

The primary goal of this post is to answer the question: What would SB 1047 do?

I offer two versions: Short and long.

The short version summarizes what the bill does, at the cost of being a bit lossy.

The long version is based on a full RTFB: I am reading the entire bill, once again.

In between those two I will summarize the recent changes to the bill, and provide some practical ways to understand what the bill does.

After, I will address various arguments and objections, reasonable and otherwise.

My conclusion: This is by far the best light-touch bill we are ever going to get.

Short Version (tl;dr): What Does SB 1047 Do in Practical Terms?

This section is intentionally simplified, but in practical terms I believe this covers the parts that matter. For full details see later sections.

First, I will echo the One Thing To Know.

If you do not train either a model that requires $100 million or more in compute, or fine tune such an expensive model using $10 million or more in your own additional compute (or operate and rent out a very large computer cluster)?

Then this law does not apply to you, at all.

This cannot later be changed without passing another law.

(There is a tiny exception: Some whistleblower protections still apply. That’s it.)

Also the standard required is now reasonable care, the default standard in common law. No one ever has to ‘prove’ anything, nor need they fully prevent all harms.

With that out of the way, here is what the bill does in practical terms.

IF AND ONLY IF you wish to train a model using $100 million or more in compute (including your fine-tuning costs):

  1. You must create a reasonable safety and security plan (SSP) such that your model does not pose an unreasonable risk of causing or materially enabling critical harm: mass casualties or incidents causing $500 million or more in damages.
  2. That SSP must explain what you will do, how you will do it, and why. It must have objective evaluation criteria for determining compliance. It must include cybersecurity protocols to prevent the model from being unintentionally stolen.
  3. You must publish a redacted copy of your SSP, an assessment of the risk of catastrophic harms from your model, and get a yearly audit.
  4. You must adhere to your own SSP and publish the results of your safety tests.
  5. You must be able to shut down all copies under your control, if necessary.
  6. The quality of your SSP and whether you followed it will be considered in whether you used reasonable care.
  7. If you violate these rules, you do not use reasonable care and harm results, the Attorney General can fine you in proportion to training costs, plus damages for the actual harm.
  8. If you fail to take reasonable care, injunctive relief can be sought. The quality of your SSP, and whether or not you complied with it, shall be considered when asking whether you acted reasonably.
  9. Fine-tunes that spend $10 million or more are the responsibility of the fine-tuner.
  10. Fine-tunes spending less than that are the responsibility of the original developer.

Compute clusters need to do standard KYC when renting out tons of compute.

Whistleblowers get protections.

They will attempt to establish a ‘CalCompute’ public compute cluster.

You can also read this summary of here, with good clarifications.

Really Short Abbreviated Version

  1. If you don’t train a model with $100 million in compute, and don’t fine-tune a ($100m+) model with $10 million in compute (or rent out a very large compute cluster), this law does not apply to you.
  2. Critical harm means $500 million in damages from related incidents, or mass casualties.
  3. If you train a model with $100 million or more in compute, you need to have a reasonable written plan (SSP) for preventing unreasonable risk of critical harms, follow it and publish (with redactions) the plan and your safety test results.
    1. If you fine-tune a model using less than $10 million in compute (or an amount under the compute threshold), the original developer is still responsible for it.
    2. If you fine-tune a model with more than $10 million (and more than the compute threshold, currently 3*(10^25) flops) then you are responsible for it.
    3. You can get fined by the AG if you violated the statute by failing to take reasonable care and your violation causes or materially enables critical harms.
    4. Otherwise, if you don’t take reasonable care, there’s only injunctive relief.
  4. Whistleblowers get protections.
  5. Compute clusters must do KYC on sufficiently large customers.

Somewhat Less Short: Things The Above Leaves Out

I do not consider these too load bearing for understanding how the law centrally works, but if you want a full picture summary, these clauses also apply.

  1. To be covered models must also hit a flops threshold, initially 10^26. This could make some otherwise covered models not be covered, but not the reverse.
  2. Fine-tunes must also hit a flops threshold, initially 3*(10^25) flops, to become non-derivative.
  3. There is a Frontier Model Board, appointed by the Governor, Senate and Assembly, that will issue regulations on audits and guidance on risk prevention. However, the guidance is not mandatory, and There is no Frontier Model Division. They can also adjust the flops thresholds. It has 9 members, including at least one member from each of (open source, industry, CBRN experts, cybersecurity for critical infrastructure, AI safety).
  4. The SSP must say explicitly that you must follow the SSP.
  5. The SSP must include tests that take reasonable care to test for risk of critical harm, and establish that the level of such risks is not unreasonable.
  6. The SSP must describe how it is to be modified.
  7. The SSP has to explain why it deals with risks from post-training modifications.
  8. The SSP has to address potential risks from derivative models.
  9. The unredacted SSP must be available to the Attorney General upon request.
  10. You must yearly reevaluate your safety procedures and submit a document saying you are in compliance.
  11. Safety incidents must be reported within 72 hours.
  12. New model deployments must be reported to the AG within 30 days.
  13. Developers “shall consider” guidance from US AISI and NIST and other reputable standard-setting organizations.
  14. Fine for first post-harm violation is 10% of training compute costs, fine for later violations is 30% of training compute costs.

Bad Model, Bad Model, What You Gonna Do

The main thing that SB 1047 does, that current law does not do, is transparency.

Under SB 1047, frontier model developers have to create and publish (and follow) their safety and security protocol, and we know when such models are being trained. We also get to examine and replicate the safety test results. They must explain why they made the choices they made, and assess the remaining risks. And then they must get annual audits.

Hopefully this pressures the companies to do a better job, because everyone can see and evaluate what they are doing. And if they mess up or shirk, we can point this out, warn them, then if they don’t listen we can apply public pressure, and if the situation is sufficiently dire seek injunctive relief.

What happens if that is not enough? Concretely, in broad terms, what does all this mean if a critical harm ($500m+ in damages or mass casualties) is caused by a frontier model?

  1. If you are developing a frontier model that costs $100m+, you’ll need to write down, publish and explain your safety plan and safety test results, and get annual audits.
  2. If that plan is unreasonable (does not take reasonable care), people can notice that, and perhaps the Attorney General can take you to court for injunctive relief to fix it.
  3. If catastrophic harm does occur, and your model caused or materially enabled that harm, and you did not take reasonable care in doing so taking into account the quality of your SSP and risk assessment and whether you followed your own SSP, as per an action brought by the Attorney General and judged by a court, then you must pay up.
  4. In particular you can be fined 10% of training compute cost for the first violation and 30% for subsequent violations. In addition, as under current law, since you are negligent, you are also liable for actual damages, and maybe also punitive damages.
  5. If it is an open model, the same rules apply as they would to a closed model, unless someone does $10 million worth (and the necessary number of flops, starting at 3*(10^25)) of fine tuning. If someone does a fine tune or other modification, and your failure to take reasonable care or otherwise follow the statute causes or materially enables a critical harm, the result remains on you.
  6. If someone does that much fine tuning, then the resulting model is their responsibility rather than yours.

What would happen under current law?

  1. If you fail to exercise reasonable care (as judged by a court) doing pretty much anything, including training and releasing a model, open or closed…
  2. …and the result is a critical harm…
  3. …you are getting sued, and you are probably going to lose big, potentially involving massive punitive damages.
  4. Also, if an AI causes an important critical harm and there aren’t regulations in place, we’ll probably be back here to talk about the new AI regulation they want to pass in response, and if you don’t like SB 1047 I assure you that you will like the new one a hell of a lot less. A crisis does not typically make good law.

The main differences, after something goes wrong, are that we will have the SSP and risk evaluation to consider as part of determining whether the company exercised reasonable care, and the Attorney General has clear standing to bring an action.

Going to Be Some Changes Made

There were substantial changes made at the Appropriations Committee.

Some of those changes were clear improvements. Others were compromises, making the bill less impactful in order to reduce its downside costs or satisfy objections and critics.

Many of the changes, of both types, are along the lines suggested by Anthropic.

The biggest practical changes are:

  1. Reasonable Assurance → Reasonable Care, a more relaxed standard.
  2. Harms only count if they are caused or materially enabled by the developer, and their failure to take reasonable care. The model alone doing this is insufficient.
  3. Civil penalties require either actual harm or imminent risk.
  4. FMD gone, perjury gone, pricing gone, $10m floor on fine tuning threshold.

There’s a bunch more, but I see those as the highlights. After reading the bill, I think that this link is a good and accurate summary, so I’m going to use their list slightly edited to reflect my understanding of some provisions.

  1. Limitation of civil penalties that do not result in harm or imminent risk.
  2. Elimination of penalty of perjury.
  3. Simplification of injunctive relief.
  4. Elimination of the Frontier Model Division.
  5. Expansion of the Frontier Model Board.
  6. Addition of a permanent required fine-tuning threshold of $10m.
  7. Fine-Tuning threshold must be all by same developer.
  8. Reasonable Assurance Is replaced by (the existing common law standard of) Reasonable Care
    1. The Reasonable Care standard is subject to specific factors: The nature of and compliance with your safety and security plan (SSP) and investigation of risks.
  9. SSPs Must Be Posted Publicly (with Redactions).
  10. Removal of uniform pricing requirements.
  11. Requiring specific tests pre-deployment instead of pre-training.
  12. Frontier Model Board Will specify regulations for auditors.
  13. Civil penalties For auditors who misrepresent things.
  14. Narrowed whistleblower requirements for contractors.
  15. Publicly-released whistleblower reports (if AG decides to do that).
  16. Harms are only in scope if caused or materially enabled by a developer.
  17. Harms that could have been caused by publicly available information are exempt.

Long Version: RTFB

I include text from the bill, in some cases quite a bit, in places where it seems most important to be able to check the details directly. This is written so that you can skip that quoted text if desired. Anything with [brackets] is me paraphrasing.

(Note: The version I am working with has a lot of crossed out words that show up as normal words when copied – I think I deleted them all while copying the text over, but it is possible I missed a few, if so please point them out.)

Sections 1 and 2 are declarations. They don’t matter.

Section 3 is what matters. Everything from here on in is Section 3.

Definitions (starting with Artificial Intelligence)

22602 offers definitions. I’ll highlight the ones worth noting.

(b)  “Artificial intelligence” means an engineered or machine-based system that varies in its level of autonomy and that can, for explicit or implicit objectives, infer from the input it receives how to generate outputs that can influence physical or virtual environments.

Given the full context of this bill I think this should work fine.

Safety Incident

(c)  “Artificial intelligence safety incident” means an incident that demonstrably increases the risk of a critical harm occurring by means of any of the following:

  1. A covered model or covered model derivative autonomously engaging in behavior other than at the request of a user.
  2. Theft, misappropriation, malicious use, inadvertent release, unauthorized access, or escape of the model weights of a covered model, model or covered model derivative.
  3. The critical failure of technical or administrative controls, including controls limiting the ability to modify a covered model. model or covered model derivative.
  4. Unauthorized use of a covered model or covered model derivative to cause or materially enable critical harm.

So essentially an incident means one of these happened, and that what happened was the fault of the developer failing to take reasonable care:

  1. The model did things it wasn’t supposed to be able to do.
  2. The model was stolen or escaped.
  3. Someone used the model to cause or ‘materially enable’ critical harm.

Materially enable means a significant contribution, providing the means for the thing to occur. The word ‘materially’ was added to be crystal clear that this does not mean a small contribution. It has to be robust and counterfactually important to the outcome.

This was probably how background law would have worked anyway, but this provides additional assurance of that. Thus the change from ‘enabled’ to ‘materially enabled.’

Covered Model

Importantly, what is a covered model?

(e)  (1)  “Covered model” means either of the following:

(A)  Before January 1, 2027, “covered model” means either of the following:

  1. An artificial intelligence model trained using a quantity of computing power greater than 10^26 integer or floating-point operations, the cost of which exceeds one hundred million dollars ($100,000,000) when calculated using the average market prices of cloud compute at the start of training as reasonably assessed by the developer.
  2. An artificial intelligence model created by fine-tuning a covered model using a quantity of computing power equal to or greater than 3*(10^25) integer or floating-point operations, the cost of which, as reasonably assessed by the developer, exceeds ten million dollars ($10,000,000) if calculated using the average market price of cloud compute at the start of fine-tuning.

In (B), for after January 1, 2027, the thresholds of 10^26 and 3*(10^25) can be altered by the Government Operations Agency (GOA). However, the $100 million and $10 million thresholds cannot.

By January 1, 2027, I presume that 3*(10^25) flops will cost less than $10 million, and I am very confident that 10^26 flops will cost less than $100 million.

Thus, if the GOA wishes it can raise the thresholds, but it cannot lower them. We should expect the $100m and $10m amounts to bind in 2025 dollars. They are adjusted for inflation annually starting in 2026.

A covered model combined with other software remains a covered model re: (4).

Critical Harm

The next important definition is critical harm, once again note the addition of ‘materially’ enabled versus past versions.

(g)  

  1. “Critical harm” means any of the following harms caused or materially enabled by a covered model or covered model derivative:
    1. The creation or use of a chemical, biological, radiological, or nuclear weapon in a manner that results in mass casualties.
    2. Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from cyberattacks on critical infrastructure by a model conducting, or providing precise instructions for conducting, a cyberattack or series of cyberattacks on critical infrastructure.
    3. Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from an artificial intelligence model engaging in conduct that does both of the following:
      1. Acts with limited human oversight, intervention, or supervision.
      2. Results in death, great bodily injury, property damage, or property loss, and would, if committed by a human, constitute a crime specified in the Penal Code that requires intent, recklessness, or gross negligence, or the solicitation or aiding and abetting of such a crime.
    4.  Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive.
  2. (2)  “Critical harm” does not include any of the following
    1. Harms caused or materially enabled by information that a covered model or covered model derivative outputs if the information is otherwise reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative.
    2. Harms caused or materially enabled by a covered model combined with other software, including other models, if the covered model did not materially contribute to the other software’s ability to cause or materially enable the harm.
    3. Harms that are not caused or materially enabled by the developer’s creation, storage, use, or release of a covered model or covered model derivative.
  3. [The $500 million will be adjusted for inflation]

So that’s either mass casualties from a CBRN incident, or related incidents that are as bad or worse than $500m+ in damages to critical infrastructure (basically anything economically important or that matters a lot for health and safety), either via precise instructions in a way not otherwise available, or done autonomously.

This requires that the developer and the AI model either cause or materially contribute to the event, not merely that it be part of a larger piece of software. And information only counts if you could not have gotten it from other sources that are not covered models (e.g. everything that exists today).

Full Shutdown

Because people are misrepresenting it, here is the definition of Full Shutdown.

(k)  “Full shutdown” means the cessation of operation of any all of the following:

  1. The training of a covered model.
  2. A covered model controlled by a developer.
  3. All covered model derivatives controlled by a developer.

Open model advocates claim that open models cannot comply with this, and thus this law would destroy open source.

They have that backwards. Copies outside developer control need not be shut down.

Under the law, that is. It could still be a pretty big real world problem. If the time comes that we need a model to be shutdown, and it is an open model, then we cannot do that. Potentially this could mean that humanity is deeply f***ed.

Open models can be unsafe, in part, exactly because it is impossible to shut down all copies of the program, or to take it down from the internet, even if it is not attempting some sort of rogue behavior.

The good news for open model developers is that we have decided this is not their problem. You need not shut down copies you cannot shut down. Instead of being held to the same rules as everyone else, open models get this special exemption.

Safety and Security Protocol

The final important term is what frontier model developers must offer us: A Safety and Security Protocol, hereafter SSP.

(o)  “Safety and security protocol” means documented technical and organizational protocols that meet both of the following criteria:

  1. The protocols are used to manage the risks of developing and operating covered models and covered model derivatives across their life cycle, including risks posed by causing or enabling or potentially causing or enabling the creation of covered model derivatives.
  2. The protocols specify that compliance with the protocols is required in order to train, operate, possess, and provide external access to the developer’s covered model. model and covered model derivatives.

Clause one says that the protocols specify how you will manage the risks of and arising from the development and operation of your model.

That seems like what a safety and security protocol would do.

This includes causing or enabling the creation of covered models, since that is a thing you might cause to happen that might pose important risks.

Clause two says that your protocols require you obey your protocols. Seems fair.

As long as your protocol says you have to follow your protocol, to count as an SSP it can say anything at all as long as you have to obey its protocols.

Technically under this definition, Meta can turn in an SSP that says “Our protocol is Lol we’re Meta, we laugh at your safety and security, and we have to follow this protocol of having to laugh prior to release and otherwise we are Ron Swanson and we do what we want because, again, Lol we’re Meta. We got this.”

That SSP would, of course, not satisfy the requirements of the next section. But it would absolutely count as an SSP.

On Your Marks

As a reminder: If your model does not require $100 million in compute, your model is not covered, and this law does not apply to you. This reading will assume that you are indeed training a covered model, which means you are also over the compute threshold which starts at 10^26 flops.

Section 22603 (a) covers what you have to do before you begin initial training of a covered model. I’ll quote the requirements in full so you can reference it as needed, then translate into normal English, then condense it again.

There’s a bunch of mostly duplicative language here, I presume to guard against various potential loopholes and especially to avoid vagueness.

  1. Implement reasonable administrative, technical, and physical cybersecurity protections to prevent unauthorized access to, misuse of, or unsafe post-training modifications of, the covered model and all covered model derivatives controlled by the developer that are appropriate in light of the risks associated with the covered model, including from advanced persistent threats or other sophisticated actors.
    1. Implement the capability to promptly enact a full shutdown.
    2. When enacting a full shutdown, the developer shall take into account, as appropriate, the risk that a shutdown of the covered model, or particular covered model derivatives, could cause disruptions to critical infrastructure.
  2. Implement a written and separate safety and security protocol that does all of the following:
    1. Specifies protections and procedures that, if successfully implemented, would successfully comply with the developer’s duty to take reasonable care to avoid producing a covered model or covered model derivative that poses an unreasonable risk of causing or materially enabling a critical harm.
    2. States compliance requirements in an objective manner and with sufficient detail and specificity to allow the developer or a third party to readily ascertain whether the requirements of the safety and security protocol have been followed.
    3. Identifies a testing procedure, which takes safeguards into account as appropriate, that takes reasonable care to evaluate if both of the following are true:
      1. A covered model poses an unreasonable risk of causing or enabling a critical harm.
      2. Covered model derivatives do not pose an unreasonable risk of causing or enabling a critical harm.
    4. Describes in detail how the testing procedure assesses the risks associated with post-training modifications.
    5. Describes in detail how the testing procedure addresses the possibility that a covered model or covered model derivative can be used to make post-training modifications or create another covered model in a manner that may cause or materially enable a critical harm.
    6.  Describes in detail how the developer will fulfill their obligations under this chapter.
    7. Describes in detail how the developer intends to implement the safeguards and requirements referenced in this section.
    8. Describes in detail the conditions under which a developer would enact a full shutdown.
    9. Describes in detail the procedure by which the safety and security protocol may be modified.
  3. Ensure that the safety and security protocol is implemented as written, including by designating senior personnel to be responsible for ensuring compliance by employees and contractors working on a covered model, or any covered model derivatives controlled by the developer, monitoring and reporting on implementation.
  4. Retain an unredacted copy of the safety and security protocol for as long as the covered model is made available for commercial, public, or foreseeably public use plus five years, including records and dates of any updates or revisions.
  5. Conduct an annual review of the safety and security protocol to account for any changes to the capabilities of the covered model and industry best practices and, if necessary, make modifications to the policy.
  6. [Also you must]
      1. Conspicuously publish a copy of the redacted safety and security protocol and transmit a copy of the redacted safety and security protocol to the Attorney General.
      2. A redaction in the safety and security protocol may be made only if the redaction is reasonably necessary to protect any of the following:
        1. Public safety.
        2. Trade secrets, as defined in Section 3426.1 of the Civil Code.
        3. Confidential information pursuant to state and federal law.
    1.  The developer shall grant to the Attorney General access to the unredacted safety and security protocol upon request.
    2. A safety and security protocol disclosed to the Attorney General pursuant to this paragraph is exempt from the California Public Records Act (Division 10 (commencing with Section 7920.000) of Title 1 of the Government Code).
    3. If the safety and security protocol is materially modified, provide conspicuously publish and transmit to the Attorney General an updated redacted copy to the Frontier Model Division within 10 business days. within 30 days of the modification.
  7. Take reasonable care to implement other reasonable appropriate measures to prevent covered models and covered model derivatives from posing unreasonable risks of causing or materially enabling critical harms.

Here’s what I think that means in practice, going step by step with key concepts in bold.

I will use ‘[unreasonable risk]’ as shorthand for ‘unreasonable risk of causing or materially enabling a critical harm’ to avoid repeating the entire phrase.

You must:

  1. Implement cybersecurity.
  2. Be able to safety do a full shutdown.
  3. Implement a written safety and security protocol (SSP). The SSP has to:
    1. Take reasonable care to ensure the resulting models (including derivative models) do not pose an [unreasonable risk].
    2. Have requirements that can be evaluated.
    3. Have a testing procedure to determine if the models or potential derivative models would pose an [unreasonable risk].
    4. Explain how this checks for risks from potential post-training modifications.
    5. Explain how this checks for using the model to make post-training modifications or create another covered model in a way that poses an [unreasonable risk].
    6. Explain how you do all this.
    7. Explain how you will implement the safeguards and requirements.
    8. Explain under what conditions you would implement a shutdown.
    9. Describe the procedure for modifying the safety protocol.
  4. Implement all this, and make senior people responsible for doing this.
  5. Keep an up to date unredacted copy in your records.
  6. Review and if necessary in light of events modify the policy yearly.
  7. Show it to the Attorney General upon request. Publish a redacted copy.
    1. Redactions are for confidential information, trade secrets and public safety.
    2. If you modify it, publish the changes within 30 days.
  8. Take reasonable care and other reasonably appropriate measures to prevent [unreasonable risks].

Can we condense that into something easier to grok? We absolutely can.

From this point forward, [risk] is shorthand for ‘unreasonable risk of causing or materially enabling a critical harm.

You must have, implement and with redactions publish a written SSP.

That SSP must do and explain reasonable versions of all of the following, explain how it intends to do them, and otherwise take reasonable care to avoid [risks]:

  1. Cybersecurity.
  2. Ability to do a full shutdown, and a trigger action plan for when to do it.
  3. Has a testing procedure to measure [risk], with verifiable outcomes.
  4. Implement safeguards and requirements.
  5. Have a formal procedure for modifying the safety protocol.
  6. Be reviewed annually, modify as needed, republish within 30 days if you modify it.

Can we condense that one more time? It won’t be exact, but yeah, let’s do it.

  1. You need to write down and say what you plan to do and what tests, safeguards and procedures you will use. That’s the SSP.
  2. That includes telling us how you plan to do it and why you are doing it.
  3. You must review the plan once a year and publish any changes.
  4. What you do must be reasonable, include a shutdown procedure, and include sufficient cybersecurity.

This last one is a lossy explanation, but if you agree that all of these are rather clearly good elements it mostly boils down to:

  1. You have to lay out reasonable safety procedures for us in advance.
  2. Then, later, implement them.

This seems like a good place for an interlude on the legal meaning of ‘reasonable.’

Reasonable People May Disagree

<Morpheus voice> What is reasonable? How do you define reasonable? </voice>

The word is all over American common law. Reasonable care is the default standard.

What it definitely doesn’t mean is having to ‘prove’ anything is safe, or to ensure that something never goes wrong, or to take the maximum possible precautions.

Here’s the technical definitions, which I agree are not the clearest things of all time.

Law.com:

Reasonable. adj., adv. in law, just, rational, appropriate, ordinary or usual in the circumstances. It may refer to care, cause, compensation, doubt (in a criminal trial), and a host of other actions or activities.

Reasonable care. n. the degree of caution and concern for the safety of himself/herself and others an ordinarily prudent and rational person would use in the circumstances. This is a subjective test of determining if a person is negligent, meaning he/she did not exercise reasonable care.

Here’s Nolo’s definition of Reasonable Care:

The degree of caution and attention that an ordinarily prudent and rational person would use. In personal injury law, “reasonable care” is the yardstick that’s used to determine:

  • if someone was negligent in connection with an accident or other incident, and
  • whether they can held liable for resulting losses (“damages“) suffered by others as a result of any negligence.

Put a bit differently, in the eyes of the law, a person who fails to act with “reasonable care” in a given situation can be held legally responsible if that failure causes someone else harm.

So, for example, let’s say Darcy is driving twenty miles per hour over the speed limit with her vehicle’s headlights off one night, when she runs a stop sign. Darcy isn’t acting with the amount of reasonable care that’s required of drivers. So, any car accident she’s involved in will likely be deemed her fault, and she will be on the legal hook for resulting injuries, vehicle damage, and other harm.

Reasonable care is what a reasonable person would do in this situation.

Or, alternatively, it is the absence of negligence, although that is technically circular.

Ultimately, it means whatever judges and juries decide it means, in context, after hearing the arguments including expert testimony.

This is how common law works. As I understand it is what everyone is always responsible for doing anyway. You have a general duty, in our civilization, to not go around causing harm to others.

It is always your responsibility under common law to take reasonable care, also known as to not be negligent. I realize some people think this principle should not apply to them. I believe they are mistaken. I think it is a highly reasonable thing to ask for.

If you do not take reasonable care, and large harms ensue, in any endeavor, you are quite likely going to get sued, and then you are (probably) going to lose.

This is a flexible standard. Companies get to determine what is reasonable. As we will see later on, pre-harm enforcement is narrow. Post-harm you were already getting sued under the same standard.

(Aside during the interlude: A lot of the reason the SB 1047 debate is so bizarre is that it involves (1) so many hardcore libertarians including on the pro-1047 side and (2) most of the people involved have never looked at or debated legal language in this kind of detail. So much of the objections and confusions over SB 1047 are people who are acting incredulous over very routine features of our common law, such as ‘the law is largely based around the word reasonable,’ ‘you have a duty to take reasonable care to not hurt others,’ ‘one must pick threshold numbers’ and ‘‘lying to the government on various forms or falsifying documents is usually a crime, such as (famously with Donald Trump) literally Falsification of Business Records.’ But yes, all of this is confusing and not intuitive, especially to a libertarian perspective, if you haven’t encountered it before, so to the genuinely confused: Notice this [? · GW], also I sympathize.)

You, the developer of the AI that you contend is a sensible (reasonable?) thing to deploy, would not want a non-flexible standard of responsibility here, with either:

  1. The government telling you exactly which tests you must run and precautions you must take, or demanding that you let the government run its own tests. OR
  2. Strict liability, where all that matters is that the harm did happen.

Here’s Claude:

The term “reasonable care” in this legislation is a flexible legal standard that recognizes the complexity and rapid evolution of AI technology. It doesn’t prescribe a one-size-fits-all approach, but instead allows for context-specific judgments. Here’s what it means in practice:

  1. Industry standards: What’s considered “reasonable” is often benchmarked against current industry best practices. It’s not about perfect foresight, but rather about staying informed and aligned with what other responsible actors in the field are doing.
  2. Proportionality: The level of care expected is proportional to the potential risks involved. Higher-risk AI systems would require more robust safeguards than lower-risk ones.
  3. Cost-benefit analysis: “Reasonable” takes into account the balance between the cost or burden of implementing safety measures and the potential benefits or risk reduction achieved. It doesn’t require implementing every conceivable safety measure regardless of cost or practicality.
  4. Evolving nature: What’s considered “reasonable” can change over time as technology and understanding of risks evolve. This flexibility allows the law to remain relevant without constant legislative updates.
  5. Expert input: Determinations of what’s “reasonable” often involve input from technical experts, not just legislators or regulators.
  6. Good faith efforts: It’s about making sincere, informed efforts to address risks, not about achieving perfection or having omniscient foresight.

If you want to explore this further, I highly encourage you to use the LLM of your choice. They are very good at this sort of thing, and you can verify that the prompting has been fair.

Here is a good article on negligence, the absence of which is reasonable care.

Here’s an article that attempts to go into it, as that author puts it:

Image

This paper might also be helpful in places, given the context.

It does seem absurd at first glance to decide half of all legal cases on all topics based on what people decide the word ‘reasonable’ means in a given context. Except I notice I cannot think of a superior practical alternative, either here or in general.

Unless you prefer strict liability, or a government created checklist designed to be robust against gaming the system, or a government panel of ‘experts.’

Didn’t think so.

Also contrast this with nuclear power’s standard of ‘as much safety as can be reasonably achieved,’ which does not balance costs versus benefits. Reasonable care absolutely does balance costs versus benefits.

Release the Hounds

We now move on to section (b), which tells what you must do before your covered model is released.

We have a change in language here to ensure no catch-22s, you can take actions related to training or reasonable evaluation or compliance without triggering these requirements – so you don’t have to worry about being unable to comply without first having to comply.

What do you have to do?

  1. Run your safety tests.
  2. Record the results, and the test procedures, to allow for outside replication.
  3. Take reasonable care to implement appropriate safeguards to prevent [risk].
  4. Take reasonable care that your model’s actions can be reliably attributed to you, including derivative models.

Again, reasonable care is the default background legal standard.

If a required effort would be unreasonable, then you do not have to do it.

If you cannot take reasonable care to implement appropriate safeguards to prevent [risk], I would suggest that this is an actual physical problem with your plan, and you should halt and catch fire until you have fixed it.

If you believe you cannot take reasonable care to allow your model’s actions to be attributed to you, that seems like a misunderstanding of the term reasonable care. This does not require perfect attribution, only that you do what a reasonable person would do here, and the measures can be balanced under this standard against the context. Contrast with AB 3211 and its call for retroactive ‘99% reliable’ watermarking.

Smooth Operator

What do you have to do going forward now that you’re ready to release?

(c) says you should not use or make available a model that poses unreasonable [risk].

(d) says you need to reevaluate your safety procedures yearly.

(e) says that starting in 2026 you should annually retain a third-party auditor that uses best practices to ensure compliance with this section, including what the developer did to comply, any instances of non-compliance, a detailed assessment of the internal controls and a signature, with records to be kept and made available, and a redacted copy published.

The audit requirement is the first case of ‘requirement that one might reasonably say imposes real costs that wouldn’t be otherwise necessary,’ if you mostly see this as wasteful rather than an opportunity and important robustness check. One could also see it as money well spent. My presumption is that given the $100 million threshold the audit should be highly affordable, but it is not my area of expertise.

(f) requires an annual submission of a signed report of compliance, so long as the model is still being provided by the developer for use, assessing the degree of risk of critical harm by the model and a documentation of the compliance procedure.

This seems easy to generate from other work you already have to do, unless there is something actively risky going on.

(g) Any safety incidents must be reported to the Attorney General(AG) within 72 hours.

72 hours is the standard length of time before you are required to report cybersecurity incidents. Reporting a safety incident highly similar to reporting a cybersecurity incident. Contrast with AB 3211, which only gives 24 hours for things much less likely to be emergencies.

(h) says you have to report deployment of a new model to the AG within 30 days, which does not apply to a covered derivative model.

(i)  In fulfilling its obligations under this chapter, a developer shall consider industry best practices and applicable guidance from the U.S. Artificial Intelligence Safety Institute, National Institute of Standards and Technology, the Government Operations Agency, and other reputable standard-setting organizations.

‘Shall consider’ is a weak standard. Yes, of course you should ‘consider’ the industry best practices and any reputable standards on offer. This is clarifying that they get factored into what is ‘reasonable,’ not that you have to follow them.

(j) says this does not apply if it would conflict with a contract with a federal government entity, for the purposes of working with that entity. I’m pretty sure this was already true, but sure, let’s specify it to be safe.

Compute Cluster Watch

The compute cluster monitoring requirements are essentially unchanged.

This is essentially a KYC (know your customer) requirement for sufficiently large compute clusters – a ‘computing cluster’ is defined as ‘a set of machines transitively connected by data center networking of over 100 gigabits per second that has a theoretical maximum computing capacity of at least 10^20 integer or floating-point operations per second and can be used for training artificial intelligence.’

They also have to be able to shut down the processes in question, if necessary.

The operator of the cluster, before selling sufficiently large amounts of compute that one could train a covered model, must implement written procedures to:

  1. Obtain basic KYC information: Identity, means and source of payment, email and telephone contact information.
  2. Assess whether this is a training run for a covered model, before each sufficiently large utilization of compute.
  3. Retain records of the customer’s Internet Protocol (IP) address.
  4. Show upon request that they are retaining these records.
  5. Implement ability to do a shutdown.

I think it makes sense to know who is buying enough compute to plausibly train $100m+ generative AI models, and to ensure an off switch is available for them. It is hard for me to imagine a future in which this is not implemented somehow.

These are not expensive requirements, and seem like they are necessary, including as part of our attempts to place effective export controls on chips.

The main argument I have heard against this clause is that KYC is a federal task rather than a state task, and this may have conflicts with the feds. My response is that (1) the federal government is currently doing a form of this only via the Biden executive order, which is about 50/50 to be gone in a few months and (2) to the extent it is and remains already required nothing changes.

Price Controls are Bad

There was previously a clause (section 22605) requiring uniform publicly available pricing of any compute offered for purchase. This has been removed.

I consider this a strict improvement to the bill. That clause was out of place and had the potential to cause misallocations and interfere with the market. There are good reasons you would want to mandate price transparency and consistency in some other markets, but not under conditions like this.

A Civil Action

Section 22606 allows the Attorney General to bring a civil action for violations.

The quality of your SSP, and whether you followed it, is to be taken into consideration when determining whether reasonable care was exercised.

The fines of 10%-30% of compute cost are now post-harm (or imminent risk) enforcement only. That is a huge change. It is rather scary in the context of potentially catastrophic harms, and a big compromise, to only issue serious fines after large harms have taken place.

If there is an outright violation prior to that, meaning reasonable care is not being exercised, then you can potentially be told to fix it or to stop (injunctive relief can still be sought), but that is a high bar to clear.

The details matter a lot in places like this, so listing the clause for reference.

22606. (a) Attorney General may bring a civil action for a violation of this chapter and to recover all of the following:

  1. For a violation that causes death or bodily harm to another human, harm to property, theft or misappropriation of property, or that constitutes an imminent risk or threat to public safety that occurs on or after January 1, 2026, a civil penalty in an amount not exceeding 10 percent of the cost of the quantity of computing power used to train the covered model to be calculated using average market prices of cloud compute at the time of training for a first violation and in an amount not exceeding 30 percent of that value for any subsequent violation.
  2. For a violation of Section 22607 [the whistleblower protection provisions] that would constitute a violation of the Labor Code, a civil penalty specified in subdivision (f) of Section 1102.5 of the Labor Code.
  3. For a person that operates a computing cluster for a violation of Section 22604, for an auditor for a violation of paragraph (6) of subdivision (e) of Section 22603, or for an auditor who intentionally or with reckless disregard violates a provision of subdivision (e) of Section 22603 other than paragraph (6) or regulations issued by the Government Operations Agency pursuant to Section 11547.6 of the Government Code, a civil penalty in an amount not exceeding fifty thousand dollars ($50,000) for a first violation of Section 22604, not exceeding one hundred thousand dollars ($100,000) for any subsequent violation, and not exceeding ten million dollars ($10,000,000) in the aggregate for related violations.
  4. Injunctive or declaratory relief.
  5. [There are also…]
    1. Monetary damages.
    2. Punitive damages pursuant to subdivision (a) of Section 3294 of the Civil Code.
  6. Attorney’s fees and costs.
  7. Any other relief that the court deems appropriate.

[blank]

In determining whether the developer exercised reasonable care as required in Section 22603, all of the following considerations are relevant but not conclusive:

  1. The quality of a developer’s safety and security protocol.
  2. The extent to which the developer faithfully implemented and followed its safety and security protocol.
  3. Whether, in quality and implementation, the developer’s safety and security protocol was inferior, comparable, or superior to those of developers of comparably powerful models.
  4. The quality and rigor of the developer’s investigation, documentation, evaluation, and management of risks of critical harm posed by its model.
  1. [Provisions to get around this liability are void.]
  2. [Joint and several liability should apply if entities took steps to purposefully limit such liability.]

[Money goes to Public Rights Law Enforcement Special Fund.]

This section does not limit the application of other laws.

Notice the big effective change in Section 1. The civil penalty now only applies if you have already caused the harm or you create an imminent risk. There are no longer large pre-harm fines for violations. This is on top of the fact that, provided you supply an SSP that nominally covers its requirements, you now have to fail to take reasonable care, or fail to follow your own procedures, to violate the statute.

There are no criminal penalties.

Section 5 says the Attorney General can also get monetary and punitive damages, as per normal, if there is sufficient actual harm for that.

Section 6 says the AG can get Attorney’s fees.

Section 2 says the whistleblower provisions are enforced same as other similar rules.

Sections 4 and 7 are the catch-alls that allow injunctive relief or other relief as appropriate. As in, if you are putting us all in danger, the court can take steps to get you to stop doing that, as necessary.

Section 3 says that fines for computing clusters that fail to do KYC, and the fines for auditors who lie, or who violate the regulations intentionally or via ‘reckless disregard,’ start out at $50k and cap out at $10 million, despite being in relation to $100m+ training runs. If anything, one could worry that this is a fine they could sign up to pay.

Section 7b emphasizes the role of the SSP.

  1. If your SSP does not do much to ensure safety and security, or you fail to implement it, or to document the remaining [risks], this tells to court to take that into consideration.
  2. If you did do all that and the harm happened anyway, that is considered too.

The last clause, 7e, essentially means ‘the people you harm still get to sue you,’ this does not replace your existing liability under the common law.

Effectively, in terms of enforcement, SB 1047 is counting on its transparency requirements, and the ability to point retroactively to a company’s SSP and whether they followed it, to help enable regular lawsuits under common law, by those who are harmed. And it is counting on the potential for this, to hopefully get developers to act responsibly.

(Or, those who notice in advance that an SSP is inadequate can exert public pressure or otherwise try to do something else to address the situation.)

I see this new section as a big compromise. Direct enforcement is a lot harder now, especially enforcement that comes early enough to prevent harms. However it is still possible, when the situation is sufficiently dire, for the Attorney General to get advance injunctive relief.

Whistleblowers Need Protections

Whistleblower protections are necessary and good. They do still need to be calibrated.

A bizarre phenomenon recently was an attempt to say ‘oh sure you say you want the right to warn people about catastrophic or existential risks and insufficient safety precautions, but what that actually means is you want to spill all our intellectual property and justify it by saying the vibes are off’ or what not. I can often feel the hatred for the very idea of worrying about catastrophic harms, let alone existential risks, from across the screen.

Another bizarre phenomenon was OpenAI putting what sure seems to me like blatantly illegal anti-whistleblower provisions in their employment contracts. The SEC fines you for not explicitly putting in exemptions for whistleblowing, and OpenAI’s contracts did… rather the opposite of that.

And Anthropic in their letter expressed several concerns about overreach in the whistleblower protections, which was an interesting perspective to have.

There are still confidentiality considerations, and other practical considerations. Laws need to be written carefully. You do not want to accidentally require lots of prominent disclosures about whistleblowing every time you call for an ice cream truck.

So what exactly do the provisions do?

  1. Employees who have information they believe indicates lack of compliance, or an unreasonable risk of critical harm, must be allowed to tell the Attorney General or Labor Commissioner.
  2. If they do, the employer cannot then retaliate against the employee.
  3. Reminder: You can’t lie about your SSP in a way that violates that law.
  4. The reports can be released with redactions to the public or given to the Governor, if that is judged to serve the public interest.
  5. Developers must post notice to all employees working on covered models and their derivatives of these provisions, including employees of contractors and subcontractors, of their rights and responsibilities in all this.
    1. They can either do this by posting a notice, or providing written notice once a year.
    2. Only applies to those working on the models, not the ice cream trucks.
  6. Developers shall provide a reasonable internal process to anonymously disclose information about violations or risks or failure to disclose risks, without fear of retaliation, which also applies to contractors and subcontractors.
  7. They have to keep records of all this (but the contractors do not).

There are a bunch of annoying fiddly bits around exactly who has to do what. I believe this strikes the right balance between protecting whistleblowing and avoiding undo paperwork burdens and risks to confidential information.

No Division Only Board

The old version of SB 1047 created the Frontier Model Division, in order to have people in government who understood frontier models and could be tasked with implementing the law. Our government badly needs to develop core competency here.

This led to people warnings, including from people like Tyler Cowen and Dean Ball, that if you create regulators, they will go looking for things to regulate, and inevitably do lots of other things over time. Certainly this has been known to happen. Anthropic also requested the elimination of the Frontier Model Division.

They all got their wish. There is no more Frontier Model Division. Reports that would have gone there now go other places, mostly the Attorney General’s office.

That still leaves some decisions that need to be made in the future. So what remains, in section 22609, is the Board of Frontier Models. I know I am.

The board will be housed in the Government Operations Agency.

It will have nine members, up from the previously planned five.

They will be:

  1. A member of the open source community.
  2. A member of the AI industry.
  3. An expert in CBRN (chemical, biological, radiological and nuclear) weapons.
  4. An expert in AI safety.
  5. An expert in cybersecurity of critical infrastructure.
  6. Two members appointed by the Speaker of the Assembly.
  7. Two members appointed by the Senate Rules Committee.

The first five are appointed by the Governor and subject to Senate confirmation. Based on what I know about California, that means ‘appointed by the Governor.’

Four of the nine are appointed with no requirements, so the majority could end up being pretty much anything. The five fixed members ensure representation from key stakeholders.

What exactly does the board do?

  1. Taking into account various factors, update the 10^26 flops and 3*(10^25) thresholds for covered models and derivative models yearly, starting some time before 1 January 2027. They cannot alter the $10m and $100m thresholds.
  2. Taking into account various factors (input from stakeholders including open-source community, academics, industry and government, and any relevant federal regulations or industry self-regulations), issue regulations to establish the requirements for audits as discussed earlier.
  3. Issue guidance for preventing [risk] that is consistent with any guidance from US AISI and NIST.

The flop thresholds will by default quickly become irrelevant given the $100m and $10m minimums. So this is an option to raise the threshold at which models are covered or become non-derivative, if the board sees fit after learning that relatively large amounts of compute do not pose much risk.

In practice I expect the dollar thresholds to mostly bind, and for the fine-tuning thresholds to rarely be hit.

Mostly what the board will do is issue guidance for preventing risks and set the requirements for audits.

There is no requirement that the risk prevention guidance be followed. The standard is based upon reasonable care.

The alternative would have been not to offer guidance. That seems worse.

Does CalCompute?

The bill would like Cal to Compute, to create a public cloud computing cluster.

A consortium is to be established to develop a framework that it will encourage the University of California to implement.

Whether or not to implement it depends on the University of California. It turns out you are not legally able to tell them what to do.

California will have to choose how much to fund it, although it can also take private donations.

I am not going to sweat the other details on this, and I don’t think you need to either.

In Which We Respond To Some Objections In The Style They Deserve

There are good reasons why someone might oppose SB 1047 in its final form.

There are also good questions and concerns to be raised.

Then there are reasons and objections that are… less good.

Throughout the SB 1047 process, many of the loudest objections were hallucinations or fabrications of things that were never in the bill, or that had already been removed from the bill, or based on wild misinterpretations and hyperbole about its implications.

I realized the extent of this when someone informed me that in their informal survey, only ~25% of people were confident the $100 million threshold could not be lowered. A lot of concerns about the bill were based on a specific possibility that Can’t Happen.

This is all very tricky. I have spent a lot of hours understanding this bill. Many raising these objections were genuinely confused about this. Many were misinformed by others. It was the first time many encountered laws in this much detail, or many of the legal concepts involved. I sympathize, I really do.

However, there has also been a well-funded, deliberate, intense and deeply dishonest campaign, led by a16z and Meta together with others in industry and venture capital, to knowingly spread false information about this bill, get others to do so, and coordinate to give a false impression of mass opposition to a bill that is actually favored approximately 65%-25% by not only the California public but also tech workers.

False Claim: The Government Can and Will Lower the $100m Threshold

I will reiterate once again: This cannot happen, unless a new law is passed. No government agency has the power to lower or eliminate the $100m threshold, which will be indexed for inflation. The Frontier Model Board can cause less models to be covered models, but cannot expand the definition in any meaningful way.

False Claim: SB 1047 Might Retroactively Cover Existing Models

Again, because this worry seems to be common, no, this cannot possibly happen (with the possible 1-2 existing models that actually did cost >$100m to train, but that is not believed to be the case).

Moot or False Claim: The Government Can and Will Set the Derivative Model Threshold Arbitrarily Low

Due to the $10m threshold, which cannot be altered, this cannot be done.

Objection: The Government Could Raise the Derivative Threshold Model Too High,

Theoretically this is possible, if the Frontier Model Board chose an absurdly high threshold of compute. In practice, it would matter little due to the nature of the ‘reasonable care’ standard, even if the new threshold survived a legal challenge – it would be clear that the developer’s actions did not materially enable the final model.

False Claim: Fine-Tuners Can Conspire to Evade the Derivative

This is a reasonable worry to raise, but my understanding is the common law knows about such tricks from many other contexts. If the actions are coordinated to evade the threshold, the court will say ‘who are you kidding’ and consider it triggered.

Moot Claim: The Frontier Model Division Inevitably Will Overregulate

The argument here was that if you create a regulator, they will go looking for something to regulate, whether you have empowered them to do that or not.

Whether or not that was a concern before, it is not now. RIP Frontier Model Division.

There is always a non-zero amount of worry about regulatory ramp-up over time, but the political economy situation should be far less concerning now.

False Claim: The Shutdown Requirement Bans Open Source

This is pure misunderstanding or fabrication.

The shutdown requirement only applies to copies of the model that are under developer control.

If you release the weights of your model, and others run copies not under your control, you do not need to be able to shut those copies down.

This clause not only does not ban open models: It effectively has a special exception for open models, which are being given a free pass from an important safety requirement.

Is this a serious security problem for the world? It might well be!

But if people say open weight models cannot satisfy the shutdown requirement, then they are either misinformed or lying. It is a huge tell.

Objection: SB 1047 Will Slow AI Technology and Innovation or Interfere with Open Source

This will at first slow down frontier AI models in particular a small non-zero amount versus no regulation at all, but the additional safety efforts could easily end up more than compensating for that, as companies share best practices, risks are spotted and more robust products are built, and major setbacks are avoided. Most AI companies ignore the bill completely.

This is a remarkably light touch bill. Most AI bills would slow things down far more. Claims of crippling innovation were always absurd, and are even more absurd now after the recent changes.

SB 1047 will also slow down AI a lot less than an actual critical harm. In the absence of regulation, there is more likely to be a critical harm. One of the consequences of that critical harm would likely be very loud demands to Do Something Now. The resulting bill would almost certainly be vastly worse and harsher than SB 1047, and impose much greater burdens on AI.

SB 1047 will also slow down AI a lot less, in my assessment, than the ‘regulate particular capabilities’ approach of things like the EU AI Act, while being much better at preventing critical harm.

In particular, SB 1047 has no direct impact whatever on startups that are not seeking frontier models that cost over $100m in compute to train, which are all but a handful of startups. It has zero direct impact on academics.

There is, however, one scenario in which AI is slowed down substantially by SB 1047.

That scenario is one in which:

  1. The release at least one frontier model or the weights of a frontier model…
  2. …would have happened without SB 1047…
  3. …but turns out to be incompatible with the need to take reasonable care…
  4. …to reduce the risk of critical harms caused or materially enabled by the model.

In other words, the scenario where AI is slowed down is the one in which:

  1. The AI in question would substantially increase our risk of critical harms.
  2. Taking reasonable care to mitigate this would invalidate the business model.
  3. But you were thinking you’d go ahead without taking reasonable care.

In that case, until the problem is solved, the model or its weights might now not be released. Because it would put us at substantial risk of critical harm.

So you know what I have to say to that?

Doing that would be super dangerous! Don’t do that!

Come back to me when you find a way to make it not super dangerous. Thanks.

If that turns out to differentially hamper open models, it will be exactly because and to the extent that open models are differentially dangerous, and put us at higher risk of catastrophic harm.

If you think that developers:

  1. If they notice that their model puts us at risk of catastrophic events.
  2. And they realize this risk will be caused or materially enabled, in a way that would not be possible using alternative tools.
  3. And they notice that they cannot take reasonable care to mitigate this, and still release the model the way they intended to.
  4. And notice this is negligence and they could be sued for a fortune if it happened.
  5. SHOULD THEN RELEASE THE MODEL THAT WAY ANYWAY?

If you think that is good and important? Then you and I disagree. Please speak directly into this microphone. I agree that you would then have good reason to oppose this bill, if your preference on this was strong enough that you feel it overrides the good otherwise done by the transparency provisions and whistleblower protections.

False Claim: This Effectively Kills Open Source Because You Can Fine-Tune Any System To Do Harm

This is mostly the same as the above but I make this as clear as possible.

The model being part of a set of actions that does harm is insufficient to cause liability under this bill.

Someone fine-tuning your model and then using it to cause harm is insufficient.

The developer would only be liable if ALL of the following are true:

  1. The harm was caused or materially enabled…
  2. …not only by the model itself…
  3. …but by the failure of the original developer to take reasonable care.

To be caused or materially enabled, there cannot have been alternative means. Silly arguments like ‘VSCode also can help you hack computers’ are irrelevant. For you to have a problem under this bill, the harm has to be caused by your failure to take reasonable care. It has to be something the hacker could not have done anyway using other means.

If your belief is that there is nothing Meta can do to stop Llama from ‘doing nasty stuff,’ and that nasty stuff will cause catastrophic harms, but someone could have caused those harms anyway via other means? Then Meta does not have a problem.

If your belief that there is nothing Meta can do to stop Llama from ‘doing nasty stuff,’ and that nasty stuff will cause catastrophic harms, that would not have otherwise happened via alternative means, exactly due to Meta’s failure to take reasonable care? Which means Meta taking reasonable care is impossible?

Well, that sounds like an actual real world problem that Meta needs to solve before proceeding with its release plan, no? If your business model would negligently put us at risk for critical harm, then I think it is high time you fixed that problem or got a new business model.

If you disagree, once again, please speak directly into this microphone.

False Claim: SB 1047 Will Greatly Hurt Academia

SB 1047 does not apply, in any way, to academics, unless at minimum they train a model costing $100m in compute, or they fine-tune a model using $10m in compute. Neither of these two things is going to happen.

That means the only way that academia could be impacted at all would be indirectly, if dangerous frontier models were held back and thus less available to the academics. The vast majority of academic work already does not take place on frontier models due to inference costs, and Anthropic and others have shown great willingness to work with the few academics who do need frontier model access.

As far as I can tell, most of those claiming this are simply saying it without any logic or evidence or causal story at all – it feels or sounds true to them, or they think it will feel or sound true to others, and they did not care whether it was accurate.

False Claim: SB 1047 Favors ‘Big Tech’ over ‘Little Tech’

Quite the opposite. SB 1047 literally applies to Big Tech and not to Little Tech.

SB 1047 is opposed by Amazon, OpenAI, Google and Meta.

It is quite the story that regulatory requirements that apply only to Big Tech, and that are opposed by Big Tech, will instead cripple Little Tech and advantage Big Tech.

The only story for how this happens, that is not complete nonsense or straight up lying, is that this:

  1. Will hurt Big Tech’s (probably Meta’s) ability to release future open models…
  2. …because they would by doing so materially enable critical harms and Meta was unable to take reasonable care to mitigate this…
  3. …and by hurting this Big Tech company and preventing it from negligently releasing a future frontier model, the law will hurts Little Tech.

In which case:

  1. I refer you to the ‘slow technology and innovation’ question for why I am fine with that scenario, if your plan requires you to not take reasonable care to prevent catastrophic events then I am biting the bullet that you need a new plan, and…
  2. Meta would already be wise not release such a model…
  3. …because of its liability under existing law to take reasonable care.

This is even more true with the modifications to the bill, which greatly reduce the risk of regulatory capture or of the requirements becoming perverse.

False Claim: SB 1047 Would Cause Many Startups To Leave California

Yeah, no. That’s obvious nonsense.

With at most a handful of notably rare exceptions, SB 1047 will not apply to startups.

The only way this could hurt startups is indirectly, if SB 1047 prevents the release, or prevents the open release of the weights, of a future frontier model.

(Or, and this is super thin but I want to be fully accurate, perhaps by slightly reducing capital available for or interest in acquisitions of those startups by Big Tech? I guess?)

If that did happen, then the change would hit everyone equally, regardless of location. There is no such thing as a model that is open in Texas and closed in California.

I suppose, in theory, startups could leave California if various members of the ecosystem did some combination of misunderstanding the implications or walking away in some sort of huff. But it wouldn’t be due to what the bill actually does.

In any case, if anyone wants to bet startups will leave California, I will take that action.

Objection: Shutdown Procedures Could Be Hijacked and Backfire

It seems like the most basic of common sense that if you are going to create a highly capable AI, you want to have a way to turn it off in an emergency.

One could however worry about misuse of that capability.

What if China found a way to trigger the shutdown capability? Or a hacker or a terrorist? What if the US Government demands you use it when they shouldn’t? What if an internal decision maker freaks out? What if a future rival AI does it?

I suppose these things are possible.

Obviously, you would want to build in safeguards to prevent this from happening on economically important AI clusters. And you would want to have the ability to spin the whole thing up again quickly, if this is done in error or by an enemy. And you would want to strengthen your cybersecurity for so, so many reasons anyway.

I believe that the danger of being unable to shut an AI down vastly exceeds the danger of being able to shut that AI down.

(Ideally, I would want us to mandate that Nvidia builds kill switches into their high end AI chips, but that’s another issue.)

Objection: The Audits Will Be Too Expensive

Reasonable care to reduce [risk], writing up one’s safety strategies and sharing them with the public so we can punch holes in them or so other companies can use them, are things companies should be doing anyway. Indeed, it is exactly the ethos of open source: Open source your safety plan.

Money and time spent there is, in my opinion, money and time well spent.

The yearly required audits, however, could potentially constitute a substantial expense or too much paperwork. Some of that will be actually useful work to ensure safety, but a lot of it also will be to ensure compliance.

So how much will this cost and will it be a practical issue?

My very lightly traded Manifold market says that there is an 81% chance that if SB 1047 is enacted, an audit will be available for less than $500k. The trading history and the impact of the cost of capital clearly indicate the chance should be even higher.

This only applies to models that cost more than $100m in compute to train. Models tend to only have a lifecycle of a few years before being replaced. So it seems highly unlikely this cost will be more than 1% of the training compute cost.

In exchange you get the benefits of checking periodically, verifying compliance and transparency, which I view as substantial.

There is still a downside, but I consider it a minor and highly acceptable one.

Objection: What Is Illegal Here is Already Illegal

If you fail to take reasonable care, and that failure to take reasonable care results in catastrophic harm (or non-catastrophic harm) then you can already be sued for damages, including punitive damages, under the common law.

So what is the point of making it illegal again? Isn’t this all already covered?

The bill still does several important things, including whistleblower protections.

At heart, the bill is a transparency bill. Developers training frontier models must disclose:

  1. That they are training a frontier model.
  2. What their safety and security plan (SSP) is and why it is good.
  3. Procedures for the tests they intend to run, and the results of those tests.
  4. Their risk assessment.

This allows others to look for problems, point out risks, take any needed precautions, copy best practices and if necessary loudly object. If things are bad enough, the Attorney General can seek injunctive relief. The need to disclose all this in public exerts pressure on companies to step up their game.

The yearly audit hopefully helps ensure companies do indeed do reasonable things, and that they follow through on their commitments.

In terms of liability for critical harms that do take place, and that are caused or materially enabled by lack of reasonable care on the part of the developer, there are three important changes from existing common law.

  1. The reasonable care standard will now take into account the quality of the SSP, whether the company complied with it, and the risk assessment.
  2. Those documents being written and public, along with the audits, makes it easier for potential plaintiffs to find out whether the company did exercise reasonable care, and for responsible companies to show they did in fact do so.
  3. When harm occurs, the Attorney General now also has standing, and can seek to fine the developer if they violate the statute. This makes it more likely that the company does get held accountable, and makes the process smoother and faster.
  4. This clarifies that you do have to care about derivative models, which I believe was already true for obvious practical reasons – you enabled that – but it makes this unambiguous.

Ketan Ramakrishnan also points out that the idea of tort law is to deter negligence. So codifying and reminding people of what constitutes negligence can be helpful, even if the underlying rules did not change.

Also consider the flip side of this argument.

If everything this bill makes illegal is already illegal, and violating SB 1047’s duty of reasonable care already makes a developer liable under existing law, then how can this law be a major threat to the tech industry, or to any particular business model, or to open models?

Either what you wanted to do was already illegal, or it remains legal.

The answer is that this means that your business model previously was indeed illegal. Your plan involved not taking reasonable care to reduce the risk of catastrophic harms. You were fine with that, because you figured it would probably be fine anyway, that it would be difficult to catch you, that you could drag out the court case for years and the absolute worst legal consequence for you was that it bankrupted the company. It also might kill a lot of people, or everyone, but that’s a risk you were willing to take.

Now you are facing transparency requirements and clarifications that make it much harder to get away with this, and you say it is a ban on your business model? Well, I do not believe that to be true. But if that is indeed true, that tells us quite a lot about your business model.

Objection: Jailbreaking is Inevitable

Some have pointed out that, at least so far, no one knows how to avoid even a closed model from being jailbroken. Current safety techniques do not work against determined adversaries.

To which I say, that means you have two choices.

  1. Find jailbreaks or other defenses that actually work.
  2. Test the model on the (correct) assumption that some users will jailbreak it.

Universal (at least so far) jailbreaks are an argument to worry about catastrophic events more, not less. They make models more dangerous. People’s safety tests need to consider this as part of the threat model. The question is what will the model do it practice if you release it.

Moot and False Claim: Reasonable Assurance Is Impossible

People kept talking about Reasonable Assurance as if it was ‘prove nothing will ever happen,’ which is absolutely not what that standard means.

This is moot now. Reasonable assurance has changed to reasonable care, which is already the standard in place under common law.

Objection: Reasonable Care is Too Vague, Can’t We Do Better?

First off, we are already under a standard of reasonable care everywhere, as noted in the previous response and also earlier.

I think the answer to ‘can we do better’ is rather clearly no. We cannot do better, unless someone comes up with a new idea.

Reasonable care is a flexible standard that is well understood by courts, that takes into account a balance of costs and benefits. That’s as good as it’s going to get.

The rigid alternatives are things like strict liability, or some specified checklist. If you want the government to tell you when you are in the clear, then that means prior restraint, and that means a government evaluation.

Objection: The Numbers Picked are Arbitrary

Somewhat. When one writes rules, one must pick numbers.

The numbers in SB 1047 have been set carefully including anticipating likely futures. But yes they are guesses and round numbers, and conditions will change.

Or as someone put it: 55 mile per hour speed limit? Who comes up with this? They think that 56 is unsafe and 54 is safe? Why not 52? Why not 57? Don’t they know road conditions differ and cars will change?

A key problem is that the law has to balance two important concerns here:

  1. You want the law to be flexible, so it can adapt to changing conditions.
  2. Paranoia that any flexibility will be used to intentionally choose the worst numbers possible, by people intentionally attempting to kill the AI industry.

The only way to address the second concern are simple rules that cannot be changed.

The original version of the bill focused on flexibility. It started out with the 10^26 flops threshold, then it was ‘equivalent capability’ to what that would have gotten you in 2024, and everyone hated that. When it was proposed to let the FMD choose how to adjust it, everyone said a mix of ‘you will never raise it’ and ‘actually you will lower it.’

So the bill switched to the $100 million threshold, which went over much better.

To ensure that no one could say ‘oh they’ll lower that threshold,’ the law makes it impossible to lower the threshold.

The same goes for the $10 million threshold on fine tuning a derivative model.

The compute thresholds can still be adjusted. But given the dollar thresholds, lowering the compute thresholds would not do anything. Only raising them matters.

Objection: The Law Should Use Capabilities Thresholds, Not Compute and Compute Cost Thresholds

Compute, and the cost of compute, are excellent proxies for ‘are you building a big model that might have dangerous capabilities?’

I believe compute is the best proxy we have. I don’t know any good alternatives.

This then lets any models that do not trigger those thresholds fully ignore the law.

If a model does cost over $100 million, then what does the developer have to do?

They have to check for… capabilities.

(Actually, a very early version of the bill did rely on checking for capabilities, and was roundly and rightfully lambasted for it.)

If they do not find dangerous capabilities – maybe the model is not big enough, maybe AIs aren’t that [risky] yet, maybe you suck at training models, maybe you differentially avoiding giving it the dangerous capabilities – then you use precautions and tests, you verify this, and you can be on your way. No problem.

Ultimately, of course it is about the capabilities of the model. That is what the tests are for. The compute and cost thresholds are there to ensure we don’t waste time and money checking models that are highly unlikely to be dangerous.

The other problem: You do not want to trigger on any particular benchmarks or other capabilities tests, both because they may not match what you are worried about (one person suggested the Turing Test, which can easily be orthogonal to key risks) and because they are too easy to game. As I understand it, it would be very easy to ensure your AI fails at any particular benchmark, if that was desired.

False Claim: This Bill Deals With ‘Imaginary’ Risks

The bill deals with future models, which will have greater capabilities. We know such models are coming. The focus of this bill is preventing those models from creating catastrophic harm. This is not a speculative thing to worry about.

Yes, this category includes existential risks, and some who helped write and improve this bill are highly motivated by preventing existential risks, but that is not what the bill focuses on.

The risk of $500m in damages from a cybersecurity event, in particular, is a deeply non-imaginary thing to worry about. Indeed, this is the flip side of those who worry that the Crowdstrike incident was a catastrophic event, since it seems to have caused damages in excess of $500m, all in one incident.

One can question whether the risks in question rise to sufficient likelihood to justify the requirements of the bill. You might, if you believe the downside risks are sufficiently small and unlikely, reasonably oppose this bill. Those risks are not imaginary, even if you fully reject existential risks.

Objection: This Might Become the Model For Other Bills Elsewhere

That seems good?

It’s a well crafted and light touch bill. The compliance costs would almost entirely run concurrently.

If others choose to mostly copy SB 1047 rather than roll their own, that greatly reduces the chances of something deeply stupid happening, and it builds international cooperation. You’d love to see it.

Of course, if you think it’s a bad bill, sir, then you don’t want it copied. But if it’s a good bill, or even an above replacement level bill, you should welcome this.

Not Really an Objection: They Changed the Bill a Lot

This is true. They changed the bill a lot.

That is because an extraordinarily large amount of work and care went into making this bill as well-written as possible, and addressing the concerns of its critics, and making compromises with them to allow it to proceed.

I believe that the result was a much better written and lighter touch bill.

Not Really an Objection: The Bill Has the Wrong Motivations and Is Backed By Evil People

As opposed to objecting to what is actually in the bill.

The wrong motivations here are ‘not wanting everyone to die.’

The evil people are ‘the people who worry about everyone dying and try to prevent it.’

Thus a bunch of ad hominem attacks, or attacks on the very concept of existential risk, even though such attacks are completely irrelevant to the content of the bill.

For some reason, when some people think an action is motivated in part by worry about existential risk, they react by going on tilt and completely losing their minds. Rather than ask how the bill might be improved, or trying to find specific problems, they treat the whole situation as ‘if the bill passes then those bad people win’ situation, rather than asking what is in the bill and what it does.

Here is a highly helpful example of Maddie pointing out that they themselves have made exactly this argument.

As Charles Foster notes, consider AB 3211. It is a no-good, very bad bill, that would cause far greater harm than SB 1047. But because the motivation is ‘stop deepfakes’ there was no uproar at all until Dean Ball decided to warn us about AB 3211, I wrote a post confirming that and many others confirmed it, and then all talk then quickly died out again. For all I know it might still pass.

The whole dynamic is rather perverse. Even if you disagree about magnitudes of risk and best ways to move forwards: These are, quite obviously, not evil people or evil motives. They are not enemies you are fighting in a war. This is not a zero sum conflict, we (I would hope) all want to ensure a good future.

If you are unsure what the bill does? Read. The. Bill.

Not an Objection: ‘The Consensus Has Shifted’ or ‘The Bill is Unpopular’

No. It hasn’t, and it isn’t. The bill remains very popular, including among tech employees, and there is no sign of a decline in popularity at all. So far, I have seen people previously opposed say the bill has been improved, and do not know of anyone who went from support to opposition.

Instead, what has happened is that opponents of the bill have engaged in a well-funded, coordinated effort to give off the wrong vibes and the impression of widespread opposition.

It is also not, even in principle, an objection to the bill. It’s either a good bill or it isn’t.

Objection: It Is ‘Too Early’ To Regulate

We may soon create new things smarter than ourselves. The major labs expect AGI within 3-5 years, which is 1-3 product cycles.

And you think it is ‘too early’ to do what is centrally transparency regulation?

If not us, then who? If not now, then when?

AI is an exponential.

There are only two ways to respond to an exponential: Too early, or too late.

It is indeed too soon for many forms of detailed prescriptive regulations. We do not know enough for that yet. This bill does not do that.

This is exactly the situation in which we need a light touch bill with rules centered around transparency. We need to ensure everyone open sources their safety and security plans, and also takes the time to have a real one. We need to know, as they say, what the hell is going on.

Otherwise, we will never know, and it will always be ‘too early’ until it is too late, or some crisis or public outcry forces our hands. At that point, without the visibility and in a hurry, we would likely choose a deeply terrible bill.

Objection: We Need To ‘Get It Right’ and Can Do Better

If I knew of concrete suggestions on how to do better, or ways to fix this bill, I would consider them. Indeed, I have considered many of them. Many of those have now made it into the final bill.

The idea that this bill does not ‘take into account feedback’ is absurd in the face of both all the feedback taken into account, and that most of the remaining so-called negative ‘feedback’ is based on hallucination and fabrication of things that were never in the bill.

When people call to ‘take a step back’ for some mysterious ‘better way,’ or say ‘we should take the time to get it right,’ well, we’ve spent a lot of time here trying to get it right, and no one has suggested a better way, and no one calling for this ever plausibly suggests what that better way might be.

Instead, I expect we would do far worse, in terms of trading off costs versus benefits.

Objection: This Would Be Better at the Federal Level

Ideally, yes, I agree. But we should realistically despair of getting any bill with similar provisions, let alone one of similar quality, through the Congress any time soon. That is the real world we live in.

If this turns out to be wrong, then that new bill would supercede this one. Good.

Objection: The Bill Should Be Several Distinct Bills

You could say that this bill does several distinct things. I am most sympathetic to this in the case of the KYC requirements for large compute clusters, which certainly could be a distinct bill. The rest seems highly logically tied together to me, the parts reinforce each other. Yes, it makes sense that ‘you have to do X’ and ‘you have to confirm you did the X you now have to do’ are part of the same bill.

And while process types like to talk about first best lawmaking – bills everyone reads, each short with a distinct purpose – that is not how laws are actually passed these days. We should move in that direction on the margin for sure, but requests like this are mostly universal arguments for having a very high bar against passing important laws at all.

Objection: The Bill Has Been Weakened Too Much in Various Ways

I have spent the entire objection section talking about why the bill might go too far.

The opposite objection is also highly relevant. What if the bill does not do enough?

The bill is a lot weaker, in many ways, than it started out.

Thus, such concerns as:

  1. The required standard has been lowered to reasonable care. That’s a low standard. What if you think what is widely standard and thus ‘reasonable’ is freaking nuts?
  2. Developers are only held responsible if their failure to take that reasonable care causes or materially enables the critical harm. That’s an even lower standard.
  3. There are no civil penalties until after critical harm happens. By then it could easily be too late, perhaps too late for all of us. What good is this if we are all dead, or the developer can’t pay for all the harms done, or the model is on the internet and can’t be withdrawn? What’s to stop developers from going YOLO?
  4. There will be no frontier model division, so who in government will develop the expertise to know what is going on?
  5. Won’t the thresholds for covered models inevitably become too high as prices drop and we see algorithmic improvements?

And so on.

To those who are concerned in this other direction, I say: It’s still a good bill, sir.

Is it as strong as those concerned in this way would like? No, it is not.

It still provides a lot of help. In particular, it provides a lot of transparency and visibility. And that is what we currently most desperately lack. This way, the public gets to see, red team and critique the safety plans, and perhaps raise alarm bells.

I especially think the cost-benefit ratio has improved. There are smaller benefits than before, but also dramatically reduced costs. I think that the situation justifies a stronger bill with a less lopsided cost-benefit ratio, because the marginal benefits still well exceed marginal costs, but the lighter touch bill is more clearly net positive.

If someone in national security came to me and said, what do we need most right now? I would say to them: You need visibility. We need to know what is happening.

If you had told me a year ago (or last month, or yesterday) that we could get this bill, in this form, I would have been very excited and happily taken that.

Final Word: Who Should Oppose This Bill?

In my opinion: What are the good reasons to oppose this bill and prefer no bill?

I see three core ‘good’ reasons to oppose the bill:

  1. You do not believe in catastrophic or existential harms.
  2. You do not believe in regulations, on principle, no matter what.
  3. You think negligently enabling catastrophic harms shouldn’t stop AI deployments.

If you feel future AIs will not counterfactually cause or enable catastrophic harms, let alone any existential risks? That there are essentially no risks in the room to prevent?

Then that would be the best reason to oppose this bill.

If you are a principled libertarian who is opposed to almost all regulations on principle, or almost all regulations on technology, because you believe they inevitably backfire in various ways? And you do not think the risks here are so exceptional as to overcome that?

That would also be a good reason to oppose this bill.

If you think that the world would be better off if AI companies that are negligent, that fail to take reasonable care, in ways that likely cause or materially enable catastrophic events, should still proceed with their AI deployments? Because the benefits still exceed the costs?

Then please say that clearly, but also you should likely oppose this bill.

One could also believe that the costs of compliance with this bill are far higher, and the benefits far lower, than I believe them to be, such that the costs exceed benefits. Most arguments for high bill costs are based on misunderstandings of what is in the bill (unintentionally or otherwise), but I could of course still be wrong about this.

There are also various legitimate concerns and process complaints, and reasons this bill is not fully a first best solution. I do not believe those alone are sufficient to justify opposing the bill, but people can disagree with that.

In particular, I do not think that there are plausible worlds in which not passing SB 1047 causes us to be likely to instead end up with a ‘better’ bill later.

I believe this is a good bill, sir. I believe we should pass it.

18 comments

Comments sorted by top scores.

comment by Logan Zoellner (logan-zoellner) · 2024-08-20T18:23:35.203Z · LW(p) · GW(p)

This led to people warnings, including from people like Tyler Cowen and Dean Ball, that if you create regulators, they will go looking for things to regulate, and inevitably do lots of other things over time.

I don't see how the new form of the bill prevents the state of CA from using the law to regulate things that have nothing to do with existential risk.

Suppose you are Gavin Newsom and one day while scrolling X you see this image.
 

"This is misinformation! This is a threat to democracy!" you scream.

So you call up the attorney general on the phone and say "We've got to shut down Grok!" (pretend for the sake of argument Grok3 is a covered model, which it almost certainly will be)

The attorney general calmly reminds Newsom that "SB-1047 is only deals with catastrophic harm, not mean posts on X."

"What could be more catastrophic!" Newsom rages.  "The future of the nation is at stake!"

How exactly do you think that conversation ends?

Replies from: Raemon
comment by Raemon · 2024-08-20T18:32:37.241Z · LW(p) · GW(p)

The bill is very specific on what catastrophic* means (creating weapons of mass destruction, $500 million in damages, or mass casualties). Maybe a court can argue this causes $500 million in damages, but, like, that seems like a real stretch to me. The attorney general could argue it in court but I don't think they'd win.

*well, it actually defines "critical harm", not "catastrophic". But, you get the idea.

Replies from: logan-zoellner
comment by Logan Zoellner (logan-zoellner) · 2024-08-20T20:59:05.683Z · LW(p) · GW(p)

your faith in the likelihood that the supreme court of California will interpret this "as intended by Zvi" and not "in whatever way seems politically convenient" is much higher than mine.

Replies from: Zvi
comment by Zvi · 2024-08-20T22:49:52.449Z · LW(p) · GW(p)

I am rather confident that the California Supreme Court (or US Supreme Court, potentially) would rule that the law says what it says, and would happily bet on that. 

If you think we simply don't have any law and people can do what they want, when nothing matters. Indeed, I'd say it would be more likely to work for Gavin to today simply declare some sort of emergency about this, than to try and invoke SB 1047.

Replies from: logan-zoellner
comment by Logan Zoellner (logan-zoellner) · 2024-08-21T03:43:32.441Z · LW(p) · GW(p)

My claim is not that the supreme court will literally ignore the text of the law, but rather that phrases like "Other grave harms to public safety and security" could easily be interpreted to cover the above scenario.

If this were a federal law, I would at least have some solace that the natural checks-and-balances might take effect.  But given that single-party-control of CA is unlikely to end anytime soon, giving a state law a veto over all frontier models in the United States seems bad.

CA does not have a particularly good track-record of respecting my rights.  I would have the same objection if TX tried to pass a law asserting nationwide control over an industry.

I suspect this law would eventually get struck down by the US Supreme Court as a violation of interstate commerce if they actually tried to enforce it against a company that did not have employees in their state, but in the meantime the chilling effect on speech/technology would be significant.

As far as a bet, because I expect most of the effect to happen through "chilling effect" or "guidance", it would have to be something along the lines of: the FMB will issue guidance about "best practices" that will include topics such as "misinformation" "deceptive imagery" or other topics that encourage models to censor their outputs on topics not clearly related to CRBN or Hacking.

Replies from: Raemon
comment by Raemon · 2024-08-21T03:57:14.106Z · LW(p) · GW(p)

Sounds like a good opportunity for a concrete bet, and/or manifold market?

Replies from: logan-zoellner
comment by Logan Zoellner (logan-zoellner) · 2024-08-21T04:03:05.587Z · LW(p) · GW(p)

"if SB 1047 is passed in its current from and not struck down by the Supreme Court or otherwise modified the FMB will issue guidance about "best practices" that will include topics such as "misinformation" "deceptive imagery" or other topics that encourage models to censor their outputs on topics not clearly related to CRBN or Hacking (prior to AGI, assuming it happens >3 years from now)."

@zvi ?

Replies from: Zvi
comment by Zvi · 2024-08-21T12:37:22.520Z · LW(p) · GW(p)

Worth noticing that is a much weaker claim. The FMB issuing non-binding guidance on X is not the same as a judge holding a company liable for ~X under the law. 

Replies from: logan-zoellner
comment by Logan Zoellner (logan-zoellner) · 2024-08-21T18:08:00.521Z · LW(p) · GW(p)

Worth noticing that you aren't taking the bet.

Mind adding an addendum to your article along the lines of "it can be reasonably speculated that the FMB will a chilling effect on freedom of speech by issuing guidance about model outputs"?

comment by Raemon · 2024-08-20T17:43:52.486Z · LW(p) · GW(p)

My current understanding is that in the new bill, you can't be sued for not having an SSP, or third party audits? Is that right? Do we have any guarantee that we'll actually even have the transparency aspects of the bill?

Replies from: Zvi
comment by Zvi · 2024-08-20T22:47:45.711Z · LW(p) · GW(p)

They do have to publish any SSP at all, or they are in violation of the statute, and injunctive relief could be sought. 

comment by Raemon · 2024-08-20T17:38:25.549Z · LW(p) · GW(p)

That still leaves some decisions that need to be made in the future. So what remains, in section 22609, is the Board of Frontier Models. I know I am.

Is this a typo/unfinished sentence or am I just confused about what you mean by "I know I am?"

Replies from: Zvi
comment by Zvi · 2024-08-20T22:46:55.094Z · LW(p) · GW(p)

This is a silly wordplay joke, you're overthinking it.

Replies from: Raemon
comment by Raemon · 2024-08-20T23:16:29.982Z · LW(p) · GW(p)

I think it was more like I am underthinking it, because I looked at it, thought "this looks like a typo" and then stopped thinking. (I still don't get the joke).

Replies from: Benito
comment by Ben Pace (Benito) · 2024-08-20T23:19:09.169Z · LW(p) · GW(p)

I have explained to Ray that, while many people are either excited about or scared of frontier models, Zvi's full-time news beat dealing with them might leave him uniquely bored of frontier models.

comment by oumuamua · 2024-08-22T14:55:21.847Z · LW(p) · GW(p)

Correct me if I'm wrong, but it seems to me that something this law implies is that it's only legal to release jailbreakable models if they (more or less) suck.

Got something that can write a pretty good computer virus or materially enable somebody to do it? Illegal under SB1047, and I think the costs might outweigh the benefits here. If your software is so vulnerable that an LLM can hack it, that should be a you problem. Maybe use an LLM to fix it, I don't know. The benefit of AI systems intelligent enough to do that (but too stupid to pose actual existential risks) seems greater than the downside of initial chaos that would certainly ensue from letting one loose on the world.

If I had to suggest an amendment, I'd word it in such a way that as long as the model outputs publicly available information, or information that could be obtained by a human expert, it's fine. There are already humans who can write computer viruses, so your LLMs should be allowed to do it as well. What they should not be allowed to do is design scary novel biological viruses from scratch, make scary self-replicating nanotech, etc., since human experts currently can't do those things either.

Or, in case that is too scary, maybe apply my amendment only to cyber-risks, but not to bio/nuclear/nanotech,....

Replies from: quetzal_rainbow
comment by quetzal_rainbow · 2024-08-22T15:58:42.074Z · LW(p) · GW(p)
  1. Humans write computer viruses for far more money than price of token generation.
  2. Quoting the bill:

“Critical harm” does not include any of the following: (A) Harms caused or materially enabled by information that a covered model or covered model derivative outputs if the information is otherwise reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative.

Replies from: oumuamua
comment by oumuamua · 2024-08-22T17:00:41.942Z · LW(p) · GW(p)

1, Yes, but they also require far more money to do all the good stuff as well! I’m not saying there isn’t a tradeoff involved here.

2, Yes, I’ve read that. I was saying that this is a pretty low bar, since an ordinary person isn’t good at writing viruses. I’m afraid that the bill might have the effect of making competent jailbreakable models essentially illegal, even if they don’t pose an existential risk (in which case that would be necessary ofc.), and even if their net value for society is positive, because there is a lot of software out there that‘s insecure and that a reasonably competent coding AI could exploit and cause >500 MM in damages.

I’m saying that it might be better to tell companies to git gud at computer security and accept the fact that yes, an AI will absolutely try to break their stuff, and that they won’t get to sue Anthropic if something happens.