RTFB: On the New Proposed CAIP AI Bill

zvi

RTFB: On the New Proposed CAIP AI Bill

post by Zvi · 2024-04-10T18:30:08.410Z · LW · GW · 14 comments

  RTFC: Read the Bill
  Basics and Key Definitions
  Oh the Permits You’ll Need
  Rubrics for Your Consideration
  Open Model Weights Are Unsafe And Nothing Can Fix This
  Extremely High Concern Systems
  The Judges Decide
  Several Rapid-Fire Final Sections
  Overall Take: A Forceful, Flawed and Thoughtful Model Bill
  The Usual Objectors Respond: The Severability Clause
  The Usual Objectors Respond: Inception
  The Usual Objectors Respond: Rulemaking Authority
  Conclusion
None
14 comments

A New Bill Offer Has Arrived

Center for AI Policy proposes a concrete actual model bill for us to look at.

Here was their announcement:

WASHINGTON – April 9, 2024 – To ensure a future where artificial intelligence (AI) is safe for society, the Center for AI Policy (CAIP) today announced its proposal for the “Responsible Advanced Artificial Intelligence Act of 2024.” This sweeping model legislation establishes a comprehensive framework for regulating advanced AI systems, championing public safety, and fostering technological innovation with a strong sense of ethical responsibility.

“This model legislation is creating a safety net for the digital age,” said Jason Green-Lowe, Executive Director of CAIP, “to ensure that exciting advancements in AI are not overwhelmed by the risks they pose.”

The “Responsible Advanced Artificial Intelligence Act of 2024” is model legislation that contains provisions for requiring that AI be developed safely, as well as requirements on permitting, hardware monitoring, civil liability reform, the formation of a dedicated federal government office, and instructions for emergency powers.

The key provisions of the model legislation include:

1. Establishment of the Frontier Artificial Intelligence Systems Administration to regulate AI systems posing potential risks.

2. Definitions of critical terms such as “frontier AI system,” “general-purpose AI,” and risk classification levels.

3. Provisions for hardware monitoring, analysis, and reporting of AI systems.

4. Civil + criminal liability measures for non-compliance or misuse of AI systems.

5. Emergency powers for the administration to address imminent AI threats.

6. Whistleblower protection measures for reporting concerns or violations.

The model legislation intends to provide a regulatory framework for the responsible development and deployment of advanced AI systems, mitigating potential risks to public safety, national security, and ethical considerations.

“As leading AI developers have acknowledged, private AI companies lack the right incentives to address this risk fully,” said Jason Green-Lowe, Executive Director of CAIP. “Therefore, for advanced AI development to be safe, federal legislation must be passed to monitor and regulate the use of the modern capabilities of frontier AI and, where necessary, the government must be prepared to intervene rapidly in an AI-related emergency.”

Green-Lowe envisions a world where “AI is safe enough that we can enjoy its benefits without undermining humanity’s future.” The model legislation will mitigate potential risks while fostering an environment where technological innovation can flourish without compromising national security, public safety, or ethical standards. “CAIP is committed to collaborating with responsible stakeholders to develop effective legislation that governs the development and deployment of advanced AI systems. Our door is open.”

I discovered this via Cato’s Will Duffield, whose statement was:

Will Duffield: I know these AI folks are pretty new to policy, but this proposal is an outlandish, unprecedented, and abjectly unconstitutional system of prior restraint.

To which my response was essentially:

I bet he’s from Cato or Reason.
Yep, Cato.
Sir, this is a Wendy’s.
Wolf.

We need people who will warn us when bills are unconstitutional, unworkable, unreasonable or simply deeply unwise, and who are well calibrated in their judgment and their speech on these questions. I want someone who will tell me ‘Bill 1001 is unconstitutional and would get laughed out of court, Bill 1002 has questionable constitutional muster in practice and unconstitutional in theory, we would throw out Bill 1003 but it will stand up these days because SCOTUS thinks the commerce clause is super broad, Bill 1004 is legal as written but the implementation won’t work, and so on. Bonus points for probabilities, and double bonus points if they tell you how likely each bill is to pass so you know when to care.

Unfortunately, we do not have that. We only have people who cry wolf all the time. I love that for them, and thank them for their service, which is very helpful. Someone needs to be in that role, if no one is going to be the calibrated version. Much better than nothing. Often their critiques point to very real issues, as people are indeed constantly proposing terrible laws.

The lack of something better calibrated is still super frustrating.

RTFC: Read the Bill

So what does this particular bill actually do if enacted?

There is no substitute for reading the bill.

I am going to skip over a bunch of what I presume is standard issue boilerplate you use when creating this kind of apparatus, like the rulemaking authority procedures.

There is the risk that I have, by doing this, overlooked things that are indeed non-standard or otherwise worthy of note, but I am not sufficiently versed in what is standard to know from reading. Readers can alert me to what I may have missed.

Each bullet point has a (bill section) for reference.

Basics and Key Definitions

The core idea is to create the new agency FAISA to deal with future AI systems.

There is a four-tier system of concern levels for those systems, in practice:

Low-concern systems have no restrictions.
Medium-concern systems must be checked monthly for capability gains.
High-concern systems require permits and various countermeasures.
Very high-concern systems will require even more countermeasures.

As described later, the permit process is a holistic judgment based on a set of ruberics, rather than a fixed set of requirements. A lot of it could do with better specification. There is a fast track option when that is appropriate to the use case.

Going point by point:

(4a) Creates the Frontier Artificial Intelligence Systems Administration, whose head is a presidential appointment confirmed by the Senate.
(4b) No one senior in FAIS can have a conflict of interest on AI, including owning any related stocks, or having worked at a frontier lab within three years, and after leaving they cannot lobby for three years and can only take ‘reasonable compensation.’ I worry about revolving doors, but I also worry this is too harsh.
(3u1): Definition: LOW-CONCERN AI SYSTEM (TIER 1).—The terms “low-concern AI system” and “Tier 1” mean AI systems that do not have any capabilities that are likely to pose major security risks. Initially, an AI system shall be deemed low-concern if it used less than 10^24 FLOP during its final training run.
(3u2): Definition: MEDIUM-CONCERN AI SYSTEM (TIER 2). The terms “medium-concern AI system” and “Tier 2” mean AI systems that have a small chance of acquiring at least one capability that could pose major security risks. For example, if they are somewhat more powerful or somewhat less well-controlled than expected, such systems might substantially accelerate the development of threats such as bioweapons, cyberattacks, and fully autonomous artificial agents. Initially, an AI system shall be deemed medium- concern if it used at least 10^24 FLOP during its final training run and it does not meet the criteria for any higher tier. I note, again, that his threshold shows up in such drafts when I think it should have been higher.
(3u3): Definition: HIGH-CONCERN AI SYSTEM (TIER 3).—The terms “high-concern AI system” and “Tier 3” mean AI systems that have at least one capability that could pose major security risks, or that have capabilities that are at or very near the frontier of AI development, and as such pose important threats that are not yet fully understood.
Gemini believes that sections 5-6 grant unusually flexible rulemaking authority, and initially I otherwise skipped those sections. It says “The Act grants the Administrator significant flexibility in rulemaking, including the ability to update technical definitions and expedite certain rules. However, there are also provisions for Congressional review and potential disapproval of rules, ensuring a balance of power.” As we will see later, there are those who have a different interpretation. They can also hire faster and pay 150% of base pay in many spots, which will be necessary to staff well.
If you are ‘low-concern’ you presumably do not have to do anything.
(7) Each person who trains a ‘medium-concern AI’ shall pre-register their training plan, meaning lay out who is doing it, the maximum compute to be spent, the purpose of the AI, the final scores of the AI system on the benchmarks selected by the DAIS, and the location of the training (including cloud services used if any). Then they have to do continuous testing each month, and report in and cease training if they hit 80% on any of the benchmarks in 3(v)(3)(a)(ii), as you are now high concern. I notice that asking for benchmark scores before starting is weird? And also defining a ‘purpose’ of an AI is kind of weird?

Oh the Permits You’ll Need

The core idea is to divide AI into four use cases: Hardware, Training, Model Weights and Deployment. You need a distinct permit for each one, and a distinct permit for each model or substantial model change for each one, and you must reapply each time, again with a fast track option when the situation abides that.

Each application is to be evaluated and ‘scored,’ then a decision made, with the criteria updated at least yearly. We are given considerations for the selection process, but mostly the actual criteria are left fully unspecified even initially. The evaluation process is further described in later sections.

There are three core issues raised, which are mostly discussed in later sections.

Practicality. How much delay and cost and unreliability will ensue?
Specificity. There is the common complaint that we do not yet know what the proper requirements will be and they will be difficult to change. The solution here is to give the new department the authority to determine and update the requirements as they go. The failure modes of this are obvious, with potential ramp-ups, regulatory capture, outright nonsense and more. The upside of flexibility and ability to correct and update is also obvious, but can we get that in practice from a government agency, even a new one?
Objectivity. Will favored insiders get easy permits, while outsiders or those the current administration dislikes get denied or delayed? How to prevent this?

As always, we have a dilemma of spirit of the rules versus technical rule of law.

To the extent the system works via technical rules, that is fantastic, protecting us in numerous ways. If it works. However, every time I look at a set of technical proposals, my conclusion is at best ‘this could work if they abide by the spirit of the rules here.’ Gaming any technical set of requirements would be too easy if we allowed rules lawyering (including via actual lawyering) to rule the day. Any rules that worked against adversarial labs determined to work around the rules and labs that seem incapable of acting wisely, that are not allowed to ask whether the lab is being adversarial or unwise, will have to be much more restrictive overall to compensate for that, to get the same upsides, and there are some bases that will be impossible to cover in any reasonable way.

To the extent we enforce the spirit of the rules, and allow for human judgment and flexibility, or allow trusted people to adjust the rules on the fly, we can do a lot better on many fronts. But we open ourselves up to those who would not follow the spirit, and force there to be those charged with choosing who can be trusted to what extent, and we risk insider favoritism and capture. Either you can ‘pick winners and losers’ in any given sense or level of flexibility, or you can’t, and we go to regulate with the government we have, not the one we wish we had.

The conclusion of this section has some notes on these dangers, and we will return to those questions in later sections as well.

Again, going point by point:

(8a) What about ‘high-concern AI’? You will need permits for that. Hardware, Training, Model Weights and Deployment are each their own permit. It makes sense that each of these steps is distinct. Each comes with its own risks and responsibilities. That does not speak to whether the burdens imposed here are appropriate.
(8b1) The hardware permit only applies to a specific collection of hardware. If you want to substantially change, add to or supplement that hardware, you need to apply again. It is not a general ‘own whatever hardware you want’ permit. This makes sense if the process is reasonably fast and cheap when no issues are present, but we do need to be careful about that.
(8b2) Similarly the training permit is for a particular system, and it includes this: ‘If the person wishes to add additional features to the AI system that were not included in the original training permit’ then they need to apply for a new permit, meaning that they need to declare in advance what (capabilities and) features will be present, or they need to renew the permit. I also want to know what counts as a feature? What constitutes part of the model, versus things outside the model? Gemini’s interpretation is that for example GPTs would count even if they are achieved purely via scaffolding, and it speculates this goes as far as a new UI button to clear your chat history. Whereas it thinks improving model efficiency or speed, which is of course more safety relevant, would not. This seems like a place we need refinement and clarity, and it was confusing enough that Gemini was having trouble keeping the issues straight.
(8b3) A deployment permit is for the final model version, to a specific set of users. If you ‘substantially change’ the user base or model, you need to reapply. That makes sense for the model, that is the whole point, but I wonder how this would apply to a user base. This would make sense if you have the option to either ask for a fully broad deployment permit or a narrow one, where the narrow one (such as ‘for research’) would hold you to at least some looser standards in exchange.
(8b4) Similarly, your right to possess one set of weights is for only those weights.
In principle, I understand exactly why you would want all of this, once the details are cleaned up a bit. However it also means applying for a bunch of permits in the course of doing ordinary business. How annoying will it be to get them? Will the government do a good job of rubber stamping the application when the changes are trivial, but actually paying attention and making real (but fast) decisions when there is real new risk in the room? Or, rather, exactly how bad at this will we be? And how tightly will these requirements be enforced in practice, and how much will that vary based on whether something real is at stake?
(8c1) There is a grandfather clause for existing systems.
(8c) (there is some confusion here with the section names) Each year by September 1 the Administrator shall review each of the thresholds for high concern in (8a) for adequacy, and fix them if they are not adequate. I notice this should be symmetrical – it should say something like ‘adequate and necessary.’ If a threshold used be needed and now does not make sense, we should fix that.
(8d1) There will be a ‘fast-track’ form of less than two pages. They list examples of who should qualify: Self-driving cars, navigational systems, recommendation engines, fraud detection, weather forecasting, tools for locating mineral deposits, economic forecasting, search engines and image generators. That list starts strong, then by the end I get less confident, an image generator can absolutely do scary stuff with ‘typically no more than thirty words of text.’ So the principle is, specialized systems for particular purposes are exempt, but then we have to ask whether that makes them safe to train? And how we know they only get used in the way you expect or claim? The decision on who gets to fast track, to me, is not mostly about what you use the system for but the underlying capabilities of the system. There should definitely be easy waivers to get of the form ‘what I am doing cannot create any new dangers.’ Or perhaps the point is that if I am fine-tuning GPT-N for my recommendation engine, you should not bother me, and I can see that argument, but I notice I would want to dig more into details here before I feel good. In practice this might mostly be intended for small fine-tuning jobs, which ideally would indeed be fine, but we should think hard about how to make this highly smooth and also ensure no one abuses the loophole. Tricky.
(8d6) Ah, application fees, including exemptions for research, fast track and open source, and ‘support for small business.’ No numbers are specified in terms of what the fee shall be. I am going to go ahead and say that if the fee is large enough that it matters, it is an outrageous fee.
(8e) There need to be rules for ‘how to score each application’ and what it takes to get approved. I notice I worry about the use of ‘score’ at all. I do not want government saying ‘this checked off 7 boxes out of 10 so it gets a +7, and thus deserves a permit.’ I don’t think that works, and it is ripe for abuse and mission creep. I also worry about many other places this threatens to be rather arbitrary. I want a well-defined set of safety and security requirements, whereas as worded we have no idea what we will get in practice.
(8e2) If there is anything on the list that is not required, they have to explain why.
(8e3) Precautions can be required, such as (A) third-party evaluations and audits, (B) penetration testing, (C) compute usage limits, (D) watermarks and (E) other.
(8e4) Mandatory insurance can be required. Yes, please.
(8e5) These rubrics should be updated as needed.
(8f) Now we see how this ‘scoring’ thing works. You get a ‘scoring factor’ for things like the plan for securing liability insurance or otherwise mitigating risks, for your incident detection and reporting plan, your ‘demonstrated ability to forecast capabilities’ (I see it and you do too), and the applicant’s ‘resources, abilities, reputation and willingness to successfully execute the plans described in subsections (g) through (j).
And we all hear that one even louder. I am not saying that there are not advantages to considering someone’s reputation and established abilities when deciding whether to issue a permit, but this is making it clear that the intention is that you are not entitled to this permit merely for following the rules. The government has discretion, and if they don’t feel comfortable with you, or you piss them off, or they have any other reason, then it is no good, no permit for you. And yes, this could absolutely be a prelude to denying Elon Musk a permit, or generally locking out newcomers.
There is an obvious dilemma here. If you have to give everyone who technically qualifies the right to do whatever they want, then you need a system safe to people who ignore the spirit of the rules, who would not follow rules unless you can catch enforce those rules at each step, and who have not proven themselves responsible in any way. But, if you allow this type of judgment, then you are not a system of laws, and we all know what could happen next. So yes, I will absolutely say that the approach taken by implication here makes me uncomfortable. I do not trust the process, and I think as written this calls for too much trust to avoid picking winners and losers.

Rubrics for Your Consideration

What are the considerations when evaluating a safety plan? There are some details here that confuse me, but also this is thought out well enough that we can talk details on that level at all.

The broader concern is the idea of this being organized into a scoring system, and how one should holistically evaluate an application. I do think the rubrics themselves are a great start.

(8g) Rubrics for hardware in particular are the plan for KYC and customer controls, for cybersecurity of systems, and guarding against physical theft. Those are good rubrics if implemented objectively with thresholds.
(8h) Rubrics for model weights are awareness of real identities of customers, preventing theft of weights, limiting access to those with proper permits, and the danger level of the weights in question. The middle two make sense. The last one implies a sliding scale for how dangerous the weights are, which implies there should be more than one category under high-risk? It makes sense that there would be multiple categories here, but we should spell it out then. Then the first one feels like a deployment issue? Your ‘customer’ is whoever has the deployment permit, here, so if you don’t need to KYC the ultimate user that is a distinct issue? I sure hope so, if not we need to clean this up.
(8i) Rubrics for training are the extent of
1. Specifications of maximum intended capabilities.
2. Extent to which you have explained why that is safe.
3. Extent to which they have ‘a theory predicting’ how capabilities will develop
4. The plan for identifying and dealing with any discrepancies from those predictions, including a potential full halt in training, and communication of the anomaly to the government.
5. A clear maximum compute budget and timeframe and schedule.
6. Protection against the model escaping during training (!).
7. A plan for who, both internally and in the government, will get what information when and about what as training proceeds.
8. A plan for detecting unauthorized access attempts.
I get why one would include each of these things. What I worry about is, again, the whole thing where I gather together tons of expensive resources so I can train and deploy a system, I try to get as ‘high a score’ on everything as I can, and then hope that I get authorized to proceed, without knowing what might put me over the edge. I also worry that many of these things should not be left up to the lab in question to the extent this implies. In any case, I am impressed they went there in many senses, but it feels off. More of these should be clear rules and hard requirements, not point sources, and we should specify more of them.
Also, okay, we are definitely drawing an implied distinction between high-concern and other levels, while being short of what Section 9 deems ‘extremely high-concern’ AI systems. I don’t love the attempt at a continuum.
(8j) Rubrics for deployment
1. Evidence the system is ‘robustly aligned’ under plausible conditions.
2. Plan for traceability.
3. Plan for preventing use, access and reverse engineering in places that lack adequate AI safety legislation.
4. Plan for avoiding future changes increasing the risks from the systems, such as from fine-tuning, plug-ins, utilities or other modifications.
5. Plan to monitor the system and if needed shut it down.
6. Danger that the AI could itself advance AI capabilities, or autonomously survive, replicate or spread.
7. Direct catastrophic risks such as bioweapons, hacking, and so on.
That is quite the mixed bag. I notice that it is very unclear what it would mean to have a ‘points system’ on these to decide who gets to deploy and who does not, and this carries a lot of risk for the company if they might develop an expensive system and then not be allowed to deploy in a way that is hard to predict.
I do very much appreciate that (e) and (f) are here explicitly, not only (g).
I notice that (d) confuses me, since fine-tuning should require a permit anyway, what it is doing there? And also similar for plug-ins and other modifications, what is the intent here? And how do you really stop this, exactly? And (c) worries me, are we going to not let people be users in countries with ‘inadequate legislation’? If you have adequate precautions in place your users should be able to be wherever they want. There are so many battles this sets us up for down the line.

Open Model Weights Are Unsafe And Nothing Can Fix This

What about open source models?

Well, how exactly do you propose they fit into the rhuberics we need?

(8k) Considerations for open source frontier models. So there is an obvious problem here, for open source systems. Look at the rubrics for deployment. You are going to get a big fat zero for (b), (c), (d) and (e), and also (a) since people can fine-tune away the alignment. These are impossible things to do with open model weights. In the original there were eight considerations (f combines two of them), so this means you utterly fail five out of eight. If we are taking this seriously, then a ‘high-risk model’ with open model weights must be illegal, period, or what the hell are we even doing.
The response ‘but that’s not fair, make an exception, we said the magic words, we are special, the rules do not apply to us’ is not how life or law works, open model weights past the high-risk threshold are simply a blatant f*** you to everything this law is trying to do.
So what to do? (8k) instead offers ‘considerations,’ and calls for ‘fairly considering both the risks and benefits associated with open source frontier AI systems, including both the risk that an open source frontier AI system might be difficult or impossible to remove from the market if it is later discovered to be dangerous, and the benefits that voluntary, collaborative, and transparent development of AI offers society.’
I mean, lol. The rest of the section essentially says ‘but what if this so-called ‘open source’ system was not actually open source, would it be okay then?’ Maybe.
It says (8k3) ‘no automatic determinations.’ You should evaluate the system according to all the rubrics, not make a snap judgment. But have you seen the rubrics? I do not see how a system can be ‘high-risk’ under this structure, and for us to be fine sharing its model weights. Perhaps we could still share its source code, or its data, or even both, depends on details, but not the weights.
That is not because these are bad rubrics. This is the logical consequence of thinking these models are high-concern and then picking any reasonable set of rubrics. They could use improvement of course, but overall they are bad rubrics if and only if you think there is no importantly large risk in the room.
Will open weights advocates scream and yell and oppose this law no matter what? I mean, oh hell yes, there is no compromise that will get a Marc Andreessen or Richard Sutton or Yann LeCun on board and also do the job at hand.
That is because this is a fundamental incompatibility. Some of us want to require that sufficiently capable future AI systems follow basic safety requirements. The majority of those requirements are not things open weights models are capable of implementing, on a deep philosophical level, in a way that open weights advocates see as a feature rather than a bug. The whole point is that anyone can do whatever they want with the software, and the whole point of this bill is to put restrictions on what software you can create and what can be done with that software.
If you think this is untrue, prove me wrong, kids. If open model weights advocates have a plan, even a bad start of a plan, for how to achieve the aims and motivations behind these proposals without imposing such restrictions, none of them have deemed to tell me about them. It seems impossible even in theory, as explained above.
Open weights advocates have arguments for why we should not care about those aims and motivations, why everything will be wonderful anyway and there is no risk in the room. Huge if true, but I find those deeply uncompelling. If you believe there is little underlying catastrophic or existential risk for future frontier AI systems, then you should oppose any version of this bill.

Extremely High Concern Systems

What about those ‘extremely’ high concern systems? What to do then? What even are they? Can the people writing these documents please actually specify at least a for-now suggested definition, even if no one is that close to hitting it yet?

(9) There will be specifications offered for what is an ‘extremely high-concern AI system,’ the definition of which should be created within 12 months of passage, and the deployment requirements for such systems within 30 months. Both are not spelled out here, similarly to how OpenAI and Anthropic both largely have an IOU or TBD where the definitions should be in their respective frameworks.
They do say something about the framework, that it should take into account:
1. Whether the architecture is fundamentally safe.
2. Whether they have mathematical proofs the AI system is robustly aligned.
3. Whether it is ‘inherently unable’ to assist with WMDs.
4. Whether it is specifically found to be inherently unable to autonomously replicate.
5. Whether it is specifically found to be inherently unable to accelerate scientific or engineering progress sufficiently to pose national security risks.
I know, I know! Pick me, pick me! The answers are:
1. No*,
2. No*,
3. No,
4. No
5. and no.
The asterisk is that perhaps Davidad’s schema will allow a proof in way I do not expect, or we will find a new better architecture. And of course it is possible that your system simply is not that capable and (c), (d) and (e) are not issues, in which case we presumably misclassified your model.
But mostly, no, if your is a ‘extremely high-concern’ system then it is not safe for deployment. I am, instead, extremely concerned. That is the whole point of the name.
Will that change in the future, when we get better techniques for dealing with such systems? I sure hope so, but until that time, not so much.

This is a formal way of saying exactly that. There is a set of thresholds, to be defined later, beyond which no, you are simply not going to be allowed to create or deploy an AI system any time soon.

The problem is that this a place one must talk price, and they put a ‘TBD’ by the price. So we need to worry the price could be either way too high, or way too low, or both in different ways.

The Judges Decide

The actual decision process is worth highlighting. It introduces random judge selection into the application process, then offers an appeal, followed by anticipating lawsuits. I worry this introduces randomness that is bad for both business and risk, and also that the iterated process is focused on the wrong type of error. You want this type of structure when you worry about the innocent getting punished, whereas here our primary concern about error type is flipped.

(10a) Saying ‘The Administrator shall appoint AI Judges (AIJs)’ is an amusing turn of phrase, for clarity I would change the name, these are supposed to be humans. I indeed worry that we will put AIs in charge of such judgments rather soon.
(10c) Applications are reviewed by randomly selected 3-judge panels using private technical evaluators for help. The application is evaluated within 60 days, but they outright consider the possibility they will lack the capacity to do this? I get that government has this failure mode (see: our immigration system, oh no) but presumably we should be a little less blasé about the possibility. I notice that essentially you apply, then a random group grades you mostly pass/fail (they can also impose conditions or request revisions), and this does not seem like the way you would design a collaborative in-the-spirit process. Can we improve on this? Also I worry about what we would do about resubmissions, where there are no easy answers under a random system.
(11) Yes, you can appeal, and the appeal board is fixed and considers issues de novo when it sees fit. And then, if necessary, the company can appeal to the courts. I worry that this is backwards. In our criminal justice system, we rightfully apply the rule of double jeopardy and provide appeals and other rules to protect defendants, since our top priority is to protect the innocent and the rights of defendants. Here, our top priority should be to never let a model be trained or released in error, yet the companies are the ones with multiple bites at the apple. It seems structurally backwards, we should give them less stringent hurdles but not multiple apple bites, I would think?

Several Rapid-Fire Final Sections

There is some very important stuff here. Any time anyone says ‘emergency powers’ or ‘criminal penalties’ you should snap to attention. The emergency powers issues will get discussed in more depth when I handle objections.

(12) Hardware monitoring. Tracking of ‘high performance’ AI hardware. I like specificity. Can we say what counts here?
(13) You shall report to Congress each year and provide statistics.
(14a) AI developers are assigned a duty of care for civil liability, with joint and several liability, private right of action and public right of action, and strict liability, with exceptions for bona fide error, and potential punitive damages. Open source is explicitly not a defense, nor is unforeseeability of misalignment (although also, if you don’t foresee it, let me stop you right there). All the liability bingo cards should be full, this seems very complete and aggressive as written, although that could be wise.
(15b) Criminal felony liability if you ignore an emergency order and fail to take steps within your power to comply, get your permit rejected and train anyway, get approved with conditions and knowingly violate the conditions, knowingly submit false statements on your application, or fraudulently claim intention to do safety precautions.
1. I note a lot of this involves intent and knowledge. You only go to jail (for 10-25 years no less) if you knowingly break the rules, or outright defy them, and the government will need to prove that. The stakes here will be very high, so you do need to be able to have enforcement teeth. Do they need to be this sharp? Is this too much? Will it scare people off? My guess is this is fine, and no one will actually fear going to jail unless they actually deserve it. You can say ‘oh the engineer who disregarded the conditional approval rules does not deserve a decade in prison’ and in many cases I would agree with you and hopefully they move up the chain as per normal instead, but also if you are actually training an existentially risky model in defiance of the rules? Yeah, I do think that is a pretty big freaking deal.
(15c) Misdemeanor liability here is 6 months to a year (plus fines). I notice this gets weird. (1) is using or improving a frontier model without a permit. So not asking for a permit is a misdemeanor, going through a rejection is a felony? I do not love the incentives there. If you know you are ‘improving’ a frontier model without a permit, then I do not see why you should get off light, although mere use does seem different. Trigger (2) is recklessness with requirements, that results in failure, I don’t love any options on this type of rule. (3) is submitting a knowingly incomplete or misleading application, rather than false, I am not sure how that line is or should be drawn. (4) is intentionally sabotaging a benchmark score in order to get less regulatory scrutiny, and I think that has to be a felony here. This is lying on an application, full stop, maybe worse.
(There are more enforcement rules and crime specifications, they seem standard.)
(16) Emergency powers. The President can declare an emergency for up to a year due to an AI-related national security risk, more than that requires Congress. That allows the President to: Suspend permits, stop actions related to frontier AI, require safety precautions, seize model weights, limit access to hardware, issue a general moratorium or ‘take any other actions consistent with this statutory scheme that the Administrator deems necessary to protect against an imminent major security risk.’
So, basically, full emergency powers related to inhibiting AI, as necessary. I continue to be confused about what emergency powers do and do not exist in practice. Also I do not see a way to deal with a potential actual emergency AI situation that may arise in the future, without the use of emergency powers like this, to stop systems that must be stopped? What is the alternative? I would love a good alternative. More discussion later.
(17) Whistleblower protections, yes, yes, obviously.
(18-20) Standard boiler-plate, I think.
(21) There is some very strong language in the severability clause that makes me somewhat nervous, although I see why they did it.

Overall Take: A Forceful, Flawed and Thoughtful Model Bill

I think it is very good that they took the time to write a full detailed bill, so now we can have discussions like this, and talk both price and concrete specific proposals.

What are the core ideas here?

We should monitor computer hardware suitable for frontier model training, frontier model training runs, the stewardship of resulting model weights and how such models get deployed.
We should do this when capability thresholds are reached, and ramp up the amount of monitoring as those thresholds get crossed.
At some point, models get dangerous enough we should require various precautions. You will need to describe what you will do to ensure all this is a safe and wise thing to be doing, and apply for a permit.
As potential capabilities, so do the safety requirements and your responsibilities. At some further point, we do not know a way to do this safely, so stop.
Those rules should be adjusted periodically to account for technological developments, and be flexible and holistic, so they do not become impossible to change.
There should be criminal penalties for openly and knowingly defying all this.
Given our options and the need to respond quickly to events, we should leave these decisions with broad discretion to an agency, letting it respond quickly, with the head appointed by the President, with the advice and consent of the Senate.
The President should be able to invoke emergency powers to stop AI activity, if he believes there is an actual such emergency.
Strict civil liability in all senses for AI if harm ensues.
Strong whistleblower protections.
We should do this via a new agency, rather than doing it inside an existing one.

I strongly agree with #1, #2, #3, #4, #5, #6 and #10. As far as I can tell, these are the core of any sane regulatory regime. I believe #9 is correct if we find the right price. I am less confident in #7 and #8, but do not know what a superior alternative would be.

The key, as always, is talking price, and designing the best possible mechanisms and getting the details right. Doing this badly can absolutely backfire, especially if we push too hard and set unreasonable thresholds.

I do think we should be aware of and prepared for the fact that, at some point in the future there is a good chance that the thresholds and requirements will need to be expensive, and impose real costs, if they are to work. But that point is not now, and we need to avoid imposing any more costs than we need to, going too far too fast will only backfire.

The problem is both that the price intended here seems perhaps too high too fast, and also that it dodges much talking of price by kicking that can to the new agency. There are several points in this draft (such as the 10^24 threshold for medium-concern) where I feel that the prices here are too high, in addition to places where I believe implementation details need work.

There is also #9, civil liability, which I also support as a principle, where one can fully talk price now, and the price here seems set maximally high, at least within the range of sanity. I am not a legal expert here but I sense that this likely goes too far, and compromise would be wise. But also that is the way of model bills.

That leaves the hard questions, #7, #8 and #11.

On #7, I would like to offer more guidance and specification for the new agency than is offered here. I do think the agency needs broad discretion to put up temporary barriers quickly, set new thresholds periodically, and otherwise assess the current technological state of play in a timely fashion. We do still have great need for Congressional and democratic oversight, to allow for adjustments and fixing of overreach or insider capture if mistakes get made. Getting the balance right here is going to be tricky.

On #8, as I discuss under objections, what is the alternative? Concretely, if the President decides that an AI system poses an existential risk (or other dire threat to national security), and that threat is imminent, what do you want the President to do about that? What do you think or hope the President would do now? Ask for Congress to pass a law?

We absolutely need, and I would argue already have de facto, the ability to in an emergency shut down an AI system or project that is deemed sufficiently dangerous. The democratic control for that is periodic elections. I see very clear precedent and logic for this.

And yes, I hate the idea of states of emergency, and yes I have seen Lisa Simpson’s TED Talk, I am aware that if you let the government break the law in an emergency they will create an emergency in order to break the law. But I hate this more, not less, when you do it anyway and call it something else. Either the President has the ability to tell any frontier AI project to shut down for now in an actual emergency, or they don’t, and I think ‘they don’t’ is rather insane as an option. If you have a better idea how to square this circle I am all ears.

On #11, this was the one big objection made when I asked someone who knows about bills and the inner workings of government and politics to read the bill, as I note later. They think that the administrative, managerial, expertise and enforcement burdens would be better served by placing this inside an existing agency. This certainly seems plausible, although I would weigh it against the need for a new distinctive culture and the ability to move fast, and the ability to attract top talent. I definitely see this as an open question.

In response to my request on Twitter, Jules Robins was the only other person to take up reading the bill.

Jules Robins: Overall: hugely positive update if this looks like something congress would meaningfully consider as a starting point. I’m not confident that’s the case, but hopefully it at least moves the Overton Window. Not quite a silver bullet (I’ll elaborate below), but would be a huge win.

Biggest failings to my eyes are:

1. Heavily reliant on top officials very much embracing the spirit of the assignment. I mean, that was probably always going to be true, but much of the philosophical bent requires lots of further research and rule-making to become effective.

2. Doesn’t really grapple with the reality we may be living in (per the recent Google paper) where you can train a frontier model without amassing a stock of specialized compute (say, SETI style). Ofc that’s only newly on most radars, and this was in development long before that.

Other odds and ends: Structure with contention favoring non-permiting is great here. As is a second person in the organization with legal standing to contest an agency head not being cautious enough.

Some checks on power I’d rather not have given this already only works with aligned officials (e.g. Deputy Administrator for Public Interest getting stopped by confidentiality, relatively light punishments for some violations that could well be pivotal)

Model tiering leaves a potentially huge hole: powerful models intended for a narrow task that may actually result in broad capabilities to train on. (e.g. predicting supply & demand is best done by forecasting Earth-system wide futures).

So all-in-all, I’d be thrilled if we came out with something like this, but it’d require a lot more work put in by the (hopefully very adept people) put in charge.

Were ~this implemented, there would be potential for overreach. There are likely better mitigations than the proposal has, but I doubt you can make a framework that adapts to the huge unknowns of what’s necessary for AGI safety without broad enough powers to run overreach risk.

This mostly was the measured, but otherwise the opposite of the expected responses from the usual objectors. Jules saw that this bill is making a serious attempt to accomplish its mission, but that there are still many ways it could fail to work, and did not focus on the potential places there could be collateral damage or overreach of various kinds.

Indeed, there are instead concerns that the checks on power that are here could interfere, rather than worrying about insufficient checks on power. The right proposal should raise concerns in both directions.

But yes, Jules does notice that if this exact text got implemented, there are some potential overreaches.

The spirit of the rules point is key. Any effort without the spirit of actually creating safety driving actions is going to have a bad time unless you planned to route around that, and this law does not attempt to route around that.

I did notice the Google paper referenced here, and I am indeed worried that we could in time lose our ability to monitor compute in this way. If that happens, we are in even deeper trouble, and all our options get worse. However I anticipate that the distributed solution will be highly inefficient, and difficult to scale on the level of actually dangerous models for some time. I think for now we proceed anyway, and that this is not yet our reality.

I definitely thought about the model purpose loophole. It is not clear that this would actually get you much of a requirement discount given my reading, but it is definitely something we will need to watch. The EU’s framework is much worse here.

The Usual Objectors Respond: The Severability Clause

The bill did gave its critics some soft rhetorical targets, such as the severability clause, which I didn’t bother reading assuming it was standard until Matt Mittlesteadt pointed it out. The provision definitely didn’t look good when I first read it, either:

Matt Mittelsteadt: This is no joke. They even wrote the severability clause to almost literally say ‘AI is too scary to follow the constitution and therefore this law can’t be struck by the courts.’

Here is the clause itself, in full:

The primary purpose of this Act is to reduce major security risks from frontier AI systems. Moreover, even a short interruption in the enforcement of this Act could allow for catastrophic harm.

Therefore, if any portion or application of this Act is found to be unconstitutional, the remainder of the Act shall continue in effect except in so far as this would be counterproductive for the goal of reducing major security risks.

Rather than strike a portion of the Act in such a way as to leave the Act ineffective, the Courts should amend that portion of the Act so as to reduce major security risks to the maximum extent permitted by the Constitution.

Then I actually looked at the clause and thought about it, and it made a lot more sense.

The first clause is a statement of intent and an observation of fact. The usual suspects will of course treat it as scaremongering but in the world where this Act is doing good work this could be very true.

The second clause is actually weaker than a standard severability clause, in a strategic fashion. It is saying, sever, but only sever if that would help reduce major security risks. If severing would happen in a way that would make things worse than striking down more of the law, strike down more on that basis. That seems good.

The third clause is saying that if a clause is found unconstitutional, then rather than strike even that clause, they are authorized to modify that clause to align with the rest of the law as best they can, given constitutional restrictions. Isn’t that just… good? Isn’t that what all laws should say?

So, for example, there was a challenge to the ACA’s individual mandate in 2012 in NFIB v. Sebelius. The mandate was upheld on the basis that it was a tax. Suppose that SCOTUS had decided that it was not a tax, even though it was functionally identical to a tax. In terms of good governance, the right thing to do is to say ‘all right, we are going to turn it into a tax now, and write new law, because Congress has explicitly authorized us to do that in this situation in the severability provision of the ACA.’ And then, if Congress thinks that is terrible, they can change the law again. But I am a big fan of ‘intent wins’ and trying to get the best result. Our system of laws does not permit this by default, but if legal I love the idea of delegating this power to the courts, presumably SCOTUS. Maybe I am misunderstanding this?

So yeah, I am going to bite the bullet and say this is actually good law, even if its wording may need a little reworking.

The Usual Objectors Respond: Inception

Next we have what appears to me to be an attempted inception from Jeremiah Johnson, saying the bill is terrible and abject incompetence that will only hurt the cause of enacting regulations, in the hopes people will believe this and make it true.

I evaluated this claim by asking someone I know who works on political causes not related to AI, with a record of quietly getting behind the scenes stuff done, to read the bill without giving my thoughts, to get a distinct opinion.

The answer came back that this was that this was indeed a very professionally drafted, well thought out bill. Their biggest objection was that they thought it was a serious mistake to make this a new agency, rather than put it inside an existing one, due to the practical considerations of logistics, enforcement and ramping up involved. Overall, they said that this was ‘a very good v1.’

Not that this ever stops anyone.

Claiming the other side is incompetent and failing and they have been ‘destroyed’ or ‘debunked’ and everyone hates them now is often a highly effective strategy. Even I give pause and get worried there has been a huge mistake, until I do what almost no one ever does, and think carefully about the exact claims involved and read the bill. And that’s despite having seen this playbook in action many times.

Notice that Democrats say this about Republicans constantly.

Notice that Republicans say this about Democrats constantly.

So I do not expect them to stop trying it, especially as people calibrate based on past reactions. I expect to hear this every time, with every bill, of any quality.

The Usual Objectors Respond: Rulemaking Authority

Then we have this, where Neil Chilson says:

Neil Chilson (Head of AI Policy at Abundance Institute): There is a new AI proposal from @aipolicyus. It should SLAM the Overton window shut.

It’s the most authoritarian piece of tech legislation I’ve read in my entire policy career (and I’ve read some doozies).

Everything in the bill is aimed at creating a democratically unaccountable government jobs program for doomers who want to regulate math.

I mean, just check out this section, which in a mere six paragraphs attempts to route around any potential checks from Congress or the courts.

You know you need better critics when they pull out ‘regulate math’ and ‘government jobs program’ at the drop of a hat. Also, this is not how the Overton Window works.

But I give him kudos for both making a comparative claim, and for highlighting the actual text of the bill that he objects to most, in a section I otherwise skipped. He links to section 6, which I had previously offloaded to Gemini.

Here is what he quotes, let’s check it in detail, that is only fair, again RTFB:

f) CONGRESSIONAL REVIEW ACT.

(1) The Administrator may make a determination pursuant to 5 U.S.C. §801(c) that a rule issued by the Administrator should take effect without further delay because avoidance of such delay is necessary to reduce or contain a major security risk. If the Administrator makes such a determination and submits written notice of such determination to the Congress, then a rule that would not take effect by reason of 5 U.S.C. §801(a)(3) shall nevertheless take effect. The exercise of this authority shall have no effect on the procedures of 5 U.S.C. § 802 or on the effect of a joint Congressional resolution of disapproval.

So as I understand it, normally any new rule requires a 60 day waiting period before being implemented under 5 U.S.C. §801(a)(3), to allow for review or challenge. This is saying that, if deemed necessary, rules can be changed without this waiting period, while still being subject to the review and potentially be paired back.

Also my understanding is that the decision here of ‘major security risk’ is subject to judicial review. So this does not prevent legal challenges or Congressional challenges to the new rule. What it does do is it allows stopping activity by default. That seems like a reasonable thing to be able to do in context?

(2) Because of the rapidly changing and highly sensitive technical landscape, a rule that appears superficially similar to a rule that has been disapproved by Congress may nevertheless be a substantially different rule. Therefore, a rule issued under this section that varies at least one material threshold or material consequence by at least 20% from a previously disapproved rule is not “substantially the same” under 5 U.S.C. § 802(b)(2).

This is very much pushing it. I don’t like it. I think here Neil has a strong point.

I do agree that rules that appear similar can indeed not be substantially similar, and that the same rule rejected before might be very different now.

But changing a ‘penalty’ by 20% and saying you changed the rule substantially? That’s clearly shenanigans, especially when combined with (1) above.

The parties involved should not need such a principle. They should be able to decide for themselves what ‘substantially similar’ means. Alas, this law did not specify how any of this works, there is no procedure, it sounds like?

So there is a complex interplay involved, and everything is case-by-case and courts sometimes intervene and sometimes won’t, which is not ideal.

I think this provision should be removed outright. If the procedure for evaluating this is so terrible it does not work, then we should update 5 U.S.C. § 802(b)(2) with a new procedure. Which it sounds like we definitely should do anyway.

If an agency proposes a ‘substantially similar’ law to Congress, here or elsewhere, my proposed new remedy is that it should need to be noted in the new rule that it may be substantially similar to a previous proposal that was rejected. Congress can then stamp it ‘we already rejected this’ and send it back. Or, if they changed their minds for any reason, an election moved the majority or a minor tweak fixes their concerns, they can say yes the second time. The law should spell this out.

(g) MAJOR QUESTIONS DOCTRINE. It is the intent of Congress to delegate to the Administration the authority to mitigate the major security risks of advanced, general-purpose artificial intelligence using any and all of the methods described in this Act. The Administration is expected and encouraged to rapidly develop comparative expertise in the evaluation of such risks and in the evaluation of the adequacy of measures intended to mitigate these risks. The Administration is expressly authorized to make policy judgments regarding which safety measures are necessary in this regard. This Act shall be interpreted broadly, with the goal of ensuring that the Administration has the flexibility to adequately discharge its important responsibilities.

If you think we have the option to go back to Congress as the situation develops to make detailed decisions on how to deal with future general-purpose AI security threats, either you do not think we will face such threats, or you think Congress will be able to keep up, you are fine not derisking or you have not met Congress.

That does not mean we should throw out rule of law or the constitution, and give the President and whoever he appoints unlimited powers to do what they want until Congress manages to pass a law to change that (which presumably will never happen). Also that is not what this provision would do, although it pushes in that direction.

Does this language rub us all the wrong way? I hope so, that is the correct response to the choices made here. It seems expressly designed to give the agency as free a hand as possible until such time as Congress steps in with a new law.

The question is whether that is appropriate.

(h) NO EFFECT ON EMERGENCY POWERS. Nothing in this section shall be construed to limit the emergency powers granted by Section 11.

Yes, yes, ignore.

Finally we have this:

(i) STANDARD FOR REVIEW. In reviewing a rule promulgated under this Act that increases the strictness of any definition or scoring criterion related to frontier AI, a court may not weaken or set aside that rule unless there is clear and convincing evidence of at least one of the following

(1) doing so will not pose major security risks, or

(2) the rule exceeded the Administrator’s authority.

That doesn’t sound awesome. Gemini thinks that courts would actually respect this clause, which initially surprised me. My instinct was that a judge would laugh in its face.

I do notice that this is constructed narrowly. This is specifically about changing strictness of definitions towards being more strict. I am not loving it, but also the two clauses here to still allow review seem reasonable to me, and if they go too far the court should strike whatever it is down anyway I would assume.

Conclusion

The more I look at the detailed provisions here, the more I see very thoughtful people who have thought hard about the situation, and are choosing very carefully to do a specific thing. The people objecting to the law are objecting exactly because the bill is well written, and is designed to do the job it sets out to do. Because that is a job that they do not want to see be done, and they aim to stop it from happening.

There are also legitimate concerns here. This is only a model bill, as noted earlier there is still much work to do, and places where I think this goes too far, and other places where if such a bill did somehow pass no doubt compromises will happen even if they aren’t optimal.

But yes, as far as I can tell this is a serious, thoughtful model bill. That does not mean it or anything close to it will pass, or that it would be wise to do so, especially without improvements and compromises where needed. I do think the chances of this type of framework happening very much went up.

14 comments

Comments sorted by top scores.

comment by Zach Stein-Perlman · 2024-04-10T19:06:31.250Z · LW(p) · GW(p)

In addition to the bill, CAIP has a short summary and a long summary.

comment by JenniferRM · 2024-04-11T23:34:57.499Z · LW(p) · GW(p)

I feel (mostly from observing an omission (I admit I have not yet RTFB)) that the international situation is not correctly countenanced here. This bit is starting to grapple with it:

Plan for preventing use, access and reverse engineering in places that lack adequate AI safety legislation.

Other than that, it seems like this bill basically thinks that America is the only place on Earth that exists and has real computers and can make new things????

And even, implicitly in that clause, the worry is "Oh no! What if those idiots out there in the wild steal our high culture and advanced cleverness!"

However, I expect other countries with less legislation to swiftly sweep into being much more "advanced" (closer to being eaten by artificial general super-intelligence) by default.

It isn't going to be super hard to make this stuff, its just that everyone smart refuses to work on it because they don't want to die. Unfortunately, even midwits can do this. Hence (if there is real danger) we probably need legislative restrictions.

That is: the whole point of the legislation is basically to cause "fast technological advancement to reliably and generally halt" (like we want the FAISA to kill nearly all dramatic and effective AI innovation (similarly to how the FDA kills nearly all dramatic and effective Drug innovation, and similar to how the Nuclear Regulatory Commission killed nearly all nuclear power innovation and nuclear power plant construction for decades)).

If other countries are not similarly hampered by having similar FAISAs of their own, then they could build an Eldritch Horror and it could kill everyone.

Russia didn't have an FDA, and invented their own drugs.

France didn't have the NRC, and built an impressively good system of nuclear power generation.

I feel that we should be clear that the core goal here is to destroy innovative capacity, in AI, in general, globally, because we fear that innovation has a real chance, by default, by accident, of leading to "automatic human extinction".

The smart and non-evil half of the NIH keeps trying to ban domestic Gain-of-Function research... so people can just do that in Norway and Wuhan instead. It still can kill lots of people, because it wasn't taken seriously in the State Department, and we have no global restriction on Gain-of-Function. The Biological Weapons Convention exists, but the BWC is wildly inadequate on its face.

The real and urgent threat model here is (1) "artificial general superintelligence" arises and (2) gets global survive and spread powers and then (3) thwarts all human aspirations like we would thwart the aspirations of ants in our kitchen.

You NEED global coordination to stop this EVERYWHERE or you're just re-arranging who, in the afterlife, everyone will be pointing at to blame them for the end of humanity.

The goal isn't to be blameless and dead. The goal is the LIVE. The goal is to reliably and "on purpose" survive and thrive, in humanistically delightful ways, in the coming decades, centuries, and millennia.

If extinction from non-benevolent artificial superintelligence is a real fear, then it needs international coordination. If this is not a real fear, then we probably don't need the FAISA in the US.

So where is the mention of a State Department loop? Where is the plan for diplomacy? Where are China or Russia or the EU or Brazil or Taiwan or the UAE or anyone but America mentioned?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-04-11T23:39:26.810Z · LW(p) · GW(p)

Two obvious points:

It is deontologically more ethical to not yourself kill everyone in the world.
America has an incredible ability to set fashions, and if it took on these policies then I think a great number of others would follow suit.

Replies from: JenniferRM, Benito

↑ comment by JenniferRM · 2024-04-15T01:39:49.024Z · LW(p) · GW(p)

Rather than have America hope to "set a fashion" (that would obviously (to my mind) NOT be "followed based on the logic of fashion") in countries that hate us, like North Korea and so on...

I would prefer to reliably and adequately cover EVERY base that needs to be covered and I think this would work best if people in literally every American consulate in every country (and also at least one person for every country with no diplomatic delegation at all) were tracking the local concerns, and trying to get a global FAISA deal done.

If I might rewrite this a bit:

The goal isn't FOR AMERICA to be blameless and EVERYONE to be dead. The goal is for ALL HUMANS ON EARTH to LIVE. The goal is to reliably and "on purpose" survive and thrive, on Earth, in general, even for North Koreans, in humanistically delightful ways, in the coming decades, centuries, and millennia.

The internet is everywhere. All software is intrinsically similar to a virus. "Survive and spread" capabilities in software are the default, even for software that lacks general intelligence.

If we actually believe that AGI convergently heads towards "not aligned with Benevolence, and not aligned with Natural Law, and not caring about humans, nor even caring about AI with divergent artificial provenances" but rather we expect each AGI to head toward "control of all the atoms and joules by any means necessary"... then we had better stop each and every such AGI very soon, everywhere, thoroughly.

↑ comment by Ben Pace (Benito) · 2024-04-17T00:06:55.628Z · LW(p) · GW(p)

@Zach Stein-Perlman [LW · GW] I'm not really sure why you gave a thumbs-down. Probably you're not trying to communicate that you think there shouldn't be deontological injunctions against genocide. I think someone renouncing any deontological injunctions against such devastating and irreversible actions would be both pretty scary and reprehensible. But I failed to come up with a different hypothesis for what you are communicating with a thumbs-down on that statement (to be clear I wouldn't be surprised if you provided one).

Replies from: Zach Stein-Perlman

↑ comment by Zach Stein-Perlman · 2024-04-17T00:14:45.064Z · LW(p) · GW(p)

Suppose you can take an action that decreases net P(everyone dying) but increases P(you yourself kill everyone), and leaves all else equal. I claim you should take it; everyone is better off if you take it.

I deny "deontological injunctions." I want you and everyone to take the actions that lead to the best outcomes, not that keep your own hands clean. I'm puzzled by your expectation that I'd endorse "deontological injunctions."

This situation seems identical to the trolley problem in the relevant ways. I think you should avoid letting people die, not just avoid killing people.

[Note: I roughly endorse heuristics like if you're contemplating crazy-sounding actions for strange-sounding reasons, you should suspect that you're confused about your situation or the effects of your actions, and you should be more cautious than your naive calculations suggest. But that's very different from deontology.]

Replies from: Raemon

↑ comment by Raemon · 2024-04-17T22:15:19.712Z · LW(p) · GW(p)

I think I have a different overall take than Ben here, but, the frame I think makes sense here is to be like: "Deontological injuctions are guardrails. There are hypothetical situations (and, some real situations) where it's correct to override them, but the guardrail should have some weight and for more important guardrails, you need a clearer reasoning for why avoiding it actually helps."

I don't know what I think about this in the case of a country passing laws. Countries aren't exactly agents. Passing novel laws is different than following existing laws. But, I observe:

it's really hard to be confident about longterm consequences of things. Consequentialism just isn't actually compute-efficient enough to be what you use most of the time for making decisions. (This includes but isn't limited to "you're contemplating crazy sounding actions for strange sounding reasons", although I think has a similar generator)
it matters just not what you-in-particular-in-a-vacuum do, in one particular timeslice. It matters how complicated the world is to reason about. If everyone is doing pure consequentialism all the time, you have to model the way each person is going to interpret consequences with their own special-snowflake worldview. Having to model "well, Alice and Bob and Charlie and 1000s of other people might decide to steal from me, or from my friends, if the benefits were high enough and they thought they could get away with it" adds a tremendous amount of overhead.

You should be looking for moral reasoning that makes you simple to reason about, and that perform well in most cases. That's a lot of what deontology is for.

comment by Chris_Leong · 2024-04-11T03:38:19.069Z · LW(p) · GW(p)

My thoughts:
a) Some of the penalties seemed too weak
b) Uncertain whether we want license appeals decided by judges. I would the approval to be decided on technical grounds, but for judges to intervene to ensure that the process is fair. Or maybe a committee that is mostly technical, but that which contains a non-voting legal expert to ensure compliance.
c) I would prefer a strong stand against dangerous open-weight models.

comment by Soapspud · 2024-04-11T19:24:39.727Z · LW(p) · GW(p)

We only have people who cry wolf all the time. I love that for them, and thank them for their service, which is very helpful. Someone needs to be in that role, if no one is going to be the calibrated version. Much better than nothing. Often their critiques point to very real issues, as people are indeed constantly proposing terrible laws.
The lack of something better calibrated is still super frustrating.

This mental (or emotional) move here, where you manage to be grateful for people doing a highly imperfect job while also being super frustrated that no one is doing a genuinely good job: how are you doing that?

I see this often in rationalist spaces, and I'm confused about how people learn to do this. I would probably end up complaining about the failings of the best (highly inadequate) strategies we've got without the additional perspective of "how would things be if we didn't even have this?"

For people who remember learning how to do this, how did you practice?

Replies from: Zvi

↑ comment by Zvi · 2024-04-11T20:26:38.499Z · LW(p) · GW(p)

My guess is that different people do it differently, and I am super weird.

For me a lot of the trick is consciously asking if I am providing good incentives, and remembering to consider what the alternative world looks like.

comment by trevor (TrevorWiesinger) · 2024-04-11T00:37:06.054Z · LW(p) · GW(p)

I think this might be a little too harsh on CAIP (discouragement risk). If shit hits the fan, they'll have a serious bill ready to go for that contingency.

Seriously writing a bill-that-actually-works shows beforehand that they're serious, and the only problem was the lack of political will (which in that contingency would be resolved).

If they put out a watered-down bill designed to maximize the odds of passage then they'd be no different from any other lobbyists.

It's better in this case to instead have a track record for writing perfect bills that are passable (but only given that shit hits the fan), than a track record for successfully pumping the usual garbage through the legislative process (which I don't see them doing well at; playing to your strengths is the name of the game for lobbying and "turning out to be right" is CAIP's strength).

Replies from: Zvi, Zach Stein-Perlman

↑ comment by Zvi · 2024-04-11T14:15:31.276Z · LW(p) · GW(p)

I don't see this response as harsh at all? I see it as engaging in detail with the substance, note the bill is highly thoughtful overall, with a bunch of explicit encouragement, defend a bunch of their specific choices, and I say I am very happy they offered this bill. It seems good and constructive to note where I think they are asking for too much? While noting that the right amount of 'any given person reacting thinks you went too far in some places' is definitely not zero.

↑ comment by Zach Stein-Perlman · 2024-04-11T02:36:25.487Z · LW(p) · GW(p)

"turning out to be right" is CAIS's strength

This is CAIP, not CAIS; CAIP doesn't really have a track record yet.

comment by delightfullyherald · 2024-04-23T19:42:24.368Z · LW(p) · GW(p)

The rulemaking authority procedures are anything but "standard issue boilerplate." They're novel and extremely unusual, like a lot of other things in the draft bill.

Section 6, for example, creates a sort of one-way ratchet for rulemaking where the agency has basically unlimited authority to make rules or promulgate definitions that make it harder to get a permit, but has to make findings to make it easier. That is not how regulation usually works.

The abbreviated notice period is also really wild.

I think the draft bill introduces a lot of interesting ideas, and that's valuable, but as actual proposed legislative language I think it's highly unrealistic and would almost certainly do more harm than good if anyone seriously tried to enact it.

For every "wow, this has never been done before in the history of federal legislation" measure in the bill--and there are at least 50 or so--there's almost certainly going to be a pretty good reason why it hasn't been done before. In my opinion, it's not wise to try and do 50 incredibly daring new things at once in a single piece of legislation, because it creates far too many failure points. It's like following a baking recipe--if you try to make one or two tweaks to the recipe that seem like good ideas, you can then observe the effect on the finished product and draw conclusions from the results you get. If you try to write your own recipe from scratch, and you've never written a recipe before, you're going to end up with a soggy mess and no real lessons will have been learned about any of the individual elements that you tried out.

RTFB: On the New Proposed CAIP AI Bill

Contents

RTFC: Read the Bill

Basics and Key Definitions

Oh the Permits You’ll Need

Rubrics for Your Consideration

Open Model Weights Are Unsafe And Nothing Can Fix This

Extremely High Concern Systems

The Judges Decide

Several Rapid-Fire Final Sections

Overall Take: A Forceful, Flawed and Thoughtful Model Bill

The Usual Objectors Respond: The Severability Clause

The Usual Objectors Respond: Inception

The Usual Objectors Respond: Rulemaking Authority

Conclusion

14 comments