President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence
post by Tristan Williams (tristan-williams) · 2023-10-30T11:15:38.422Z · LW · GW · 39 commentsThis is a link post for https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/
Contents
39 comments
39 comments
Comments sorted by top scores.
comment by Zach Stein-Perlman · 2023-10-30T23:53:13.488Z · LW(p) · GW(p)
This was the press release; the actual order has now been published.
One safety-relevant part:
Replies from: Vladimir_Nesov, charbel-raphael-segerie4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:
(i) Companies developing or demonstrating an intent to develop potential dual-use foundation models to provide the Federal Government, on an ongoing basis, with information, reports, or records regarding the following:
(A) any ongoing or planned activities related to training, developing, or producing dual-use foundation models, including the physical and cybersecurity protections taken to assure the integrity of that training process against sophisticated threats;
(B) the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights; and
(C) the results of any developed dual-use foundation model’s performance in relevant AI red-team testing based on guidance developed by NIST pursuant to subsection 4.1(a)(ii) of this section, and a description of any associated measures the company has taken to meet safety objectives, such as mitigations to improve performance on these red-team tests and strengthen overall model security. Prior to the development of guidance on red-team testing standards by NIST pursuant to subsection 4.1(a)(ii) of this section, this description shall include the results of any red-team testing that the company has conducted relating to lowering the barrier to entry for the development, acquisition, and use of biological weapons by non-state actors; the discovery of software vulnerabilities and development of associated exploits; the use of software or tools to influence real or virtual events; the possibility for self-replication or propagation; and associated measures to meet safety objectives; and
(ii) Companies, individuals, or other organizations or entities that acquire, develop, or possess a potential large-scale computing cluster to report any such acquisition, development, or possession, including the existence and location of these clusters and the amount of total computing power available in each cluster.
(b) The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section. Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:
(i) any model that was trained using a quantity of computing power greater than 1026 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 1023 integer or floating-point operations; and
(ii) any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 1020 integer or floating-point operations per second for training AI.
↑ comment by Vladimir_Nesov · 2023-10-31T06:14:58.780Z · LW(p) · GW(p)
This requires reporting of plans for training and deployment, as well as ownership and security of weights, for any model with training compute over FLOPs. Might be enough of a talking point with corporate leadership to stave off things like hypothetical irreversible proliferation of a GPT-4.5 scale open weight LLaMA 4.
↑ comment by Charbel-Raphaël (charbel-raphael-segerie) · 2023-10-31T09:16:15.256Z · LW(p) · GW(p)
Is there a definition of "dual-use foundation model" anywhere in the text?
Replies from: tristan-williams, Vladimir_Nesov↑ comment by Tristan Williams (tristan-williams) · 2023-10-31T09:20:45.442Z · LW(p) · GW(p)
(k) The term “dual-use foundation model” means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such as by:
(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;
(ii) enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks; or
(iii) permitting the evasion of human control or oversight through means of deception or obfuscation.
Models meet this definition even if they are provided to end users with technical safeguards that attempt to prevent users from taking advantage of the relevant unsafe capabilities.
Replies from: M. Y. Zuo↑ comment by M. Y. Zuo · 2023-10-31T15:49:49.432Z · LW(p) · GW(p)
(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;
Wouldn't this include most, if not all, uncensored LLMs?
And thus any person/organization working on them?
Replies from: nathan-helm-burger↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-10-31T20:57:47.912Z · LW(p) · GW(p)
I think the key here is 'substantially'. That's a standard of evidence which must be shown to apply to the uncensored LLM in question. I think it's unclear if current uncensored LLMs would meet this level. I do think that if GPT-4 were to be released as an open source model, and then subsequently fine-tuned to be uncensored, that it would be sufficiently capable to meet the requirement of 'substantially lowering the barrier of entry for non-experts'.
Replies from: M. Y. Zuo↑ comment by M. Y. Zuo · 2023-10-31T21:02:13.892Z · LW(p) · GW(p)
Do you know who would be deciding on orders like this one? Some specialized department in the USG, whatever judge that happens to hear the case, or something else?
Replies from: nathan-helm-burger, tristan-williams↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-10-31T21:08:45.949Z · LW(p) · GW(p)
I do not know. I can say that I'm glad they are taking these risks seriously. The low screening security on DNA synthesis orders has been making me nervous for years, ever since I learned the nitty gritty details while I was working on engineering viruses in the lab to manipulate brains of mammals for neuroscience experiments back in grad school. Allowing anonymous people to order custom synthetic genetic sequences over the internet without screening is just making it too easy to do bad things.
Replies from: Aidan O'Gara↑ comment by aogara (Aidan O'Gara) · 2023-10-31T21:17:04.886Z · LW(p) · GW(p)
Do you think we need to ban open source LLMs to avoid catastrophic biorisk? I'm wondering if there are less costly ways of achieving the same goal. Mandatory DNA synthesis screening is a good start. It seems that today there are no known pathogens which would cause a pandemic, and therefore the key thing to regulate is biological design tools which could help you design a new pandemic pathogen. Would these risk mitigations, combined with better pandemic defenses via AI, counter the risk posed by open source LLMs?
Replies from: nathan-helm-burger↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-10-31T21:31:41.216Z · LW(p) · GW(p)
I think that in the long term, we can make it safe to have open source LLMs, once there are better protections in place. By long term, I mean, I would advocate for not releasing stronger open source LLMs for probably the next ten years or so. Or until a really solid monitoring system is in place, if that happens sooner. We've made a mistake by publishing too much research openly, with tiny pieces of dangerous information scattered across thousands of papers. Almost nobody has time and skill sufficient to read and understand all that, or even a significant fraction. But models can, and so a model that can put the pieces together and deliver them in a convenient summary is dangerous because the pieces are there.
Replies from: M. Y. Zuo↑ comment by M. Y. Zuo · 2023-11-01T01:16:34.345Z · LW(p) · GW(p)
Why do you believe it's, on the whole, a 'mistake' instead of beneficial?
I can think of numerous benefits, especially in the long term.
e.g. drawing the serious attention of decision makers who might have otherwise believed it to be a bunch of hooey, and ignored the whole topic.
e.g. discouraging certain groups from trying to 'win' in a geopolitical contest, by rushing to create a 'super'-GPT, as they now know their margin of advantage is not so large anymore.
Replies from: nathan-helm-burger↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-11-01T05:03:08.708Z · LW(p) · GW(p)
Oh, I meant that the mistake was publishing too much information about how to create a deadly pandemic. No, I agree that the AI stuff is a tricky call with arguments to be made for both sides. I'm pretty pleased with how responsibly the top labs have been handling it, compared to how it might have gone.
Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.
Replies from: M. Y. Zuo↑ comment by M. Y. Zuo · 2023-11-01T21:10:22.270Z · LW(p) · GW(p)
Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.
Okay, I guess my question still applies?
For example, it might be that letting it progress without restriction has more upsides then slowing it down.
Replies from: nathan-helm-burger↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-11-01T21:31:29.362Z · LW(p) · GW(p)
An example of something I would be strongly against anyone publishing at this point in history is an algorithmic advance which drastically lowered compute costs for an equivalent level of capabilities, or substantially improved hazardous capabilities (without tradeoffs) such as situationally-aware strategic reasoning or effective autonomous planning and action over long time scales. I think those specific capability deficits are keeping the world safe from a lot of possible bad things.
Replies from: M. Y. Zuo↑ comment by M. Y. Zuo · 2023-11-02T02:58:21.469Z · LW(p) · GW(p)
Yes it's clear these are your views, Why do you believe so?
Replies from: nathan-helm-burger↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2023-11-02T22:08:13.730Z · LW(p) · GW(p)
I think... maybe I see the world and humanity's existence on it, as a more fragile state of affairs than other people do. I wish I could answer you more thoroughly.
https://www.lesswrong.com/posts/uPi2YppTEnzKG3nXD/nathan-helm-burger-s-shortform?commentId=qmrrKminnwh75mpn5 [LW(p) · GW(p)]
↑ comment by Tristan Williams (tristan-williams) · 2023-11-01T19:20:17.186Z · LW(p) · GW(p)
Not sure, but maybe the new AI institute they're setting up as a result
↑ comment by Vladimir_Nesov · 2023-10-31T09:55:20.216Z · LW(p) · GW(p)
The temporary technical conditions in 4.2(b) such as FLOPs of training compute seem to apply without further qualification for whether a model is "dual-use" in a more particular sense. So unclear if the definition of "dual-use" in 3(k) is relevant to application of reporting requirements in 4.2(a) until updated technical conditions get defined.
comment by Vladimir_Nesov · 2023-10-30T11:22:56.829Z · LW(p) · GW(p)
Calling mundane risk "near term" sneaks in the implication that extinction risk isn't.
Replies from: tristan-williams↑ comment by Tristan Williams (tristan-williams) · 2023-10-30T11:26:06.398Z · LW(p) · GW(p)
What alternative would you propose? I don't really like mundane risk but agree that an alternative would be better. For now I'll just change to "non-existential risk actions"
comment by Natália (Natália Mendonça) · 2023-10-31T01:40:41.168Z · LW(p) · GW(p)
This made me wonder about a few things:
- How responsible is CSET for this? CSET is the most highly funded longtermist-ish org, as far as I can tell from checking openbook.fyi (I could be wrong), so I've been trying to understand them better, since I don't hear much about them on LW or the EA Forum. I suspected they were having a lot of impact "behind the scenes" (from my perspective), and maybe this is a reflection of that?
- Aaron Bergman said on Twitter that for him, "the ex ante probability of something at least this good by the US federal government relative to AI progress, from the perspective of 5 years ago was ~1%[.] Ie this seems 99th-percentile-in-2018 good to me", and many people seemed to agree. Stefan Schubert then said that "if people think the policy response is "99th-percentile-in-2018", then that suggests their models have been seriously wrong." I was wondering, do people here agree with Aaron that this EO appeared unlikely back then, and, if so, what do you think the correct takeaway from the existence of this EO is?
comment by trevor (TrevorWiesinger) · 2023-10-30T14:52:41.007Z · LW(p) · GW(p)
Below, I've segmented by x-risk and non-x-risk related proposals, excluding the proposals that are geared towards promoting its use and focusing solely on those aimed at risk.
Thanks for the work put into the distillation! But I think that the acceleration proposal to safety proposal ratio is highly relevant. British PM's Rishi Sunak's speech, for example, was in large part an announcement that the UK would not regulate AI anytime soon. [LW(p) · GW(p)] I've argued previously that governments have strong short term incentives to accelerate AI and even lie about it [LW · GW], so my prediction is that omitting the ratio of safety to pro-acceleration points here, by omitting pro-acceleration points entirely, is net harmful.
Replies from: tristan-williams↑ comment by Tristan Williams (tristan-williams) · 2023-10-30T16:44:23.660Z · LW(p) · GW(p)
Hmm, I get the idea that people value succinctness a lot with these sorts of things, because there's so much AI information to take in now, so I'm not so sure about the net effect, but I'm wondering maybe if I could get at your concern here by mocking up a percentage (i.e. what percentage of the proposals were risk oriented vs progress oriented)?
It wouldn't tell you the type of stuff the Biden administration is pushing, but it would tell you the ratio which is what you seem perhaps most concerned with.
[Edit] this is included now
comment by Eli Tyre (elityre) · 2023-11-02T03:32:50.404Z · LW(p) · GW(p)
I spent a few hours reading, and parsing out, sections 4 and 5 of the recent White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.
The following are my rough notes on each subsection in those two subsections, summarizing what I understand each to mean, and my personal thoughts.
My high level thoughts are at the bottom.
Section by section
Section 4 – Ensuring the Safety and Security of AI Technology.
4.1
- Summary:
- The secretary of commerce and NIST are going to develop guidelines and best practices for AI systems.
- In particular:
- “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
- What does this literally mean? Does this allocate funding towards research to develop these benchmarks? What will concretely happen in the world as a result of this initiative?
- “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
- It also calls for the establishment of guidelines for conducting red-teaming.
- [[quote]]
- (ii) Establish appropriate guidelines (except for AI used as a component of a national security system), including appropriate procedures and processes, to enable developers of AI, especially of dual-use foundation models, to conduct AI red-teaming tests to enable deployment of safe, secure, and trustworthy systems. These efforts shall include:
- (A) coordinating or developing guidelines related to assessing and managing the safety, security, and trustworthiness of dual-use foundation models; and
- (B) in coordination with the Secretary of Energy and the Director of the National Science Foundation (NSF), developing and helping to ensure the availability of testing environments, such as testbeds, to support the development of safe, secure, and trustworthy AI technologies, as well as to support the design, development, and deployment of associated PETs, consistent with section 9(b) of this order.
- (ii) Establish appropriate guidelines (except for AI used as a component of a national security system), including appropriate procedures and processes, to enable developers of AI, especially of dual-use foundation models, to conduct AI red-teaming tests to enable deployment of safe, secure, and trustworthy systems. These efforts shall include:
- [[quote]]
- Commentary:
- I imagine that these standards and guidelines are going to be mostly fake.
- Are there real guidelines somewhere in the world? What process leads to real guidelines?
4.2
- Summary:
- a
- Anyone who has or wants to train a foundation model, needs to
- Report their training plans and safeguards.
- Report who has access to the model weights, and the cybersecurity protecting them
- The results of red-teaming on those models, and what they did to meet the safety bars
- Anyone with a big enough computing cluster needs to report that they have it.
- Anyone who has or wants to train a foundation model, needs to
- b
- The Secretary of Commerce (and some associated agencies) will make (and continually update) some standards for models and computer clusters that are subject to the above reporting requirements. But for the time being,
- Any models that were trained with more than 10^26 flops
- Any models that are trained primarily on biology data and trained using greater than 10^23 flops
- Any datacenter that connected with greater than 100 gigabits per second
- Any datacenter that can train an AI at 10^20 flops
- The Secretary of Commerce (and some associated agencies) will make (and continually update) some standards for models and computer clusters that are subject to the above reporting requirements. But for the time being,
- c
- I don’t know what this subsection is about. Something about protection cyber security for “United States Infrastructure as a Service” products.
- This includes some tracking of when foreigners want to use US AI systems in ways that might pose a cyber-security risk, using standards identical to the ones laid out above.
- d
- More stuff about IaaS, and verifying the identity of foreigners.
- a
- Thoughts:
- Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
- What do I think about this overall?
- I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
- The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
- The interest in red-teaming is promising, but again it depends on the implementation details.
- I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
- What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?
- I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
4.3
- Summary:
- They want to protect against AI cyber-security attacks. Mostly this entails government agencies issuing reports.
- a – Some actions aimed at protecting “critical infrastructure” (whatever that means).
- Heads of major agencies need to provide an annual report to the Secretary of Homeland security on potential ways that AIs open vulnerabilities to critical infrastructure in their purview.
- “…The Secretary of the Treasury shall issue a public report on best practices for financial institutions to manage AI-specific cybersecurity risks.”
- Government orgs will incorporate some new guidelines.
- The secretary of homeland security will work with government agencies to mandate guidelines.
- Homeland security will make an advisory committee to “provide to the Secretary of Homeland Security and the Federal Government’s critical infrastructure community advice, information, or recommendations for improving security, resilience, and incident response related to AI usage in critical infrastructure.”
- b – Using AI to improve cybersecurity
- One piece that is interesting in that: “the Secretary of Defense and the Secretary of Homeland Security shall…each develop plans for, conduct, and complete an operational pilot project to identify, develop, test, evaluate, and deploy AI capabilities, such as large-language models, to aid in the discovery and remediation of vulnerabilities in critical United States Government software, systems, and networks”, and then report on their results
- a – Some actions aimed at protecting “critical infrastructure” (whatever that means).
- They want to protect against AI cyber-security attacks. Mostly this entails government agencies issuing reports.
- Commentary
- This is mostly about issuing reports, and guidelines. I have little idea if any of that is real or if this is just an expansion of lost-purpose bureaucracy. My guess is that there will be few people in the systems that have inside views that allow them to write good guidelines for their domains of responsibility regarding AI, and mostly these reports will be epistemically conservative and defensible, with a lot of “X is possibly a risk” where the authors have large uncertainty about how large the risk is.
- Trying to use AI to improve cyber security sure is interesting. I hope that they can pull that off. It seems like one of the things that ~ needs to happen for the world to end up in a good equilibrium is for computer security to get a lot better. Otherwise anyone developing a powerful model will have the weights stolen, and there’s a really vulnerable vector of attack for not-even-very-capable AI systems. I think the best hope for that is using our AI systems to shore up computer security defense, and hope that at higher-than-human levels of competence, cyber warfare is not so offense-dominant. (As an example, someone suggested maybe using AI to write a secure successor to C, and the using AI to “swap out” the lower layers of our computing stacks with that more secure low level language.)
- Could that possibly happen in government? I generally expect that private companies would be way more competent at this kind of technical research, but maybe the NSA is a notable and important exception? If they’re able to stay ten years ahead in cryptography, maybe they can stay 10 years ahead in AI cyberdefense.
- This raises the question, what advantage allows the NSA to stay 10 years ahead? I assume that it is a combination of being able to recruit top talent, and that there are things that they are allowed to do that would be illegal for anyone else. But I don’t actually know if that’s true.
- Could that possibly happen in government? I generally expect that private companies would be way more competent at this kind of technical research, but maybe the NSA is a notable and important exception? If they’re able to stay ten years ahead in cryptography, maybe they can stay 10 years ahead in AI cyberdefense.
4.4 – For reducing AI-mediated CHEMICAL, BIOLOGICAL, RADIOLOGICAL, AND NUCLEAR threats, focusing on biological weapons in particular.
- Summary:
- a
- The Secretary of Homeland Security (with help from other executive departments) will “evaluate” the potential of AI to both increase and to defend against these threats. This entails talking with experts and then submitting a report to the president.
- In particular, it orders the Secretary of Defense (with the help of some other governmental agencies) to conduct a study that “assesses the ways in which AI can increase biosecurity risks, including risks from generative AI models trained on biological data, and makes recommendations on how to mitigate these risks”, evaluates the risks associated with the biology datasets used to train such systems, assesses ways to use AI to reduces biosecurity risks.
- b – Specifically to reduce risks from synthetic DNA and RNA.
- The office of science and technology policy (with the help of other executive departments) are going to develop a “framework” for synthetic DNA/RNA companies to “implement procurement and screening mechanisms”. This entails developing “criteria and mechanisms” for identifying dangerous nucleotide sequences, and establishing mechanism for doing at-scale screening of synthetic nucleotides.
- Once such a framework is in place, all (government?) funding agencies that fund life science research will make compliance with that framework a condition of funding.
- All of this, once set up, needs to be evaluated and stress tested, and then a report sent to the relevant agencies.
- a
- Commentary:
- The part about setting up a framework for mandatory screening of nucleotide sequences, seems non-fake. Or at least it is doing more than commissioning assessments and reports.
- And it seems like a great idea to me! Even aside from AI concerns, my understanding is that the manufacture synthetic DNA is one major vector of biorisk. If you can effectively identify dangerous nucleotide sequences (and that is the part that seems most suspicious to me), this is one of the few obvious places to enforce strong legal requirements. These are not (yet) legal requirements, but making this a condition of funding seems like a great step.
- The part about setting up a framework for mandatory screening of nucleotide sequences, seems non-fake. Or at least it is doing more than commissioning assessments and reports.
4.5
- Summary
- Aims to increase the general ability for identifying AI generated content, and mark all Federal AI generated content as such.
- a
- The secretary of commerce will produce a report on the current and likely-future methods for, authenticating non-AI content, identifying AI content, watermarking AI content, preventing AI systems from “producing child sexual abuse material or producing non-consensual intimate imagery of real individuals (to include intimate digital depictions of the body or body parts of an identifiable individual)”
- b
- Using that report, the Secretary of Commerce will develop guidelines for detecting and authenticating AI content.
- c
- Those guidelines will be issued to relevant federal agencies
- d
- Possibly those guidelines will be folded into the Federal Acquisitions Regulation (whatever that is)
- Commentary
- Seems generally good to be able to distinguish between AI generated material and non-AI generated material. I’m not sure if this process will turn up anything real that meaningfully impacts anyone’s experience of communications from the government.
4.6
- Summary
- The Secretary of Commerce is responsible for running a “consultation process on potential risks, benefits, other implications” of open source foundation models, and then for submitting a report to the president on the results.
- Commentary
- More assessments and reports.
- This does tell me that someone in the executive department has gotten the memo that open source models mean that it is easy to remove the safeguards that companies try to put in them.
4.7
- Summary
- Some stuff about federal data that might be used to train AI Systems. It seems like they want to restrict the data that might enable CBRN weapons or cyberattacks, but otherwise make the data public?
- Commentary
- I think I don’t care very much about this?
4.8
- Summary
- This orders a National Security Memorandum on AI to be submitted to the president. This memorandum is supposed to “provide guidance to the Department of Defense, other relevant agencies”
- Commentary:
- I don’t think that I care about this?
Section 5 – Promoting Innovation and Competition.
5.1 – Attracting AI Talent to the United States.
- Summary
- This looks like a bunch of stuff to make it easier for foreign workers with AI relevant expertise to get visas, and to otherwise make it easy for them to come to, live in, work in, and stay in, the US.
- Commentary
- I don’t know the sign of this.
- Do we want AI talent to be concentrated in one country?
- On the one hand that seems like it accelerates timelines some, especially if there are 99.9% top tier AI researchers that wouldn’t otherwise be able to get visas, but who can now work at OpenAI. (It would surprise me if this is the case? Those people should all be able to get O1 visas, right?)
- On the other hand, the more AI talent is concentrated in one country the smaller jurisdiction of the regulatory regime that slows down AI. If enough of the AI talent is in the US, regulations that slow down AI development in the US only have a substantial impact, at least the the short term, before that talent moves, but maybe also in the long term, if researchers care more about continuing to live in the US than they do about making cutting edge AI progress.
5.2
- Summary
- a –
- The director of the NSF will do a bunch of things to spur AI research.
- …”launch a pilot program implementing the National AI Research Resource (NAIRR)”. This is evidently something that is intended to boost AI research, but I’m not clear on what it is or what it does.
- …”fund and launch at least one NSF Regional Innovation Engine that prioritizes AI-related work, such as AI-related research, societal, or workforce needs.”
- …”establish at least four new National AI Research Institutes, in addition to the 25 currently funded as of the date of this order.”
- The director of the NSF will do a bunch of things to spur AI research.
- b –
- The Secretary of Energy will make a pilot program for training AI scientists.
- c –
- Under Secretary of Commerce for Intellectual Property and Director of the United States Patent and Trademark Office will sort out how generative AI should impact patents, and issue guidance. There will be some similar stuff for copyright.
- d –
- Secretary of Homeland Security “shall develop a training, analysis, and evaluation program to mitigate AI-related IP risks”
- e –
- The HHS will prioritize grant-making to AI initiatives.
- f –
- Something for the veterans.
- g –
- Something for climate change
- a –
- Commentary
- Again. I don’t know how fake this is. My guess is not that fake? There will be a bunch of funding for AI stuff, from the public sector, in the next two years.
- Most of this seems like random political stuff.
5.3 – Promoting Competition.
- Summary
- a –
- The heads of various departments are supposed to promote competition in AI, including in the inputs to AI (NVIDIA)?
- b
- The Secretary of Commerce is going to incentivize competition in the semi-conductor industry, via a bunch of methods including
- “implementing a flexible membership structure for the National Semiconductor Technology Center that attracts all parts of the semiconductor and microelectronics ecosystem”
- mentorship programs
- Increasing the resources available to startups (including datasets)
- Increasing the funding to R&D for superconductors
- The Secretary of Commerce is going to incentivize competition in the semi-conductor industry, via a bunch of methods including
- c – The Administrator of the Small Business Administration will support small businesses innovating and commercializing AI
- d
- a –
- Commentary
- This is a lot of stuff. I don’t know that any of it will really impact how many major players there are at the frontier of AI in 2 years.
- My guess is probably not much. I don’t think the government knows to to create NVIDAs or OpenAIs.
- What the government can do is break up monopolies, but they’re not doing that here.
My high level takeaways
Mostly, this executive order doesn’t seem to push for much object-level action. Mostly it orders a bunch of assessments to be done, and reports on those assessments to be written, and then passed up to the president.
My best guess is that this is basically an improvement?
I expect something like the following to happen:
- The relevant department heads talk with a bunch of experts.
- The write up very epistemically conservative reports in which they say “we’re pretty sure that our current models in early 2024 can’t help with making bioweapons, but we don’t know (and can’t really know) what capabilities future systems will have, and therefore can’t really know what risk they’ll pose.”
- The sitting president will then be weighing those unknown levels of national security risks against obvious economic gains and competition with China.
In general, this executive order means that the Executive branch is paying attention. That seems, for now, pretty good.
(Though I do remember in 2015 how excited and optimistic people in the rationality community were about Elon Musk, “paying attention”, and that ended with him founding OpenAI, what many of those folks consider to be the worst thing that anyone had ever done to date. FTX looked like a huge success worthy of pride, until it turned out that it was a damaging and unethical fraud. I’ve become much more circumspect about which things are wins, especially wins of the form “powerful people are paying attention”.)
Replies from: Benito↑ comment by Ben Pace (Benito) · 2023-11-02T04:00:09.783Z · LW(p) · GW(p)
My guess is that this comment would be much more readable with the central chunk of it in a google doc, or failing that a few levels fewer of indented bullets.
e.g. Take this section.
- Thoughts:
- Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
- What do I think about this overall?
- I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
- The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
- The interest in red-teaming is promising, but again it depends on the implementation details.
- I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
- What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?
I find it much more readable as the following prose rather than 5 levels of bullets. Less metacognition tracking the depth.
Replies from: elityreThoughts
Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
What do I think about this overall?
I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
The interest in red-teaming is promising, but again it depends on the implementation details.
I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?
↑ comment by Eli Tyre (elityre) · 2023-11-02T14:04:35.956Z · LW(p) · GW(p)
Possibly. I wrote this as personal notes, originally, in full nested list format. Then I spent 20 minutes removing some of the the nested-list-ness in wordpress which was very frustrating. I would definitely have organized it better if wordpress was less frustrating.
I did make a google doc format. Maybe the main lesson is that I should have edited it there.
comment by Steven Zhang (steven-zhang) · 2023-10-31T18:11:05.595Z · LW(p) · GW(p)
The actual text or the order is 70 pages long and very hard to navigate. At the request of some DC friends, I made a tool for navigating the text that adds
- Sidebar for navigation
- Tooltips for definitions defined in section 3
- Deep linking to any section/sentence of the text
Hope it's useful for some of you here! https://www.aijobstracker.com/ai-executive-order
comment by GeneSmith · 2023-10-31T05:53:23.096Z · LW(p) · GW(p)
My overall takeaway is all these things are generally good, though insufficient to actually address many X-risk-related concerns.
All of the standards assume we can wait until after a very powerful system is trained to evaluate it, which by my current understanding would not address risks of deception.
Frankly, this is more than I would have expected the white house to do, and thus I think a positive update on likely future actions.
Replies from: tristan-williams↑ comment by Tristan Williams (tristan-williams) · 2023-10-31T09:30:41.689Z · LW(p) · GW(p)
Yeah, I think the reference class for me here is other things the executive branch might have done, which leads me to "wow, this was way more than I expected".
Worth noting is that they at least are trying to address deception by including it in the full bill readout. The type of model they hope to regulate here include those that permit "the evasion of human control or oversight through means of deception or obfuscation". The director of the OMB also has to come up with tests and safeguards for "discriminatory, misleading, inflammatory, unsafe, or deceptive outputs".
Replies from: GeneSmithcomment by followthesilence · 2023-11-01T06:04:10.610Z · LW(p) · GW(p)
this is crazy, perhaps the most sweeping action taken by government on AI yet.
Seems like too much consulting jargon and "we know it when we see it" vibes, with few concrete bright-lines. Maybe a lot hinges on enforcement of the dual-use foundation model policy... any chance developers can game the system to avoid qualifying as a dual-use model? Watermarking synthetic content does appear on its face a widely-applicable and helpful requirement.
Replies from: D0TheMath, tristan-williams↑ comment by Garrett Baker (D0TheMath) · 2023-11-01T08:06:22.247Z · LW(p) · GW(p)
My general impression is for these sorts of things, vagueness is generally positive, since it gives the executive and individual actors who want to make a name for themselves more leeway, and makes companies less able to wriggle out on technicalities. Contrast with vague RSPs, for which the value of vagueness is in the opposite direction.
But of course this is an executive order, so if enough companies aren’t subject to it based on technicalities, it could easily be changed and re-issued. I don’t know how common this is though.
↑ comment by Tristan Williams (tristan-williams) · 2023-11-01T19:13:55.769Z · LW(p) · GW(p)
Garrett responded to the main thrust well, but I will say that watermarking synthetic media seems fairly good as a next step for combating misinformation from AI imo. It's certainly widely applicable (not really even sure what the thrust of this distinction was) because it is meant to apply to nearly all synthetic content. Why exactly do you think it won't be helpful?
Replies from: followthesilence↑ comment by followthesilence · 2023-11-01T19:29:28.288Z · LW(p) · GW(p)
I agree, I was trying to highlight it as one of the most specific, useful policies from the EO. Understand the confusion given my comment was skeptical overall.
comment by JessRiedel · 2023-10-31T21:29:23.536Z · LW(p) · GW(p)
UK’s proposal for a joint safety institute seems maybe more notable:
Sunak will use the second day of Britain's upcoming two-day AI summit to gather “like-minded countries” and executives from the leading AI companies to set out a roadmap for an AI Safety Institute, according to five people familiar with the government’s plans.
The body would assist governments in evaluating national security risks associated with frontier models, which are the most advanced forms of the technology.
The idea is that the institute could emerge from what is now the United Kingdom’s government’s Frontier AI Taskforce, which is currently in talks with major AI companies Anthropic, DeepMind and OpenAI to gain access to their models. An Anthropic spokesperson said the company is still working out the details of access, but that it is “in discussions about providing API access.”
https://www.politico.eu/article/uk-pitch-ai-safety-institute-rishi-sunak/
comment by Shankar Sivarajan (shankar-sivarajan) · 2023-10-30T16:42:33.697Z · LW(p) · GW(p)
The good news is that this is an Executive Order, so it can be repealed easily.
Replies from: amaury-lorin↑ comment by momom2 (amaury-lorin) · 2023-10-31T07:08:01.880Z · LW(p) · GW(p)
1- I didn't know Executive Order could be repealed easily. Could you please develop?
2- Why is it good news? To me, this looks like a clear improvement on the previous status of regulations.
↑ comment by Colin McGlynn (colin-mcglynn) · 2023-10-31T15:33:44.091Z · LW(p) · GW(p)
Executive Orders aren't legislation. They are instructions that the white house makes to executive branch agencies. So the president can issue new executive orders that change or reverse older executive orders made by themselves or past presidents.
comment by Shankar Sivarajan (shankar-sivarajan) · 2023-10-30T16:45:22.440Z · LW(p) · GW(p)
It's amusing the lengths they'll go to in order to "authenticate official content" instead of just posting announcements to their own website. Of course, they'd have to stop posting to 𝕏, Facebook, TikTok, etc. but it looks like they're as addicted to social media as everyone else.