Open-Source AI: A Regulatory Review

post by Elliot Mckernon (elliot), Deric Cheng (deric-cheng) · 2024-04-29T10:10:55.779Z · LW · GW · 0 comments

Contents

  What are open-source models, and what are their effects on AI safety?
  Current Regulatory Policies
    The US
    The EU
    China
  Convergence’s Analysis
None
No comments

Cross-posted on the EA Forum [EA · GW]. This article is part of a series of ~10 posts comprising a 2024 State of the AI Regulatory Landscape Review, conducted by the Governance Recommendations Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance, such as incident reporting [EA · GW]safety evals [EA · GW], model registries [LW · GW], and more. We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.

This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll publish individual posts on our website and release a comprehensive report at the end of this series.

What are open-source models, and what are their effects on AI safety?

Some software developers choose to open-source their software; they freely share the underlying source code and allow anyone to use, modify, and deploy their work. This can encourage friendly collaboration and community-building, and has produced many popular pieces of software, including operating systems like Linux, programming languages and platforms like Python and Git, and many more.

Similarly, AI developers are open-sourcing their models and algorithms, though the details can vary. Generally, open-sourcing of AI models involves some combination of:

For example, Meta released  the model weights of their LLM, Llama 2, but not their training code, methodology, original datasets, or model architecture details. In their excellent article on Openness In Language Models, Prompt Engineering labels this an example of an “open weight” model. Such an approach allows external parties to use the model for inference and fine-tuning, but doesn’t allow them to meaningfully improve or analyze the underlying model. Prompt Engineering points out a drawback of this approach:

So, open weights allows model use but not full transparency, while open source enables model understanding and customization but requires substantially more work to release [...] If only open weights are available, developers may utilize state-of-the-art models but lack the ability to meaningfully evaluate biases, limitations, and societal impacts. Misalignment between a model and real-world needs can be difficult to identify.

Further, while writing this article in April 2024, Meta released Llama 3 with the same open-weights policy, claiming that it is “the most capable openly available LLM to date”. This has brought fresh attention to the trade-offs of open-sourcing, as the potential harms of freely sharing software are greater the more powerful the model in question is. Even those who are fond of sharing wouldn’t want everyone in the world to have easy access to the instructions for a 3D-printable rocket launcher, and freely sharing powerful AI could present similar risks; such AI could be used to generate instructions for assembling homemade bombs or even designing deadly pathogens [LW · GW]. Distributing information of this nature widely is termed an information hazard.

To prevent these types of hazards, AI models like ChatGPT have safeguards built in during the fine-tuning phase towards the end of their development (implementing techniques such as Reinforcement Learning by Human Feedback, or RLHF). This technique can limit AI models from producing harmful or undesired content.

Some people find ways to get around this fine-tuning, but experts have pointed out that malicious actors could circumvent the problem entirely. ChatGPT and Claude, the two most prominent LLMs are closed-source (and their model weights are closely guarded secrets), but open-source models can be used and deployed without fine-tuning safeguards. This was demonstrated practically with Llama 2, a partly open-source LLM developed by Meta in Palisade Research’s paper BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B. To quote an interview with one of its authors Jeoffrey Ladish:

You can train away the harmlessness. You don’t even need that many examples. You can use a few hundred, and you get a model that continues to maintain its helpfulness capabilities but is willing to do harmful things. It cost us around $200 to train even the biggest model for this. Which is to say, with currently known techniques, if you release the model weights there is no way to keep people from accessing the full dangerous capabilities of your model with a little fine tuning.

Therefore, these models and their underlying software may themselves be information hazards, and many argue that open-sourcing advanced AI should be legally prohibited, or at least prohibited until developers can guarantee the safety of their software. In “Will releasing the weights of future large language models grant widespread access to pandemic agents?”, the authors conclude that

Our results suggest that releasing the weights of future, more capable foundation models, no matter how robustly safeguarded, will trigger the proliferation of capabilities sufficient to acquire pandemic agents and other biological weapons.

Others counter that openness is necessary to stop the power and wealth generated by powerful AI falling into the hands of a few, and that prohibitions won’t be effective safeguards, as argued in GitHub’s Supporting Open Source and Open Science in the EU AI Act and Mozilla’s Joint Statement on AI Safety and Openness, which was signed by over 1,800 people and states: 

Yes, openly available models come with risks and vulnerabilities — AI models can be abused by malicious actors or deployed by ill-equipped developers. However, we have seen time and time again that the same holds true for proprietary technologies — and that increasing public access and scrutiny makes technology safer, not more dangerous. The idea that tight and proprietary control of foundational AI models is the only path to protecting us from society-scale harm is naive at best, dangerous at worst.

Finally, some argue that open-sourcing or not is a false dichotomy [LW · GW], putting forward intermediate policies such as structured access:

Instead of openly disseminating AI systems, developers facilitate controlled, arm's length interactions with their AI systems. The aim is to prevent dangerous AI capabilities from being widely accessible, whilst preserving access to AI capabilities that can be used safely.

There are more perspectives and arguments than we can concisely include here, and you might be interested in the following discussions:

Current Regulatory Policies

The US

The US AI Bill of Rights doesn’t discuss open-source models, but the Executive Order on AI does initiate an investigation into the risk-reward tradeoff of open-sourcing. Section 4.6 calls for soliciting input on foundation models with “widely available model weights”, specifically targeting open-source models. Section 4.6 summarizes the risk-reward tradeoff of publicly sharing model weights, which offers “substantial benefits to innovation, but also substantial security risks, such as the removal of safeguards within the model”. In particular: 4.6 calls for the Secretary of Commerce to:

The EU

The EU AI Act states that open-sourcing can increase innovation and economic growth. The act therefore exempts open-source models and developers from some restrictions and responsibilities placed on other models and developers. Note though that these exemptions do not apply to foundation models (meaning generative AI like ChatGPT), or if the open-source software is monetized or is a component in high-risk software. 

Notably, the treatment of open-source models was contentious during the development of the EU AI Act (see also here). 

China

There is no mention of open-source models in China’s regulations between 2019 and 2023; open-source models are neither exempt from any aspects of the legislation, nor under any additional restrictions or responsibilities. 

Convergence’s Analysis

The boundaries and terminology around open-sourcing are often underspecified. 

Open-sourcing models improves transparency and accountability, but also gives the public broader access to dangerous information and reduces the efficacy of legislation. No one agrees on the right balance.

Developers of open-source models are not currently under any additional legal  obligations compared to developers of private or commercial models. 

The EU legislation treats open-source models favorably. 

0 comments

Comments sorted by top scores.