Compute Governance Literature Review

sijarvis

Compute Governance Literature Review

post by sijarvis · 2024-06-25T22:41:46.694Z · LW · GW · 0 comments

  Executive Summary
  What is Compute?
  Compute Governance
  Benefits of Compute Governance
  Existing Approaches
  Potential Approaches
  Limitations
  Risks
  Conclusion
  References
None
No comments

This white paper reviews the use of compute governance to mitigate risks from AI, the reasoning behind this approach, the ways in which it has been used and in which it could be further developed. It aims to provide an introduction for individuals who feel that there is a need for AI safety and would like to further understand this governance approach. It was prepared as a final project for the BlueDot Impact AI Alignment Course.

Executive Summary

Compute governance is a method for monitoring and regulating AI systems by tracking the hardware required to develop frontier models. At its most simple it can be implemented by recording the owners and locations of physical chips and it is also possible to be extended to require functionality that enables reporting or other security features. It can assist the enforcement of existing regulations that require companies to report AI system development over certain thresholds. There are many reasons why compute governance is seen as a promising approach, however it does not mitigate all possible risks from AI and is likely to be most effective as part of a package of governance interventions.

Figure 1 - AI Generated Image of a chip secured with a lock, Copilot Designer

What is Compute?

The training process for an AI model involves calculating the result of many mathematical equations. Depending on the context, compute can be used to mean either, the total number of calculations used to train a model (measured in floating-point operations or FLOP), how many such calculations are performed over time (in FLOPS, floating-point operations per second), or more generally the resources and hardware used to perform the calculations.

Large Language Models (LLMs) and multimodal models require immense computational power to train on massive source datasets, to access the level of compute required powerful Graphics Processing Units (GPUs), Tensor Processing Units (TPUs) or other specialised AI hardware chips are used. It is estimated that GPT-4 required 2.1x1025 FLOP for training (Sevilla et al., 2022) whereas cutting-edge AI chips have performance levels measured in teraFLOPS or 1012 FLOPS (H100 Tensor Core GPU | NVIDIA, n.d.) meaning that tens of thousands of chips are required for each training run to be completed in a reasonable amount of time.

The outcome of this is that compute is intrinsically linked to the access to such specialised hardware. It’s estimated that Microsoft and Meta bought 150,000 chips each in 2023 (The GPU Haves and Have-Nots. - The Verge, 2023) compared to others who could only purchase significantly fewer.

It is thought that the factor most strongly driving progress in AI performance is the increasing amount of compute available for model training and research shows that the amount of compute being used for training runs is doubling every six months (Sevilla et al., 2022). This underscores how critical access to compute is for frontier AI models.

Figure 2 - NVIDIA H100 GPU on new SXM5 Module, nvidia.com

Compute Governance

At its highest level, the approach of compute governance is tracking and monitoring the chips used to develop AI models. Once their locations and owners are known, governments or regulators can use this information to identify parties that are capable of developing potentially dangerous AI systems and audit them to check compliance. In addition, access to these chips can be controlled, either to restrict or subsidise access in accordance with national interests.

Governance approaches can be further developed by implementing guard rail functionality in the chips themselves to allow them to report what they have been used for, to enforce firmware updates, require them to “phone home” to a licensing server and other possible safety measures.

Alongside compute, the main inputs to the development of an AI system are data and algorithms (Sastry et al., 2024) which are easy to share and hard to unshare. Compute is a physical thing which makes it much easier to regulate. This physical nature gives it a number of other properties which make it particularly suitable for regulation. Compute is:

Detectable - Concentrations of chips at the level needed for training cutting-edge models are housed in data centres which have significant power and often water requirements for running and cooling chips. If the owner of such a facility is not able to provide an explanation of what the facility is being used for, this is cause for suspicion in a similar way to the monitoring of nuclear sites (Baker, 2023).
Excludable - It is possible to prevent access by certain parties.
Quantifiable - It is possible to count chips, report on numbers, establish thresholds and verify reports.

A further aspect of the regulatability of compute is that the supply chain of cutting-edge chips is remarkably concentrated. There are very few parties involved in the production and the competencies needed are significant and hard to develop. This means there are very few chip manufacturers, possibly only 2-3 in the US (Cheng, 2024). Applying regulations to a small number of parties reduces complexity and also makes them more likely to be effective. A similar effect is also seen with cloud compute providers. These are the companies which most major AI companies use in order to hire access compute, rather than purchasing their own hardware. There are relatively few providers with access to the volume of chips required, further helping to minimise the number of companies involved in the application of regulations.

Benefits of Compute Governance

If regulations and procedures are put in place to track the location, ownership and users of chips used for AI development it will be possible to identify which organisations are capable of developing the highest risk models. As well as the bottleneck in the supply chain, it is thought that there are not many companies capable of training frontier models and most of them are based in just a few countries. The result of this is that compute governance is a highly tractable solution. Compute governance policies only need to be adopted in very few countries for all of the major players in the manufacturing supply chain to become subject to the regulations, which will then impact every downstream party.

Tracking of compute also allows for its allocation to be controlled. It can be restricted so that only organisations with suitable safety policies in place are able to access it, or made available at subsidised rates to organisations deemed deserving. It is also possible to enforce rules on who is able to purchase chips, particularly based on location e.g. restricting the shipment of chips to certain countries.

Existing Approaches

Whilst the governance of compute is still at a fairly early stage, some steps have been taken to address risks from AI through this approach. In the US restrictions have been put in place on chip manufacturers preventing them from exporting certain types of chips to China (Bureau of Industry and Security, 2022, 2023). These restrictions cover the export of chips and also the technology to be able to develop semi-conductors which are essential to the manufacture of such chips. Should an organisation breach these restrictions significant financial penalties can be imposed. In addition to this, pre-existing US export controls apply to any commodity, software, or technology with a high probability that it can be used to make weapons (Weinstein & Wolf, 2023).

Recent laws in the US and EU have introduced thresholds for reporting when significant amounts of compute are used to train AI systems (The White House, 2023; European Parliament, 2024). Both pieces of legislation have taken a risk-based approach to reporting and have used the assumption that higher levels of compute lead to higher risks. At the time of writing no models have met the US threshold of 1026 FLOP and there may be one which has met the slightly lower EU threshold of 1025 FLOP however it is likely that frontier models will be utilising this level of compute soon (Sevilla et al., 2022). Both jurisdictions have a lower reporting threshold for narrow models which are trained on biological sequence data. This is due to the potential risk of a highly specialised but less powerful model being able to assist with the creation or discovery of harmful biological agents.

In the UK the government has issued AI strategy papers which highlight the importance of access to compute in AI development and ensuring its fair allocation, but there are no similar reporting regulations to the US or EU (Department for Science, Innovation & Technology, 2023; Donelan, 2023; Hunt & Donelan, 2023). At least one think tank has criticised the UK government for recognising the importance of compute, but stopping short of putting any compute governance policies in place (Jess Whittlestone et al., 2023). This is particularly pertinent as the UK has been looking to be seen as a leader in AI development and safety. In addition to reporting rules, the US has launched a pilot scheme to provide access to compute for causes it deems beneficial such as research and education (NAIRR Pilot - Home, n.d.).

Potential Approaches

The most fundamental approach to governing compute relies on tracking physical chips, their owners and locations (Shavit, 2023). In the US, a coalition of senators led by Mitt Romney has proposed an approach where chip manufacturers/vendors are obliged to report to an oversight body that tracks chip ownership (Romney et al., 2024). Their proposal also includes a due diligence or “Know Your Customer” (KYC) approach to vet potential customers and rules for owners of chips around reporting of development and deployment.

A common analogy for tracking compute hardware is that of nuclear material (Baker, 2023), however similar regulatory regimes also already exist for common items such as cars, firearms and property as it is beneficial for society to track these items. A tracking system for chips could start as early in the process as monitoring at the wafer fabrication facility level, and then maintaining a chain of custody until the finished chip is completed, at which point it would be added to a chip register and then tracked. Jones (2024) provides a well-considered framework for how a reporting body could function practically and what infrastructure would be required.

If chip tracking and control is taken to extremes, it would be possible for all compute to be controlled by a central governing body and access provided in line with an agreed policy, however this poses significant practical challenges and likely opposition so is not considered further here. Once a system is in place for tracking chips and their owners, there are other aspects of compute governance which become possible. If hardware is tracked, a regulator can allocate access to restrict some parties and benefit others based on their usage or approach to safety. It would be possible to require a licence to own chips and granting of licences could be dependent on the licensee agreeing to implement downstream safety requirements. It would also be possible to restrict access in order to slow progress whilst other AI safety approaches are developed and/or allowed to catch up.

An area of compute governance which has received significant attention is requiring chip manufacturers to implement hardware functions that enable features to support reporting and inspection, or more advanced safety guardrails. Shavit (2023) proposes a method by which individual chips can store information on how they have been used, in a secure way which can then be retrieved by an auditor. Such an ability would provide an enforcement mechanism to make the reporting rules present in the US and EU more robust. There is already an incentive for chip manufacturers to implement some security functions to help guard against the risk of model theft, particularly for cloud computing providers to protect their customers’ interests.

The ability to allow remote shutdown has also been proposed, however this is unlikely to be in the manufacturer or the customer's interest. Even if this were to be implemented, it is unlikely that the potential damage caused by an AI system will be understood at the compute intensive training stage, making it too late to apply retroactively. There is also the vulnerability to cyber attacks to either trigger it prematurely or to block trigger signals. It should be assumed that some manufacturers will attempt to circumvent or minimise any security features and reporting which may have a negative impact on them. Nvidia has already created a slightly lower powered chip to attempt to avoid restrictions (Peters, 2023) and an extreme example of corporate cheating is the Volkswagen emissions scandal (CPA, 2019). As they have unrestricted physical access to the chips, secure hardware approaches are required including burned-in identification numbers, secure firmware and encryption keys to sign reporting information cryptographically (Shavit, 2023). Some proposals go even further, to require an online dedicated security module which can ensure chips are running the latest firmware and are only used by licensed companies, thus removing the need for physical inspections (Aarne et al., 2024).

For a compute governance regulation to be successful it is beneficial to have the support of all of the major stakeholders. The general public is widely supportive of steps to improve AI safety (Davies & Birtwistle, 2023; Deltapoll, 2023). This provides the backing for governments to take action. Compute governance allows governments to have more fine-grained control over the development of AI systems and also to improve national security. It is likely that governments will want to implement such a regime (Cheng, 2024). To offset the implementation costs, there are need to be benefits for chip manufacturers. If the alternative to regulation is a blanket ban in certain geographies, regulated sales have the potential to open up wider markets. AI companies that take responsible approaches are aided by regulation as it makes it hard for less responsible companies to develop risky models, removing the commercial pressure on the responsible companies to move quickly or reduce safety. Even in the absence of legal backing, a regime could be initiated as a voluntary programme which responsible companies want to be seen to be a part of. Taken in combination, there are potential benefits to the major stakeholders involved in such a regime.

Limitations

Whilst there are various factors in favour of compute governance approaches, it is not a perfect solution to the risks posed by AI. Large amounts of compute are currently required for frontier models, however algorithms are improving (Erdil, 2022; Ho et al., 2024) and progress is being made on lower specification hardware (Phothilimthana & Perozzi, 2023). At the same time consumer level hardware is improving and may reach a point where it can be used for model development (Mann, 2024). An underlying assumption is that more compute enables more risky models, but this is a blunt assumption and doesn’t necessarily provide an accurate assessment of model capabilities. Small but narrow models can still be dangerous if they focus on biological processes (Urbina et al., 2022) or specialised military uses.

The training stage of a model requires a large amount of compute, but once a model is developed either inside or outside of a compute governance framework, it can be distributed easily. If a model is released as open source, it may be possible to remove or reduce safety training with much smaller levels of compute (Lermen et al., 2023). Any compute governance regime will be difficult to enforce retroactively, particularly if it requires hardware modifications. Any existing deployed chips will not be covered automatically; however, this will become less important over time as chips degrade and become superseded. There will be incentives for evasion and circumvention of governance regulations. If a regime is designed and implanted correctly it can be made difficult to circumvent, particularly if combined with other approaches (Department for Science, Innovation & Technology, 2023). Chip smuggling is already happening in light of the US export ban to China (Ye et al., 2023), seemingly only in single figures however, making it largely irrelevant compared to the numbers needed to train a frontier model.

Risks

In addition to the known limitations of governing compute, there are also risks which may materialise. If a regime is implemented without proper planning, well intentioned safety measures could end up being counterproductive (Sastry et al., 2024). To mitigate against this, lessons can be learned from successful regimes such as the control of nuclear materials (Baker, 2023).

Depending on the level of transparency required, reporting regulations may introduce race dynamics as competitors try to keep up with one another, with the result of accelerating the development of riskier AI models. With any regime, there is the risk of regulatory capture, where regulations end up acting in the interest of some of the firms that are being regulated. This may also act to restrict competition and keep new entrants out of the market due to regulatory burdens. An example of this in the AI industry is that a French AI company Mistral lobbied for changes to the EU AI Act to help them compete with larger US-based companies (Lomas, 2023) and then announced a €2bn deal with Microsoft (Warren, 2024), giving rise to competition concerns (Hartmann, 2024).

There is the risk of abuse of power by leading regimes, in particular it may spur greater concerns about US governmental overreach (Cheng, 2024). It may also give rise to legal challenges as using the law to verify how private entities are using chips raises a multitude of significant privacy and security concerns (Bernabei et al., 2024). In addition, the verification process may mean that commercial or personal information is leaked accidentally. Procedural safeguards and strong cyber security approaches are needed to protect relevant data.

Even with current basic export restrictions, retaliatory restrictions have been seen with China completely stopping the export of certain elements required for semiconductor production in August 2023 (He, 2023). A more nebulous perceived risk is that of slowing growth in a highly promising economic sector which has had previous low barriers to entry. In practice the regulations being discussed are already designed to exclude all but the largest scale development in frontier labs. They also incorporate regularly revisiting rules and thresholds to ensure this remains the case.

Conclusion

Compute governance is understood to be a possible and effective way of managing some of the potential risks from advanced AI systems. Basic steps to enable this have been taken in the US and EU. The majority of the challenges to a successful regime are political rather than technical (Bernabei et al., 2024). Even in the absence of governmental support, immediate steps like voluntary chip registration can pave the way for robust legal frameworks, ensuring AI development remains safe and beneficial. As participation of responsible manufacturers and developers increases, it would become easier to convert to a legal requirement. Compute governance is not a single solution to all of the possible risks posed by AI but can be used as part of a Swiss cheese model (Reason et al., 1997) and combined with other governance approaches to reduce risk and allow time for other approaches to be developed.