Surviving and Shaping Long-Term Competitions: Lessons from Net Assessment

post by Gentzel, ihavenoahidea · 2023-11-24T18:18:41.072Z · LW · GW · 0 comments

Contents

  What is net assessment?
  The methods and principles of net assessment
    1. Look for asymmetries between competitors. 
    2. Focus on the outcomes of likely scenarios, considering all the interacting forces. 
    3. Focus on objective, long-term, diagnosis over prescriptive, near-term advice. 
    4. Look for crucial considerations in problems that haven’t been considered clearly. 
    5. Standardize data collection to build long-term trends and comparisons. 
    6. Model simply, think complexly. 
    7. Assess the possibility of avoiding, limiting, and reshaping competitions entirely.
  A historical use-case for net assessment: hacking Soviet spending
  In conclusion: So what? 
None
No comments

This post examines net assessment, a framework for evaluating  strategic competition that evolved to inform U.S. defense policy during the Cold War. We explain what net assessment is, its methods and principles, and how some of its tools can be applied to reason about highly uncertain, long-term tech competitions with potentially existential stakes.

 

What is net assessment?

In the late 1950s, the Cold War slid into an especially dangerous period. Nuclear stockpiles swelled and delivery capabilities advanced, while a decades-long buildup by the Soviet Union challenged the United States’ conventional military dominance. Defense analysts needed to reframe the way they looked at military competition: the U.S. could not overpower the Soviet war machine with brute force, and the prospect of nuclear war both elevated the stakes of conflict and created a need for new metrics and principles of strategy for engaging in limited competition. It was in this context that the framework of “net assessment” began to develop.

Andrew Marshall, who founded and then directed the DoD’s Office of Net Assessment for forty-two years, described net assessment as follows:

"Our notion of a net assessment is that it is a careful comparison of U.S. weapon systems, forces, and policies in relation to those of other countries. It is comprehensive, including description of the forces, operational doctrines and practices, training regime, logistics, known or conjectured effectiveness in various environments, design practices and their effect on equipment costs, performance, and procurement practices and their influence on cost and lead times. The use of net assessment is intended to be diagnostic. It will highlight efficiency and inefficiency in the way we and others do things, and areas of comparative advantage with respect to our rivals."

Generalizing from its original military context, the core idea of net assessment is to create comprehensive (hence the “net” in the name), objective, and informative comparisons between competitors: be they states, companies, or other political or ideological coalitions. Insights drawn from these comparisons can inform more sustainable strategies and help build beneficial competitive dynamics instead of disastrous ones on tech and policy subjects of high importance.

Such analyses tend to zero in on crucial considerations. In a particularly salient example, Marshall himself displayed early concern about threats such as synthetic biorisk: writing a foreword to a biosecurity volume in 2010, and commissioning multiple biotech threat analyses looking at bioterrorism, biodefense, and biotechnology as a strategic blind spot. 

Overall, as net assessment is not a well defined single procedure. This piece attempts to describe and examine some of the underlying principles and tools of Net Assessment, their successes, and their applicability to other areas of significant long-term moral importance beyond military competition. 

 

The methods and principles of net assessment

The most common tools of net assessment are scenarios, wargames, trend analyses, and considered judgment.  Scenarios and wargames force analysts to become near to the dynamics under consideration. Stepping through exactly how a competitor thinks and the options at their disposal creates new insights and enables a kind of long-distance participant observation. Trend analysis bolts the inside-view [? · GW] knowledge gained from wargaming onto measurable, outside-view base rates, and the integration of these methods enables high-quality considered judgment. This is especially helpful assessing novel situations where new technologies, doctrines, and strategic actors may produce very different results from those implied by prior base rates. 

Net assessments tend to mix qualitative and quantitative analytical methods. The Office of Net Assessment honed in on simple trends and explored nuance with wargames: they didn’t tend to build extremely complex models (or at least not themselves). Quantitative optimization can work wonders, but neglected issues often involve ideas and ways of thinking that haven’t happened before and lack numbers or giant datasets. Just as winning wars often requires thinking about fuzzy concepts like morale and not just optimizing projected kill-death ratios, promoting long-term flourishing will require more than optimizing lives legibly-saved per dollar. 

The following principles are key to applying the ideas mentioned above: 

 

1. Look for asymmetries between competitors

Different competitors will have different strengths and weaknesses, efficiencies and inefficiencies, strategies and goals, doctrines, policies, and programs. Net assessment merges analysis of these differences into a coherent understanding of each side’s view of the balance of power, which enables better forecasting of their behavior, including idiosyncratic and irrational blind spots that simpler models might miss. 

Finding the irrationalities and blind spots that emerge from bureaucratic, cultural, and political factors is one of the central ways net assessment provides strategic value. Early forecasts of Soviet defense decision making often implicitly assumed unitary rational actor models that were not very descriptive of how actual bureaucracies function. When asked to forecast Soviet bomber basing in the early Cold War, RAND analysts assumed the USSR would relocate air bases along the Trans-Siberian Railway deep within the country’s interior to leverage increasing bomber range, assure logistics access, and to make targeting more difficult for U.S. bombers.[1] If the analysts had accounted for bureaucratic inertia or even the way the U.S. had based its own bombers, they might have considered that the Soviet Union would deploy its bombers along the same vulnerable runways it constructed on it's periphery before it had developed longer range aircraft.

Finding the structural irrationalities of an opponent, its principal-agent problems, and areas of predictable behavior can provide large returns to both competitive and cooperative strategy. Bureaucracies and large organizations enforce structure and execute routines; they don’t reflexively engage in optimal behavior by default: this is as true of gain-of-function research labs and industrial meat producers as it was of the Soviet Union. Exploiting these irrationalities can impose unforced costs on an opponent (e.g., in the case of the Cold War, increased military spending in irrelevant, less threatening areas) or save one’s own resources avoiding unnecessary domains of competition entirely. At the same time, understanding each side's political constraints can also help bound "worst-case scenario" thinking about arms race dynamics, enhance forecasting, and and enable more cooperative strategy by making pre-commitments more credible. 

In general, considering the difficulties of internal alignment and principal-agent problems within organizations is a key part of net assessment. For instance, researchers conducting net assessments about competition among AI organizations might pursue the following questions: What are the internal incentives of the teams at the major AI organizations in the United States and abroad? How do they make decisions, what appears to influence them, and why? What do they perceive as their competitive advantages? What are the major organs and coalitions of each organization, what are their incentives, and how much influence does each have?

Examples of this sort of reasoning—around the internal dynamics of organizations and how agents within them can aim towards conflicting goals—can also be found in selectorate theory and public choice theory style reasoning about incentives.
 

2. Focus on the outcomes of likely scenarios, considering all the interacting forces

Comparing absolute numbers can deceive: if a country has many tanks but no capacity to repair them, it will lose to a lower-inventory country that can keep its tanks on the battlefield for longer—this is how Israel surprised the world with a victory against a large coalition of forces in 1973. More recently, differences in morale, logistics, communications, and human capital have all played to Ukraine’s advantage during the invasion by Russia’s far larger force. Agility, communications, logistics, and maintenance can beat size, but these factors may be difficult to quickly or accurately incorporate into quantitative models and simulations. Unstable alliances and multipolar conflict may also greatly increase the complexity of assessing how large and coordinated a given conflict will be. An individually weak country may perform well above its weight when highly determined and supplied by a superpower. Net assessment’s qualitative methods attempt to account for such factors, and they may prove useful for analyses of the AI landscape that go beyond determining who spends the most on compute or who has the biggest model. Crisis simulations, war games, and searching for analogies and disanalogies from applied history can help to form concepts for how things will go in reality.

 

3. Focus on objective, long-term, diagnosis over prescriptive, near-term advice

Net assessment aims to elevate decision maker's understanding of the world and to assess problems and opportunities. Instead of continuously redirecting attention to short-term policy goals, net assessment aims to provide neglected value by taking a holistic view and characterizing the long-term dynamics of competition. This same sort of rigorous long-term trend analysis is also likely to be valuable in many cause areas, from global poverty reduction to technology policy. As explained below, net assessment’s greatest contribution to ending the Cold War came from analyzing the peacetime patterns in Soviet military expenditure, not from generating quick responses to any particular crisis. By looking to the decades ahead, approaches like net assessment can inform strategies that outmaneuver near-term focused bureaucracies, avoid conflict, and channel resources toward future advantages. 

This doesn’t mean the capability to analyze and act fast isn’t valuable, just that longer-term slower analysis is uniquely valuable when the incentives of large organizations get captured by short-term feedback loops. For issues that evolve extremely quickly such as AI, coming up with faster ways to generate “all things considered” perspectives is likely to be very important in generating sound strategic advice.[2]

 

4. Look for crucial considerations in problems that haven’t been considered clearly. 

When framing problems, it is essential to assure you focus on what matters and to not just follow the inertia of how things have been done before. Metrics such as “territory captured” weren’t that relevant for thinking about conflicts as existential as nuclear war, and this was a key impetus toward developing net assessment. 

If something has been quantified and modeled, someone has paid attention to it, which sometimes means that marginal intellectual effort may be less valuable. Methods like net assessment often focus attention on the low-hanging fruit found in qualitative or unquantified areas of research. In one memorable episode from the Cold War, when skeptics questioned Herman Kahn’s claim to have become a “world leading expert on ending nuclear war”, he responded: “I put two junior people on it for a couple days last week. We’ve thought more about it than the entire Department of Defense has.” 

While fortunately we’ve never had to test this “expertise” on ending nuclear war,  in general when a problem only has attention from a limited number of angles there can be massive value to new approaches. Many failures of preparation and analysis are failures of imagination rather than failures of calculation. Sometimes a small amount of thought can add a lot of value, and you are more likely to run into the small thoughts that add a lot of value by spending time thinking about more things

Solving wicked [EA · GW], not-yet-decomposed problems requires concentrated thinking and flexible methods. Andrew Marshall said that “the single most productive resource that can be brought to bear making net assessments is sustained hard intellectual effort” (see also: the Feynman Algorithm).
 

Without standardized, panoptic data, it is difficult to fairly or accurately compare competitors that have many differences. It is often essential to ask what spending is relevant to measure and how to adjust comparisons for qualitative factors or differences in how clearly bureaucracies organize and describe their efforts. Since net assessment is best done by analysts with access to as much relevant data as possible, access to non-public data is often essential, and accordingly analysis must be secured to avoid providing an advantage to competitors. While secrecy is less relevant in less adversarial contexts, the importance of seeking comprehensive relevant data remains. Compiling AI progress metrics (including interpretability and capacity), policy differences between countries, biosafety failures, and a variety of other measures can all serve to inform risk reduction. Consistent with the principle of long-term thinking above, time-series data should ideally cover long timescales.

Several writers have investigated AI trends, but trends that are further upstream of or unrelated to capabilities may be less well studied in the same context. On the more obvious side, what are the trends in AI chip production and how will the various actors respond to plausible changes? What kinds of AI spending are actually concerning, and what are the trends for that subset of spending? How will the major players react to different advances in other technologies? At what speed are the most relevant talent pools for ML growing and who has the best pipelines for attracting and assessing talent? Are there any concerning trends in the behavior of leading firms and states? What are the influence trends of different potential sources of regulation?

 

6. Model simply, think complexly

As discussed above, net assessment grounds itself in robust trends; this practice helps analysts balance the psychological biases of anchoring effect and the representativeness heuristic against each other and toward more accurate estimation. After assembling the empirical knowledge of how organizations & their organs work, how the world will probably change, how it might change, and how to measure the important variables, the final step of net assessment is to use war-games and considered understanding of everyone’s positions and habits to predict behavior in response to different developments. How might organizations react to certain legislation or policies? How could they respond to increased public concern? While any individual scenario may not be likely, analyzing a large number of them may reveal non-obvious strategies that do well in a wide variety of potential futures and highlight which uncertainties matter the most. The complexity of scenario-building can make net assessments difficult to vet, so the process benefits from high-quality experts with different kinds of domain expertise. In a military context: business analysts, economists, engineers, regional experts, historians, social scientists, military personnel, and industrialists all had useful knowledge. When trying to build a team for a net assessment that will produce neglected insights, look for people with clearly relevant experience and expertise that is rarely applied within the domain of study.

 

7. Assess the possibility of avoiding, limiting, and reshaping competitions entirely.

Net assessment is often thought about in the context of developing competitive strategies that map one’s enduring strengths against a competitor’s enduring weaknesses, but there’s no point in doing this if costly competition can be avoided. The use of methods like net assessment are diagnostic and not inherently zero-sum: they can be used to inform cooperation as well as competition and to cultivate more factually grounded strategic empathy. Net assessments can help identify trends, organizational quirks, cultural values, and patterns of thought which can enable socialization strategies to end competitions through détente, realignment, or friendship.

Before engaging in competitive strategy, one has to ask if a potential opponent can be befriended or aligned. If two parties can establish a justified basis of trust, deconflict their interests, and establish incentives toward win-win cooperation, then that’s certainly better than fighting a hot war, a cold war, or even engaging in political competition. When a competitor’s leadership or demeanor changes, there may be new opportunities for confidence building and arms control, as occurred with the rise of Gorbachev.

Likewise, within adversarial competitions, it is still useful to diagnose areas where cooperation and socialization strategies are helpful both for positive-sum and competitive ends.[3] Following WW2, the U.S. and USSR did not directly go to war: this minimal level of cooperation (driven by both deterrence and domestic distaste for war) reflects a mutually beneficial limit on competition. Within the confines of peace-time competition, further cooperation on arms control similarly provided mutual benefit: reducing the burden of military competition. At the same time, arms control could also reshape competition in a favorable manner: reduction agreements on U.S. and Soviet nuclear forces weren’t just a reduction of risk and hostility, but arguably may have also shifted military competition in a qualitative direction that favored the United States. At the same time, pursuing arms control for a competitive advantage is often a recipe for failing to get arms control at all, and an advantage at one level may be offset by disadvantages at other levels. New START did not restrain Soviet tactical nuclear weapons, and the destructive power of the remaining Russian arsenal likely still exceeds that of the entire rest of the world combined. Lastly, socialization strategies can have competitive and stabilizing benefits not just from building one’s own alliances, but also by reacting to divisions within and between blocks of competitors: such as with Nixon’s efforts to open China and to deter Soviet nuclear attack on China. As opposed to socialization strategies, the less friendly variants of competitive strategy start making sense where and when the interests of each party are inherently at odds and the other side is inherently determined to put up a fight. 

Even with a determined unalignable competitor, tailored competitive strategies won’t always be the best response. If the other side is adaptively rational, flexible, and hard to predict, the returns to modeling the other side’s bureaucracy and decision making culture may be short-lived. In such situations, it is better to focus on agility and quickly pivoting in response to future events. If one is too weak to target an unalignable competitor’s weaknesses directly, then it may be best to “hide and bide”: avoiding conflict and banking on the future by leveraging existing trends better than other competitors. 

Overall, tailored competitive strategies are best reserved for unalignable opponents that possess enduring, predictable weaknesses against which you can match your own coalition’s strengths.[4] As a highly determined, non-alignable opponent with territorial ambitions and exploitable structural irrationalities, the Soviet Union was a good target for U.S. competitive strategy.

 

A historical use-case for net assessment: hacking Soviet spending

Historically, the most direct application of net assessment was to discover insights that lead to victory in long-term competition. For example, one of the Office of Net Assessment’s most valuable insights during the Cold War was in identifying large inefficiencies in how the Soviets would respond to American military expenses. As one example, using the above principles, analysts noted that the Soviet defense bureaucracy was irrationally wedded to building single-purpose, single-use surface to air missile systems in response to improvements in American long-range bombers, which although individually expensive, were fewer in number and multi-use. 

The focus on air-defense systems, and deploying them in mass to defend the massive periphery of the Soviet Union, was irrational because intercontinental ballistic missiles already held the entire USSR vulnerable in the event of a nuclear war, regardless of any air-defenses.[5] This observation allowed the United States to induce a massive waste of resources on the part of the Soviets simply by building more useful long-range bombers. While in isolation the extra air defense may not have been an extreme burden, this was far from the only area of tremendous Soviet defense spending. At one point the CIA and Pentagon estimated that the Soviet Union was spending 1—2 percent of their GDP on bunkers to protect their leadership in war: bunkers that could potentially be targeted and destroyed at a far lower cost to the U.S.[6] While very risky from an escalatory perspective and unlikely to work as initially proposed, President Reagan’s “Star Wars” missile defense program could likewise be viewed in a similar way: it took advantage of recent purges of Soviet spies from the U.S. defense industry and likely generated disproportionately wasteful Soviet spending by playing to the fear that the U.S. might make huge advances quickly that couldn’t be stolen via espionage. Overall, the total USSR defense burden consumed about a quarter of Soviet GNP and accordingly, cost imposition strategies made the Soviet war machine and USSR altogether less sustainable. [7] Other, shorter-horizon schools of analysis failed to note these sorts of glitches in the Soviet bureaucracy because they did not consider the conflict on the timescale of decades and at the level of bureaucratic incentives. 

Taking a step back and envisioning longer time horizons, the kind of competitive strategy used to hack Soviet Defense spending was not an unambiguous win. Cost imposition strategies can force economic costs upon the innocent and encourage dangerous arms races or high-risk posturing. While cost-imposition helped end the Cold War faster and focused the Soviet Union on more inherently defensive weapons, efforts that played on Soviet leaders’ fear of a surprise attack may have increased the probability of accidental or inadvertent nuclear war.[8] The collapse of the Soviet Union also brought a collapse in life expectancy for former Soviet citizens and the risk of loose nukes or bioweapons. Were it not for the Nunn-Lugar Cooperative Threat Reduction Program, many weapons of mass destruction may have remained unguarded and the Russian arsenal might be far larger today. With Russia’s modern invasion of Ukraine and nuclear threats, winning a techno-economic competition at one point in time is no guarantee of lasting existential security. Russia’s government didn’t become aligned with the West, and while its military, nuclear arsenal, and economy are much weaker than they could have been, its remaining nuclear forces remain a potentially massive threat.

 

In conclusion: So what? 

In some ways, the difference between operations analysis and net assessment parallels the difference between GiveWell’s analysis and hits-based giving: looking for crucial considerations that may lead to neglected counter-intuitive strategies. As philanthropists and do-gooders set their sights on harder, more amorphous problems that defy easy quantization, some of the methods and thought tools used in net assessment may prove useful for problem framing. 

Military arms races and competitions in AI capabilities [? · GW] and bioweapons [? · GW] technology are some of the most obvious use cases for some of the tools of net assessment. Are there opportunities for competitors to cooperate to reduce risks? Are there exploitable irrationalities among the type of organizations most likely to develop dangerous AI systems or bioweapons? Are there areas where you can reliably take competitive actions without triggering an arms race? What about ways to trigger a competitor to take stabilizing actions, like spending more on safety or wasting attention and resources on lower risk activities? Net assessment can also be useful outside of x-risk areas. For instance, similar approaches could be applied to analyze the dynamics of policy competition between animal welfare advocates and industrial animal agriculture companies.

Fundamentally, net assessment reveals opportunities that idealized models of competition often obscure, and some of its related thought tools and styles of thinking can be useful across many causes. Zooming in on how different organizations actually function can reveal structural inertias and irrationalities that can enable competition to unfold very differently. While these insights can be abused to exacerbate risk for an advantage, they can also enable strategies to escape from disastrous competitive equilibria to align competitiveness with safety. 
 

 

 

End notes:

Footnotes:

  1. ^

    See page 44-45 in "The Last Warrior: Andrew Marshall and the Shaping of Modern American Defense Strategy"

  2. ^

    At the same time, the closer analysts move to near-term prescriptive advice, the closer they may come to political competition for influence and the associated risks of politicized analysis.

  3. ^

    Note that blending these can be risky and erode the ability to achieve positive-sum, win-win compromises.

  4. ^
  5. ^

    In exercising this sort of reasoning, it is important to consider the different potential causes of bureaucratic biases and if they are actually "irrational." The impetus for Soviet overreaction to bombers may have been a response to the huge numbers of bombers the U.S. deployed in the 1950s, the massive arsenal that came with them, and the sympathies of prior Generals like Curtis LeMay toward preemptive attack. It's also possible that Soviet bureaucrats thought that air defense missile costs would decline enough to be much more economical. Another source of bias toward surface to air weapons could be a lack of trust in fighter pilot autonomy: a disproportionate cost for authoritarian systems. If for some reason Soviet war planners thought they could take out U.S. ICBMs on the ground, but not alerted bombers, this would also have created a bias toward air defense, and the bomber strategy would accordingly have been a good counter measure. In any case of large state-funded industry, many actors will have the motive to keep their funding secure even when their efforts are no longer adaptive. 

  6. ^
  7. ^

    A notable memo on this subject was Marshall’s, “Estimates of Soviet GNP and Military Burden,” Memorandum for the Secretary of Defense through the Assistant Secretary of Defense (ISA), August 2, 1988. The episode is recounted starting on page 172 of "The Last Warrior: Andrew Marshall and the Shaping of Modern American Defense Strategy"

  8. ^

    In the book “The Dead Hand” there are numerous examples of how Soviet Intelligence grew paranoid, and invested a great deal of analytic effort into investigating potential signs of first strike intentions in the West. At the same time, some of these narratives around incidents such as Able Archer 83, are likely to be heavily exaggerated. Because the prospect of existential risk can motivate extreme action, many strategic actors often have the incentive to engage in influence operations that over or underplay such threats. 

0 comments

Comments sorted by top scores.