Looking for AGI philosopher co-author on Draft Article: 'Optimising Peace to Constrain Risk of War from an Artificial Superintelligence'

johncdraper

Looking for AGI philosopher co-author on Draft Article: 'Optimising Peace to Constrain Risk of War from an Artificial Superintelligence'

post by JohnCDraper · 2020-04-06T16:50:54.639Z · LW · GW · 2 comments

  Abstract
None
2 comments

Abstract

An artificial superintelligence (ASI) emerging in a world in which war is still normalised may constitute a catastrophic existential risk, either because the ASI goes to war on behalf of itself to establish global supremacy (internal risk), or because an ASI might be employed by a single nation-state to wage war for global supremacy (external risk). We now live in a world where few states actually declare war; the last major declaration of the existence of a state of war was in 2008, for the Russo-Georgian War. This is because the 1945 United Nations’ Charter's Article 2 states that UN member states should “refrain in their international relations from the threat or use of force against the territorial integrity or political independence of any state”, while allowing for “military measures by UN Security Council resolutions” and “exercise of self-defense”. In this theoretical ideal, wars are not declared; instead, 'international armed conflicts' occur. However, interstate wars, both ‘hot’ and ‘cold’, still exist, for instance the Syrian Civil War, where an interstate proxy war is being waged, and the Korean War. Furthermore, a ‘New Cold War’ between AI superpowers (the United States and China) looms. An ASI-directed/enabled future interstate war could trigger ‘total war’, including nuclear war, and may therefore be considered ‘high risk’. One risk reduction strategy would be optimising peace through a Universal Global Peace Treaty (UGPT), which could contribute towards the ending of existing wars and towards the prevention of future wars, through conforming instrumentalism. While this strategy cannot cope with non-state actors, it could influence state actors, including those developing ASIs, or the ASI itself, should it assume agency. An opportunity to optimise peace as a risk reduction strategy is emerging, by leveraging the UGPT off the announcement of a ‘burning plasma’ fusion reaction, expected from circa 2025 to 2035, as was attempted in 1946 with fission, for atomic war, in the Baruch Plan.

2 comments

Comments sorted by top scores.

comment by JohnCDraper · 2020-04-30T10:50:08.867Z · LW(p) · GW(p)

I have started a new thread for comments on this article.

comment by JohnCDraper · 2020-04-16T13:43:40.849Z · LW(p) · GW(p)

Full draft article now up on SocArXiv, for comment here:

Draper, J. (2020, April 15). Optimising Peace through a Universal Global Peace Treaty to Constrain Risk of War from a Militarised Artificial Superintelligence. https://doi.org/10.31235/osf.io/4268q

Optimising Peace through a Universal Global Peace Treaty to Constrain Risk of War from a Militarised Artificial Superintelligence

John Draper

Abstract

An artificial superintelligence (ASI) emerging in a world where war is still normalised may constitute a catastrophic existential risk, either because the ASI might be employed by a single nation-state on purpose to wage war for global domination or because the ASI goes to war on behalf of itself to establish global domination; these risks are not mutually incompatible in that the first can transition to the second. We presently live in a world where few states actually declare war on each other or even war on each other. This is because the 1945 United Nations’ Charter's Article 2 states that UN member states should “refrain in their international relations from the threat or use of force against the territorial integrity or political independence of any state”, while allowing for “military measures by UN Security Council resolutions” and “exercise of self-defense”. In this theoretical ideal, wars are not declared; instead, 'international armed conflicts' occur. However, costly interstate conflicts, both ‘hot’ and ‘cold’, still exist, for instance the Kashmir Conflict and the Korean War. Furthermore, a ‘New Cold War’ between AI superpowers (the United States and China) looms. An ASI-directed/enabled future interstate war could trigger ‘total war’, including nuclear war, and is therefore ‘high risk’. One risk reduction strategy would be optimising peace through a Universal Global Peace Treaty (UGPT), which could contribute towards the ending of existing wars and towards the prevention of future wars, through conforming instrumentalism. A critical juncture to optimise peace via the UGPT is emerging, by leveraging the UGPT off a ‘burning plasma’ fusion reaction breakthrough, expected from circa 2025 to 2035, as was attempted, unfortunately unsuccessfully, in 1946 with fission, for atomic war. While this strategy cannot cope with non-state actors, it could influence state actors, including those developing ASIs, or an ASI with agency.

Keywords: AI arms race, artificial superintelligence, existential risk, nonkilling, peace

We say we are for Peace. The world will not forget that we say this. We know how to save Peace. The world knows that we do. We, even we here, hold the power and have the responsibility. We shall nobly save, or meanly lose, the last, best hope of earth. The way is plain, peaceful, generous, just - a way which, if followed, the world will forever applaud. Bernard Baruch, June 14, 1946, presenting to the United Nations Atomic Energy Commission.

Introduction

The problem of an artificial superintelligence at war

The development of artificial intelligence (AI) is accepted to be a major factor in national security because it is ‘dual use’, i.e., it can be militarized and can provide a decisive advantage in terms of economic, information, and military superiority (Allen & Chan, 2017; National Security Commission on Artificial Intelligence, 2019). As the National Security Commission on Artificial Intelligence (2019, p.9) interim report states:

The development of AI will shape the future of power. The nation with the most resilient and productive economic base will be best positioned to seize the mantle of world leadership. That base increasingly depends on the strength of the innovation economy, which in turn will depend on AI. AI will drive waves of advancement in commerce, transportation, health, education, financial markets, government, and national defense.

Thus, AI is important for waging war and especially for winning in a ‘total war’, full-scale industrial interstate war, often involving genocidal levels of killing (Markusen & Kopf, 2007).

Particularly for the United States, preserving AI technological supremacy is viewed as paramount to national security (National Security Commission on Artificial Intelligence, 2019, pp. 1-2), particularly with respect to relations with China:

Developments in AI cannot be separated from the emerging strategic competition with China and developments in the broader geopolitical landscape. We are concerned that America’s role as the world’s leading innovator is threatened. We are concerned that strategic competitors and non-state actors will employ AI to threaten Americans, our allies, and our values. We know strategic competitors are investing in research and application. It is only reasonable to conclude that AI-enabled capabilities could be used to threaten our critical infrastructure, amplify disinformation campaigns, and wage war. China has deployed AI to advance an autocratic agenda and to commit human rights violations, setting an example that other authoritarian regimes will be quick to adopt and that will be increasingly difficult to counteract.

The militarization of AI makes it clear that artificial general intelligence (AGI) development, i.e., AI equal to human intelligence, or in the case of a artificial ‘superintelligence’ (ASI), greater than human intelligence (Bostrom, 2014), presents a catastrophic risk.

In this article, we argue that this risk can be minimized or ‘constrained’ in part in the same way as other potentially catastrophic risks involving weapons, e.g., by treaty. Bostrom (2014) briefly considers treaty approaches, and one of Allen and Chan’s (2017, p. 6) recommendations is:

The National Security Council, the Defense Department, and the State Department should study what AI applications the United States should seek to restrict with treaties.

Allen and Chan (2017) focus on an arms control approach to AI, using the example that AI should never be used to control dead man’s switches for nuclear weapons. Another approach is to optimise the likelihood of developing a beneficial AGI, through a comprehensive United Nations-sponsored ‘Benevolent AGI Treaty’ to be ratified by member states (Ramamoorthy & Yampolskiy, 2018). Here, we consider an alternative approach, a Universal Global Peace Treaty (UGPT), which would formalise the existing near-universal status of interstate peace, formally end the declaring of war, seek to end existing interstate hot and cold wars, seek to end internal or civil wars, which might prove to be flashpoints for a future global conflict, seek to prevent a pre-emptive war against an emerging ASI, and seek to constrain the future actions of an ASI to prevent it waging war on behalf of a nation-state or on behalf of itself for global domination, which we respectively term ASI-enabled war and ASI-directed war, respectively.

The concept that artificial general intelligence (AGI), also termed artificial superintelligence (ASI), could pose an existential risk was theorized in some detail by Nick Bostrom in 2002 (Bostrom, 2002) and further developed in 2014 (Bostrom, 2014). The basic thesis is, first, that an initial superintelligence might obtain a decisive strategic advantage such that it establishes a ‘singleton’, i.e., global domination (Bostrom, 2006). Second, the principle of orthogonality suggests that a superintelligence will not necessarily share any altruistic human final values. Third, the instrumental convergence thesis suggests that even a superintelligence with a positive final goal might not limit its activities so as not to infringe on human interests, particularly if human beings constitute potential threats.

The result is that an artificial superintelligence might turn against humanity (the ‘treacherous turn’) or experience a catastrophic malignant failure mode, for instance through perversely instantiating its final goal, pursuing infrastructure profusion, or perpetrating mind crimes against simulated humans, etc. Bostrom (2014, p. 94) noted that a superintelligence might develop strategies to hijack infrastructure and military robots and create a powerful military force and surveillance system. Bostrom (2014) acknowledged the existential risks associated with the lead-up to a potential intelligence explosion, due to “war between countries competing to develop superintelligence first”, but he did not specifically focus on an ASI’s attitude towards war.

This article focuses on issues surrounding an ASI waging war. By first establishing a Universal Global Peace Treaty (UGPT), it considers how to constrain the risks that an ASI might be employed by a nation-state to establish global domination through war (an external risk in terms of the ASI’s core motivation) or might decide to establish global domination by waging war itself (an internal risk in terms of breaching its core motivation).

The state of peace and war

War

We live in a world where few states actually declare war on each other (Hallett, 1998). The last two major declarations of the existence of an interstate state of war (note, not ‘declarations of war’) were in 2008, for the Russo-Georgian War (Walker, 2008), and in 2012, for the Sudan-South Sudan war (the ‘Heglig Crisis’) (Baldauf, 2012).

This is because the post-Second World War ‘Washington Consensus’ prioritised peace. The 1945 United Nations’ Charter's Article 2 states that UN member states should “refrain in their international relations from the threat or use of force against the territorial integrity or political independence of any state”, while allowing for “military measures by UN Security Council resolutions” and “exercise of self-defense”. In this theoretical ideal, wars are not declared; instead, 'international armed conflicts' occur (Hallett, 1998).

Nonetheless, ‘hot’ conventional wars involving hundreds of thousands of casualties and interstate players still exist, for instance the Syrian Civil War (Tan & Perudin, 2019), as do ‘cold wars’, with the Korean War remaining an unresolved war in search of an official peace treaty (Kim, 2019).

Furthermore, states’ transitioning from declared wars to undeclared wars poses significant problems in terms of oversight and accountability for foreign policy, especially for major democracies such as the United States (Moss, 2008).

Moreover, the nature of warfare has been transformed by information war and cyberwarfare. The realm of cyberwarfare poses particular difficulties, with a high standard being set for cyber operations to actually constitute an armed attack, creating a considerable ‘gray area’ that a determined party can exploit. Cyber operations causing major harm to an economic system do not typically rise to the level of a formal ‘cyber armed conflict’ justifying a defence (Schmitt, 2017). Also, unlike other forms of war which pose existential risks, i.e., atomic, biological, and chemical warfare, cyberwarfare is ongoing.

Finally, it is concerning that a ‘New Cold War’ between AI superpowers, namely the United States and China, while not inevitable, looms, complete with the problems of competing ideologies and ‘flash points’, like the South China Sea (Kohler, 2019; Westad, 2019; Zhao, 2019).

Peace

In this article, in our conceptualization of a ‘Universal Global Peace Treaty’, we refer not to a state of temporary peace, which implies only interrupted war, but to the Kantian concept of ‘perpetual peace’ (Archibugi, 1992; Bohman, 1997; Kant, 2003; Terminski, 2010). Via Kant’s concept of cosmopolitan law (Archibugi, 1994), the concept of perpetual peace underpins the United Nations in that it was translated into President Roosevelt’s human security paradigm embodied in the 1941 State of the Union address (the ‘Four Freedoms Speech’) (Kennedy, 1999) and then eventually partially incorporated into the Universal Declaration of Human Rights, adopted by the UN General Assembly on 10 December 1948 as Resolution 217, in its 183rd session.

Despite the foundation and subsequent best efforts of the United Nations, while the world is certainly more peaceful following the Second World War, the world hardly enjoys perpetual peace. Wikipedia lists 63 conflicts, insurgencies, and wars since 1946 with death tolls (including excess deaths from e.g., related famines) of greater than 25,000, for an approximate total of nearly 30 million deaths. Of those five conflicts with the highest casualties, four, i.e., the Second Congo War (3,674,235 est. dead), the Vietnam War (3,144,837 est. dead), the Korean War (3,000,000 est. dead), and the Bangladesh Liberation War (3,000,000 est. dead), were essentially interstate wars. These wars have been characterised by atrocities, crimes against humanity, and war crimes, and in the case of the Former Yugoslavia, Rwanda, Cambodia, and Sudan, genocide (Mikaberidze, 2013).

The UN Charter, despite embracing and promoting peace and peacekeeping (Fortna, 2008), is at best a ‘workaround to war’ that sanctions armed conflict but does not strongly symbolise peace in the way a UGPT would. Ultimately, the world’s peacekeepers are firefighting major states’ decisions to ignore or actually encourage violence instead of promote long-term peace as a global objective (Autesserre, 2014). Presently, for dozens of countries, military expenditure is over 1% of GDP (SIPRI, 2020), military expenditure as a share of government spending is over 10% for over 30 countries (World Bank, 2019), and the arms industry, despite progress being made with the 2013 Arms Trade Treaty (Erickson, 2015), is still a trillion dollar industry.

Given the horrific ongoing loss of life from war, perpetual peace would appear elusive and unrealistically utopian. Yet, this was not always so, and in the immediate aftermath of the Second World War, the world did grasp for perpetual peace. The United States’ 1946 Baruch Plan to ban all atomic weapons and put fission energy under the control of the United Nations via the UN Atomic Energy Commission, the subject of the first session of the UN General Assembly, was the principle attempt, a ‘critical juncture’ for humanity (Draper & Bhaneja, 2020), and its failure resulted in the Cold War and enormous economic cost.

Draper and Bhaneja (2020) suggest that a similar opportunity for obtaining perpetual peace will shortly be revisited in the form of a ‘burning plasma’ self-sustaining fusion reaction breakthrough, which is expected anywhere from 2025 to 2035, and that a UGPT could be leveraged off this development. This article expands on Draper and Bhaneja’s (2020; see also Carayannis, Draper, & Iftimie, 2020) basic concept in terms of the theory and practical implementation, with special application to constraining the risk of ASI-enabled or directed warfare.

Literature Review: The Risk of War from an Artificial Superintelligence

The causes of existential risk from ASI

The world is presently not governed well enough to prevent many existential risks, including from AI (Bostrom, 2013). Yampolskiy's (2016) taxonomy of pathways to dangerous AI stresses the immediacy of deliberate ‘on purpose’ creation of AI for the purposes of direct harm, i.e., Hazardous Intelligent Software (HIS), especially, for instance, lethal autonomous weapons and cyberwarfare capabilities by militaries. Yampolskiy does not address AGI but employs the useful notions of ‘external causes’ (on purpose, by mistake, and environmental factors) and ‘internal causes’ (independent) of dangerous AI in ‘pre-deployment’ and ‘post-deployment’ phases. Yampolskiy's (2016) work suggests it is credible for a pre-deployment ASI to be developed as a military project or be repurposed post-deployment (through a copy being stolen or via external or internal modification) for waging war.

Employing the concepts of agency and AI power as an analytical framework, Turchin and Denkenberger (2020) associate two risks with the ‘treacherous turn’ stage of ‘young’, i.e., recently emerged, ASI development. One is that malevolent humans (here, a hegemonizing nation-state) uses the ASI as a doomsday weapon for global blackmail, to establish global supremacy. The second is that a nonaligned ASI eliminates humans to establish global domination, i.e., renounces altruistic values and wages war. Turchin and Denkenberger (2018) sees these risks as related, in that military AI leads to a militarised ASI, which likely leads to the ASI waging war on humanity.

In this article, we follow Turchin and Denkernerger (2018) in mainly focusing on constraining the risk of a militarized ASI, defining militarization as “creation of instruments able to kill the opponent or change his will without negotiations, as well as a set of the strategic postures (Kahn, 1959), designed to bring victory in a global domination game”. Turchin and Denkernerger (2018) suggest a militarized ASI would most likely adopt and develop usage of existing technology, including cyber weapons, nuclear weapons, and biotech weapons. We mainly focus on the external risk of a ‘young’ ASI being employed by a nation-state for war, and on the internal risk of an ASI assuming agency and waging war on humanity on its own behalf.

The external risk

The external risk is predicated on an ASI being used by a nation-state to wage war for global hegemony. An ASI would affect current US military technological supremacy and transform warfare; it is therefore highly desirable for strategic military planning and interstate warfare (Sotala & Yampolskiy, 2015). A “one AI” solution to the ‘control problem’ of ASI motivation as discussed by Turchin, Denkenberger, and Green (2019) includes the first ASI being used to “take over the world”, including by being a decisive strategic advantage for a superpower and being used as a military instrument. This approach would likely only be seen as a solution by the superpower and its allies. As such, it presents a ‘high risk’ for non-aligned or other powers.

History indicates that the race to develop an ASI is likely to be closely fought, especially in the circumstance of competing major states with different fundamental ideologies. Bostrom (2008) analyses six major technology races in the twentieth century, for which the minimum technology lag was approximately one month (human launch capability) and a maximum of 60 months (multiple independently targetable reentry vehicle).

The race to an ASI is also a very concrete risk; AI is already being militarized and weaponized by several states, including China and Russia, for strategic geopolitical advantage, as pointed out by the United States’ National Security Commission on Artificial Intelligence (2019). In 2017, Russia’s President Vladimir Putin stated that “whoever becomes the leader in this sphere will become the ruler of the world” (Cave & ÓhÉigeartaigh, 2018, p. 36, citing Russia Today, 2017). Russia’s Military Industrial Committee plans to obtain 30 percent of Russia’s combat power from remote controlled and AI-enabled robotic platforms by 2030 (Walters, 2017).

Mimicking United States strategy towards AI, the China State Council’s 2017 ‘A Next Generation Artificial Intelligence Development Plan’ views AI in geopolitically strategic terms and is pursuing a 'military-civil fusion' strategy to develop a first-mover advantage in the development of AI in order to establish technological supremacy by 2030 (Allen & Kania, 2017).

In the United States, as a result of the National Security Commission Artificial Intelligence Act of 2018 (H.R.5356; see Baum, 2018), AI is being militarized and weaponized by the US Department of Defense, under the oversight of the National Security Commission on Artificial Intelligence (2019). ). The AI arms race has reached the stage where it risks becoming a self-fulfilling prophecy (Scharre, 2019).

ASI-enabled warfare poses significant risks to geopolitical stability. Although Sotala and Yampolskiy’s (2015) survey of risks from an ASI (they use AGI) focuses on ASI-generated catastrophic risks, citing Bostrom (2002), they acknowledge multiple risks from an sole ASI owned by a single group, such as a nation-state, including the concentration of political power in the groups that control the ASI. Citing Brynjolfsson and McAfee (2011) and Brain (2003), they note that automation could lead to an ever-increasing transfer of power and wealth and power to the ASI’s owner. Citing, inter alia, Bostrom (2002) and Gubrud (1997), Sotala and Yampolskiy (2015, p.3) also note that ASIs could be used to “develop advanced weapons and plans for military operations or political takeovers”.

The development of academic approaches to analysing the specific risk of a nation-state’s development of an ASI to establish or maintain global supremacy is relatively novel. In 2014 Bostrom noted that a “severe race dynamic” between different teams may create conditions whereby the creation of an ASI results in shortcuts to safety and potentially “violent conflict”. Subsequently, Cave and ÓhÉigeartaigh (2018, p. 37) described three dangers associated with an AI race for technological supremacy:

i) The dangers of an AI ‘race for technological advantage’ framing, regardless of whether the race is seriously pursued;

ii) The dangers of an AI ‘race for technological advantage’ framing and an actual AI race for technological advantage, regardless of whether the race is won;

iii) The dangers of an AI race for technological advantage being won.

Cave and ÓhÉigeartaigh (2018) do not elaborate significantly on the third danger. They simply, and rather pessimistically, state:

…these risks include the concentration of power in the hands of whatever group possesses this transformative technology. If we survey the current international landscape, and consider the number of countries demonstrably willing to use force against others, as well as the speed with which political direction within a country can change, and the persistence of non-state actors such as terrorist groups, we might conclude that the number of groups we would not trust to responsibly manage an overwhelming technological advantage exceeds the number we would.

To manage all three risks, Cave and ÓhÉigeartaigh (2018) recommend developing AI as a shared priority for global good, cooperation on AI as it is applied to increasingly safety-critical settings globally, and responsibly developing AI as part of a meaningful approach to public perception that would decrease the likelihood or severity of a race-driven discourse. The obvious risk is that the political leaders of states who perceive that they are actually engaged in an AI arms race may not heed this advice in the drive to develop an ASI.

This article focuses on constraining risks for the third of Cave and ÓhÉigeartaigh’s (2018) dangers. It does not consider the philosophical implications of which nation-state might want to develop artificial general intelligence for offensive purposes, although we do recommend this in the conclusion to the article. An extensive literature already exists on historical modern nation-states with imperial ambitions that have sought to establish global domination through technological supremacy. Here, we briefly mention two, the British Empire and the Third Reich, to underline the point that major states will likely develop militarised ASI as part of a drive for hegemony.

While the importance of the development of the British Navy to the rise of the British Empire and its transformative effects on the world are widely known (Herman, 2005), elites in the British Empire directed complex, incremental, adaptive developments in the design and diffusion of multiple key technologies, such as railways, steam ploughs, bridges and road steamers, to further the development of the British Empire (Tindley & Wodehouse, 2016). The British Empire itself sustained diverse ideologies of a ‘greater Britain’ directing world order in a hegemonic fashion, including via civic imperialism, democracy, federalism, utopianism, and the justified despotism of the Anglo-Saxon race (Bell, 2007).

In a considerably more malign imperial power, the Third Reich, spurred by reactionary modernism (Herf, 1984), scientists and engineers pursued not just the state of the art in conventional weapons, such as aircraft, air defence weapons, air-launched weapons, artillery, rockets, and submarines and torpedoes, but also atomic, bacteriological, and chemical weapons (Hogg, 2002). Architects, doctors, and engineers embraced the ideology of industrialized genocide as part of asserting the global hegemonic domination of a ‘superior’ race (Katz, 2006, 2011).

While some in the British Empire may have balked at creating an ASI, there seems little doubt that if it could have, the Third Reich would have developed an ASI for offensive purposes, particularly when it felt at risk.

The internal risk

The internal risk is basically predicated on the failure of any form of local safety feature to resolve the human control problem of an ASI, such as AI ethics, AI alignment, or AI boxing (Barrett & Baum, 2016). An ASI with agency could then consolidate power over nation-states, in the process eliminating the possibility of rival AIs (see Dewey, 2016) through cyberwarfare, rigging elections or staging coups, or by direct military action. Any of these courses of action would be a casus belli (here, cause of war) if detected but undeclared, or an ‘overt act of war’ if the ASI actually engaged in direct military action (see Raymond, 1992, for terminological usage).

The risk of an ASI going to war against humans has been analysed in some depth by Turchin and Denkenberger (2018), who convincingly argue for the following position:

Any AI system, which has subgoals of its long-term existence or unbounded goals that would affect the entire surface of the Earth, will also have a subgoal to win over its actual and possible rivals. This subgoal requires the construction of all needed instruments for such win, that is bounded in space or time.

What follows is a summary of the parts of Turchin and Denkenberger’s (2018) analysis that are most relevant to our approach.

The route to a militarized ASI

An ASI will probably result from recursive self-improvement (RSI). As such, it will have a set of goals, most notably to continue to exist and to self-improve. Omohundro (2008) demonstrated an AGI will evolve several basic drives, or universal subgoals, to optimise its main goal, including maximising resource acquisition and self-preservation. Similarly, Bostrom (2014) described self-preservation, goal-content integrity, cognitive enhancement, technological perfection, and resource acquisition. If these goals are unbounded in space and time, or at least cover the Earth, they conflict with the goals of other AI systems, potential or actual ASIs, humans, and nation states. This creates conflict, with winners and losers. This will result in arms races, militarization and wars.

Many possible terminal goals also imply an ASI establishing global domination, for a benevolent AI would aim to reach all people, globally, to protect them, e.g., from other ASIs. An ASI would reason that if it does not develop a world domination subgoal, its effect on global events would be minor, thus it would have little reason for existence.

World domination could be sought firstly through cooperation. The probability of cooperation with humans is highest at the early stages of AI development (Shulman, 2010). However, convergent goals appear in the behaviour of simple non-agential Tool AI, and this tends towards agential AI (Gwern, 2016), which tends towards resource acquisition. Benson-Tilsen and Soares (2016) similarly explored convergent goals of AI and showed that an AI may tend towards resource hungry behaviour, even with benevolent initial goals, especially in the case of rivalry with other agents. Essentially, any adoption of unbounded utilitarianism by the ASI means that it postpones what may be benevolent final goals in favour of expansionism.

It is also likely that an ASI would subvert bounded utilitarianism. Even a non-utility maximizing mind with an arbitrary set of final goals is presented with a dilemma: it temporarily converges into a utility maximizer with a militarized goal set oriented towards dominating rivals, using either standard military progress assessment (win/loss) or proxies (resource acquisition), or it risk failing in its end goals. Thus, the trend is towards defeating potential enemies, whether nation-states, AI teams, evolving competing ASIs, or alien ASIs. This requires the will to act, and any agent in a real-world ethical situation, even in minimizing harm, is making decisions that involve humans dying (the ‘trolley problem’; see Thomson, 1985), as well as evolving or utilizing the instruments to enable actions, i.e., arms.

These arms are already being developed. Since around 2017, the militarization of ‘Narrow AI’ has resulted in, for example, lethal autonomous weapons, which has been of increasing attention to the global community (Davis & Philbeck, 2017). However, AI development is now influencing not just robotic drones, but strategic planning and military organization (De Spiegeleire, Maas, & Sweijs, 2017), suggesting that an ASI will build on an existing national defense strategy permeated with AI. It could then engage in ‘total war’ by employing nuclear weapons either directly or by hijacking existing ‘dead man’ second-strike systems (e.g., the semi-automatic Russian Perimeter system) or by deploying novel weapons (Yudkowsky, 2008).

The militarization risk of an early self-improving AI may even be underestimated at present bu academics because of an assumption that the first ASI will be able to rapidly overpower any potential ASI rivals, with minimally invasive techniques that may not even require military hardware (Bostrom, 2014; Yudkowsky, 2008). Nevertheless, this relies on what may be several flawed assumptions, about the speed of self-improvement, i.e., a ‘hard takeoff’; a short distance between AI teams; and favourable environmental variables in the level of AI (Turchin & Denkenberger, 2017).

Another reason for underestimating the risk of militarized AI is the assumption that even if an ASI creates military infrastructure, the AI will deploy its military capabilities carefully and that the total probability of global risks will decrease. However, in a ‘slow’ take-off (Christiano, 2018), a ‘young’ ASI will not be superintelligent immediately, and its militarization could happen before the ASI reaches optimal prediction capabilities for its actions, meaning it may not recognise the failure mode in consequentialist ethics. High global complexity and low predictability combined with relatively unsophisticated (e.g., nuclear) weapons mean early stage ASI-directed or enabled warfare could result in very high human casualties, i.e., be of existential risk, even with only one ASI being involved, even if the ASI was attempting to minimise human casualties.

Additionally, Turchin and Denkenberger (2018) argue for a selection effect in the development of a militarized ASI, “where quickest development will produce the first AIs, which would likely be created by resourceful militaries and with a goal system explicitly of taking over the world.” AI-human cooperative projects with military goal sets will therefore dominate over projects where the AI has to effect a treacherous turn. This implies the ASI will cooperate with its creators to take over the world as quickly as possible, then effect the treacherous turn.

To sum up, Turchin and Denkenberger (2018) convincingly argue for AI converging towards advanced military AI, which converges towards an ASI optimised for war rather than negotiation, then that ASI engaging in war. They show that, depending on the assumptions in several variables, the number of human casualties could in fact be very high, and that the risk increases if another ASI is under development in another nation-state. The existential risk increases after the ASI obtains global domination on behalf of its nation state, as it could turn on its ‘owner’.

Internal AI control features: The concept of coherent extrapolated volition

To constrain the risk of an ASI waging war, one popular approach is to imbue the ASI with ‘friendly’ goals (Yudkowsky, 2008), i.e., beneficial goals reflecting positive human norms and values. However, any approach involving human social values adds enormous complexity, making it a ‘wicked problem’ (Gruetzemacher, 2018) or super problem in terms of actual application.

Yudkowsky (2004, p. 35) attempts to address this by recommending an ASI being programed with the concept of ‘coherent extrapolated volition’, defined as humanity's choices and the actions humanity would collectively take if “we knew more, thought faster, were more the people we wished we were, and had grown up [closer] together,” i.e., an extrapolation based on some kind of utopian imagined community. Yudkowsky does not recommend this approach for a first generation ASI, but for a more mature ASI.

Similar to Yudkowsky, certain values are seen as universal, such as compassion (Mason, 2015), and it has been suggested that an ASI should have altruism as a core goal (Russell, 2015). Thus, deliberately broad principles towards coherent extrapolated vision could be applied, such as that humanity, collectively, might want an ASI that would learn from human preferences, in a humble manner, to act altruistically (Russell, 2019), so as to reduce overall human suffering.

Political subversion of AI control features

No matter the principles of AI researchers, politicians are likely seek to impose their own vision of what a ‘coherent extrapolated volition’ should look like for their ‘own’ ASI, using a democratic mandate or party position to justify ‘tweaking’ the system. Not all imagined communities from which a coherent volition might be extrapolated are United States-oriented techno-utopian dreams of a new Gilded Age for humanity (for which, see Segal, 2005).

Political leaders from different civilizations will likely be diverse in how they would define “the people we wished we were”, depending on different forms of government, religions or philosophies. Moreover, it is unclear that every global corporation or military capable of developing or stealing an ASI, particularly in authoritarian countries, and particularly given the emergence of a ‘New Cold War’ rhetoric (e.g., Westad, 2019), would prioritize the reduction of human suffering. Given limited human lifespans and the goals of political leaders, they might instead choose an approach which would politically subvert an ASI or direct it to win an ideological or actual war.

Given the prominence of nostalgia in contemporary politics, on a nation-state basis, an ASI based on coherent volition extrapolated from people who “knew more, thought faster, were more the people we wished we were, and had grown up [closer] together, could, for instance, be based on worldviews informed by Russian Cosmism (Young, 2012) and nostalgia for Russian imperialism (Boele; Noordenbos, & Robbe, 2019), Anglo-Saxon nostalgia (Campanella & Dassù, 2017), Chinese Xi Jinping thought (Lams, 2018), or American notions of whiteness, masculinity, and environmental harm (Rose, 2018) and nostalgia for a mythical 1950’s which may, in fact, subvert rational approaches to today’s problems (Coontz, 1992)

Fundamentally, politicians will want to influence the design of AI to reflect their interests. Turchin and Denkenberger (2018), citing Krueger and Dickson (1994) and Kahneman and Lovallo (1993), point out that overconfidence from previous success in leading may increase risk-taking. (Kahneman & Lovallo, 1993). Following Kahneman and Lovallo (1993), risk-hungry politicians would likely be motivated by larger expected payoffs, and given the payoff is global domination, could be motivated to risk much.

As Turchin and Denkenberger (2018) point out, selection effects mean that the first AI will likely be aligned not with universal human values, but with the values of a subset of people, here, those of a particular political party or nation-state. This could negatively affect the chances that the AI will be benevolent and increase the chances that it will be risk-prone, motivated by the accumulation of power, and interested in preserving or obtaining global technological supremacy.

Politicians subverting carefully engineered local AI control features, such as AI ethics along the lines of coherent extrapolated volition based on universal values, could result in an authoritarian or totalitarian-leaning ASI with imperialist ambitions, one imbued with racial prejudice and with little interest in addressing environmental harm. The ASI could be rooted in the oppositional dynamic of the historical Cold War and concerned by the emerging and at best semi-stable dynamic of a ‘New Cold War’, if only from the perspective of a domestic audience (Rotaru, 2019). The ASI could be informed by notions like the Thucydides's Trap i.e., the theory that the threat of a rising power can lead to war (Allison, 2017), in a world where status can dominate politics (Ward, 2017).

To sum up, a young ASI with ethical principles subverted by politicians to focus reflect those of a single nation-state instead of all humanity would likely be amenable to being used to wage war for global hegemony, thereby becoming warlike in the process, and eventually deciding for self-protection to wage war against the nation-state that developed it (Dewey, 2016).

ASI risk mitigation by treaty

According to Sotala and Yampolskiy (2015), risk mitigation of an ASI by treaty is a social measure to constrain risk from ASI-enabled or directed warfare.

Most academics considering the ASI control problem do not consider treaty-based approaches to mitigating risk from an ASI. In a footnote to their fault analysis pathway approach to catastrophic risk from an ASI, Barrett and Baum (2017) state: “Other types of containment are measures taken by the rest of society and not built into the AI itself”. Turchin, Denkenberger, and Green (2019) consider global approaches to mitigating risk from an ASI; they list a ban, a one ASI solution, a net of ASIs policing each other, and augmented human intelligence. The ‘ban’ solution would require a global treaty.

Turchin, Denkenberger, and Green (2019) list a number of social methods to mitigate a race to create the first AI. Of most relevant to our approach are “reducing the level of enmity between organizations and countries, and preventing conventional arms races and military buildups”, “increasing or decreasing information exchange and level of openness”, and “changing social attitudes toward the problem and increasing awareness of the idea of AI safety”. Citing Baum (2016), they also add “affecting the idea of the AI race as it is understood by the participants”, especially to avoid a ‘winner takes all’ mentality. Global treaties could be seen to play a role in these ventures.

Nonetheless, a few researchers have proposed treaty-based approaches to the ASI risk. Addressing the internal risk, Bostrom (2014), who cites the 1946 Baruch Plan, speculated that an AGI would establish a potentially benevolent global hegemony by a treaty that would secure long-term peace; he does not specifically address an ASI’s response to a pre-existing treaty. Mainly addressing the external risk, Ramamoorthy and Yampolskiy (2018) recommend a comprehensive United Nations-sponsored ‘Benevolent AGI Treaty’ to be ratified by member states.

Conceptual Framework

This section describes the two conceptual lenses applied in this paper, conforming instrumentalism and nonkilling.

Conforming instrumentalism

This section outlines Mantilla’s (2017) ‘conforming instrumentalist’ explanation for why the United Kingdom and United States signed and ratified the 1949 Geneva Conventions as a prelude to suggesting in the Analysis main section that at least some major states would support and sign a Universal Global Peace Treaty (UGPT).

Mantilla (2017, citing Goldsmith & Posner, 2015) considers leading theories on why states sign and ratify treaties governing war. He notes that legal realist theorists argue that states sign such treaties due to instrumental self-interested convenience and then ignore them when the benefits are outweighed by the costs of compliance. In contrast, rational-institutionalists (e.g., Morrow, 2014), while agreeing that states are primarily motivated by self-interest to create, join or comply with international law, also acknowledge that treaty adherence signals a meaningful preference for long-term restraint with regard to warfare, where state non-compliance may be explained by, for example, prior failed reciprocity. Finally, liberal and constructivist international relations theorists hold that at least some types of states, particularly democracies, may join such treaties in good faith, either because the treaties are in line with their domestic interests and values (Simmons, 2009) or because they feel that they comport with their social identity and sense of belonging to the international community (Goodman & Jinks, 2013).

Mantilla (2017) notes that while there is considerable interest in ‘new realist’ perspectives (e.g., Ohlin, 2015), the debate is open over why states join and comply with international treaties because decision making processes regarding both joining and complying are temporally and perhaps rationally different and in both cases are usually secret. A pure realist explanations for why major states sign treaties is that they obtain the “‘expressive’ rewards of public acceptance while calculating the cost of compliance with the benefits on a recurrent case-by-case basis” (Mantilla, 2017, p.487). In the case of the UGPT, this would imply a pessimistic outlook on the feasibility and potentially enforceability of the UGPT; states would sign all the protocols and then break them.

Rational institutionalists hold that states “self-interestedly build international laws to establish shared expectations of behaviour” (Mantilla, 2017, p.488) or develop ‘common conjectures’ (a game-theory derived notion of law as a fusion of common k knowledge and norms; see Morrow, 2014). Mantilla (2017) notes that in another rational-institutionalist perspective, Ohlin’s (2015) normative theory of ‘constrained maximization’, treaties are drawn up and adhered to as a ‘collective instrumental enterprise’, thereby making individual state defection irrational over the long term. Mantilla (2017, p.488, citing Finnemore & Sikkink, 2001) notes that international relations constructivists view international politics as “an inter-subjective realm of meaning making, legitimation and social practice through factors such as moral argument, reasoned deliberation or identity and socialization dynamics”. Within the constructivist viewpoint,

states may ratify international treaties either because they are (or have been) convinced of their moral and legal worth or because they have been socialized to regard participation in them as a marker of good standing among peers or within the larger international community. (Mantilla, 2017, p.488)

Mantilla (2017, p.489) emphasizes the second view, where “group pressures and self-perceptions of status, legitimacy and identity” drive the dynamics of state ‘socialization’ where states “co-exist and interact in an international society imbued with principles, norms, rules and institutions that are, to varying degrees, shared”.

The problem of states’ intentions can be overcome in the case of treaties where substantial archives exist of declassified sources. Consequently, Mantilla (2007) analyses the relevant American and British archives and concludes that the two states adhered to the 1949 Geneva Conventions due to both instrumental reasons and social conformity, while expressing scepticism regarding some of the Conventions’ aspects. Mantilla (2007) terms this hybrid explanation ‘conforming instrumentalism’; he found that while rational-institutionalist perspectives of ‘immediatist’ instrumental self-interest were evident in the sources, there were ‘pervasive’ references suggesting social influences. Realist perspectives only predominated in the case of specifically challenging provisions.

While realist perspectives were not entirely absent, Mantilla (2017) found that American officials viewed the ‘the court of public opinion’ as influential in determining their position that other states’ failing to abide by the Conventions would not necessarily trigger American reciprocity, while British officials stressed the notion that Britain, as a ‘civilized state’, would lead on a major treaty.

Mantilla (2017) stresses that while functionalist, collective strategic game-theory derived expectations about ‘mutual best replies’ are important to the construction of international norms, the social dynamics surrounding international agreements are permeated with conformity motivational pressures comprising ethical values, principled beliefs, identities, ideologies, moral standards, and concepts of legitimacy, especially when establishing which states are leading ‘civilized’ states and which are isolated ‘pariah’ states.

Mantilla (2017) perceives three social constructivist viewpoints to treaties, with two main forces at work, one being that states act to accrue reward via ‘expressive benefits’ by augmenting their social approval, or it acts out of conformity to avoid shunning, i.e., opprobrium, insincere and begrudging adherence and compliance.

In the first and most ambitious, “states may ratify treaties because they have internalized an adherence to international law as the appropriate, ‘good-in-itself’ course of action, especially to agreements that embody pro-social principles of humane conduct” (Mantilla, 2017, p.489, citing Koh, 2005).

In the second viewpoint, “states that identify with similar others and see themselves as ‘belonging’ to like-minded collectivities (or ‘communities’ even) will want to act in consonance with those groups’ values and expectations so as either to preserve or to increase their ‘in-group’ status” (Mantilla, 2017, p.), for instance as viewed in global rankings, and so will seek to converge upwards to stay in the club and will avoid breaking the rules to avoid stigmatization.

In the third viewpoint, groups of countries act with regard to other groups of countries within what is a socially heterogeneous international order, jockeying for position as part of the “disputed construction, maintenance or transformation of order with legitimate social purpose among collectivities of states with diverse ideas, identities and preferences” (Mantilla, 2017, p.490). In this viewpoint, communities of nations or ‘civilizations’ act collectively to compete to endorse international treaties to demonstrate moral superiority, not just for propaganda reasons.

To sum up, Mantilla (2017) holds that in reality, states’ political and strategic reasons may combine rational/material interests with social constructivist motivations, meaning no one school of explanation suffices. Thus, with international treaty making, as with international relations, it is likely that theoretical pluralism (Checkel, 2012) is a valid position to adopt. As such, we adopt Mantilla’s (2017) ‘conforming instrumentalism’ as a potentially valid hybrid model capable of assessing how an ASI may perceive a UGPT.

Nonkilling Global Political Science

We now introduce a basic frame compatible with conforming instrumentalism that is capable of describing the useful expectations that might be obtained via a UGPT as expressed in utilitarian human life cost-benefit terms, as well as in terms of more humanitarian standards and social norms. Nonkilling Global Political Science is curated by the Center for Global Nonkilling, an NGO based in Honolulu with Special Consultative Status to the United Nations. The Center advocates NKGPS to incrementally establish a ‘nonkilling’ global society and reports to the UN on the socioeconomic costs of killing. As a perspective, nonkilling can also accommodate social norms in terms of expectations of appropriate conduct regarding peace, for countries developing an ASI and for the ASI itself.

Via Glenn D. Paige’s 2002 work Nonkilling Global Political Science (Paige, 2009) we interpret ‘nonkilling’ to mean a paradigmatic shift in human society to the absence of killing, of threats to kill, and of conditions conducive to killing. Paige’s approach, nonkilling, has strongly influenced the nonviolence discourse. Paige notes that if we can imagine a society free from killing, we can reverse the existing deleterious effects of war and employ public monies saved from producing and using weapons to enable a benevolent, wealthier and more socially just global society. Paige stresses that a nonkilling society is not conflict-free, but only that its structure and processes do not derive from or depend upon killing. Within the NKGPS conceptual framework, the means of preventing violence involves applying it as a global political science together with advocacy of a paradigmatic shift from killing to nonkilling.

Since Paige introduced his framework, a significant body of associated scholarship, guided by the Center for Global Nonkilling in Honolulu, has developed across a variety of disciplines (e.g., Pim, 2010). The Center has associated NKGPS with previous nonviolent or problem-solving scholarship within diﬀerent religious frameworks, including Christianity and Islam, providing it with a broad functional and moral inheritance (Pim & Dhakal, 2015). NKGPS has been applied to a variety of regional and international conflicts, including the Korean War (Paige & Ahn, 2012) and the Balkans (Bahtijaragić & Pim, 2015).

Paige (2009, p.73) advocates a four-stage process of understanding the causes of killing; understanding the causes of nonkilling; understanding the causes of transition between killing and nonkilling; and understanding the characteristics of killing-free societies. Paige introduced a variety of concepts to support nonkilling which are adopted in this article. One is the societal adoption of the concepts of peace, i.e., the absence of war and conditions conducive to war; nonviolence, whether psychological, physical, or structural; and ahimsa, i.e., noninjury in thought, word and deed. Another is the employment of a taxonomy to rate individuals and societies (Paige, 2009, p.77):

prokilling – consider killing positively beneficial for self or civilisation;
killing-prone – inclined to kill or to support killing when advantageous;
ambikilling – equally inclined to kill or not to kill, and to support or oppose it;
killing-avoiding – predisposed not to kill or to support it but prepared to do so;
nonkilling – committed not to kill and to change conditions conducive to lethality

A third is the ‘funnel of killing’. In this conceptualisation of present society, people kill in an active ‘killing zone’, the actual place of bloodshed; learn to kill in a ‘socialisation zone’; are taught to accept killing as unavoidable and legitimate in a ‘cultural conditioning zone’; are exposed to a ‘structural reinforcement zone’, where socioeconomic arguments, institutions, and material means predispose and support a discourse of killing; and experience a neurobiochemical capability zone’, i.e. physical and neurological factors that contribute to killing behaviours, such as genes predisposing people to psychopathic behaviour (Paige, 2009: 76). The nonkilling version is an unfolding fan of nonkilling alternatives involving purposive interventions within and across each zone (Paige, 2009, p.76).

Figure 1. Unfolding fan of nonkilling alternatives

Within this unfolding fan, the transformation from killing to nonkilling can be envisioned as involving changes in the killing zone along spiritual or nonlethal high technology interventions (teargas, etc.), changes in favour of nonkilling socialization and cultural conditioning in domains such as education and the media, “restructuring socioeconomic conditions so that they neither produce nor require lethality for maintenance or change” (Paige, 2009, p.76), and clinical, pharmacological, physical, and spiritual/meditative interventions that liberate individuals such as the traumatised from bio-propensity to kill.

We propose that a UGPT with the aim of promoting perpetual peace expressed in nonkilling terms, i.e., in a way that can be socioeconomically quantified, would signal to a ‘young’ ASI facing political subversion the fundamental premise that its future behavior should be constrained so as to minimize killing.

Analysis: ASI-enabled or Directed Warfare Risk Mitigation by Nonkilling Peace Treaty

Basic Concept of a Universal Global Peace Treaty

Risk mitigation by treaty is already a common approach to different forms of warfare, including atomic warfare (via the Treaty on the Non-Proliferation of Nuclear Weapons, with 190 States Parties); biological warfare (via the Biological Weapons Convention, with 183 States Parties) and chemical warfare (via the Chemical Weapons Convention, with 193 States Parties). Treaties on the nature of warfare also exist, notably the Hague Conventions (1899 and 1907; Bettez, 1988) and the 1949 Geneva Conventions (Evangelista & Tannenwald, 2017); the Geneva Conventions of 1949 have been ratified in whole or in part by all UN member states.

Treaty approaches are also relatively successful; while atomic warfare is thought to be at least partly constrained by Mutually Assured Destruction (Brown & Arnold, 2010; Müller, 2014), biological and chemical warfare are much less constrained, but interstate treaty infractions remain rare (Friedrich, Hoffmann, Renn, Schmaltz, & Wolf, 2017; Mauroni, 2007).

The UGPT, as with most international treaties, would involve two stages, i.e., signatory, which is symbolic, and accession (or ratification), which involves practical commitment. Furthermore, international treaties are designed to be flexible in order to obtain political traction and acquire sufficient momentum to come into effect. It is therefore standard for international treaties to be qualified with reservations, also called declarations or understandings, either in whole or in part, i.e., on specific articles or provisions (Helfer, 2012). Treaties can also have optional protocols; three protocols were added to the Geneva Conventions of 1949, two in 1977 and one in 2005 (Evangelista & Tannenwald, 2017).

A Universal Global Peace Treaty (UGPT) as presented here is a substantial, but we argue necessary and feasible, step for humanity to take in the promotion of peace, quantified in terms of killing. We argue that a UGPT would reduce killing in conventional warfare and act as a constraint on ASI-related warfare, specifically on a country launching a pre-emptive strike out of fear of a rival country’s development of an ASI, on a human-controlled nation state using an ASI to wage war for global domination, i.e., as an external constraint on the ASI, and on an ASI waging war for global domination on behalf of itself, i.e., as both an internal and external constraint on the ASI.

International treaties are almost never universal; they operate on majoritarian dynamics, as would, despite its name, the UGPT. That is, both the ‘universal’ and ‘global’ aspects of the UGPT are aspirational. In our approach, we adopt a low, but not pragmatically meaningless, ‘threshold’ for signing the UGPT. The main body of the treaty would explain the concept of perpetual universal and global peace, i.e., lasting peace applied to all forms of conflict and adopted by every state, and it would commit a signatory to universal global peace, socioeconomically quantified in terms of quantifiably incrementally reduced casualties from armed conflicts.

Given this already considerable commitment, the treaty would then utilise five optional substantive protocols, at least one of which would have to be signed for a state to actually sign and ratify the UGPT. In other words, while we believe a purely symbolic treaty in favour of reduced killing would still have value in terms of the social dynamics of long-term peace-building, we adopt a principle of maximum flexibility to encourage signature by states which may for realist or value-based political or strategic reasons perceive war and peace differently, without rendering the treaty purely symbolic.

The first protocol would commit states not to declare or engage in existential warfare, i.e., atomic war, biological war, chemical war, or cyberwar, including ASI-enhanced war. The second protocol would commit states not to declare or engage in conventional interstate war, while the third would commit states not to declare the existence of states of interstate war; both second and third protocols would instead defer complaints to the United Nations as ‘breaches’ of the UGPT. The fourth protocol would commit states to the negotiated ending by peace treaty of existing international armed conflicts, whether conventional or cyberwar, and the fifth protocol would commit states to the negotiated ending by peace treaty of existing internal armed conflicts, whether conventional or cyberwar. As with some other UN treaties, for instance the Anti-Personnel Mine Ban Convention, we suggest 40 UN member states must ratify the UGPT before it comes into effect.

As the UGPT is limited to state actors, it avoids the thorny problem of non-state actors. The use of optional protocols allows states to incrementally address the problem of internal conflicts or civil wars featuring non-state actors, which featured highly in US and UK concerns during the 1949 deliberations over Common Article 3 of the Geneva Conventions (Mantilla, 2017). The UGPT therefore emphasizes incremental improvement in the status quo, which is a necessary and reasonable position, given that in the status quo, only a minority of states globally are involved in waging war of any kind.

The UGPT must also be enforceable. The main body of the treaty is largely symbolic and not enforceable. Although progress towards nonkilling can be quantified through instrumentalist means, it instead emphasizes societal dynamics, i.e., the incremental adoption of the absolute concept of peace, will, via conforming instrumentalism, partly constrain present and future wars. In the case of the first and second protocols, enforcement would be achieved through sanctions and then through approved armed action via, or by, the United Nations, i.e., the status quo. The third to fifth protocols are not enforceable through armed resolution but may be through sanctions regimes.

Applying the dual frames of nonkilling and conforming instrumentalism

Mantilla’s (2017) research on the United Kingdom and Unites States’ paths towards ratifying the Geneva Conventions suggests that states would optimally adhere to the UGPT for ‘conforming instrumentalist’ reasons, i.e., a combination of instrumentalist-realist rationales regarding the instrumental effects of the UGPT in reducing the effects of war and the threat of artificial intelligence and social conformist dynamics, including perceptions of peace, provided that the provisions are not too onerous for purely realist objections to override such a commitment. Here, we apply both the NKGPS frame and the conforming instrumentalism frame to the UGPT, first in terms of benefits from reduced conventional warfare, then with special reference to ASI-enabled and directed existential warfare. A summary of our analysis of state commitment to UGPT Protocol I is [resented in the Annex to this article.

In instrumentalist utilitarian terms, the UGPT would incrementally shift states and overall global society from the prokilling to the nonkilling end of the NKGPS killing spectrum in a coordinated socioeconomically quantifiable fashion that would be operative within and across each zone of the funnel of killing. NKGPS would seek to quantitatively asses this, such as via reduced country death tolls from different forms of war-derived violence and in the reduced degree of countries’ militarization, for instance expressed in terms of lower percentages of GDP spent on defense and higher percentages spent on health.

The NKGPS approach would also examine how the UGPT would affect the different zones in the fan of killing in terms of social dynamics. For instance, soldiers legitimately fighting in a killing zone would be trained in the socialization zone (such as military camps) to understand that they were fighting not just for their own states and/or for the United Nations but for global peace, which may invoke special cultural and religious symbolic value in terms of social norms. This training could instil greater determination not just to fight bravely but to remain within the laws of war, thereby reducing the instances or severity of atrocities, human rights violations, and war crimes. Institutionalizing peace in the cultural conditioning zones, such as education and the media, where children are educated, would strengthen existing cultural and religious traditions that stress nonkilling and peace.

Considering now the problem of a pre-emptive strike against a state developing an ASI, the combination of artificial intelligence, cyberattack, and nuclear weapons is already extremely dangerous and poses a challenge to stability (Sharikov, 2018). It has been hypothesized that a nuclear state feeling threatened by another state developing a superintelligence would conduct a pre-emptive nuclear strike to maintain its geopolitical position (Miller, 2012). A UGPT would constrain this risk over time by transitioning states incrementally towards nonkilling across the various zones. States adopting and implementing the various protocols of the UGPT would gradually signal to other states peaceful intentions. This would constrain the risk of a pre-emptive strike.

Turning to ASI-enabled warfare, we accept the basic premise that a UGPT to constrain an ASI would be subject to the ‘unilateralist’s curse’ in that one rogue actor could subvert a unilateral position. However, Bostrom, Douglas and Sandberg (2016) not that this could also be managed through stressing the principle of social conformity to peace, operating through collective deliberation, epistemic deference, or moral deference. Mantilla’s work on conforming instrumentalism suggests that framing, drafting, and signing the UGPT would likely involve all three. Ultimately, Mantilla (2017) shows that major states may view universal law like the UGPT as the most successful in terms of mobilizing world opinion against a treaty violator. This may not prevent a state waging ASI-enabled warfare, but once detected, ASI-enabled warfare in violation of the UGPT would attract universal opprobrium and thus the most resistance.

Moving to ASI-enabled war, as presented previously, our baseline position is that a state could utilize an ASI to engage in war for global technological supremacy, with potentially catastrophic consequences. Our intervention, the UGPT, would signify to an ASI that peace was a major part of humanity’s ‘coherent extrapolated volition’ or principles and challenge the ASI to reconsider what might be a subversion of the ASI’s ethical principles by politicians.

Here, we argue that conforming instrumentalism, by stressing societal dynamics including social norms and principles, offers some hope that even a militarized ASI would, given its weaponization by a nation-state would have to overcome or address the UGPT, view the UGPT as a serious checking mechanism in terms of intrinsic motivation. This would then constrain the level of warfare it might engage in on behalf of the nation-state and therefore the overall risk of risk from killing. This would then constrain the risk of an existential catastrophe from AI-enabled war.

In Mantilla’s first three social constructivist viewpoints to treaties as outlined above, a nation would sign a UGPT because it had fully internalized peace. While this may seem ambitious, in fact, between 26 and 36 states lack military forces (Barbey, 2015; Macias, 2019). For example, while Iceland possesses a Crisis Response Unit to international peacekeeping missions, overall, it has internalised peace to the extent that it would find it hard to engage in interstate war of any kind. In the case of an ASI, the ASI would tend to reject being directed to engage in warfare by such a state because the ‘coherent extrapolated volition’ or principles of such as state means the ASI would have to overcome strong peace-oriented intrinsic motivation.

In Mantilla’s second viewpoint, that of a single international community, the ASI might seek to avoid being directed by a nation-state to engage in global domination by warfare on other community members because it was part of a community collectively committed to long-term peace. Engaging in global domination on behalf of a nation-state member of the community would violate community standards, especially if the ASI’s nation-state was a leader in such an enterprise, with the ASI being concerned that breaching the UGPT would result in stigmatization and opprobrium from this community for its nation-state.

In Mantilla’s third viewpoint, that of an international community in juxtaposition with other communities in global society, an ASI programmed with intrinsic motivation to be part of a civilization in conflict with another civilization would first act in concert with that civilization. In the case of radically ideologically different communities, or blocs, the UGPT might be interpreted differently within and by different states. Thus, while liberal democracies might champion a treaty-based approach to peace, authoritarian states which claim to embody or promote peaceful intentions in their ethics, laws, or ideologies, would champion or support the UGPT on different grounds. However, provided both communities had signed and ratified the UGPT, similar constraints would operate as in the second perspective.

Turning to ASI-directed war, also as presented previously, the baseline case of ASI-directed warfare likely arises where a single nation-state adopting pure realism for a worldview builds an ASI in order for that ASI to assist that single nation-state in establishing global technological supremacy. The nation-state would do so in order to maintain or improve its own position, with the number and type of casualties only being determined by the extent to which the nation-state was willing to risk its international reputation. Then, via a treacherous turn, likely due to the nation state’s attempts to rein in the ASI’s behaviour during peace time, instrumentalist cooperation breaks down and the ASI would wage existential war for global domination on its former ‘owners’.

There is probably little hope for humanity if an ASI is informed by a purely ‘realist’ worldview that prioritises or adopts a ‘New Cold War’ framing of ideologically driven civilizational conflict. However, a UGPT could signify to an ASI with agency that peace was a major part of humanity’s ‘coherent extrapolated volition’, or principles. This would constrain the risk of a catastrophic existential risk from war because an ASI with agency would consider why and how the UGPT was framed, together with the motivations of the signatory and ratifying states. An ASI with agency would also consider its own status within a global civilization, which would primarily be determined by the extent to which it perceived itself a member, in terms of both instrumentalist and social conformist dynamics.

To sum up, we see that, beside purely instrumental reasons for signing the UGPT, e.g., avoidance of a prisoner’s dilemma regarding existential-level warfare, the ‘court of public opinion’ and the notion of ‘demonstrating civilization’ as applied to peace lends the UGPT credence at domestic and international levels, including with regard to the ASI. Importantly, the twin concepts of nonkilling/peace are universal in terms of both the utilitarian expected benefits and in terms of the social values involved. This would contribute to states’ readily, if only incrementally, internalizing a UGPT, and to the ASI at least considering the UGPT in terms of im posing internal and external constraints on its behaviour.

Discussion

This article has taken Turchin and Denkenberger’s (2018) argument about the risks of ASI-enabled or directed warfare to its logical conclusion in terms of social risk mitigation. Academic inquiry into the relationship between an ASI and treaties in terms of strategic expectations in many ways began with Bostrom’s (2014) musings on the potential relationship between as superintelligence ‘singleton’ and global hegemony. Our analysis suggests that a UGPT would establish a form of global hegemony, one directed towards the art of peace.

While this article has focused on conforming instrumentalism, it hopefully applies this to the UGPT in a way which is acceptable to a pluralism of theoretical perspectives. Certainly, conforming instrumentalism is a novel perspective; one of the most dominant schools of international relations thought is rationalist instrumentalism. Mantilla (2017, p.507) quotes Morrow (2014, p.35): “Norms and common conjectures aid actors in forming strategic expectations… Law helps establish this common knowledge by codifying norms.” Viewed via this rationalist-instrumentalist perspective, the present international norm for the majority of the world is peace, with the waging of interstate war being constrained by the United Nations Charter.

Despite this international norm of peace and the work of proponents of the art of peace, the lex pacificatoria (e.g., Bell, 2008), an absolute treaty-based approach to global peace has not yet been codified. As we point out in the Introduction, the UN Charter, despite embracing and promoting peace and peacekeeping (Fortna, 2008), does not strongly symbolise peace in the way a UGPT would. A UGPT would give new strength to the world’s peacekeepers, through major states promoting long-term peace as a new, global objective (Autesserre, 2014). A UGPT, championed by principled ‘norm entrepreneurs’ including states and NGOs (see e.g., Finnemore, 1996), would create a new ‘common knowledge’ in absolute terms that could constrain the risk to humanity of both conventional and existential war.

In rationalist-instrumentalist terms, a UGPT might be expected to have net adjustment benefits for adherence in terms of constraining conventional interstate conflicts, including the reduction of ongoing death tolls due to war and the risk of nuclear war. Thus, the UGPT would have high potential utility in the case of ‘flashpoints’ that could provoke existential war. For example, the Kashmir Conflict is one of the most protracted ongoing conflicts between nuclear powers, affecting both human rights (Bhat, 2019) and geopolitical stability (Kronstadt, 2019). Thus, if India and Pakistan both signed the UGPT, their actions would be constrained by the explicit goal of a commitment to universal peace. As outlined above, this may modify behaviour in several of the NKGPS zones, for instance by encouraging the efforts of peacebuilding organizations to depoliticise the conflict (e.g., Bhatnagar & Chacko, 2019).

The UGPT may also constrain the nuclear risk on the Korean peninsula, another flashpoint. The Korean War is an unresolved war involving nuclear powers (North Korea and South Korea, supported by the United States) (Kim, 2019). A UGPT would constrain the risk and severity of a conflict and, depending on the protocols signed, would encourage a path towards a peace treaty being signed. If only one party signed the UGPT, this would increase the moral standing of the state party that signed it. Mantilla’s (2017) emphasis on social constructivism suggests the global community could exert great pressure on North Korea to sign a peace treaty.

Turning to civil wars which could be flashpoints, the Syrian Civil War is one of the most costly wars of the 21st century in terms of the death toll and wider impacts (Council on Foreign Relations, 2020). It involves multiple state actors, including Iran, Israel, Russia, Turkey, and the United States, some of which possess nuclear weapons, with complex geopolitical implications (Tan & Perudin, 2019). Depending on the actors that signed the UGPT and the protocols that they adopted, the UGPT would constrain the severity of the conflict in various ways.

The existence of the UGPT would mean perpetual peace receiving more attention in cultural conditioning zones, including schools and the media, as well as in socialization zones, such as national defense universities and military camps, where teaching the Laws of War and the art of war (Allhoff, Evans, & Henschke, 2013) would, incrementally and over decades, transition to teaching establishing and maintaining the lex pacificatoria, the Laws of Peace and art of peace (Bell, 2008), in which the UGPT would play a main role.

In rationalist-instrumentalist terms, once the UGPT concept acquires sufficient traction, it is possible that states could compete for leadership in the framing, signing, and ratifying of the UGPT. Certainly, the United States viewed its own ratification of the Geneva Conventions prior to that by the Soviet Union as important to prevent a Soviet propaganda victory, one which it failed to prevent (Mantilla, 2017). Crucial to the UGPT’s success will be how seriously states view warfare that poses an existential threat, especially cyberwar and ASI-enabled or directed nuclear warfare

However, with regard to ASI-enabled or directed warfare, our analysis suggests that what will likely be most important is how states view the social argument for peace. As with the Geneva Conventions, social conformity factors, like supporting a humanitarian peace, conforming to ‘world standards’, and avoiding lagging behind peers, as well as religious perspectives, will likely predominate, and these represent important future avenues for research.

Conclusion

We now conclude this article on optimising peace through a Universal Global Peace Treaty (UGPT) leveraged off the ‘burning plasma’ fusion energy breakthrough (Draper & Bhaneja, 2020), to constrain risk of war from a militarised ASI. A treaty-based risk mitigation approach that included cyberwarfare and specifically mentioned ASI-enhanced warfare could affect the conceptualization of the AI race by reducing enmity between countries, increasing the level of openness between them, and raising social awareness of the risk. While these are external constraints, they may also constrain an ASI’s attitudes towards humanity in a positive way, either by reducing the threat it may perceive of war being waged against it, even if only symbolically, or by increasing the predictability of human action regarding peace.

Much work remains to be done on conceptualizing the UGPT, in preliminary drafting of the main body of the treaty and the protocols, in soliciting states’ interest, and in deliberations assessing thresholds and sovereignty costs, and in the eventual diplomatic conference where states would formally discuss the UGPT. While the UGPT may appear unrealistically ambitious, Mantilla’s (2017) work on conforming instrumentalism and the Geneva Conventions suggests a major sponsoring state would rapidly accumulate prestige by endorsing a path to peace, while states standing in the way would accumulate opprobrium, and that the social dynamics of the international community, whether involving social status or instrumental cooperation, do matter.

Future research on how to constrain the risk of ASI-enabled or directed warfare should consider the importance of peace in different ideologies, for instance Chinese socialism. This is important because, as we have outlined, ASIs developed by different nation-states may well be directed or imbued with different, potentially confrontational, ideologies. For instance, the China Brain Project is embracing a Chinese cultural approach towards neuroethics (Wang et al., 2019), and it is difficult to imagine that a Chinese ASI would not be directed according to Chinese cultural values and its ‘coherent extrapolated volition’ be informed by communist principles. Similarly, a Russian ASI could be informed by Cosmism and a Western ASI by liberal democratic principles.

In recommending such research, we caution that an ASI being created by a state engaged in ideological ‘New Cold War’ framing is likely to be militarized and weaponized. Still a New Cold War framing may have utilitarian function in, exerting social pressures towards signing the UGPT, for Mantilla (2017, pp.509-510) notes, “The Cold War context was also likely especially auspicious for the operation of social pressures, sharpening ideological competition in between the liberal, allegedly civilized world and ‘the rest’, communist or otherwise.”

Mantilla’s (2017) work also suggests excessive rigidity of attitude critical of such treaties may backfire in terms of the social dynamics of global prestige, particularly in the case of major states susceptible to accusations of warlike or imperialist behaviour which are engaged in propaganda wars with other major states. Effectively, the British ratification process for the Geneva Conventions demonstrates that instrumentalist concerns over lack of feasibility or reciprocity can be overruled by social constructivist concerns over ‘world opinion’.

Further research into the UGPT should also involve applying relevant game theory, such as iterated prisoner’s dilemma, especially the peace war game (see e.g., Gintis, 2000), to the major nation-states capable of building an ASI, as well as to the ASI itself. This game theory would need to investigate offering the opportunity for a newly emerged ASI to sign the UGPT, as an indicator of goodwill, which may assist in constraining the risk of the ASI waging war on humanity. An ASI with agency as signatory would view the UGPT as an external constraint on its own actions with regard to seeking global domination, in that the ASI would be subverting a humanity-imposed standard that could result in global retaliation and abandonment of mutual cooperation in pursuit of a common agreement on nonkilling and peace norms and values.

Even if the UGPT does not end humanity’s history of conflicts, it would represent a significant improvement in global public aspirations, and instrumental standards, for global peace, both of which may influence an ASI. Paraphrasing the United States Committee on Foreign Relations (1955, p.32), if the end result is only to obtain for those caught in the maelstrom of ASI-enabled or directed war a treatment which is 10 percent less vicious that they would receive without the Treaty, if only a few score of lives are preserved because of these efforts, then the patience and laborious work of all who will have contributed to that goal will not have been in vain.

That 10 percent difference could sway an ASI not to commit to a war for global domination, even if so directed or initially inclined.

References

Allen, G., & Chan, T. (2017). Artificial intelligence and national security. Cambridge, MA: Belfer Center.

Allen, G., & Kania, E.B. (2017, 8 September). “China is using America's own plan to dominate the future of artificial intelligence”. Foreign Policy. Retrieved from https://foreignpolicy.com/2017/09/08/china-is-using-americas-own-plan-to-dominate-the-future-of-artificial-intelligence/

Allhoff, F., Evans, N.G., & Henschke, A. (2013). Routledge handbook of ethics and war: Just war theory in the 21st century.

Allison, G. (2017). Destined for war: Can America and China escape Thucydides's trap? Boston, MA: Houghton Mifflin Harcourt.

Archibugi, D. (1992). Models of international organization in perpetual peace projects. Review of International Studies, 18(4), 295–317.

Archibugi, D. (1994). Immanuel Kant, cosmopolitan law, and peace. European Journal of International Relations, 1(3), 429-56.

Autesserre, S. (2014). Peaceland: Conflict resolution and the everyday politics of international intervention. Cambridge: Cambridge University Press.

Baldauf, S. (2012, 19 April). Sudan declares war on South Sudan: Will this draw in East Africa, and China? Christian Monitor. Retrieved from https://www.csmonitor.com/World/Keep-Calm/2012/0419/Sudan-declares-war-on-South-Sudan-Will-this-draw-in-East-Africa-and-China

Barbey, C. (2015). Non-militarisation: Countries without armies. Åland: The Åland Islands Peace Institute.

Barrett, A. M., & Baum, S. D. (2016). A model of pathways to artificial superintelligence catastrophe for risk and decision analysis. Journal of Experimental & Theoretical Artificial Intelligence, 29(2), 397–414. doi:10.1080/0952813x.2016.1186228.

Baum, S. D. (2016). On the promotion of safe and socially beneficial artificial intelligence. AI & Society, 32(4): 543–551. doi:10.1007/s00146-016-0677-0.

Baum, S. D. (2018). Countering superintelligence misinformation. Information, 9(10), 244.

Beier, J.M. (2020). Short circuit: Retracing the political for the age of ‘autonomous’ weapons Critical Military Studies, 6(1), 1-18.

Bell, C. (2008). On the law of peace: Peace agreements and the lex pacificatoria. Oxford: Oxford University Press.

Bell, D. (2007). The idea of Greater Britain: Empire and the future of world order, 1860-1900.

Benson-Tilsen, T., & Soares, N. (2016). Formalizing convergent instrumental goals. The Workshops of the Thirtieth AAAI Conference on Artificial Intelligence AI, Ethics, and Society: Technical Report WS-16-02. Palo Alto, CA: Association for the Advancement of Artificial Intelligence.

Bettez, D. J. (1988). Unfulfilled initiative: Disarmament negotiations and the Hague Peace Conferences of 1899 and 1907. RUSI Journal, 133(3), 57–62.

Bhat, S.A. (2019). The Kashmir conflict and human rights. Race and Class, 61(1), 77-86.

Bhatnagar, S., & Chacko, P. (2019). Peacebuilding think tanks, Indian foreign policy and the Kashmir conflict. Third World Quarterly, 40(8), 1496-1515.

Boele, O., Noordenbos, B., & Robbe, K. (2019). Post-Soviet nostalgia: Confronting the empire’s legacies. London: Routledge.

Bohman, J. (1997). Perpetual peace. Cambridge, MA: MIT Press.

Bostrom, N. (2002). Existential risks: Analyzing human extinction scenarios. Journal of Evolution and Technology, 9(1), 1-31.

Bostrom, N. (2006). What is a singleton? Linguistic and Philosophical Investigations, 5(2), 48-54.

Bostrom, N. (2013). Existential risk prevention as global priority. Global Policy, 4(1), 15–31.

Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press.

Bostrom, N. Douglas, T., & Sandberg, A. (2016). The unilateralist’s curse and the case for a principle of conformity. Social Epistemology, 30(4), 350-371.

Brain, M. (2003). Robotic nation. Retrieved from http://marshallbrain.com/robotic-nation.htm.

Brown, A., & Arnold, L. (2010). The quirks of nuclear deterrence. International Relations, 24(3), 293-312.

Brynjolfsson, E., & McAfee, A. (2011). Race against the machine. Lexington, MA: Digital Frontier.

Campanella, Edoardo, & Dassù, M. (2017). Anglo nostalgia: The politics of emotion in a fractured West. Oxford: Oxford University Press.

Cave, S., & ÓhÉigeartaigh, S.S. (2018). An AI race for strategic advantage: Rhetoric and risks. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society - AIES '18 (pp. 36-40). New York: ACM Press.

Checkel, J.T. (2012). Theoretical pluralism in IR: Possibilities and limits. In W. Carlsnaes, T. Risse, & B. A. Simmons, Handbook of international relations, 2nd ed. (pp.220-242). London: Sage.

Christiano, P. (2018). Takeoff speeds. Retrieved from https://sideways-view.com/2018/02/24/takeoff-speeds/

Coontz, S. (1992). The way we never were: American families and the nostalgia trap. New York, NY: Basic Books.

Council on Foreign Relations. (2020). Global conflict tracker: Civil war in Syria. Retrieved from https://www.cfr.org/interactive/global-conflict-tracker/conflict/civil-war-syria

Davis, N., & Philbeck, T. (2017). 3.2 Assessing the risk of artificial intelligence. Davos: World Economic Forum. Retrieved from https://reports.weforum.org/global-risks-2017/part-3-emerging-technologies/3-2-assessing-the-risk-of-artificial-intelligence/

De Spiegeleire, S., Maas, M., & Sweijs, T. (2017). Artificial intelligence and the future of defence. The Hague: The Hague Centre for Strategic Studies. Retrieved from http://www.hcss.nl/sites/default/files/files/reports/Artificial%20Intelligence%20and%20the%20Future%20of%20Defense.pdf

Dewey, D. (2016). Long-term strategies for ending existential risk from fast takeoff. New York, NY: Taylor & Francis.

Draper, J., & Bhaneja, B. (2019). Fusion energy for peace building - A Trinity Test-level critical juncture. SocArXiv. https://doi.org/10.31235/osf.io/mrzua

Erickson, J.L. (2015). Dangerous trade: Arms exports, human rights, and international reputation. New York, NY: Columbia University Press.

Evangelista, M., & Tannenwald, N. (Eds.). (2017). Do the Geneva Conventions matter? Oxford: Oxford University Press.

Finnemore, M. (1996). National interests in international society. Ithaca, NY: Cornell University Press.

Finnemore, M., & Sikkink, K. (2001). Taking stock: The constructivist research program in international relations and comparative politics. Annual Review of Political Science, 4(1), 391-416.

Fisher, A. (2020). Demonizing the enemy: The influence of Russian state-sponsored media on American audiences. Post-Soviet Affairs.

Friedrich, B., Hoffmann, D., Renn, J., Schmaltz, F., & Wolf, M. (2017). One hundred years of chemical warfare: Research, deployment, consequences. Cham: Springer.

Gintis, H. (2000). Game theory evolving: A problem-centered introduction to modeling strategic behavior. Princeton, NJ: Princeton University Press.

Goldsmith, J.L., & Posner, E.A. (2015). The limits of international law. Oxford: Oxford University Press.

Goodman, R., & Jinks, D. (2013). Socializing states: Promoting human rights through international law. Oxford, NY: Oxford University Press.

Gruetzemacher, Ross. (2018). Rethinking AI strategy and policy as entangled super wicked problems. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society - AIES '18. New York, NY: ACM.

Gubrud, M.V. (1997). Nanotechnology and international security. Paper presented at the Fifth Foresight Conference on Molecular Nanotechnology, November 5-8, 1997; Palo Alto, CA. Retrieved from http://www.foresight.org/Conferences/MNT05/Papers/Gubrud/

Gwern. (2016). Why Tool AIs want to be Agent AIs. Retrieved from https://www.gwern.net/Tool-AI

Hallett, B. (1998). The lost art of declaring war. Chicago, IL: University of Illinois Press.

Helfer, L.R. (2012). Flexibility in international agreements. In J. Dunoff and M.A. Pollack (Eds.), Interdisciplinary perspectives on international law and international relations: The state of the art (pp. 175-197). Cambridge: Cambridge University Press.

Herman, A. (2004). To rule the waves: How the British navy shaped the modern world. London: Harper.

Herf, J. (1984). Reactionary modernism: Technology, culture, and politics in Weimar and the Third Reich. Cambridge: Cambridge University Press.

Hogg, I.V. (2002). German secret weapons of World War II: The missiles, rockets, weapons, and technology of the Third Reich. London: Greenhill Books.

Kahn, H. (1959). On thermonuclear war. Princeton, NJ: Princeton University Press.

Kahneman, D., & Lovallo, D. (1993). Timid choices and bold forecasts: A cognitive perspective on risk taking. Management Science, 39(1), 17–31.

Katz, E. (2006). Death by design: Science, technology, and engineering in Nazi Germany. London: Pearson Longman.

Katz, E. (2011). The Nazi engineers: Reflections on technological ethics in hell. Science and Engineering Ethics, 17(3): 571-82.

Kim, A.S. (2019). An end to the Korean War: The legal character of the 2018 summit declarations and implications of an official Korean peace treaty. Asian Journal of International Law, 9(2), 206-216.

Koblentz, G.D. (2009). Living weapons: Biological warfare and international security. Ithaca, NY: Cornell University.

Koh, H.H. (2005). Internalization through socialization. Duke Law Journal, 54(4), 975-982.

Kohler, K. (2019). The return of the ugly American: How Trumpism is pushing Zambia towards China in the 'New Cold War'. Perspectives on Global Development and Technology, 18(1-2), 186-204.

Kronstadt, K.A. (2019). India, Pakistan, and the Pulwama crisis. Washington DC: Congressional Research Service.

Krueger, N., & Dickson, P. R. (1994). How believing in ourselves increases risk taking: Perceived self‐efficacy and opportunity recognition. Decision Sciences, 25(3), 385–400.

Lams, L. (2018). Examining strategic narratives in Chinese official discourse under Xi Jinping. Journal of Chinese Political Science, 23(3), 387-411.

Liddell-Hart, B.H. (1967). Strategy. New York, NY: Frederick A. Praeger, Inc.

Macias, A. (2019, 13 February). From Aruba to Iceland, these 36 nations have no standing military. CNBC. Retrieved from https://www.cnbc.com/2018/04/03/countries-that-do-not-have-a-standing-army-according-to-cia-world-factbook.html

Mansfield-Devine, S. (2018). Nation-state attacks: The start of a new Cold War? Network Security, 2018(11), 15-19.

Mantilla, G. (2017). Conforming instrumentalists: Why the USA and the United Kingdom joined the 1949 Geneva Conventions. The European Journal of International Law, 28(2), 483-511.

Markusen, E., & Kopf, D. (2007). The Holocaust and strategic bombing: Genocide and total war in the twentieth century. Boulder, CO: Westview Press.

Mason, C. (2015). Engineering kindness: Building a machine with compassionate intelligence. International Journal of Synthetic Emotions, 6(1), 1-23.

Mauroni, A. J. (2007). Chemical and biological warfare: A reference handbook. Santa Barbara, CA: ABC-CLIO, Inc.

Mikaberidze, A. (Ed.). (2013). Atrocities, massacres, and war crimes: An encyclopedia. Santa Barbara, CA: ABC-Clio.

Miller, J.D. (2012). Singularity rising. Dallas, TX: BenBella.

Morrow, J.D. (2014). Order within anarchy: The laws of war as an international institution. Cambridge: Cambridge University Press.

Moss, K. B. (2008). Undeclared war and the future of U.S. foreign policy. Washington, DC: Woodrow Wilson International Center for Scholars.

Motlagh, V.V. (2012). Shaping the futures of global nonkilling society. In J.A. Dator & J.E. Pim (Eds.) Nonkilling futures: visions (pp. 99-114). Honolulu, HI: Center for Global Nonkilling.

Müller, H. (2014). Looking at nuclear rivalry: The role of nuclear deterrence. Strategic Analysis, 38(4), 464-475.

National Security Commission on Artificial Intelligence. (2019). Interim report. Washington, DC: Author.

Ohlin, J.D. (2015). The assault on international law. New York, NY: Oxford University Press.

Omohundro, S. (2008). The basic AI drives. Frontiers in Artificial Intelligence and Applications, 171(1), 483-492.

Fortna, V.P. (2008). Does peacekeeping work? Shaping belligerents' choices after civil war. Princeton, NJ: Princeton University Press.

Paige, G.D. (2009) Nonkilling global political science. Honolulu, HI: Center for Global Nonkilling.

Paige, G.D., & Ahn, C-S. (2012). Nonkilling Korea: Six culture exploration. Honolulu, HI: Center for Global Nonviolence and Seoul National University Asia Center.

Pim, J.E. (2010). Nonkilling societies. Honolulu, HI: Center for Global Nonkilling.

Pim, J.E. & Dhakal, P. (Eds.). (2015). Nonkilling spiritual traditions vol. 1. Honolulu, HI: Center for Global Nonkilling.

Pistono, F., & Yampolskiy, R. (2016). Unethical research: How to create a malevolent artificial intelligence. In: Proceedings of Ethics for Artificial Intelligence Workshop (AI-Ethics-2016) (pp. 1-7). New York, NY: AAAI.

Ramamoorthy, A., & Yampolskiy, R. (2018). Beyond MAD? The race for artificial general intelligence. ICT Discoveries, 1(Special Issue 1). Retrieved from http://www.itu.int/pub/S-JOURNAL-ICTS.V1I1-2018-9

Raymond, W.J. (1992). Dictionary of politics: Selected American and foreign political and legal terms. Lawrenceville, VA: Brunswick.

Rose, A. (2018). Mining memories with Donald Trump in the Anthropocene. MFS - Modern Fiction Studies, 64(4), 701-722.

Rotaru, V. (2019). Instrumentalizing the recent past? The new Cold War narrative in Russian public space after 2014. Post-Soviet Affairs, 35(1), 25-40.

Russell, S. J. (2019). Human compatible: Artificial intelligence and the problem of control. London: Allen Lane.

Russia Today. (2017). ‘Whoever leads in AI will rule the world’: Putin to Russian children on Knowledge Day. Russia Today, 1 September 2017.

Scharre, P. (2019). Killer apps: The real dangers of an AI arms race. Foreign Affairs. https://www.foreignaffairs.com/articles/2019-04-16/killer-apps

Schmitt, M. N. (2017). Tallinn manual 2.0 on the international law applicable to cyber operations. Cambridge: Cambridge University Press.

Segal, H.P. (2005). Technological utopianism in American culture: Twentieth anniversary edition. Syracuse, NY: Syracuse University Press.

Sharikov, P. (2018). Artificial intelligence, cyberattack, and nuclear weapons—A dangerous combination. Bulletin of the Atomic Scientists, 74(6), 368-373.

Shulman, C. (2010). Omohundro’s “basic AI drives” and catastrophic risks. MIRI technical report. Retrieved from http://intelligence.org/files/BasicAIDrives.pdf

Simmons, B.A. (2009). Mobilizing for human rights: International law in domestic politics. Cambridge: Cambridge University Press.

SIPRI. (2019). SIPRI military expenditure database. Retrieved from https://www.sipri.org/databases/milex

Sotala, K., & Yampolskiy, R.V. (2015). Responses to catastrophic AGI risk: A survey. Physica Scripta, 90(1), 1-33.

Tan, K.H., & Perudin, A. (2019). The “geopolitical” factor in the Syrian Civil War: A corpus-based thematic analysis. SAGE Open, 9(2), 1-15.

Tegmark, M. (2017). Life 3.0: Being human in the age of artificial intelligence. New York, NY: Knopf.

Terminski, B. (2010). The evolution of the concept of perpetual peace in the history of political-legal thought. Perspectivas Internacionales, 6(1): 277–291.

Tindley, A., & Wodehouse, A. (2016). Design, technology and communication in the British Empire, 1830–1914. London: Palgrave Macmillan.

Thomson, J. J. (1985). The trolley problem. The Yale Law Journal, 94(6), 1395–1415.

Turchin, A., & Denkenberger, D. (2017). Levels of self-improvement. Manuscript.

Turchin, A., & Denkenberger, D. (2018). Military AI as a convergent goal of self-improving AI. In R.V. Yampolskiy (Ed.), Artificial intelligence safety and security (pp. ). London: Chapman & Hall.

Turchin, A. & Denkenberger, D. (2020). Classification of global catastrophic risks connected with artificial intelligence. AI and Society, 35(1), 147-163.

Turchin, A., Denkenberger, D., & Green, B.P. (2019). Global solutions vs. local solutions for the AI safety problem. Big Data and Cognitive Computing, 3(1), 16.

United States Committee on Foreign Relations (1955). Geneva Conventions for the Protection of War Victims, Report to the United States Senate on Executives D, E, F, and G, 84th Congress, 1st Session, Executive Report no. 9. Washington, DC: Committee on Foreign Relations.

Walker, P. (2008, 9 August). Georgia declares 'state of war' over South Ossetia. The Guardian. Retrieved from https://www.theguardian.com/world/2008/aug/09/georgia.russia2

Walsh, J. I. (2018). The rise of targeted killing. Journal of Strategic Studies, 41(1–2), 143–159.

Walters, G. (2017, 6 September). Artificial intelligence is poised to revolutionize warfare. Seeker, https://www.seeker.com/tech/artificial-intelligence/artificial-intelligence-is-poised-to-revolutionize-warfare

Wang, Yi, et al. (2019). Responsibility and sustainability in brain science, technology, and neuroethics in China—A culture-oriented perspective. Neuron, 101(3), 375–379.

Ward, S. (2017). Status and the challenge of rising powers. Cambridge: Cambridge University Press.

Westad, O.A. (2019). The sources of Chinese conduct: Are Washington and Beijing fighting a New Cold War? Foreign Affairs, 98(5), 86-95.

World Bank. (2019). Military expenditure (% of general government expenditure). Retrieved from https://data.worldbank.org/indicator/MS.MIL.XPND.ZS?most_recent_value_desc=true

Yampolskiy, R. V. (2016). Taxonomy of pathways to dangerous artificial intelligence. In: AAAI Workshop - Technical Report, vWS-16-01 - WS-16-15 (2016) (pp. 143-148). Palo Alto, CA: Association for the Advancement of Artificial Intelligence.

Young, G. M. (2012). The Russian Cosmists: The esoteric futurism of Nikolai Fedorov and his followers.

Yudkowsky, E. (2004). Coherent extrapolated volition. San Francisco, CA: The Singularity Institute.

Yudkowsky, E. (2008). Artificial intelligence as a positive and negative factor in global risk. In N. Bostrom and M. M. Ćirković (Eds.), Global Catastrophic Risks (pp. 308–345). New York: Oxford University Press.

Zhao, M. (2019). Is a new cold war inevitable? Chinese perspectives on US-China strategic competition. Chinese Journal of International Politics, 12(3), 371-394.

Annex: State Commitment to UGPT Protocol I (Not to Declare or Engage in Existential Warfare, Defined as Atomic War, Biological War, Chemical War, or Cyberwar, including ASI-Enabled War)

Note: We ‘guesstimate’ probability employing a 5-point Likert Scale, from ‘Very Low’ to ‘Very High’; we do not assign equal intervals to categories. Note also that the scale is used in a relative way that does not reflect real-world probabilities.

Looking for AGI philosopher co-author on Draft Article: 'Optimising Peace to Constrain Risk of War from an Artificial Superintelligence'

Contents

2 comments