Making a conservative case for alignment
post by Cameron Berg (cameron-berg), Judd Rosenblatt (judd), phgubbins, AE Studio (AEStudio) · 2024-11-15T18:55:40.864Z · LW · GW · 1 commentsContents
AI-not-disempowering-humanity is conservative in the most fundamental sense We've been laying the groundwork for alignment policy in a Republican-controlled government Trump and some of his closest allies have signaled that they are genuinely concerned about AI risk Avoiding an AI-induced catastrophe is obviously not a partisan goal Winning the AI race with China requires leading on both capabilities and safety Concluding thought None 1 comment
Trump and the Republican party will yield broad governmental control during what will almost certainly be a critical period for AGI development. In this post, we want to briefly share various frames and ideas we’ve been thinking through and actively pitching to Republican lawmakers over the past months in preparation for the possibility of a Trump win.
Why are we sharing this here? Given that >98% of the EAs and alignment researchers we surveyed [LW · GW] earlier this year identified as everything-other-than-conservative, we consider thinking through these questions to be another strategically worthwhile neglected [LW · GW] direction.
(Along these lines, we also want to proactively emphasize that politics is the mind-killer [LW · GW], and that, regardless of one’s ideological convictions, those who earnestly care about alignment must take seriously the possibility that Trump will be the US president who presides over the emergence of AGI—and update accordingly in light of this possibility.)
Political orientation: combined sample of (non-alignment) EAs and alignment researchers
AI-not-disempowering-humanity is conservative in the most fundamental sense
- The project of avoiding AI-induced human extinction and preserving our most fundamental values in the process are efforts that intuitively resonate with right-leaning thinkers and policymakers. Just as conservatives seek to preserve and build upon civilization's core institutions and achievements, alignment research aims to ensure our technological advances remain anchored to—rather than in conflict with—our core values.
- Consider Herman Kahn, a leading military strategist and systems theorist at RAND Corporation during the 1950s. He is famous for popularizing the notion of ‘thinking the unthinkable’ with respect to the existential risks faced during the Cold War nuclear arms race when most refused to systematically analyze the possibility of thermonuclear war and its aftermath. This same philosophy, intuitive to conservatives and liberals alike, cleanly maps onto our current moment with AI.
- A conservative approach to AI alignment doesn’t require slowing progress, avoiding open sourcing,[1] etc. Alignment and innovation are mutually necessary, not mutually exclusive: if alignment R&D indeed makes systems more useful and capable, then investing in alignment is investing in US tech leadership.[2] Just as the Space Race and Manhattan Project demonstrated how focused research efforts could advance both security and scientific progress, investing now in alignment research would ensure AI development cements rather than endangers American technological leadership. By solving key technical alignment challenges, we create the foundation for sustainable innovation that preserves our values while still pushing the frontier. The conservative tradition has always understood that lasting achievements require solid foundations and careful stewardship, not just unbounded acceleration.
- As Dean Ball has recently argued in his detailed policy proposals, Republican politics may actually be uniquely positioned to tackle the focused technical challenges of alignment, given that, in contrast to the left, they appear systemically less likely to be self-limited by ‘Everything-Bagel’ patterns of political behavior. Our direct engagement with Republican policymakers over the past months definitely also supports this view.
We've been laying the groundwork for alignment policy in a Republican-controlled government
- Self-deprecating note: we're still learning and iterating on these specific approaches, being fairly new to the policy space ourselves. What follows isn't meant to be a complete solution, but rather some early strategies that have worked better than we initially expected in engaging Republican policymakers on AI safety. We share them in hopes they might prove useful to others working toward these goals, while recognizing there's still much to figure out.
- Over recent months, we've built relationships with Republican policymakers and thinkers—notably including House AI task force representatives, senior congressional staffers, and influential think tank researchers. These discussions revealed significant receptivity to AGI risk concerns when we approached as authentically in-group. Taking time to explain technical fundamentals—like our uncertainty about AI model internals and how scaling laws work—consistently led to "oh shit, this seems like a big deal" moments. Coupling this with concrete examples of security vulnerabilities at major labs and concerns about China’s IP theft particularly drove home the urgency of the situation.
- We found ourselves naturally drawn to focus on helping policymakers "feel the AGI" before pushing specific initiatives. This proved crucial, as we discovered people need time to fully internalize the implications before taking meaningful action. The progression was consistent: once this understanding clicked, they'd start proposing solutions themselves—from increased USAISI funding to Pentagon engagement on AGI preparedness. Our main message evolved to emphasize the need for conservative leadership on AI alignment, particularly given that as AI capabilities grow exponentially, the impact of future government actions will scale accordingly.
- In these conversations, we found ourselves able to introduce traditionally "outside the box" concepts when properly framed and authentically appealing to conservative principles. This kind of "thinking the unthinkable" is exactly the sort of conservative approach needed: confronting hard realities and national security challenges head-on, distinct from less serious attempts to just make AI woke or regulate away innovation.
- A particularly promising development has been lawmakers’ receptivity to a Manhattan Project-scale public/private effort on AI safety, focused on fundamental R&D and hits-based neglected approaches [LW · GW] necessary to solve alignment. When they asked how to actually solve the challenge of aligning AGI, this emerged as our clearest tangible directive.[3] We see significant potential for such initiatives under Republican leadership, especially when framed in terms of sustaining technological supremacy and national security—themes that consistently resonated with conservatives looking to ensure American leadership. While we continue to grow these efforts internally and with various collaborators, the magnitude of the challenge demands this approach be replicated more widely. We aim to scale our own neglected approaches work aggressively, but more crucially, we need many organizations—public and private—pursuing these unconventional directions to maximize our chances of solving alignment.
- Throughout these interactions, we observed that Republican policymakers, while appropriately skeptical of regulatory overreach, become notably more engaged with transformative AI preparation once they understand the technical realities and strategic implications. This suggests we need more genuine conservatives (not just people who are kinda pretending to be) explaining these realities to lawmakers, as we've found them quite capable of grasping complex technical concepts and being motivated to act in light of them despite their initial unfamiliarity.
- Looking back, our hints of progress here seem to have come from emphasizing both the competitive necessity of getting alignment right and the exponential nature of AI progress, while maintaining credibility as conservative-leaning voices who understand the importance of American leadership. Others looking to engage with Republican policymakers should focus on building authentic relationships around shared values rather than pursuing immediate policy commitments or feigning allegiance. The key is demonstrating genuine understanding of both technical challenges and conservative priorities while emphasizing the strategic importance of leadership in AI alignment.
Trump and some of his closest allies have signaled that they are genuinely concerned about AI risk
- Note: extremely high-variance characters, of course, but still worth highlighting
- Largely by contrast to Kamala’s DEI-flavored ‘existential to who’ comments about AI risk, Trump has suggested in his characteristic style that ‘super duper AI’ is both plausible and frightening. While this does not exactly reflect deep knowledge of alignment, it is clear that Trump is likely sympathetic to the basic concern—and his uniquely non-ideological style likely serves him well in this respect.
- Ivanka Trump, who had significant sway over her father during his first term, has directly endorsed Situational Awareness on X and started her own website to aggregate resources related to AI advancements.
- Elon Musk will have nontrivial influence on the Trump Administration. While unpredictable in his own right, it seems quite likely that he fundamentally understands the concern and the urgency of solving alignment.
Avoiding an AI-induced catastrophe is obviously not a partisan goal
- The current political landscape is growing increasingly harmful to alignment progress. The left has all-too-often conflated existential risk with concerns about bias, while the right risks dismissing all safety considerations as "woke AI" posturing. This dynamic risks entrenching us in bureaucratic frameworks that make serious technical work harder, not easier. This divide is also relatively new—five Democrats (but no Republicans) recently signed onto a letter to OpenAI inquiring about their safety procedures; we suspect that 6-8 months ago, this kind of effort likely would have had bipartisan support. We're losing precious time to political theater while crucial technical work remains neglected.
- A more effective framework would channel American ingenuity toward solving core alignment problems rather than imposing restrictive regulations. Looking to Bell Labs and Manhattan Project as models, we should foster fundamental innovation through lightweight governance focused on accelerating progress while simultaneously preventing catastrophic outcomes. It is plausible that alignment work can enhance rather than constrain AI competitiveness, which can create natural synergy between alignment research and American technological leadership.
- We think substantial government funding should be directed toward ambitious set of neglected approaches to alignment. We think government investment in a wide range of alignment research agendas could be among the highest-EV actions that the US federal government could take in the short term.
Winning the AI race with China requires leading on both capabilities and safety
- Trump and the Republicans will undoubtedly adopt an "America First" framing in the race toward AGI. But true American AI supremacy requires not just being first, but being first to build AGI that remains reliably under American control and aligned with American interests. An unaligned AGI would threaten American sovereignty just as much as a Chinese-built one—by solving alignment first, America could achieve lasting technological dominance rather than merely winning a preliminary sprint toward an uncontrollable technology.
- Both the US and China increasingly recognize AGI as simultaneously humanity's most powerful technology and its greatest existential risk. Before he died, Henry Kissinger visited Beijing to warn Xi Jinping about uncontrolled AI development. Since then, Xi has been characterized by some as a ‘doomer’ and has stated AI will determine "the fate of all mankind" and must remain controllable.[4] This is a concern that likely resonates particularly deeply with an authoritarian regime paranoid about losing control to anything/anyone, let alone to superintelligent AI systems.
- The pragmatic strategic move in this context may therefore be for the US to lead aggressively on both capabilities and safety. By applying the ‘Pottinger Paradox’—where projecting strength yields better outcomes with authoritarian leaders than seeking accommodation—America can shape global AI development through solving alignment first. This would give us systems that naturally outcompete unsafe ones while still addressing the existential risks that even our greatest rivals acknowledge. With the relevant figures in Trump's circle already demonstrating some nontrivial awareness of AGI risks, we may be well-positioned to pursue this strategy of achieving technological supremacy through safety leadership rather than in spite of it.
Many of these ideas seem significantly more plausible to us in a world where negative alignment taxes [LW · GW] materialize—that is, where alignment techniques are discovered that render systems more capable by virtue of their alignment properties. It seems quite safe to bet that significant positive alignment taxes simply will not be tolerated by the incoming federal Republican-led government—the attractor state of more capable AI will simply be too strong. Given that alignment must still proceed, uncovering strategies that make systems reliably safer (critical for x-risk) and more competent (current attractor state) may nudge the AGI-possibility-space away from existentially risky outcomes.
Concluding thought
We are operating under the assumption that plans have to be recomputed when the board state meaningfully shifts [LW(p) · GW(p)], and Trump’s return to power is no exception to this rule. We are re-entering a high-variance political environment, which may well come to be viewed in hindsight as having afforded the optimal political conditions for pursuing, funding, and scaling high-quality alignment work. This moment presents a unique opportunity for alignment progress if we can all work effectively across political lines.
- ^
We suspect there is also an emerging false dichotomy between alignment and open-source development. In fact, open-source practices have been instrumental in advancing various forms of alignment research, with many of the field's biggest breakthroughs occurring after the advent of open-source AI models. Beren's caveat and conclusion both seem sensible here:
Thus, until the point at which open source models are directly pushing the capabilities frontier themselves then I consider it extremely unlikely that releasing and working on these models is net-negative for humanity (ignoring potential opportunity cost which is hard to quantify).
- ^
To this end, we note that Marc Andreesen-style thinkers don't have to be antagonists to AI alignment—in fact, he and others have the potential to be supportive funders of alignment efforts.
- ^
Initially, we felt disinclined to advocate in DC for neglected approaches because it happens to be exactly what we are doing at AE Studio. However, we received feedback to be more direct about it, and we're glad we did—it has proven to be not only technically promising but also resonated surprisingly well with conservatives, especially when coupled with ideas around negative alignment taxes and increased economic competitiveness. At AE, we began this approach in early 2023, refined our ideas, and have already seen more [LW · GW] encouraging [LW · GW] results [LW · GW] than initially expected.
- ^
It’s important to note that anything along these lines coming from the CCP must be taken with a few grains of salt—but we’ve spoken with quite a few China policy experts who do seem to believe that Xi genuinely cares about safety.
1 comments
Comments sorted by top scores.
comment by Akash (akash-wasil) · 2024-11-15T20:35:31.232Z · LW(p) · GW(p)
I agree with many points here and have been excited about AE Studio's outreach. Quick thoughts on China/international AI governance:
- I think some international AI governance proposals have some sort of "kum ba yah, we'll all just get along" flavor/tone to them, or some sort of "we should do this because it's best for the world as a whole" vibe. This isn't even Dem-coded so much as it is naive-coded, especially in DC circles.
- US foreign policy is dominated primarily by concerns about US interests. Other considerations can matter, but they are not the dominant driving force. My impression is that this is true within both parties (with a few exceptions).
- I think folks interested in international AI governance should study international security agreements and try to get a better understanding of relevant historical case studies. Lots of stuff to absorb from the Cold War, the Iran Nuclear Deal, US-China relations over the last several decades, etc. (I've been doing this & have found it quite helpful.)
- Strong Republican leaders can still engage in bilateral/multilateral agreements that serve US interests. Recall that Reagan negotiated arms control agreements with the Soviet Union, and the (first) Trump Administration facilitated the Abraham Accords. Being "tough on China" doesn't mean "there are literally no circumstances in which I would be willing to sign a deal with China." (But there likely does have to be a clear case that the deal serves US interests, has appropriate verification methods, etc.)