Making a conservative case for alignment

post by Cameron Berg (cameron-berg), Judd Rosenblatt (judd), phgubbins, AE Studio (AEStudio) · 2024-11-15T18:55:40.864Z · LW · GW · 1 comments

Contents

  AI-not-disempowering-humanity is conservative in the most fundamental sense
  We've been laying the groundwork for alignment policy in a Republican-controlled government
  Trump and some of his closest allies have signaled that they are genuinely concerned about AI risk
  Avoiding an AI-induced catastrophe is obviously not a partisan goal
  Winning the AI race with China requires leading on both capabilities and safety
  Concluding thought
None
1 comment

Trump and the Republican party will yield broad governmental control during what will almost certainly be a critical period for AGI development. In this post, we want to briefly share various frames and ideas we’ve been thinking through and actively pitching to Republican lawmakers over the past months in preparation for the possibility of a Trump win.

Why are we sharing this here? Given that >98% of the EAs and alignment researchers we surveyed [LW · GW] earlier this year identified as everything-other-than-conservative, we consider thinking through these questions to be another strategically worthwhile neglected [LW · GW] direction. 

(Along these lines, we also want to proactively emphasize that politics is the mind-killer [LW · GW], and that, regardless of one’s ideological convictions, those who earnestly care about alignment must take seriously the possibility that Trump will be the US president who presides over the emergence of AGI—and update accordingly in light of this possibility.)

Political orientation: combined sample of (non-alignment) EAs and alignment researchers

AI-not-disempowering-humanity is conservative in the most fundamental sense

We've been laying the groundwork for alignment policy in a Republican-controlled government

Trump and some of his closest allies have signaled that they are genuinely concerned about AI risk

Avoiding an AI-induced catastrophe is obviously not a partisan goal

Winning the AI race with China requires leading on both capabilities and safety

Many of these ideas seem significantly more plausible to us in a world where negative alignment taxes [LW · GW] materialize—that is, where alignment techniques are discovered that render systems more capable by virtue of their alignment properties. It seems quite safe to bet that significant positive alignment taxes simply will not be tolerated by the incoming federal Republican-led government—the attractor state of more capable AI will simply be too strong. Given that alignment must still proceed, uncovering strategies that make systems reliably safer (critical for x-risk) and more competent (current attractor state) may nudge the AGI-possibility-space away from existentially risky outcomes.

Concluding thought

We are operating under the assumption that plans have to be recomputed when the board state meaningfully shifts [LW(p) · GW(p)], and Trump’s return to power is no exception to this rule. We are re-entering a high-variance political environment, which may well come to be viewed in hindsight as having afforded the optimal political conditions for pursuing, funding, and scaling high-quality alignment work. This moment presents a unique opportunity for alignment progress if we can all work effectively across political lines.

  1. ^

     We suspect there is also an emerging false dichotomy between alignment and open-source development. In fact, open-source practices have been instrumental in advancing various forms of alignment research, with many of the field's biggest breakthroughs occurring after the advent of open-source AI models. Beren's caveat and conclusion both seem sensible here: 

    Thus, until the point at which open source models are directly pushing the capabilities frontier themselves then I consider it extremely unlikely that releasing and working on these models is net-negative for humanity (ignoring potential opportunity cost which is hard to quantify). 

  2. ^

     To this end, we note that Marc Andreesen-style thinkers don't have to be antagonists to AI alignment—in fact, he and others have the potential to be supportive funders of alignment efforts.

  3. ^

     Initially, we felt disinclined to advocate in DC for neglected approaches because it happens to be exactly what we are doing at AE Studio. However, we received feedback to be more direct about it, and we're glad we did—it has proven to be not only technically promising but also resonated surprisingly well with conservatives, especially when coupled with ideas around negative alignment taxes and increased economic competitiveness. At AE, we began this approach in early 2023, refined our ideas, and have already seen more [LW · GWencouraging [LW · GWresults [LW · GW] than initially expected.

  4. ^

     It’s important to note that anything along these lines coming from the CCP must be taken with a few grains of salt—but we’ve spoken with quite a few China policy experts who do seem to believe that Xi genuinely cares about safety.

1 comments

Comments sorted by top scores.

comment by Akash (akash-wasil) · 2024-11-15T20:35:31.232Z · LW(p) · GW(p)

I agree with many points here and have been excited about AE Studio's outreach. Quick thoughts on China/international AI governance:

  • I think some international AI governance proposals have some sort of "kum ba yah, we'll all just get along" flavor/tone to them, or some sort of "we should do this because it's best for the world as a whole" vibe. This isn't even Dem-coded so much as it is naive-coded, especially in DC circles.
  • US foreign policy is dominated primarily by concerns about US interests. Other considerations can matter, but they are not the dominant driving force. My impression is that this is true within both parties (with a few exceptions).
  • I think folks interested in international AI governance should study international security agreements and try to get a better understanding of relevant historical case studies. Lots of stuff to absorb from the Cold War, the Iran Nuclear Deal, US-China relations over the last several decades, etc. (I've been doing this & have found it quite helpful.)
  • Strong Republican leaders can still engage in bilateral/multilateral agreements that serve US interests. Recall that Reagan negotiated arms control agreements with the Soviet Union, and the (first) Trump Administration facilitated the Abraham Accords. Being "tough on China" doesn't mean "there are literally no circumstances in which I would be willing to sign a deal with China." (But there likely does have to be a clear case that the deal serves US interests, has appropriate verification methods, etc.)