We Choose To Align AI

post by johnswentworth · 2022-01-01T20:06:23.307Z · LW · GW · 16 comments

Contents

  WE CHOOSE TO ALIGN AI IN THIS DECADE AND DO THE OTHER THINGS
  NOT BECAUSE THEY ARE EASY, BUT BECAUSE THEY ARE HARD
  BECAUSE THAT GOAL WILL SERVE TO ORGANIZE THE BEST OF OUR SKILLS AND ENERGIES
  BECAUSE THAT CHALLENGE IS ONE WE ARE WILLING TO ACCEPT
  ONE WE ARE UNWILLING TO POSTPONE
  AND ONE WE INTEND TO WIN
None
16 comments

Epistemic status: poetry

"We choose to go to the moon! We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard. Because that goal will serve to organize the best of our skills and energies. Because that challenge is one we are willing to accept, one we are unwilling to postpone, and one we intend to win." - John F Kennedy

WE CHOOSE TO ALIGN AI IN THIS DECADE AND DO THE OTHER THINGS

JFK gave his “We choose to go to the moon!” speech in 1962. And when he said “in this decade”, he did not mean that we’d go to the moon before 1972. He meant we’d go to the moon before 1970.

Happy 2022! When I say we choose to align AI in this decade, I don’t mean before 2032. I mean before 2030. Maybe sooner if things go well. Do I think that’s actually doable? Yes [LW · GW]. Also fuck you.

… and some other things! As long as we’re shooting for the metaphorical moon, might as well throw aging [LW · GW] in the mix too. That seems doable by 2030.

NOT BECAUSE THEY ARE EASY, BUT BECAUSE THEY ARE HARD

Effective altruists talk a lot about “importance, neglectedness, and tractability”. The more important, neglected, and tractable a problem is, the more we should expect a high impact per unit of effort invested in it. The alignment problem scores through the roof on importance, and is still relatively neglected, but tractability is… um… not.

I’m not really an EA, at heart. When there’s low hanging fruit, I might pick it quickly and move on, or these days I might point it out to someone else and move on. Point is, the low hanging fruit is not what I’m here for. I’m here for the challenge. I study alignment and agency and the other things not because they are easy, but because they are hard.

The more EAs I meet, the more I realize that wanting the challenge is a load-bearing pillar of sanity when working on alignment.

When people first seriously think about alignment, a majority freak out. Existential threats are terrifying. And when people first seriously look at their own capabilities, or the capabilities of the world, to deal with the problem, a majority despair. This is not one of those things where someone says “terrible things will happen, but we have a solution ready to go, all we need is your help!”. Terrible things will happen, we don’t have a solution ready to go, and even figuring out how to help is a nontrivial problem. When people really come to grips with that, tears are a common response.

… but for someone who wants the challenge, the emotional response is different. The problem is terrifying? Our current capabilities seem woefully inadequate? Good; this problem is worthy. The part of me which looks at a rickety ladder 30 feet down into a dark tunnel and says “let’s go!” wants this. The part of me which looks at a cliff face with no clear path up and cracks its knuckles wants this. The part of me which looks at a problem with no clear solution and smiles wants this. The response isn’t tears, it’s “let’s fucking do this”.

BECAUSE THAT GOAL WILL SERVE TO ORGANIZE THE BEST OF OUR SKILLS AND ENERGIES

Why align an AI, rather than prove the Riemann hypothesis? Or calculate bits of Chaitin’s constant - we know that’s hard.

When faced with a hard problem, there’s this tendency to substitute easier problems, solve those instead, and call it progress. Riemann hypothesis is too hard, so we pick some other function which looks kinda similar, and prove things about it instead. And sometimes that is progress! But other times, people just end up goodhearting [? · GW] on the new problem instead.

Alignment is a problem which needs to be solved. One day, reality will test us, and if we fail then it’s game over. Substitute an easier problem instead, and reality will ignore our easier solution and wipe us all out anyway.

That’s a core part of the appeal: we don’t have the option of just walking away, we don’t have the option of solving some easier problem instead.

(We still look for shortcuts and loopholes, of course. Those who despair look for shortcuts and loopholes because they want some hope to cling to. Those who seek challenge look for shortcuts and loopholes because if the problem does turn out to be easy, we want to solve it and move on.)

The alignment problem will serve to organize the best of our skills and energies because we can’t just substitute some other problem. It is a Schelling point in problem space, a problem around which I can organize my efforts and expect others to do the same, without everyone spontaneously sliding off to some other problem.

BECAUSE THAT CHALLENGE IS ONE WE ARE WILLING TO ACCEPT

Damn straight.

ONE WE ARE UNWILLING TO POSTPONE

Did I mention we’re on a timer [LW · GW], and we’re not sure when it will run out?

AND ONE WE INTEND TO WIN

16 comments

Comments sorted by top scores.

comment by Jon Garcia · 2022-01-02T23:48:21.766Z · LW(p) · GW(p)

Yeah! Let's do this!

The part of me which looks at a rickety ladder 30 feet down into a dark tunnel and says “let’s go!” wants this. The part of me which looks at a cliff face with no clear path up and cracks its knuckles wants this.

Although, I would rather those working on AI alignment adopt a general policy of not descending rickety ladders into dark abysses or free-climbing sheer cliffs, just to avoid having the probability of AI catastrophe make a discontinuous jump upward after an exciting weekend trip.

comment by Ruby · 2022-01-28T19:27:30.800Z · LW(p) · GW(p)

Curated. Huzzah for poetry and inspiration. We could use a few more speeches here and there.

comment by Quinn (quinn-dougherty) · 2022-01-03T17:55:32.167Z · LW(p) · GW(p)

Major guilty pleasure of mine is Aaron Sorkin, who once did a show called Newsroom about a large news broadcast project that, against all odds and incentives, doubles down on the duty of media elites to inform the public and so on. It's either unbearably corny (insulting) or unbearably corny (affectionate) depending on who's watching.

After the first broadcast of their re-invigorated show, the producer says

in the old days of like 10 minutes ago, we did the news well. You know how? We decided to.

I was thinking about this post and I got my streams crossed -- the model of the JFK bit in my head accidentally inserted something like "we do this because we decide to", and it worked really well! I find it motivating in the poetry sense to believe whatever illusion about agency or free will, especially at a collective level, that allows me to say "we are those who happened to step up" and flushing out any mention of reasons why we stepped up.

For some reason, "we decided to" is nearly as potent as defiance from solstice for me!

Replies from: bill-prada, Jon Garcia
comment by Bill Prada (bill-prada) · 2022-01-29T04:29:45.666Z · LW(p) · GW(p)

For an inspiring movie scene for the moment I’d go with Apollo 13. The nerdy engineers saving the mission by coming up with a kluge to fit the wrong shape and size charcoal CO2 scrubbers. A palpable payoff to the JFK inspirational speech.

https://spacecenter.org/apollo-13-infographic-how-did-they-make-that-co2-scrubber/

Replies from: bill-prada
comment by Bill Prada (bill-prada) · 2022-01-29T04:44:30.473Z · LW(p) · GW(p)

Edited above

Replies from: Yoav Ravid
comment by Yoav Ravid · 2022-01-29T08:27:51.639Z · LW(p) · GW(p)

Comments and posts are editable.

Replies from: bill-prada
comment by Bill Prada (bill-prada) · 2022-01-29T16:04:33.566Z · LW(p) · GW(p)

Thank you.

comment by Jon Garcia · 2022-01-03T21:22:55.876Z · LW(p) · GW(p)

Thanks for the link. That call-and-response was beautiful.

comment by Ben Pace (Benito) · 2024-01-17T20:09:42.945Z · LW(p) · GW(p)

This is a better spirit with which to accomplish great and important tasks than most I have around me, and I'm grateful that it was written up. I give this +4.

comment by PoignardAzur · 2022-02-04T12:32:11.594Z · LW(p) · GW(p)

Do I think that’s actually doable? Yes [LW · GW]. Also fuck you.

Uh, excuse you?

I've read your blog post and I still think the problem is poorly defined and untractable with current methods.

Also, the part Kennedy isn't mentioning in your speech is that "going to the moon" was the end goal of a major propaganda war between the two major superpowers of the time, and as a result it had basically infinite money thrown at it.

Inspirational speeches are great, but having the funding and government authority to back them is even better.

comment by Rachel Shu (wearsshoes) · 2022-02-04T04:20:14.236Z · LW(p) · GW(p)

Composer Christopher Tin has set JFK's "We Choose to go to the Moon" speech to music, https://www.youtube.com/watch?v=HBITb9Zz0rY . Solsticegoers may recognize the opening leitmotif as shared with Sogno Di Volare, another movement from the same work, an oratorio on the theme of flight, To Shiver the Sky.

comment by Bill Prada (bill-prada) · 2022-01-28T22:02:01.678Z · LW(p) · GW(p)

The Kennedy speech lit a fire under a generation. Appending the ‘fuck you’ makes you sound like a truculent Reddit knuckle head. Grow up a bit and I’ll take you more seriously.

Replies from: bill-prada
comment by Bill Prada (bill-prada) · 2022-01-29T16:44:48.587Z · LW(p) · GW(p)

The juxtaposition of soaring rhetoric and coarse language still jars me but my comment is harsher than it needed to be. I apologize.

Replies from: shiri-dori-hacohen
comment by Shiri Dori-Hacohen (shiri-dori-hacohen) · 2022-01-30T17:10:13.258Z · LW(p) · GW(p)

Actually,  I think you were spot on. The curse was completely uncalled for and not helpful in any way, as I mentioned in this Twitter thread. This was the first email broadcast I ever opened from LessWrong - and will be the last as well. Unsubscribed.

Replies from: shiri-dori-hacohen
comment by Shiri Dori-Hacohen (shiri-dori-hacohen) · 2022-01-30T17:11:57.365Z · LW(p) · GW(p)

P.S. I am not a prude and use curses in my language quite liberally. The problem for me was not the usage of the coarse language in and of itself, but the fact that it was directed at the reader for no reason whatsoever.

Replies from: Lukas_Gloor
comment by Lukas_Gloor · 2022-02-03T13:01:17.805Z · LW(p) · GW(p)

Good point, but I thought that casually insulting the reader for no reason whatsoever gave the post an aura of battle-readiness (perhaps a bit much of it), which is maybe the tone the author was going for.