A warm-up for the AI governance project

post by jacek (jacek-karwowski) · 2023-02-17T18:06:13.113Z · LW · GW · 2 comments

Contents

2 comments

Here is one possible world we could be living in.

Imagine that the majority of the world's population knows that unaligned AGI is coming. This majority includes most of the world's heads of governments, all of respectable scientists, most of journalists, all your friends, your close family, your distant aunt from Canada and your neighbour next door. The topic would sneak in casual conversations over beer, and you could overhear people on the street discussing their fears of being turned into paperclips.

Furthermore, imagine that the AI alignment project turned out to be easy. And, not just easy in theory, easy in the most ordinary sense of the world, like taking trash out, or microwaving pizza, and then even simpler. Easy to the point that you could explain a solution to a five year old, and they would understand the spirit of it, if not the details, perfectly well. Let's, for the sake of argument, imagine that the solution to the alignment problem was for us to repeat, collectively, one quadrillion times, to the mirror, "AGI, AGI, please be safe and turn out fine", while patting ourselves three times on the forehead.

If this all sounds unrealistic -- possibly too optimistic even -- let's dream bigger! Picture a website with a big red counter saying "AGI safety X% done", that anyone in the world could look at. Kind of like a Doomsday Clock, but with real measurements backing it up real, and the fact that it is real being the common knowledge.

The counter would show the number of times people all over the world said the phrase (and patted themselves doing it). It would be frequently featured in the news, governments around the world would talk about it in their communications, the pope would sometimes mention it at the end of his homilies, some cities would even build giant interactive billboards displaying the current value of the counter in real time.

The counter would not only have a numerical value, it would have a range of sub-meters, each pointing out how people in all countries around the world are doing, down to the granularity of a city or more. There would not be any taboo talking to people about this - conversely, it would be actively status-increasing to show off your tells-and-pats. People could, and would, shame each other for not doing it enough.

Imagine that the counter you pictured before had a pretty clear timeline until AGI. The timeline would be perfectly sized - not short enough as to render any action meaningless, nor long enough for us to discount it entirely.

Implementing the solution would not imply any major drawbacks. Everyone would be equally able to say the words. Forget about us "sacrificing our music and our non-numerical names" in the process, or challenging our ethical norms, or going against worldviews. The solution would not only be good in the circumstances, but would, as if by happy accident, make the world a better place in a broad sense. Maybe it turned out that saying the words lessens the chance of throat cancer, and physical activity of patting yourself increases blood circulation.

And if this all sounds too easy, one last thing comes to mind. It would not only be people who could say the magic spell! We could build robots to help us with the task. Constructing them to say the words properly, and pat, would not even be that hard - in fact, there would be hundreds of companies building and pitching them in an open market.

In this world, would pursuing AI-safety-focused policy be easy?


That analogy with global warming is, on a visceral level, what makes me really sceptical about AI governance. I can picture a few arguments against it:

I agree that these are reasonable points, and I certainly made the first part look like it does more for the fun-of-imagining-it reasons than factual accuracy. Still, I am not completely convinced by any of the arguments against, and I feel that they somehow don't refute the central premise. What am I missing?

(Incidentally, I think one big unintended benefit of working hard on global warming is making it a test ground, a trial run, for what AI governance might require. It might even be the case that pursuing that, instead of taking on the hard case first, is a better strategy? But then, of course, is the question of timelines...)

2 comments

Comments sorted by top scores.

comment by Noosphere89 (sharmake-farah) · 2023-02-17T18:27:54.607Z · LW(p) · GW(p)

I think one of the major failure modes here was politicizing the climate change movement, as it led to 40 years of blocking climate solutions.

Now climate change is being solved, slower than people would like, but conditional on the world where AI alignment is as easy as the post suggests, serious changes would be in order for LW, and the Alignment forum should close up shop.

Replies from: jacek-karwowski
comment by jacek (jacek-karwowski) · 2023-02-17T18:32:16.801Z · LW(p) · GW(p)

Yes, I agree that the politicisation is the central issue. But this is exactly why I wrote the first part - I feel that this section is true despite it (I didn't claim that most people agree with the solution, only that the elites, experts, and the reader's social bubble does!).

So one question I'm trying to understand is: since politicisation happened to climate change, why do we think that it won't happen to AI governance? I.e. the point is that pursuing goals by political means might just usually end up like that, because of the basic structure of the political discourse (you get points for opposing the other side, etc).