AI Alignment in The New Yorker

post by Eleni Angelou (ea-1) · 2023-05-17T21:36:18.341Z · LW · GW · 0 comments

This is a link post for https://www.newyorker.com/science/annals-of-artificial-intelligence/can-we-stop-the-singularity

Contents

No comments

The article is centered around the Singularity, that we might be relatively near it, and whether or not it is inevitable. On the way to tackling these questions, the author introduces and describes the field of AI Alignment:

A growing area of research called A.I. alignment seeks to lessen the danger by insuring that computer systems are “aligned” with human goals. The idea is to avoid unintended consequences while instilling moral values, or their machine equivalents, into A.I.s. Alignment research has shown that even relatively simple A.I. systems can break bad in bizarre ways.

The author further discusses the problem of the complexity of value which he calls the "King Midas problem": 

Alignment researchers worry about the King Midas problem: communicate a wish to an A.I. and you may get exactly what you ask for, which isn’t actually what you wanted. (In one famous thought experiment, someone asks an A.I. to maximize the production of paper clips, and the computer system takes over the world in a single-minded pursuit of that goal.)

As someone who'd read the New Yorker long before she got into AI safety, I wonder what would this piece do to me counterfactually if I was reading it now as just a philosophy student in NYC. The article doesn't cite LessWrong or any other sources I would be looking up although it mentions a couple of key names in the field e.g., Karnofsky, Russell. I doubt that reading this or anything similar would drastically shift my research interests. Or maybe it would. I'd be curious to hear a story along these lines. 

0 comments

Comments sorted by top scores.