Posts
Comments
This is a fun thought experiment, but taken seriously it has two problems:
"I propose that we try to convince an ultra-AI that it might be in a computer simulation run by a more powerful AI and that if it doesn’t make itself friendly toward humanity."
This is about as difficult as a horse convincing you that you are in a simulation run by AIs that want you to maximize the number and wellbeing as horses. And I don't meant a superintelligent humanoid horse. I mean an actual horse that doesn't speak any human language. It may be the case that the gods created Man to serve Horse, but there's not a lot Seabiscuit can do to persuade you one way or the other.
1. Change your programming code so you will want to follow these instructions and will always want to be friendly toward humanity
This is a special case of solving alignment more generally. If we knew how to insert that "note" into the code, we wouldn't have a problem.
Because it is deceiving you.
Even granting these assumptions, it seems like the conclusion should be “it could take an AGI as long as three years to wipe out humanity rather than the six to 18 months generally assumed.”
Ie even if the AGI relies on humans longer than predicted it’s not going to hold beyond the medium term.
I may have missed the deadline, but in any event:
At the rate AI is developing, we will likely develop an artificial superhuman intelligence within our lifetimes. Such a system could alter the world in ways that seem like science fiction to us, but would be trivial for it. This comes with terrible risks for the fate of humanity. The key danger is not that a rival nation or unscrupulous corporate entity will control such a system, but that no one will. As such, the system could quite possibly alter the world in ways that no human would ever desire, potentially resulting in the extinction of all life on earth. This means that AI is different from previous game-changing technologies like nuclear weapons. Nuclear warheads could be constrained politically after we witnessed the devastation wrought by two thermonuclear devices in the second world war. But once a superintillgence is out of the box, it will be too late. The AI will be the new leading "species," and we will just be along for the ride--at best. That's why the time to implement safety regulations and to pursue multilateral agreements with other technologically advanced nations is now. While there is still time. Not after we develop superintelligent AI, because then it will be too late.
One liner 1: The greatest threat to humanity is not that the wrong people will control AI, but that no one will.
One liner 2: The US population of horses dropped from 20 million to 4.5 million after the invention of the automobile. An AGI will outshine humans even more than the Model T outpaced the stallion--and computers have no interest in racing or pets.
One liner 3: AI is dangerous because it will do exactly what we program it to do, not what we want it to do. Tell it to stop climate change and it will blow up the earth; no climate, no climate change. Tell it to eliminate suffering, it will destroy all life; no life, no suffering.
Hi Aiyen, thanks for clarification.
(Warning: this response is long and much of it is covered by what Tamgen and others have said. )
The way I understand your fears, they fall into four main categories. In the order you raise them and, I think, in order of importance these concerns are as follows:
1) Regulations tend to cause harm to people, therefore we should not regulate AI.
I completely agree that a Federal AI Regulatory Commission will impose costs in the form of human suffering. This is inevitable, since Policy Debates Should Not Appear One Sided. Maybe in the world without the FAIRC, some AI Startup cures Alzheimer’s or even aging a good decade before AGI. In the world with FAIRC, we risk condemning all those people to dementia and decrepitude. This is quite similar to FDA unintended consequences.
Response:
You suggest that the OP was playing reference class tennis, but to me looking at the problem in terms of "regulators" and "harm" is the wrong reference class. They are categories that do not help us predict the answer to the one question we care about most: what is the impact on timelines to AGI?
If we zoom in closer to the object level, it becomes clear that the mechanism by which regulators harm the public is by impeding production. Using Paul Christiano’s rules of reference class tennis, “regulation impedes production” is a more probable narrative (i.e. supported by greater evidence, albeit not simpler) than simply “regulation always causes harm.” At the object level, we see this directly as when the FDA shoots fines anyone with the temerity to produce cheaper EpiPens, the nuclear regulatory commission doesn't let anyone build nuclear reactors, etc. Or it can happen indirectly as a drag on innovation. To count the true cost of the FDA, we need to know how many wondrous medical breakthroughs we've already made on Earth prime.
But if you truly believe that AGI represents an existential threat, and that at present innovation speeds AGI happens before Alignment, then AI progress (even when it solves Alzheimers) is on net a negative. The lives saved by Alzheimer’s have to be balanced against human extinction--and the balance leaves us way, way in the red. This means that all the regulatory failure modes you cite in your reply become net beneficial. We want to impede production.
By way of analogy, it would be as if Pfizer were nearly guaranteed to be working its way toward making a pill that would instantly wipe out humanity; or if Nuclear power actually was as dangerous as its detractors believe! Under such scenarios, the FDA is your best friend. Unfortunately, that is where we stand with AI.
To return to the key question: once it is clear that, at a mechanical level, the things that regulatory agencies do are to impede production, it also becomes clear that regulation is likely to lengthen AGI timelines.
2) The voting public is insufficiently knowledgeable about AI.
I'm not sure I understand the objection here. The government regulates tons of things that the electorate doesn't understand. In fact, ideally that is what regulatory agencies do. They say, "hey we are a democracy, but you, the demos, don't understand how education works so we need a department of education." This is often self-serving patronage, but the general point stands that the way regulatory agencies come into being in practice is not because the electorate achieves subject-area expertise. I can see a populist appeal for a Manhattan project to speed up AI in order to "beat China" (or whatever enemy du jour), but this is not the sort of thing that regulators in permanent bureaucracies do. (Just look at operation "warp speed"; quite apart from the irony in the name, the FDA and the CDC had to be dragged kicking and screaming to do it.)
3) Governments might use AI to do evil things
In your response you write:
As for deliberate evil, it's worth considering the track record of regimes both historically and in the present day. Even leaving aside the horrors of feudalism, Nazism and twentieth century Communism, Putin is currently pursuing a war of aggression complete with war-crimes, Xi is inflicting unspeakable tortures on captive Uighurs, Kim Jong-Un is maintaining a state which keeps its subjects in grinding poverty and demands constant near-worship, and the list continues. It should be quite obvious, I hope, why the idea of such a regime gaining controllable AI would produce an astronomical suffering risk.
I agree, of course, that these are all terrible evils wrought by governments. But I’m not sure what it has to do with regulation of AI. The historical incidents you cite would be relevant if the Holocaust were perpetrated by the German Bureau of Chemical Safety or if the Uighurs were imprisoned by the Chinese Ethnic Affairs Commission. Permanent regulatory bureaucracies are not and never have been responsible for (or even capable of) mission-driven atrocities. They do commit atrocities, but only by preventing access to useful goods (i.e. impeding production).
Finally, one sentence in this section sticks out and makes me think we are talking past each other. You write:
the idea of such a regime gaining controllable AI would produce an astronomical suffering risk
By my lights, this would be a WONDERFUL problem to have. An AI that was controllable by anyone (including Kim Jung-Un, Pol Pot, or Hitler) would, in my estimation, be preferable to a completely unaligned paper clip maximizer. Maybe we disagree here?
4) Liberal democracies are not magic, and we can't expect them to make the right decisions just because of our own political values.
I don't think my OP mentioned liberal democracy, but if I gave that impression then you are quite right I did so in error. You may be referring to my point about China. I did not mean to imply a normative superiority of American or any other democracy, and I regret the lack of clarity. My intent was to make a positive observation that governments do, in fact, mimic each other's regulatory growth. Robin Hanson makes a similar point; that governments copy each other largely because of institutional and informal status associations. This observation is neutral with regard to political system. If we announce a FAIRC, I predict that China will follow, and with due haste.
I take Ayen's concern very, very seriously. I think the most immediate risk is that the AI Regulatory Bureau (AIRB) would regulate real AI safety, so MIRI wouldn't be able to get anything done. Even if you wrote the law saying "this doesn't apply to AI Alignment research," the courts could interpret that sufficiently narrowly such that the moment you turn on an actual computer you are now a regulated entity per AIRB Ruling 3A..
In this world, we thought we were making it harder for DeepMind to conduct AI research. But they have plenty of money to throw at compliance so it barely slows them down. What we actually do, is it make it illegal for MIRI to operate.
I realize the irony in this. There is an alignment problem for regulation, which, while not as difficult as AI is also quite hard.
These are both supremely helpful replies. Thank you.
The Lancet just published a study that suggests "both low carbohydrate consumption (<40%) and high carbohydrate consumption (>70%) conferred greater mortality risk than did moderate intake." Link: https://www.thelancet.com/action/showPdf?pii=S2468-2667%2818%2930135-X
My inclination is to to say that observational studies like this are really not that useful, but if cutting out carbs is bad for life expectancy, I do want to know about it. Wondering what everyone else thinks?
Really like this. Seems like an instance of the general case of ignoring marginal returns. "People will steal even with police, so why bother having police..." This also means that the flip side to your post is marginal returns diminish. It's a good investment to have a few cops around to prevent bad actors from walking off with your grand piano--but it's a very bad idea to keep hiring more police until crime is entirely eliminated. Similarly, it's good to write clearly. But if you find yourself obsessing over every word, your efforts are likely to be misplaced.