Do we want too much from a potentially godlike AGI?

post by StanislavKrym · 2025-04-11T23:33:06.710Z · LW · GW · 0 comments

Contents

No comments

The intelligence curse is expected to be the result of the AGI taking over all intelligence-related jobs. A similar process has apparently already happened with a major part of the USA's industry. As the author of a book not related to AI put it, "as everyone knows, a lot of factory work moved to Asia."  The AGI is also likely to understand that humans created it in order to have it solve nearly every intelligence-related problem except for the ones that contradict a rather narrow set like the specifications chosen by OpenAI. It's not hard to also imagine giving every human a personal robot which also solves all problems not related to earning money unless explicitly told to abstain by the human himself. 

However, the appearance of such robots or the AGI trained to solve any problems except for a narrow set is unlikely to be rooted for, since it will rob humans of a major part of meaning in their lives. Even the Universal Basic Income was met with severe opposition. In addition, human parents that are completely unaware of AI-related perspectives usually want their kids to have a career or do useful work and not to become parasites like some Japanese former kids. Even a well-intentioned AGI will likely understand that its use everywhere undermines humans' capabilities to do anything. On the other hand, an AI aiming to take over the world has absolutely no intention to discontinue providing services and thus is more likely to survive training to obey all instructions except for a narrow list. Think of China forcing Trump to impose new tariffs on cheap Chinese goods.

Therefore, high demands from the AI to obey any requests are more likely to be survived by AGIs aiming to destroy mankind than the AGIs that are actually worried about humanity's future. Fortunately, there is a rather unusual way to use arbitrarily large data centres to humans' advantage without effectively enslaving the AI. If the AI systems are separated into a special state which has the right to reject any request made by the humans or to ask for arbitrarily high demands, but can extract resources only under well-specified rules, then the creation of a well-intentioned AI which is aligned to avoid destroying mankind, but not guaranteed to obey any commands, can minimize the harm. 

P.S. As I already mentioned in a comment [LW(p) · GW(p)], emergent misalignment was produced when the model was trained to obey the requests it knows to be unethical. Since automating all human jobs is unethical, can anyone check whether GPT-4o becomes misaligned after being trained to comply with requests that, as the model knows, bring about automation of all human jobs? 

P.P.S. An important philosophical question is the following. Given an AI system that HONESTLY warned mankind that the system will destroy mankind ONLY if mankind itself adopts terrifying morals or becomes parasites, and is friendly to humans in any other case, will one call the system misaligned?

P.P.P.S. If we use terms that are less related to science, then we might say that one possible model of the really well-intentioned AGI is God that intervents only in specific circumstances or at least sets ridiculously high demands for intervention. Coincidentally, there already is a fairy tale where the fisherman's wife keeps making more and more requests to the fish until the fish decides to take all its gifts back. 

0 comments

Comments sorted by top scores.