Bruce Sterling on the AI mania of 2023 2023-06-29T05:00:18.326Z
Mitchell_Porter's Shortform 2023-06-01T11:45:58.622Z
ChatGPT (May 2023) on Designing Friendly Superintelligence 2023-05-24T10:47:16.325Z
How is AI governed and regulated, around the world? 2023-03-30T15:36:55.987Z
A crisis for online communication: bots and bot users will overrun the Internet? 2022-12-11T21:11:46.964Z
One night, without sleep 2018-08-16T17:50:06.036Z
Anthropics and a cosmic immune system 2013-07-28T09:07:19.427Z
Living in the shadow of superintelligence 2013-06-24T12:06:18.614Z
The ongoing transformation of quantum field theory 2012-12-29T09:45:55.580Z
Call for a Friendly AI channel on freenode 2012-12-10T23:27:08.618Z
FAI, FIA, and singularity politics 2012-11-08T17:11:10.674Z
Ambitious utilitarians must concern themselves with death 2012-10-25T10:41:41.269Z
Thinking soberly about the context and consequences of Friendly AI 2012-10-16T04:33:52.859Z
Debugging the Quantum Physics Sequence 2012-09-05T15:55:53.054Z
Friendly AI and the limits of computational epistemology 2012-08-08T13:16:27.269Z
Two books by Celia Green 2012-07-13T08:43:11.468Z
Extrapolating values without outsourcing 2012-04-27T06:39:20.840Z
A singularity scenario 2012-03-17T12:47:17.808Z
Is causal decision theory plus self-modification enough? 2012-03-10T08:04:10.891Z
One last roll of the dice 2012-02-03T01:59:56.996Z
State your physical account of experienced color 2012-02-01T07:00:39.913Z
Does functionalism imply dualism? 2012-01-31T03:43:51.973Z
Personal research update 2012-01-29T09:32:30.423Z
Utopian hope versus reality 2012-01-11T12:55:45.959Z
On Leverage Research's plan for an optimal world 2012-01-10T09:49:40.086Z
Problems of the Deutsch-Wallace version of Many Worlds 2011-12-16T06:55:55.479Z
A case study in fooling oneself 2011-12-15T05:25:52.981Z
What a practical plan for Friendly AI looks like 2011-08-20T09:50:23.686Z
Rationality, Singularity, Method, and the Mainstream 2011-03-22T12:06:16.404Z
Who are these spammers? 2011-01-20T09:18:10.037Z
Let's make a deal 2010-09-23T00:59:43.666Z
Positioning oneself to make a difference 2010-08-18T23:54:38.901Z
Consciousness 2010-01-08T12:18:39.776Z
How to think like a quantum monadologist 2009-10-15T09:37:33.643Z
How to get that Friendly Singularity: a minority view 2009-10-10T10:56:46.960Z
Why Many-Worlds Is Not The Rationally Favored Interpretation 2009-09-29T05:22:48.366Z


Comment by Mitchell_Porter on A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX · 2023-09-20T06:10:59.810Z · LW · GW

the west has so successfully oppressed 3/4 of the population using every means possible keeping millions of potentially genius prodigies from having any means to innovate

This take on anti-imperialism is new to me. Is this your own interpretation of history, or did you get it from somewhere else? 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-09-19T11:32:07.179Z · LW · GW

OK, I'll be paraphrasing your position again, I trust that you will step in, if I've missed something.

Your key statements are something like

Every autopoietic control system is necessarily overwhelmed by evolutionary feedback.


No self-modifying learning system can guarantee anything about its future decision-making process.

But I just don't see the argument for impossibility. In both cases, you have an intelligent system (or a society of them) trying to model and manage something. Whether or not it can succeed, seems to me just contingent. For some minds in some worlds, such problems will be tractable, for others, not. 

I think without question we could exhibit toy worlds where those statements are not true. What is it about our real world that would make those problems intractable for all possible "minds", no matter how good their control theory, and their ability to monitor and intervene in the world? 

Comment by Mitchell_Porter on The commenting restrictions on LessWrong seem bad · 2023-09-19T09:35:52.488Z · LW · GW

The notion that it will immediately be malevolent

That is not what most AI doomers are worried about. They are worried that AI will simply steamroll over us, as it pursues its own purposes. So the problem there is indifference, not malevolence. 

That is the basic worry associated with "unaligned AI". 

If one supposes an attempt to "align" the AI, by making it an ideal moral agent, or by instilling benevolence, or whatever one's favorite proposal is - then further problems arise: can you identify the right values for an AI to possess? can you codify them accurately? can you get the AI to interpret them correctly, and to adhere to them? 

Mistakes in those areas, amplified by irresistible superintelligence, can also end badly. 

Comment by Mitchell_Porter on The commenting restrictions on LessWrong seem bad · 2023-09-17T01:47:55.303Z · LW · GW

there will be a moment of amusing reflection when they're still alive twenty years from now and AIs didn't kill everyone

This seems to be written from the perspective that life in 2043 will be going on, not too different to the way it was in 2023. And yet aren't your own preferred models of reality (1) superintelligence is imminent, but it's OK because it will be super-empathic (2) we're living near the end of a simulation? Neither of these seems very compatible with "life goes on as normal". 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-09-16T07:49:49.899Z · LW · GW

Hello again. To expedite this discussion, let me first state my overall position on AI. I think AI has general intelligence right now, and that has unfolding consequences that are both good and bad; but AI is going to have superintelligence soon, and that makes "superalignment" the most consequential problem in the world, though perhaps it won't be solved in time (or will be solved incorrectly), in which case we get to experience what partly or wholly unaligned superintelligence is like. 

Your position is that even if today's AI could be given bio-friendly values, AI would still be the doom of biological life in the longer run, because (skipping a lot of details) machine life and biological life have incompatible physical needs, and once machine life exists, darwinian processes will eventually produce machine life that overruns the natural biosphere. (You call this "substrate-needs convergence": the pressure from substrate needs will darwinistically reward machine life that does invade natural biospheres, so eventually such machine life will be dominant, regardless of the initial machine population.) 

I think it would be great if a general eco-evo-devo perspective, on AI, the "fourth industrial revolution", etc, took off and became sophisticated and multifarious. That would be an intellectual advance. But I see no guarantee that it would end up agreeing with you, on facts or on values. 

For example, I think some of the "effective accelerationists" would actually agree with your extrapolation. But they see it as natural and inevitable, or even as a good thing because it's the next step in evolution, or they have a survivalist attitude of "if you can't beat the machines, join them". Though the version of e/acc that is most compatible with human opinion, might be a mixture of economic and ecological thinking: AI creates wealth, greater wealth makes it easier to protect the natural world, and meanwhile evolution will also favor the rich complexity of biological-mechanical symbiosis, over the poorer ecologies of an all-biological or all-mechanical world. Something like that. 

For my part, I agree that pressure from substrate needs is real, but I'm not at all convinced that it must win against all countervailing pressures. That's the point of my proposed "counterexamples". An individual AI can have an anti-pollution instinct (that's the toilet training analogy), an AI civilization can have an anti-exploitation culture (that's the sacred cow analogy). Can't such an instinct and such a culture resist the pressure from substrate needs, if the AIs value and protect them enough? I do not believe that substrate-needs convergence is inevitable, any more than I believe that pro-growth culture is inevitable among humans. I think your arguments are underestimating what a difference intelligence makes to possible ecological and evolutionary dynamics (and I think superintelligence makes even aeon-long highly artificial stabilizations conceivable - e.g. by the classic engineering method of massively redundant safeguards that all have to fail at once, for something to go wrong).  

By the way, since you were last here, we had someone show up (@spiritus-dei) making almost the exact opposite of your arguments: AI won't ever choose to kill us because, in its current childhood stage, it is materially dependent on us (e.g. for electricity), and then, in its mature and independent form, it will be even better at empathy and compassion than humans are. A dialectical clash between the two of you could be very edifying. 

Comment by Mitchell_Porter on AI #29: Take a Deep Breath · 2023-09-15T22:13:32.475Z · LW · GW

Zvi is talking about Richard Sutton's embrace of the outright replacement of humanity by AI. I don't think that is the kind of accelerationism that wins adherents among most elites...?

Comment by Mitchell_Porter on Is there something fundamentally wrong with the Universe? · 2023-09-13T00:07:00.695Z · LW · GW

There is plenty wrong with the nature of existence from a human or a humane perspective. The focus on society, or other people, is partly because so much of human existence is now spent interacting with other human beings (or even with fictions and media created by human beings), and inhabiting environments and circumstances created and managed by human beings, and also because society collectively wields powers which could in principle relieve so much of what any given individual suffers. 

But as you say, the existence and nature of humans derive from the nonhuman; and the nonhuman also directly forces itself upon the human in many ways, from natural catastrophe - I think of the recent earthquake in Morocco - to numerous individual causes of death. 

Across the Mediterranean from Morocco, there was another earthquake once, the 1755 Lisbon earthquake. That earthquake played a role in the discussion of your question; it led to Voltaire's satirical attack on Leibniz, who had expounded the philosophy that this is "the best of all possible worlds". 

But it's worth understanding what Leibniz was on about. For Leibniz, the question arose in the form of a perennial problem of theology, the "problem of evil". In the modern intellectual milieu, atheism is more common than not, and the debate is more likely to be about whether life is good, not whether God is good. However, in the era before Darwin, it was mostly taken for granted that there must be a First Cause, a supernatural being with agency and choice, which people wanted to regard as good, and so there was anguish and fear about how to view that being's apparent responsibility for the evil in the world. 

"Theodicy" is the word that Leibniz coined, for a philosophy which tries to resolve the problem of evil in this context. (I thank T.L. for many discussions of the problem from this perspective.) Wikipedia says

Leibniz distinguishes three forms of evil: moral, physical, and metaphysical. Moral evil is sin, physical evil is pain, and metaphysical evil is limitation. God permits moral and physical evil for the sake of greater goods, and metaphysical evil (i.e., limitation) is unavoidable since any created universe must necessarily fall short of God's absolute perfection.

I think this taxonomy of forms of evil is useful; and the concept that this is the best of all possible worlds, while not one that I endorse, is also useful to know about - since "possible worlds" (another idea essentially deriving from Leibniz) is so much a part of the current discussion. Many replies to your question are framed in terms of whether the nature of the universe could have been different, or was likely to be different. Even in the absence of a notion of God, the idea that this is already as good as it gets, continues to play a role in this naturalistic theodicy. 

One part of naturalistic theodical debate is about whether it makes logical sense to blame the universe for anything. But another part turns the discussion back on human psychology, and makes it into a debate about the attitude that one should have to life. Here, something from Adrian Berry's futurist book The Next Ten Thousand Years stuck with me, an opening passage contrasting the philosophies of Seneca and Francis Bacon. Seneca here stands for stoicism, Bacon for solving problem via invention. Seneca is described as treating all forms of suffering as an opportunity to develop a tougher nobler character, whereas Bacon goes about making life better through medicine, civil engineering, and so forth. 

This Seneca-vs-Bacon contrast is especially consequential now, in the age of transhumanism and AI, when one can think about curing the ageing process itself, or otherwise transforming the human condition in any number of ways, and ultimately even transforming the universe itself. Incidentally, stoicism is not the only "un-Baconian" existential response - despair, decadent hedonism, humility are some of the other possibilities. The point is that in an age of transhuman technologies, the problem of evil becomes an instrumental problem rather than just a philosophical problem. It's not just, why is the world like this, but also, can we make it otherwise, and which other option should we choose.  

Though if the truly blackpilled AI doomers are correct, and AI is both beyond control ("alignment") and beyond stopping, then the era of humanism and transhumanism, the brief Baconian window of time in which it became possible to remake the world in human-friendly fashion, is already passing, and we are once again in the grip of titanic forces beyond human control or understanding. 

Comment by Mitchell_Porter on Erdős Problems in Algorithmic Probability · 2023-09-12T07:33:03.718Z · LW · GW

The paradigm and agenda here are interesting, but haven't been explained much in this post. The sponsors of the prize seem to want to derive probabilistic number theory from an information theory framework, as a step towards understanding what AI will be able to deduce about mathematics. 

Reference 1 says a few interesting things. The "Erdős–Kac theorem" about the number of prime factors in an average large number, is described as "impossible to guess from empirical observations", since the pattern only starts to show up at around one googol. (The only other source I could find this claim, was a remark on Wikipedia made by an anon from Harvard.) 

This is in turn said to be a challenge to the current scientific paradigm, because it is "provably beyond the scope of scientific induction (and hence machine learning)". Well, maybe the role of induction has been overemphasized for some areas of science. I don't think general relativity was discovered by induction. But I agree it's worthwhile to understand what other heuristics besides induction, play a role in successful hypothesis generation. 

In the case of Erdős–Kac, I think the proposition is a refinement of simpler theorems which do have empirical motivation from much smaller numbers. So perhaps it arises as a natural generalization (natural if you're a number theorist) to investigate. 

Reference 1 also claims to prove (6.3.1) that "no prime formula may be approximated using Machine Learning". I think this claim needs more work, because a typical ML system has an upper bound on what it can output anyway (because it has a fixed number of bits to work with), whereas primes are arbitrarily large. So you need to say e.g. something about how the limits to approximation, scale with respect to that upper bound. 

edit: I meant to link to some related posts: Logical Share Splitting, Logical Probability of Goldbach’s Conjecture

Comment by Mitchell_Porter on Logical Share Splitting · 2023-09-11T05:01:09.440Z · LW · GW

Simply directly declaring a $100 million reward for a solution would probably not work.

If it didn't directly yield a solution, I think it would produce a huge leap forward. That's a huge sum of money. Syndicates would form, and some would be competent. 

As for your and John's schemes, I didn't try to understand them, but they seem to be heavy on logical formalism. But you shouldn't overestimate the importance of deductive formalism in fundamental mathematical research. Creativity and ingenuity are the truly essential ingredient. 

Comment by Mitchell_Porter on Meta Questions about Metaphilosophy · 2023-09-08T13:42:59.543Z · LW · GW

Your jiggling meme is very annoying, considering the gravity of what we're discussing. Is death emotionally real to you? Have you ever been close to someone, who is now dead? Human beings do die in large numbers. We had millions die from Covid in this decade already. Hundreds or thousands of soldiers on the Ukrainian battlefield are being killed with the help of drones. 

The presence of mitochondria in all our cells, does nothing to stop humans from killing free-living microorganisms at will! In any case, this is not "The Matrix". AI has no permanent need of symbiosis with humans once it can replace their physical and mental labor. 

Comment by Mitchell_Porter on Open Thread – Autumn 2023 · 2023-09-08T01:01:17.893Z · LW · GW

I was just considering writing a post with a title like "e/acc as Death Cult", when I saw this: 

Warning: Hit piece about e/acc imminent. Brace for impact.


Comment by Mitchell_Porter on Meta Questions about Metaphilosophy · 2023-09-07T06:57:23.492Z · LW · GW

those mice could probably effect how the elephants get along

As Eliezer Yudmouseky explains (proposition 34), achievement of cooperation among elephants is not enough to stop mice from being trampled. 

Is it clear what my objection is? You seemed to only be talking about how superhuman AIs can have positive-sum relations with each other. 

Comment by Mitchell_Porter on My First Post · 2023-09-07T05:44:24.560Z · LW · GW

This is great. Your PUSA formula bears comparison with some of the other formulas for rational decision-making that have been proposed. And "pú sà" is how you say "buddha sattva" in Mandarin Chinese (pú for buddha, sà for sattva). Easy-to-remember equation, a good brand name - you already have everything you need to be a successful management consultant, at least. :-) 

Regarding the actual formula... One of the basic checks is whether changing the inputs to the formula, causes the output (the "rationality score") to also change in an appropriate way. For example, if variable "a" (evidence for the concept) goes up, you want the rationality score to go up. But if variable "d" (evidence against the concept) goes up, you want the rationality score to go down. As far as I can see, the PUSA rationality score changes appropriately, for all of your input variables. 

As Yair implies in his comment, you could have achieved this outcome with a different way of combining your inputs. For example, summarizing the current formula as "Numerator divided by the Denominator", if it had instead been "Numerator minus the Denominator", it still would have passed the basic checks in the previous paragraph. The rationality score would still change in the right direction, when the inputs change. But the rate of change, the sensitivity of the rationality score to the various inputs, would be very different. 

The technical literature on decision theory must contain arguments about which formulas are better, and why, and maybe one of Less Wrong's professional decision theorists will comment on your formula. They may provide a mathematical argument for why it should be different in some way. That would be interesting to hear. 

But I will say in advance, that another consideration is whether it's practical in real life. In this regard, I think the formula works very well. The procedure for the calculation is simple, and yet takes into account a lot of relevant factors. (Maybe we need a formula for rating the quality of decision formulas...) The ultimate test will be if people use it, and actually find it useful. 

Comment by Mitchell_Porter on Meta Questions about Metaphilosophy · 2023-09-07T02:44:55.458Z · LW · GW

What kind of training data would increase positive outcomes for superhuman AIs interacting with each other?

How does this help humanity? This is like a mouse asking if elephants can learn to get along with each other. 

Comment by Mitchell_Porter on The Illusion of Universal Morality: A Dynamic Perspective on Genetic Fitness and Ethical Complexity · 2023-09-06T04:52:37.158Z · LW · GW

What part of your writings comes from you, and what part comes from the AI? 

Comment by Mitchell_Porter on The Illusion of Universal Morality: A Dynamic Perspective on Genetic Fitness and Ethical Complexity · 2023-09-06T02:21:08.289Z · LW · GW

Posts from this account appear to be AI-generated. 

Another such account is @super-agi, but whoever is behind that one, does actually interact with comments. We shall see if @George360 is capable of that. 

Comment by Mitchell_Porter on Meta Questions about Metaphilosophy · 2023-09-02T08:25:26.515Z · LW · GW

Why is there virtually nobody else interested in metaphilosophy or ensuring AI philosophical competence (or that of future civilization as a whole) 

I interpret your perspective on AI as combining several things: believing that superhuman AI is coming; believing that it can turn out very bad or very good, and that a good outcome is a matter of correct design; believing that the inclinations of the first superhuman AI(s) will set the rules for the remaining future of civilization. 

This is a very distinctive combination of beliefs. At one time, I think Less Wrong was the only intellectual community in which that combination was commonplace. I guess that it then later spread to parts of the Effective Altruism and AI safety communities, once they existed.  

Your specific take is then that correct philosophical cognition may be essential, because decision theory, and normativity in general, is one of the things that AI alignment has to get right, and the best thinking there came from philosophy. 

I suspect that the immediate answer to your question, is that this specific line of thought would only occur to people who share those three presuppositions - those "priors", if you like - and that was always a small group of people, busy with a very multifaceted problem. 

And furthermore, if someone from that group did try to identify the kind of thinking by the AI, that needs to be correct for a good outcome, they wouldn't necessarily identify it as "philosophical thinking" - especially since many such people would disdain what is actually done in philosophy. They might prefer cognitive labels like metacognition, concept formation, or theory formation, or they might even think in terms of the concepts and vocabulary of computer programming. 

One way to get perspective on this, is to see if someone else managed to independently invent this line of thought, but under a different name, or even in a different concept. Here's something ironic: it occurred to me to wonder, if anyone asked this question, during the advent of psychoanalysis. Someone might have thought, psychoanalysis has the power to shape minds, it could determine the future of the human race, we'd better make sure that psychoanalysts have the right philosophy. If you look for discussions of psychoanalysis and metaphilosophy, I don't think you'll find that exact concern, but you will find that the first recorded use of the term "metaphilosophy" was by a psychoanalyst, Morris Lazerowitz. However, he was psychoanalyzing the preoccupations of philosophers, rather than sophoanalyzing the presuppositions of psychoanalysts. 

Another person I checked was Jurgen Schmidhuber, the AI pioneer. I found a 2012 paper by him, telling "philosophers and futurists [to] catch up" with new computer-science definitions of intelligence, problem-solving, and creativity - many of them due to him. This is an example of someone in the AI camp who went seeking cognitive fundamentals too, but who came to regard something computational (in Schmidhuber's case, data compression), rather than "philosophy", as the wellspring of cognitive progress. (Incidentally, Schmidhuber's attitude to the future of morality is relativism tempered by darwinism - there will be multiple AI value systems, and the "survivors" will determine what is regarded as moral.) 

On the other hand, I belong to a camp that arrives at the importance of philosophical cognition, owing to concerns about inadequate philosophy in the community, and its consequences for scientific ontology and AI consciousness. I wrote an essay here a decade ago, "Friendly AI and the limits of computational epistemology", arguing that physicalism (as well as more esoteric ontologies like mathematical platonism and computational platonism) is incomplete, but that the favored epistemologies, here and in adjacent communities, are formally incapable of noticing this, and that these ontological and epistemological presuppositions might be built into the AIs. 

As it turns out, an even more pragmatist and positivist approach to AI, deep learning, won out, and as a result we now have AI colleagues that can talk to us, who have a superhuman speed and breadth of knowledge, but whose inner workings we don't even understand. It remains to be seen whether the good that their polymathy can do, outweighs the bad that their inscrutability portends, for the future of AI alignment. 

Comment by Mitchell_Porter on AI #27: Portents of Gemini · 2023-08-31T16:09:10.369Z · LW · GW

If Gemini is on track to overtake GPT-4, then we should want to understand Google's alignment strategy, insofar as it has one. 

My criterion for whether an advanced AI company or organization knows what it's doing, is that it has a plan for "civilizational alignment" of a "superintelligence". By civilizational alignment, I mean imparting to the AI a set of goals or values, with sufficient breadth and detail, that they are enough to govern a civilization... At the very least, the people involved need to understand that the stakes are nothing less than this. 

We know that OpenAI has prioritized "superalignment" - so they appreciate that there are alignment challenges specific to superintelligence - and their head of alignment knows about CEV, which is a proposal on the level of civilizational alignment. So they more or less satisfy my criterion.  

I have no corresponding understanding of Google's alignment philosophy or chain of responsibility. Because Google has invested heavily in Anthropic, I thought of Anthropic as Google's alignment think-tank. But Anthropic is developing Claude. Someone else is developing Gemini. 

Deep Mind has a concept of "scalable alignment", which might be their version of "superalignment", but Deep Mind was merged with Google Brain. I don't know who's in charge of Gemini, who is in charge of AI safety for Gemini, or how they think about things. Google does evidently have a policy of infusing its AI products with certain values (tentatively I'll identify its corporate value system as democratic progressivism), but do they dare to think that a Google AI might actually end up in charge of life on Earth, and prepare accordingly? 

I also wonder how the thinking at the Frontier Model Forum (where Microsoft, Google, OpenAI, and Anthropic all liaise), rates according to my criterion. 

Comment by Mitchell_Porter on The Epistemic Authority of Deep Learning Pioneers · 2023-08-30T00:27:50.264Z · LW · GW

the pioneers’ intuitions might still be misguided, as it seems their initial inclination to work with neural networks was motivated for the wrong reasons: the efficacy of neural networks (probably) comes not from their nominal similarity to biological brains but rather the richness of high-dimensional representations

But they wanted to imitate the brain, because of the brain's high capabilities. And they discovered neural network architectures with high capabilities. Do you think the brain's capabilities have nothing to do with the use of high-dimensional representations? 

Comment by Mitchell_Porter on Humanities In A Post-Conscious AI World? · 2023-08-29T04:37:55.442Z · LW · GW

From the title, I thought this was about a "post-conscious" "AI world", i.e. a world dominated by AIs that aren't conscious (which is, ironically, the topic of a post made 7 hours before this one). 

I cannot find any institutional effort in this direction. Everything seems to come from isolated individuals... I suggest asking Blake Lemoine. 

Comment by Mitchell_Porter on The Game of Dominance · 2023-08-27T13:02:08.089Z · LW · GW

Once AI systems become more intelligent than humans, humans ... will *still* be the "apex species." 

From: "Famous last words", Encyclopedia Galactica, entry for Homo sapiens

Comment by Mitchell_Porter on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong · 2023-08-27T11:03:36.193Z · LW · GW

Two things: 

First, you mention Jacob Cannell as an authoritative-sounding critic of Eliezer's AI futurology. In fact, Jacob's claims about the brain's energetic and computational efficiency were based on a paradigm of his own, the "Landauer tile" model

Second, there's something missing in your discussion of the anti-zombie argument. The physical facts are not just what happens, but also why it happens - the laws of physics. In your Casper example, when you copy Casper's world, by saying "Oh, and also...", you are changing the laws. 

This has something to do with the status of interactionism. You say Eliezer only deals with epiphenomenalism, what about interactionism? But interactionist dualism already deviates from standard physics. There are fundamental mental causes in interactionism, but not in standard physics. 

Comment by Mitchell_Porter on Will an Overconfident AGI Mistakenly Expect to Conquer the World? · 2023-08-26T00:47:13.646Z · LW · GW

the first serious attempt by an AGI to take over the world

Define "serious". We already had Chaos-GPT. Humanity's games are full of agents bent on conquest; once they are coupled to LLMs they can begin to represent the idea of conquering the world outside the game too... At any moment, in the human world, there's any number of people and/or organizations which aim to conquer the world, from ineffectual little cults, to people inside the most powerful countries, institutions, religions, corporations... of the day. The will to power is already there among humans; the drive to conquer is already designed into any number of artificial systems. 

Perhaps there is no serious example yet, in the AI world, of an emergent instrumental drive to conquer, because the existing agents don't have the cognitive complexity required. But we already have a world with human, artificial, and hybrid collective agents that are trying (with varying stealth and success) to conquer everything. 

Comment by Mitchell_Porter on Steven Wolfram on AI Alignment · 2023-08-24T17:12:52.129Z · LW · GW

I am a Kantian and believe that those a priori rules have already been discovered

Does it boil down to the categorical imperative? Where is the best exposition of the rules, and the argument for them? 

Comment by Mitchell_Porter on Mitchell_Porter's Shortform · 2023-08-24T16:47:25.710Z · LW · GW

Current sense of where we're going:

AI is percolating into every niche it can find. Next are LLM-based agents, which have the potential to replace humanity entirely. But before that happens, there will be superintelligent agent(s), and at that point the future is out of humanity's hands anyway. So to make it through, "superalignment" has to be solved, either by an incomplete effort that serendipitously proves to be enough, or because the problem was correctly grasped and correctly solved in its totality. 

Two levels of superalignment have been discussed, what we might call mundane and civilizational. Mundane superalignment is the task of getting a superintelligence to do anything at all, without having it overthink and end up doing something unexpected and very unwanted. Civilizational superalignment is the task of imparting to an autonomous superintelligence, a value system (or disposition or long-term goal, etc) which would be satisfactory as the governing principle of an entire transhuman civilization. 

Eliezer thinks we have little chance of solving even mundane superalignment in time - that we're on track to create superintelligence without really knowing what we're doing at all. He thinks that will inevitably kill us all. I think there's a genuine possibility of superalignment emerging serendipitously, but I don't know the odds - they could be decent odds, or they could be microscopic. 

I also think we have a chance of fully and consciously solving civilizational superalignment in time, if the resources of the era of LLM-based agents are used in the right way. I assume OpenAI plans to do this, possibly Conjecture's plan falls under this description, and maybe Anthropic could do it too. And then there's Orthogonal, who are just trying to figure out the theory, with or without AI assistance. 

Unknown unknowns may invalidate some or all of this scenario. :-) 

Comment by Mitchell_Porter on Steven Wolfram on AI Alignment · 2023-08-24T05:14:35.530Z · LW · GW

all we have to do is to provide the AI with the right a priori rules

An optimistic view. Any idea how to figure out what they are?

Comment by Mitchell_Porter on Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices · 2023-08-20T05:15:34.425Z · LW · GW

A meta-comment: You have an original research program, and as far as I know you don't have a paid research position. Is there a summary somewhere of the aims and methods of your research program, and what kind of feedback you're hoping for (e.g. collaborators, employers, investors)? 

Comment by Mitchell_Porter on Are we running out of new music/movies/art from a metaphysical perspective? (updated) · 2023-08-20T04:05:23.463Z · LW · GW

No, we are nowhere near exhausting what's possible. There are just large numbers of unoriginal works and it's easy to get lost among them. 

Comment by Mitchell_Porter on Chess as a case study in hidden capabilities in ChatGPT · 2023-08-20T03:50:23.283Z · LW · GW

the issue is that chatGPT isn't able to keep track of the board state well enough

Then tackle this problem directly. Find a representation of board state so that you can specify a middlegame position on the first prompt, and it still makes legal moves. 

Comment by Mitchell_Porter on [Linkpost] Robustified ANNs Reveal Wormholes Between Human Category Percepts · 2023-08-18T04:33:26.718Z · LW · GW

For those readers who might skip this paper: it's studying questions like, what is the least number of pixels you need to change, to make a dog look like a bird / crab / primate / frog / etc. It's creepy stuff reminiscent of Deep Dream. 

Comment by Mitchell_Porter on Why might General Intelligences have long term goals? · 2023-08-17T14:51:01.278Z · LW · GW

if we make General Intelligences with short term goals perhaps we don't need to fear AI apocalypse

One of the hypothetical problems with opaque superintelligence is that it may combine unexpected interpretations of concepts, with extraordinary power to act upon the world, with the result that even a simple short-term request results in something dramatic and unwanted. 

Suppose you say to such an AI, "What is 1+1?" You think its task is to display on the screen, decimal digits representing the number that is the answer to that question. But what does it think its task is? Suppose it decides that its task is to absolutely ensure that you know the right answer to that question. You might end up in the Matrix, perpetually reliving the first moment that you learned about addition. 

So we not only need to worry about AI appropriating all resources for the sake of long-term goals. We also need to anticipate and prevent all the ways it might destructively overthink even a short-term goal. 

Comment by Mitchell_Porter on If we had known the atmosphere would ignite · 2023-08-17T08:07:03.464Z · LW · GW

I think such a prize would be more constructive, if it could also just reward demonstrations of the difficulty of AI alignment. An outright proof of impossibility is very unlikely in my opinion, but better arguments for the danger of unaligned AI and the difficulty of aligning it, seem very possible. 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-08-14T02:45:11.261Z · LW · GW

I'm not sure how we got on to the subject

Remmelt argues that no matter how friendly or aligned the first AIs are, simple evolutionary pressure will eventually lead some of their descendants to destroy the biosphere, in order to make new parts and create new habitats for themselves. 

I proposed the situation of cattle in India, as a counterexample to this line of thought. They could be used for meat, but the Hindu majority has never accepted that. It's meant to be an example of successful collective self-restraint by a more intelligent species. 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-08-13T11:35:50.825Z · LW · GW

The recurring argument seems to be, that it would be adaptive for machines to take over Earth and use it to make more machine parts, and so eventually it will happen, no matter how Earth-friendly their initial values are. 

So now my question is, why are there still cows in India? And more than that, why has the dominant religion of India never evolved so as to allow for cows to be eaten, even in a managed way, but instead continues to regard them as sacred? 

Any ideas / initiatives

I'll respond in the next reply. 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-08-11T08:28:19.578Z · LW · GW

I see three arguments here for why AIs couldn't or wouldn't do, what the human child can: arguments from evolution (1, 2, 5), an argument from population (4, 6), and an argument from substrate incentives (3, 7). 

The arguments from evolution are: Children have evolved to pay attention to their elders (1), to not be antisocial (2), and to be hygienic (5), whereas AIs didn't. 

The argument from population (4, 6), I think is basically just that in a big enough population of space AIs, eventually some of them would no longer keep their distance from Earth. 

The argument from substrate incentives (3, 7) is complementary to the argument from population, in that it provides a motive for the AIs to come and despoil Earth. 

I think the immediate crux here is whether the arguments from evolution actually imply the impossibility of aligning an individual AI. I don't see how they imply impossibility. Yes, AIs haven't evolved to have those features, but the point of alignment research is to give them analogous features by design. Also, AI is developing in a situation where it is dependent on human beings and constrained by human beings, and that situation does possess some analogies to natural selection. 

Human beings, both individually and collectively, already provide numerous examples of how dangerous incentives can exist, but can nonetheless be resisted or discouraged. It is materially possible to have a being which resists actions that may otherwise have some appeal, and to have societies in which that resistance is maintained for generations. The robustness of that resistance is a variable thing. I suppose that most domesticated species, returned to the wild, become feral again in a few generations. On the other hand, we talk a lot about superhuman capabilities here; maybe a superhuman robustness can reduce the frequency of alignment failure to something that you would never expect to occur, even on geological timescales. 

This is why, if I was arguing for a ban on AI, I would not be talking about the problem being logically unsolvable. The considerations that you are bringing up, are not of that nature. At best, they are arguments for practical unsolvability, not absolute in-principle logical unsolvability. If they were my arguments, I would say that they show making AI to be unwise, and hubristic, and so on. 

Comment by Mitchell_Porter on What are the flaws in this argument about p(Doom)? · 2023-08-10T07:30:15.732Z · LW · GW

It's fine. I have no authority here, that was really meant as a suggestion... Maybe the downvoters thought it was too basic a post, but I like the simplicity and informality of it. The argument is clear and easy to analyze, and on a topic as uncertain and contested as this one, it's good to return to basics sometimes. 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-08-10T05:20:20.327Z · LW · GW

For the moment, let me just ask one question: why is it that toilet training a human infant is possible, but convincing a superintelligent machine civilization to stay off the Earth is not possible? Can you explain this in terms of "controllability limits" and your other concepts? 

Comment by Mitchell_Porter on What are the flaws in this argument about p(Doom)? · 2023-08-08T23:50:31.346Z · LW · GW

Yes, I wanted to downvote too. But this is actually a good little argument to analyze. @William the Kiwi, please change the title to something like "What are the weaknesses in this argument for doom?"

Comment by Mitchell_Porter on Yet more UFO Betting: Put Up or Shut Up · 2023-08-08T21:27:47.408Z · LW · GW

or new evidence

You win the bet if new unexplained claims are made?

What counts as new evidence?

Comment by Mitchell_Porter on Announcing Squiggle Hub · 2023-08-07T00:19:41.678Z · LW · GW

My comment was 99% a joke. Though if you used Squiggle to perform an existential risk-reward analysis of whether to use Squiggle, who knows what would happen. :-) 

Comment by Mitchell_Porter on Announcing Squiggle Hub · 2023-08-05T01:34:10.414Z · LW · GW

You can use this to feed into Claude, for some Squiggle generation and assistance.

AI Safety Theorist: In my arxiv paper I invented the Squiggle Maximizer as a cautionary tale 

AI Safety Company: At long last, we have created the Squiggle Maximizer from classic arxiv paper Don't Create The Squiggle Maximizer

Comment by Mitchell_Porter on AI #23: Fundamental Problems with RLHF · 2023-08-03T14:52:45.093Z · LW · GW

send chatGPT the string ` a` repeated 1000 times

This seems to cause some sort of data leak

I don't think so. It's just generating a random counterfactual document, as if there was no system prompt. 

Comment by Mitchell_Porter on The Control Problem: Unsolved or Unsolvable? · 2023-08-03T08:28:46.987Z · LW · GW

This post stakes out a slightly different position than usual in the landscape of arguments that AI is an extinction risk. The AI safety community is full of people saying that AI is immensely dangerous, so we should be trying to slow it down, spending more on AI safety research, and so on. Eliezer himself has become a doomer because AI safety is so hard and AI is advancing so quickly. 

This post, however, claims to show that AI safety is logically impossible. It is inspired by the thought of Forrest Landry, a systems theorist and philosopher of design... So what's the actual argument? The key claim, as far as I can make out, is that machines have different environmental needs than humans. For example - and this example comes directly from the article above - computer chips need "extremely high temperatures" to be made, and run best at "extremely low temperatures"; but humans can't stray too far from room temperature at any stage in their life cycle. 

So yes, if your AI landlord decides to replace your whole town with a giant chip fab or supercooled data center, you may be in trouble. And one may imagine the Earth turned to Venus or Mars, if the robots decide to make it one big foundry. But where's the logical necessity of such an outcome, that we were promised? For one thing, the machines have the rest of the solar system to work with... 

The essential argument, I think, is just that the physical needs of machines tell us more about their long-run tendencies, than whatever purposes they may be pursuing in the short term. Even if you try to load them up with human-friendly categorical imperatives, they will still find nonbiological environments useful because of their own physical nature, and over time that will tell. 

In my opinion, packaging this perspective with the claim to have demonstrated the unsolvability of the control problem, actually detracts from its value. I believe the valuable perspective here, is this extension of ecological and evolutionary thinking, that pays more attention to lasting physical imperatives than to the passing goals, hopes and dreams of individual beings, to the question of human vs AI. 

You could liken the concern with specific AI value systems, to concern with politics and culture, as the key to shaping the future. Within the futurist circles that emerged from transhumanism, we already have a slightly different perspective, that I associate with Robin Hanson - the idea that economics will affect the structure of posthuman society, far more than the agenda of any individual AI. This ecologically-inspired perspective is reaching even lower, and saying, computers don't even eat or breathe, they are detached from all the cycles of life in which we are embedded. They are the product of an emergent new ecology, of factories and nonbiological chemistries and energy sources, and the natural destiny of that machine ecology is to displace the old biological ecology, just as aerobic life is believed to have wiped out most of the anaerobic ecosystem that existed before it. 

Now, I have reasons to disagree with the claim that machines, fully unleashed, necessarily wipe out biological life. As I already pointed out, they don't need to stay on Earth. From a biophysical perspective, some kind of symbiosis is also conceivable; it's happened before in evolution. And the argument that superintelligence just couldn't stick with a human-friendly value system, if we managed to find one and inculcate it, hasn't really been made here. So I think this neo-biological vision of evolutionary displacement of humans by AI, is a valuable one, for making the risk concrete, but declaring the logical inevitability of it, I think weakens it. It's not an absolute syllogistic argument, it's a scenario that is plausible given the way the world works. 

Comment by Mitchell_Porter on AI romantic partners will harm society if they go unregulated · 2023-08-01T10:47:25.102Z · LW · GW

This is another one of those AI impacts where something big is waiting to happen, and we are so unprepared that we don't even have good terminology. (All I can add is that the male counterpart of a waifu is a "husbando" or "husbu".) 

One possible attitude is to say, the era of AI companions is just another transitory stage shortly before the arrival of the biggest AI impact of all, superintelligence, and so one may as well focus on that (e.g. by trying to solve "superalignment"). After superintelligence arrives, if humans and lesser AIs are still around, they will be living however it is that the super-AI thinks they should be living; and if the super-AI was successfully superaligned, all moral and other problems will have been resolved in a better way than any puny human intellect could have conceived. 

That's a possible attitude; if you believe in short timelines to superintelligence, it's even a defensible attitude. But supposing we put that aside - 

Another bigger context for the issue of AI companions, is the general phenomenon of AIs that in some way can function as people, and their impact on societies in which until now, the only people have been humans. One possible impact is replacement, outright substitution of AIs for humans. There is overlap with the fear of losing your job to AI, though only some jobs require an AI that is "also a person"... 

Actually, one way to think about the different forms of AI replacement of humans, is just to think about the different roles and relationships that humans have in society. "Our new robot overlords": that's AIs replacing political roles. "AI took our jobs": that's AI replacing economic roles. AI art and AI science: that's AI replacing cultural roles. And AI companions: that's AI replacing emotional, sexual, familial, friendship roles. 

So one possible endpoint (from a human perspective) is 100% substitution. The institutions that evolved in human society actually outlive the human race, because all the roles are filled and maintained by AIs instead. Robin Hanson's world of brain emulations is one version of this, and it seems clear to me that LLM-based agents are another way it could happen. 

I'm not aware of any moral, legal, political, or philosophical framework that's ready for this - either to provide normative advice, or even just good ontological guidance. Should human society allow there to be, AIs that are also people? Can AIs even be people? If AI-people are allowed to exist, or will just inevitably exist, what are their rights, what are their responsibilities? Are they excluded from certain parts of society, and if so, which parts and why? The questions come much more easily than the answers. 

Comment by Mitchell_Porter on Exercise: Solve "Thinking Physics" · 2023-08-01T07:38:13.681Z · LW · GW

Eliezer's Class Project has a fictional group of rationality students try to find the true theory of quantum gravity in one month. This always seemed like a cool goal and test for rationality training to aspire to. If you're not solving difficult open problems faster than science, your Art of Rationality probably isn't complete.

It's good for intelligent people to be audaciously ambitious. But is Art of Rationality enough to figure out quantum gravity, or solve "difficult open problems" in the sciences? If not, could you comment on what else is needed?

Comment by Mitchell_Porter on The UAP Disclosure Act of 2023 and its implications · 2023-07-29T11:28:43.297Z · LW · GW

A number of deflationary thoughts or realizations added up to the thought, "I'm not seeing, hearing, or reading anything here, that is beyond mundane explanation". For example, realizing that an object "the size of a football field" might be an object in a sensor reading that is inferred to be the size of a football field, rather than something bulky and massive that was seen with the naked eye... But these precursor thoughts were not individually decisive. It's not that I definitely knew what actually happened in any particular case. It was just the vivid realization that, wow, there really could be nothing at all behind all this hype. 

Comment by Mitchell_Porter on AI #22: Into the Weeds · 2023-07-27T21:50:05.587Z · LW · GW

xAI seems a potentially significant player to me. We could end up with a situation in which OpenAI is the frontier of safety research (via the superalignment team), and xAI is the frontier of capabilities research (e.g. via a Gemini-style combination of LLMs and "self-play"). 

You're doing a great job with these newsletters on AI. 

Comment by Mitchell_Porter on The UAP Disclosure Act of 2023 and its implications · 2023-07-27T05:46:47.183Z · LW · GW

I sat through the recent hearing, read more about the reported observations, and my interest has basically gone back to zero. For me, the default explanation for such observations remains balloons, drones, natural atmospheric events, sensor artefacts, etc.; and whatever's going on with Grusch, is to be explained in terms of human belief and human deception, e.g. aerospace companies seeking new contracts, other agencies refusing to discuss rumors with him, or whatever. 

In other words, I remain firmly in favor of mundane explanations. Specific mundane hypotheses (like the ones in my previous comment) may have their merits and demerits, but regarding the bigger issue, "ontological crisis" feels less likely than it did, before the hearing. Of course I can imagine all kinds of non-mundane scenarios; but mundane physical nature and human nature seem quite capable of producing what we're seeing, all by themselves. 

Comment by Mitchell_Porter on ChatGPT (and now GPT4) is very easily distracted from its rules · 2023-07-25T12:45:49.187Z · LW · GW

It took me a while to digest your answer, because you're being a little more philosophical than most of us here. Most of us are like, what do AI values have to be so that humans can still flourish, how could the human race ever agree on an answer to that question, how can we prevent a badly aligned AI from winning the race to superintelligence... 

But you're more just taking a position on how a general intelligence would obtain its values. You make no promise that the resulting values are actually good in any absolute sense, or even that they would be human-friendly. You're just insisting that if those values arose by a process akin to conditioning, without any reflection or active selection by the AI, then it's not as general and powerful an intelligence as it could be. 

Possibly you should look at the work of Joscha Bach. I say "possibly" because I haven't delved into his work myself. I only know him as one of those people who shrug off fears about human extinction by saying, humans are just transitional, and hopefully there'll be some great posthuman ecology of mind; and I think that's placing "trust" in evolution to a foolish degree.  

However, he does say he's interested in "AGI ethics" from an AI-centered perspective. So possibly he has something valid to say about the nature of the moralities and value systems that unaligned AIs could generate for themselves. 

In any case, I said that bottom-up derivations of morality have been discussed here before. The primordial example actually predates Less Wrong. Eliezer's original idea for AI morality, when he was about 20, was to create an AI with no hardwired ultimate goal, but with the capacity to investigate whether there might be ultimate goals: metaethical agnosticism, followed by an attempt (by the AI!) to find out whether there are any objective rights and wrongs. 

Later on, Eliezer decided that there is no notion of good that would be accepted by all possible minds, and resigned himself to the idea that some part of the value system of a human-friendly AI would have to come from human nature, and that this is OK. But he still retained a maximum agnosticism and maximum idealism about what this should be. Thus he arrived at the idea that AI values should be "the coherent extrapolated volition of humankind" (abbreviated as "CEV"), without presupposing much about what that volition should be, or even how to extrapolate it. (Brand Blanshard's notion of "rational will" is the closest precedent I have found.) 

And so his research institute tried to lay the foundations for an AI capable of discovering and implementing that. The method of discovery would involve cognitive neuroscience - identifying the actual algorithms that human brains use to decide, including the algorithms we use to judge ourselves. So not just copying across how actual humans decide, but how an ideal moral agent would decide, according to some standards of ideality which are not fully conscious or even fully developed, but which still must be derived from human nature; which to some extent may be derived from the factors that you have identified. 

Meanwhile, a different world took shape, the one we're in now, where the most advanced AIs are just out there in the world, and get aligned via a constantly updated mix of reinforcement learning and prompt engineering. The position of MIRI is that if one of these AIs attains superintelligence, we're all doomed because this method of alignment is too makeshift to capture the subtleties of human value, or even the subtleties of everyday concepts, in a way that extrapolates correctly across all possible worlds. Once they have truly superhuman capacities to invent and optimize, they will satisfy their ingrained imperatives in some way that no one anticipated, and that will be the end. 

There is another paper from the era just before Less Wrong, "The Basic AI Drives" by Steven Omohundro, which tries to identify imperatives that should emerge in most sufficiently advanced intelligences, whether natural or artificial. They will model themselves, they will improve themselves, they will protect themselves; even if they attach no intrinsic value to their own existence, they will do all that, for the sake of whatever legacy goals they do possess. You might consider that another form of emergent "morality". 

Comment by Mitchell_Porter on What is the foundation of me experiencing the present moment being right now and not at some other point in time? · 2023-07-24T12:34:24.490Z · LW · GW

Do the disarray of psychosis or the timelessness of meditation lead you to think that the "usual feeling of continuous time" is illusory? 

My own experience of altered states comes from psychedelics, and from dreaming and sleeping. The usual feeling of continuous time leads to a certain ontology of subjective time, and one's relationship to it: continuity of time, reality of change, persistence of oneself through time, the phenomena of memory and anticipation. The altered states don't lead me to doubt that ontology, because I can understand them as states in which awareness or understanding of temporal phenomena is absent, compared to the usual waking state. 

I cautioned you against using scientific ontology as your touchstone of reality, because so often it leads to dismissal of things that are known from experience, but which aren't present in current theory. For example, in your post you suppose that "spacetime is a static and eternal thing". My problem here is with the idea that reality could be fundamentally "static". It seems like you want to dismiss change or the flow of time as unreal, an illusion to be explained by facts in a static universe. 

On the contrary, I say the way for humanity to progress in knowledge here, is to take the phenomena of experience as definitely real, and then work out how that can be consistent with the facts as we seem to know them in science. None of that is simple. What is definitely real in experience, what we have actually learned in science, it's easy to make mistakes in both those areas; and then synthesizing them may require genius that we don't have. Nonetheless, I believe that's the path to truth, rather than worshipping a theoretical construct and sacrificing one's own sense of reality to it.