Posts
Comments
This all assumes that the AI is an entity that is capable of being negotiated with.
You can't negotiate with a hurricane, an earthquake, a tsunami, an incoming asteroid, or a supernova. The picture I have always had of "AI doom" is a disaster of that sort, rather than rival persons with vastly superior abilities. That is also a possibility, but working out how ants might negotiate with humans looks like a small part of the ants' survival problem.
Utility theory is significantly more problematic than probability theory.
In both cases, from certain axioms, certain conclusions follow. The difference is in the applicability of those axioms in the real world. Utility theory is supposedly about agents making decisions, but as I remarked earlier in the thread, these are "agents" that make just one decision and stop, with no other agents in the picture.
I have read that Morgenstern was surprised that so much significance was read into the VNM theorem on its publication, when he and von Neumann had considered it to be a rather obvious and minor thing, relegated to the appendix of their book. I have come to agree with that assessment.
[Jeffrey's] theory doesn't involve time, like probability theory. It also applies to just one agent, again like probability theory.
Probability theory is not about agents. It is about probability. It applies to many things, including processes in time.
That people fail to solve the Sleeping Beauty paradox does not mean that probability theory fails. I have never paid the problem much attention, but Ape in the coat's analysis seems convincing to me.
This sounds individually (which is still an option), but the question is about collectively.
None of us would last very long without the rest.
The question is zooming out from the humanity itself, and view from out-of-human kind of angle.
Sounds like the viewpoint of the dead.
I think it is an interesting angle, and remind ourselves that many things we do, may not be as altruistic as we thought.
What's altruistic about wanting humanity to survive and flourish? Why would it be? The more humanity flourishes, the more the individuals that make up humanity do. That is what humanity flourishing is.
ETA: The flourishing will be unevenly distributed, as of old.
Well, why want anything? Why not just be dead? Peace of mind guaranteed for ever.
Here's a non-metaphor!
“Cryptobenthics do one thing particularly well: getting eaten.”
I guess the moral of that is, don't be a cryptobenthic.
"2600:1f18:17c:2d43:338d:2669:3fa5:82f8" is an IPv6 address, which one reverse lookup site maps to an Amazon AWS server in Ashburn, Virginia. There could be anything on that machine, and no-one to connect it with. But the URL given does not work in my web browser or in curl
. Looks sketchy to me.
This one is sufficiently egregious that it should be deleted and the author banned. It's at best spam, at worst malware. Fortunately, the obfuscated URL does not actually work.
That was an analogy, a similarity between two things, not an isomorphism.
The mathematics of value that you are asking for is the thing that does not exist yet. People, including me, muddle along as best they can; sometimes at less than that level. Post-rationalists like David Chapman valorise this as "nebulosity", but I don't think 19th century mathematicians would have been well served by that attitude.
It's an interesting open problem.
Here is an analogy. Classical utility theory, as developed by VNM, Savage, and others, the theory of which Eliezer made the searchlight comment, is like propositional calculus. The propositional calculus exists, it's useful, you cannot ever go against it without falling into contradiction, but there's not enough there to do much mathematics. For that you need to invent at least first-order logic, and use that to axiomatise arithmetic and eventually all of mathematics, while fending off the paradoxes of self-reference. And all through that, there is the propositional calculus, as valid and necessary as ever, but mathematics requires a great deal more.
The theory that would deal with the "monsters" that I listed does not yet exist. The idea of expected utility may thread its way through all of that greater theory when we have it, but we do not have it. Until we do, talk of the utility function of a person or of an AI is at best sensing what Eliezer has called the rhythm of the situation. To place over-much reliance on its letter will fail.
I can still be interested, even if I don't have the answers.
The trailer is designed to draw prospective players' attention to the issue, no more than that. If you "don't think current models are sentient", and hence are not actually feeling bad, then I don't see a reason for having a problem here, in the current state of the game. If they manage to produce this game and keep upgrading it with the latest AI methods, when will you know if there is a problem?
I do not have an answer to that question.
(I don't think current models are sentient, but the way of thinking "they are digital, so it's totally OK to torture them" is utterly insane and evil)
I don't think the trailer is saying that. It's just showing people examples of what you can do, and what the NPCs can do. Then it's up to the player to decide how to treat the NPCs. AIpeople is creating the platform. The users will decide whether to make Torment Nexi.
At the end of the trailer, the NPCs are conspiring to escape the simulation. I wonder how that is going to be implemented in game terms.
I notice that there also exists a cryptocoin called AIPEOPLE, and a Russian startup based in Cyprus with the domain aipeople dot ru. I do not know if these have anything to do with the AIpeople game. The game itself is made by Keen Software House. They are based in Prague together with their sister company GoodAI.
I don't have one. What would I use it for? I don't think anyone else yet has one, at least not something mathematically founded, with the simplicity and inevitability of VNM. People put forward various ideas and discuss the various "monsters" I listed, but I see no sign of a consensus.
One example would be the generic one from the OP: "As a teenager, I endorsed the view that Z is the highest objective of human existence. … Yeah, it’s a bit embarrassing in hindsight." This hypothetical teenager's values (I suggest, in disagreement with the OP) have changed. Their knowledge about the world has no doubt also changed, but I see no need to postulate some unobservable deeper value underlying their earlier views that has remained unchanged, only their knowledge about Z having changed.
Long-term lasting changes in one's food preferences might also count, but not the observation that whatever someone has for lunch, they are less likely to have again for dinner.
Utility theory is overrated. There is a certain mathematical neatness to it for "small world" problems, where you know all of the possible actions and their possible effects, and the associated probabilities and payoffs, and you are just choosing the best action, once. Eliezer has described the situation as like a set of searchlights coherently pointing in the same direction. But as soon as you try to make it a universal decision theory it falls apart for reasons that are like another set of searchlights pointing off in all directions, such as unbounded utility, St Petersburg-like games, "outcomes" consistsing of all possible configurations of one's entire future light-cone, utility monsters, repugnant conclusions, iterated games, multi-player games, collective utility, agents trying to predict each other, and so on, illuminating a landscape of monsters surrounding the orderly little garden of VNM-based utility.
No. I can't make any sense of where that came from.
So e.g. if you sometimes feel like eating one kind of food and other times feel like eating another kind of food, you just think "ah, my food preference arbitrarily changed", not "my situation changed to make so that the way to objectively improve my food intake is different now than it was in the past"?
No, there is simply no such thing as a utility function over foodstuffs.
Expected utility is what you have before the outcome of an action is known. Actual utility is what you have after the outcome is known. Here, the utility function has remained the same and you have acquired knowledge of the outcome.
Someone no longer finding a thing valuable that they used to, has either re-evaluated the thing in the light of new information about it, or changed the value they (their utility function) put on it.
That is just replacing the idea of fixed values with a fixed utility function. But it is just as changeable whatever you call it.
Show me your utility function before you were born.
The core claim in this post is that our brains model the world as though there's a thing called "our values", and tries to learn about those values in the usual epistemic way.
I find that a very strange idea, as strange as Plato’s Socrates’ parallel idea that learning is not the acquisition of something new, but recollection of what one had forgotten.
If I try X, anticipating that it will be an excellent experience, and find it disappointing, I have not learned something about my values, but about X.
I have never eaten escamoles. If I try them, what I will discover is what they are like to eat. If I like them, did I always like them? That is an unheard-falling-trees question.
If I value a thing at one period of life and turn away from it later, I have not discovered something about my values. My values have changed. In the case of the teenager we call this process “maturing”. Wine maturing in a barrel is not becoming what it always was, but simply becoming, according to how the winemaker conducts the process.
But people have this persistent illusion that how they are today is how they always were and always will be, and that their mood of the moment is their fundamental nature, despite the evidence of their own memory.
I will look forward to that. I have read the LDSL posts, but I cannot say that I understand them, or guess what the connection might be with destiny and higher powers.
I prefer to see Reality as "nihil supernum" rather than Goodness. Reality does not speak. It promises me nothing. It owes me nothing. If it is not as I wish it to be, it is up to me, and me only, to act to make it more to my liking. There is no-one and nothing to magically make things right. Or wrong, for that matter. There is no-one to complain to, no-one to be grateful to.
This does not have the problem that the Goodness idea has, of how to justify calling it Good to people who are in very bad circumstances. Nor is there any question of dropping a letter and calling it God.
I hardly use Reddit, so I don't know what is intended by the comparison with Reddit 2.0.
You are coming across to me as either a crank or a crackpot, and the "Manic" handle isn't helping. I see no mathematics in what you have posted, and without mathematics, there is no physics. The theory, so far as it is saying definite things, gives no reasons for these particular things. Your insistence that neutrinos have not been detected is supported only by saying there may be other explanations for all the supposed detections, without giving any. This argument could be applied equally well to most of the known subatomic particles, including the gluons that you reply on.
You keep on describing neutrinos as "undetectable". How do you interpret this catalogue of neutrino detectors? BTW, the neutrino idea took shape in 1930-1934, long before the Standard Model was formulated, to explain beta decay.
The question to ask is, what is the measure of the space of physical constants compatible with life? Although that requires some prior probability measure on the space of all hypothetical values of the constants. But the constants are mostly real numbers, and there is no uniform distribution on the reals.
It seems to me that, in fact, it’s entirely possible for a coin to come up aardvarks. ...
For all practical purposes, none of that is ever going to happen. Neither is the coin going to be snatched away by a passing velociraptor, although out of doors, it could be snatched by a passing seagull or magpie, and I would not be surprised if this has actually happened.
Outré scenarios like these are never worth considering.
Rolling a standard 6-sided die and getting a 7 has probability zero.
Tossing an ordinary coin and having it come down aardvarks has probability zero.
Every random value drawn from the uniform distribution on the real interval [0,1] has probability zero.
2=3 with probability zero.
2=2 with probability 1.
For any value in the real interval [0,1], the probability of picking some other value from the uniform distribution is 1.
In a mathematical problem, when a coin is tossed, coming down either heads or tails has probability 1.
In practice, 0 and 1 are limiting cases that from one point of view can be said not to exist, but from another point of view, sufficiently low or high probabilities may as well be rounded off to 0 or 1. The test is, is the event of such low probability that its possibility will not play a role in any decision? In mathematics, probabilities of 0 and 1 exist, and if you try to pretend they don't, all you end up doing is contorting your language to avoid mentioning them.
I think this does sound like you. I would be interested to see your commentary on it. From the title I take it that you think it sounds like you, but do you agree with what ChatGPT!lsusr has written? Does it think like you?
The more prominent you are, the more people want to talk with you, and the less time you have to talk with them. You have to shut them out the moment the cost is no longer worth paying.
blocking high-quality information sources
It seems likely to me that Eliezer blocked you because he has concluded that you are a low-quality information source, no longer worth the effort of engaging with.
getting leverage from destiny/higher powers
Please say more about this. Where can I get some?
Perhaps the solution is simply the “term limit”, but only one check is too fragile. I wonder what other mechanisms may help balance the systems that govern our societies.
A standard thing for dictators to do is to disregard or nullify any merely legal threat to whatever they want to do, whether that is term limits, elections, the constitution, or anything else. That is what a dictator is.
The problem is even more acute if we replace "dictator" by "artificial superintelligence".
This article, like your previous one, does not belong on LessWrong. It does not engage with any of main concerns of this site. It comes across as just a random essay dropped here. And it has more that a hint of the dead hand of ChatGPT.
There are many psychological theories of development, some of which have been discussed on LW in the context of rationality. What is the reason for bringing this one to our attention?
If making decisions some way incentivizes other agents to become less like LDTs and more like uncooperative boulders, you can simply not make decisions that way.
Another way that those agents might handle the situation is not to become boulders themselves, but to send boulders to make the offer. That is, send a minion to present the offer without any authority to discuss terms. I believe this often happens in the real world, e.g. customer service staff whose main goal, for their own continued employment, is to send the aggrieved customer away empty-handed and never refer the call upwards.
A true story from a couple of days ago. Chocolates were being passed round, and I took one. It had a soft filling with a weird taste that I could not identify, not entirely pleasant. The person next to me had also taken one of the same type, and reading the wrapper, she identified it as apple flavoured. And so it was. It tasted much better when I knew what it was supposed to be.
On another occasion, I took what I thought was an apple from the fruit bowl, and bit into it. It was soft. Ewww! A soft apple is a rotten one. Then I realised that it was a nectarine. Delicious!
Not having read that part of planecrash, the solution I immediately thought of, just because it seemed so neat, was that if offered a fraction of the money, accept with probability . The other player’s expectation is , maximised at . Is Eliezer’s solution better than mine, or mine better than his?
One way in which Eliezer’s is better is that mine does not have an immediate generalisation to all threat games.
In case 1, if I don't know how to make a safe AGI while preventing an unsafe AGI, and no-one else does (i.e. the current state of the art), what regulations would I be calling for?
There is nothing routine about my dismissal of the text in question. Remember, this is not the work of a writer, skilled or otherwise. It is AI slop (and if the "author" has craftily buried some genuine pearls in the shit, they cannot complain if they go undiscovered).
If you think the part I quoted (or any other part) means something profound, perhaps you could expound your understanding of it. You yourself have written on the unreliability of LLM output, and this text, in the rare moments when it says something concrete, contains just as flagrant confabulations.
The sufficiently skilled writer does not generate foggy texts. Bad writers and current LLMs do so easily.
Yes, once they've given it to you, or contracted to do so.
I am unperturbed. Have a nice day.
Oh, I read some of them. It was like listening to Saruman. Or to draw a non-fictional comparison, an Adam Curtis documentary. There is no point in engaging with Saruman. One might as well argue with quicksand.
The We-sphere and the They-sphere each have a philosophy. We in the We-sphere have rationally concluded that our philosophy is right (or it would not be our philosophy). Where Their philosophy is different, it is therefore irrational and wrong. This is proved by listing all the differences between Our philosophy and Theirs. That They adhere to Their wrong views instead of Our true views proves that They are irrational and closed-minded. But We adhere to Our views, which are right, proving Us to have superior rationality.
I doubt the interviewees are doing anything more than reaching for a word to express "badness" and uttering the first that comes to hand.
I tried to find some concrete exposition in the paper of what the authors mean by key words such as “organism”, “agent”, and so on, but to me the whole paper is fog. Not AI-generated fog, as far as I can tell, but a human sort of fog, the fog of philosophers.
Then I found this in the last paragraph of section 3:
The problem is that such algorithmic systems have no freedom from immediacy, since all their outputs are determined entirely—even though often in intricate and probabilistic ways—by the inputs of the system. There are no actions that emanate from the historicity of internal organization.
Well, that just sinks it. All the LLMs have bags of “historicity of internal organization”, that being their gigabytes of weights, learned from their training, not to mention the millions of tokens worth of context window that one might call “short-term historicity of internal organization”.
The phrase “historicity of internal organization” seems to be an obfuscated way of saying “memory”.
DOes any of this make sense?
The yammershanks is a bit sponfargled.
Utilitarianism is not supposed to be applied like this. It is only a perspective. If you apply it everywhere, then there's a much quicker shortcut: we should kill a healthy person and use this person's organs to save several other people who would otherwise be healthy if not for some organ disfunction.
And Peter Singer would say yes, yes we should. But only in secret, because of the bad effects there would be if people knew they might be chopped for spares. (Which rather contradicts Singer’s willingness to publish that paper, but you’d have to ask Singer about that.)
Is there some Internet Law that says that however extreme the reductio, there is someone who will bite the bullet?
A human universal that you might be missing is the ability to understand things in their context.
I can’t decide what the epistemic status of that post is, but in the same spirit, here’s how to tell the difference between a brown bear and a grizzly. Climb a tree. A brown bear will climb up after you and eat you, while a grizzly will knock down the tree and eat you.
A baby is not discovering who it is as its mind develops. It is becoming who it will be. This process does not stop before death. At no point can one say, “THIS is who I am” and stop there, imagining that all future change is merely discovering what one already was (despite the new thing being, well, new).
IMHO, utility functions only make sense for “small world” problems: local, well-defined, legible situations for which all possible actions and outcomes are known and complete preferences are possible. For “large worlds” the whole thing falls apart, for multiple reasons which have all been often discussed on LW (although not necessarily with the conclusion that I draw from them). For example, the problems of defining collective utility, self-referential decision theories, non-ergodic decision spaces, game theory with agents reasoning about each other, the observed failures of almost everyone to predict the explosion of various technologies, and the impossibility of limiting the large world to anything less than the whole of one’s future light-cone.
I do not think that any of these will yield merely to “better rationality”.
I guess my summary is, 'Use LLMs thoughtfully and deliberately, not sloppily and carelessly.'
That reminds me of a remark attributed to Dijkstra. I forget the exact wording, but it was to the effect that we should make our errors thoughtfully and deliberately, not sloppily and carelessly.
How far do you take this? What else would you have everyone sacrifice to saving lives?
I am currently attending the Early Music Festival in Utrecht, 10 days of concerts of music at least 400 years old. Is everyone involved in this event — the performers whose whole career is in music, the audiences who are devoting their time to doing this and not something else, and all the people organizing it — engaging in dereliction of duty?