# A system of infinite ethics

post by Chantiel · 2021-10-29T19:37:42.828Z · LW · GW · 60 comments## Contents

My ethical system Infinitarian paralysis Fanaticism Preserving the spirit of aggregate consequentialism Distortions None 60 comments

One unresolved problem in ethics is that aggregate consequentialist ethical theories tend to break down if the universe is infinite. An infinite universe could contain both an infinite amount of good and an infinite amount of bad. If so, you are unable to change the total amount of good or bad in the universe, which can cause aggregate consequentialist ethical systems to break.

There has been a variety of methods considered to deal with this. However, to the best of my knowledge all proposals either have severe negative side-effects or are intuitively undesirable for other reasons.

Here I propose a system of aggregate consequentialist ethics intended to provide reasonable moral recommendations even in an infinite universe. I would like to thank JBlack and Miranda Dixon-Luinenburg for helpful feedback on earlier versions of this work.

My ethical system is intended to satisfy the desiderata for infinite ethical systems specified in Nick Bostrom's paper, "Infinite Ethics". These are:

- Resolving infinitarian paralysis. It must not be the case that all humanly possible acts come out as ethically equivalent.
- Avoiding the fanaticism problem. Remedies that assign lexical priority to infinite goods may have strongly counterintuitive consequences.
- Preserving the spirit of aggregative consequentialism. If we give up too many of the intuitions that originally motivated the theory, we in effect abandon ship.

- Avoiding distortions. Some remedies introduce subtle distortions into moral deliberation

I have yet to find a way in which my system fails any of the above desiderata. Of course, I could have missed something, so feedback is appreciated.

# My ethical system

First, I will explain my system.

My ethical theory is, roughly, "Make the universe one agents would wish they were born into".

By this, I mean, suppose you had no idea which agent in the universe it would be, what circumstances you would be in, or what your values would be, but you still knew you would be born into this universe. Consider having a bounded quantitative measure of your general satisfaction with life, for example, a utility function. Then try to make the universe such that the expected value of your life satisfaction is as high as possible if you conditioned on you being an agent in this universe, but didn't condition on anything else. (Also, "universe" above means "multiverse" if there is one.)

In the above description I didn't provide any requirement for the agent to be sentient or conscious. Instead, all it needs is preferences. If you wish, you can modify the system to give higher priority to the satisfaction of agents that are sentient or conscious, or you can ignore the welfare of non-sentient or non-conscious agents entirely.

Calculate satisfaction as follows. Imagine hypothetically telling an agent everything significant about the universe, and then optionally giving them infinite processing power and infinite time to think. Ask them, "Overall, how satisfied are you with that universe and your place in it"? That is the measure of their satisfaction with the universe. Giving them infinite processing power isn't strictly necessary, and doesn't do the heavy lifting of my ethical system. But it could be helpful for allowing creatures time to reflect on what they really want.

It's not entirely clear how to assign a prior over situations in the universe you could be born into. Still, I think it's reasonably intuitive that there would be some high-entropy situations among the different situations in the universe. This is all I assume for my ethical system.

Now I'll give some explanation of what this system recommends.

First off, my ethical system requires you to use a decision theory other than causal decision theory. In general, you can't have any causal affect on the moral desirability of the universe as I defined it, leading to infinitarian paralysis. However, you can still have acausal effects, so other decision theories can consider these effects can still work.

Suppose you are in a position to do something to help the creatures of the world or or our section of the universe. For example, suppose you have the ability to create friendly AI. to the world, for example by creating friendly AI or something. And you're using my ethical system and considering whether to do it. If you do decide to do it, then that logically implies that any other agent sufficiently similar to you and in sufficiently similar circumstances would also do it. Thus, if you decide to make the friendly AI, then the expected value of an agent in circumstances of the form, "In a world with someone very similar to <insert description of yourself> who has the ability to make safe AI" is higher. And the prior probability of ending up in such a world is non-zero. Thus, by deciding to make the safe AI, you can acausally increase the total moral value of the universe, and so my ethical system would recommend doing it.

Similarly, the system also allows you to engage in acausal trades to improve parts of the universe quite unlike your own. For example, suppose there are some aliens who are indifferent to the suffering of other creatures and only care about stacking pebbles. And you are considering making an acausal trade with them that in which they will avoid causing needless suffering in their section of the universe if you stack some pebbles in your own section. By deciding to stack the pebbles, you acausally make other agents in sufficiently similar circumstances to yours also stack pebbles, and thus make it more likely that the pebble stackers would avoid causing needless suffering. Thus, the expected value of life satisfaction of a creature in the circumstances, "a creature vulnerable to suffering that is in in a world of pebble-stackers who don't terminally value avoiding suffering" would increase. If the harm (if any) of stacking some pebbles is sufficiently small and the benefits to the creatures in that world are sufficiently large, then my ethical system could recommend making the acausal trade.

The system also values helping as many agents as possible. If you only help a few agents, the prior probability of an agent ending up in situations just like those agents would be low. But if you help a much broader class of agents, the effect on the prior expected life satisfaction would be larger.

These all seem like reasonable moral recommendations.

I will now discuss how my system does on the desiderata.

# Infinitarian paralysis

Some infinite ethical systems result in what is called "infinitarian paralysis". This is the state of an ethical system being indifferent in its recommendations in worlds that already have infinitely large amounts of both good and bad. If there's already an infinite amount of both good and bad, then our actions, using regular cardinal arithmetic, are unable to change the amount of good and bad in the universe.

My system does not have this problem. To see why, remember that my system says to maximize the expected value of your life satisfaction given you are in this universe but not conditioning on anything else. And the measure of life-satisfaction was stated to be bounded, say to be in the range [0, 1]. Since any agent can only have life satisfaction in [0, 1], then in an infinite universe, the expected value of life satisfaction of the agent must still be in [0, 1]. So, as long as a finite universe doesn't have expected value of life satisfaction to be 0, then an infinite universe can at most only have finitely more moral value than it.

To say it another way, my ethical system provides a function mapping from possible worlds to their moral value. And this mapping always produces outputs in the range [0, 1]. So, trivially, you can see the no universe can have infinitely more moral value than another universe with non-zero moral value. just isn't in the domain of my moral value function.

# Fanaticism

Another problem in some proposals of infinite ethical systems is that they result in being "fanatical" in efforts to cause or prevent infinite good or bad.

For example, one proposed system of infinite ethics, the extended decision rule, has this problem. Let g represent the statement, "there is an infinite amount of good in the world and only a finite amount of bad". Let b represent the statement, "there is an infinite amount of bad in the world and only a finite amount of good". The extended decision rule says to do whatever maximizes P(g) - P(b). If there are ties, ties are broken by choosing whichever action results in the most moral value if the world is finite.

This results in being willing to incur any finite cost to adjust the probability of infinite good and finite bad even very slightly. For example, suppose there is an action that, if done, would increase the probability of infinite good and finite bad by 0.000000000000001%. However, if it turns out that the world is actually finite, it will kill every creature in existence. Then the extended decision rule would recommend doing this. This is the fanaticism problem.

My system doesn't even place any especially high importance in adjusting the probabilities of infinite good and or infinite bad. Thus, it doesn't have this problem.

# Preserving the spirit of aggregate consequentialism

Aggregate consequentialism is based on certain intuitions, like "morality is about making the world as best as it can be", and, "don't arbitrarily ignore possible futures and their values". But finding a system of infinite ethics that preserves intuitions like these is difficult.

One infinite ethical system, infinity shades, says to simply ignore the possibility that the universe is infinite. However, this conflicts with our intuition about aggregate consequentialism. The big intuitive benefit of aggregate consequentialism is that it's supposed to actually systematically help the world be a better place in whatever way you can. If we're completely ignoring the consequences of our actions on anything infinity-related, this doesn't seem to be respecting the spirit of aggregate consequentialism.

My system, however, does not ignore the possibility of infinite good or bad, and thus is not vulnerable to this problem.

I'll provide another conflict with the spirit of consequentialism. Another infinite ethical system says to maximize the expected amount of goodness of the causal consequences of your actions minus the amount of badness. However, this, too, doesn't properly respect the spirit of aggregate consequentialism. The appeal of aggregate consequentialism is that its defines some measure of "goodness" of a universe, and then recommends you take actions to maximize it. But your causal impact is no measure of the goodness of the universe. The total amount of good and bad in the universe would be infinite no matter what finite impact you have. Without providing a metric of the goodness of the universe that's actually affected, this ethical approach also fails to satisfy the spirit of aggregate consequentialism.

My system avoids this problem by providing such a metric: the expected life satisfaction of an agent that has no idea what situation it will be born into.

Now I'll discuss another form of conflict. One proposed infinite ethical system can look at the average life satisfaction of a finite sphere of the universe, and then take the limit of this as the sphere's size approaches infinity, and consider this the moral value of the world. This has the problem that you can adjust the moral value of the world by just rearranging agents. In an infinite universe, it's possible to come up with a method of re-arranging agents so the unhappy agents are spread arbitrarily thinly. Thus, you can make moral value arbitrarily high by just rearranging agents in the right way.

I'm not sure my system entirely avoids this problem, but it does seem to have substantial defense against it.

Consider you have the option of redistributing agents however you want in the universe. You're using my ethical system to decide whether to make the unhappy agents spread thinly.

Well, your actions have an effect on agents in circumstances of the form, "An unhappy agent on an Earthlike world with someone who <insert description of yourself> who is considering spreading the unhappy agents thinly throughout the universe". Well, if you pressed that button, that wouldn't make the expected life satisfaction of any agent satisfying the above description any better. So I don't think my ethical system recommends this.

Now, we don't have a complete understanding of how to assign a probability distribution of what circumstances an agent is in. It's possible that there is some way to redistribute agents in certain circumstances to change the moral value of the world. However, I don't know of any clear way to do this. Further, even if there is, my ethical system still doesn't allow you to get the moral value of the world arbitrarily high by just rearranging agents. This is because there will always be some non-zero probability of having ended up as an unhappy agent in the world you're in, and your life satisfaction after being redistributed in the universe would still be low.

# Distortions

It's not entirely clear to me how Bostrom distinguished between distortions and violations of the spirit of aggregate consequentialism.

To the best of my knowledge, the only distortion pointed out in "Infinite Ethics" is stated as follows:

Your task is to allocate funding for basic research, and you have to choose between two applications from different groups of physicists. The Oxford Group wants to explore a theory that implies that the world is canonically infinite. The Cambridge Group wants to study a theory that implies that the world is finite. You believe that if you fund the exploration of a theory that turns out to be correct you will achieve more good than if you fund the exploration of a false theory. On the basis of all ordinary considerations, you judge the Oxford application to be slightly stronger. But you use infinity shades. You therefore set aside all possible worlds in which there are infinite values (the possibilities in which the Oxford Group tends to fare best), and decide to fund the Cambridge application. Is this right?

My approach doesn't ignore infinity and thus doesn't have this problem. I don't know of any other distortions in my ethical system.

## 60 comments

Comments sorted by top scores.

## comment by gjm · 2021-10-30T00:06:14.740Z · LW(p) · GW(p)

I think this system may have the following problem: It implicitly assumes that you can take a kind of random sample that in fact you can't.

You want to evaluate universes by "how would I feel about being in this universe?", which I think means either something like "suppose I were a randomly chosen subject-of-experiences in this universe, what would my expected utility be?" or "suppose I were inserted into a random place in this universe, what would my expected utility be?". (Where "utility" is shorthand for your notion of "life satisfaction", and you are welcome to insist that it be bounded.)

But in a universe with infinitely many -- countably infinitely many, presumably -- subjects-of-experiences, the first involves an action equivalent to *picking a random integer*. And in a universe of infinite size (and with a notion of space at least a bit like ours), the second involves an action equivalent to *picking a random real number*.

And there's no such thing as picking an integer, or a real number, uniformly at random.

This is essentially the same as the "infinitarian paralysis" problem. Consider two universes, each with a countable infinity of happy people and a countable infinity of unhappy people (and no other subjects of experience, somehow). In the first, all the people were generated with a biased coin-flip that picks "happy" 99.9% of the time. In the second, the same except that their coin picks "unhappy" 99.9% of the time. We'd like to be able to say that the first option is better than the second, but we can't, because *actually with probability 1 these two universes are equivalent* in the sense that with probability 1 they both have infinitely many happy and infinitely many unhappy people, and we can *simply rearrange them* to turn one of those universes into the other. Which is one way of looking at *why* there's no such operation as "pick a random integer", because if there were then surely picking a random person from universe 1 gets you a happy person with probability 0.999 and picking a random person from universe 1 gets you a happy person with probability 0.001.

When you have infinitely many things, you may find yourself unable to say meaningfully whether there's more positive or more negative there, and that *isn't dependent on adding up the positives and negatives and getting infinite amounts of goodness or badness*. You are entirely welcome to say that in our hypothetical universe there are no infinite utilities anywhere, that we shouldn't be trying to compute anything like "the total utility", and that's fine, but you *still* have the problem that e.g. you can't say "it's a bad thing to take 1000 of the happy people and make them unhappy" if what you mean by that is that it makes for a worse universe, because the modified universe is *isomorphic to the one you started with*.

## ↑ comment by JBlack · 2021-10-31T11:42:57.837Z · LW(p) · GW(p)

It's not a distribution over agents in the universe, it's a distribution over *possible agents in possible universes*. The possible universes can be given usual credence-based weightings based on conditional probability given the moral agent's observations and models, because what else are they going to base anything on?

If your actions make 1000 people unhappy, and presumably some margin "less satisfied" in some hypothetical post-mortem universe rating, the idea seems to be that you first estimate how much less satisfied they would be. Then the novel (to me) part of this idea is that you multiply this by the estimated fraction of *all agents, in all possible universes weighted by credence*, who would be in your position. Being a fraction, there is no unboundedness involved. The fraction may be extremely small, but should always be nonzero.

As I see it the exact fraction you estimate doesn't actually matter, because all of your options have the same multiplier and you're evaluating them relative to each other. However this multiplier is what gives ethical decisions nonzero effect even in an infinite universe, because there will only be finitely many *ethical scenarios* of any given complexity.

So it's not just "make 1000 happy people unhappy", it's "the 1 in N people with similar incentives as me in a similar situation would each make 1000 happy people unhappy", resulting in a net loss of 1000/N of universal satisfaction. N may be extremely large, but it's not infinite.

Replies from: gjm## ↑ comment by gjm · 2021-10-31T12:39:47.722Z · LW(p) · GW(p)

How is it a distribution over possible agents in possible universes (plural) when the idea is to give a way of assessing the merit of *one* possible universe?

I do agree that an ideal consequentialist deciding between actions should consider for each action the whole distribution of possible universes after they do it. But unless I'm badly misreading the OP, I don't see where it proposes anything like what you describe. It says -- emphasis in all cases mine, to clarify what bits I think indicate that a single universe is in question -- "... but you still knew you would be born into **this universe**", and "Imagine hypothetically telling an agent **everything** significant about **the universe**", and "a prior over **situations in the universe** you could be born into", and "my ethical system provides a **function mapping from possible worlds to their moral value**", and "maximize the expected value of your life satisfaction **given you are in this universe**", and "The appeal of aggregate consequentialism is that its defines some measure of "goodness" **of a universe**", and "the moral value of **the world**", and plenty more.

Even if somehow this is what OP meant, though -- or if OP decides to embrace it as an improvement -- I don't see that it helps at all with the problem I described; in typical cases I expect picking a random agent in a credence-weighted random universe-after-I-do-X to pose all the same difficulties as picking a random agent in a single universe-after-I-do-X. Am I missing some reason why the former would be easier?

Replies from: Chantiel, Chantiel## ↑ comment by Chantiel · 2021-11-01T00:02:53.112Z · LW(p) · GW(p)

(Assuming you're read my other response you this comment):

I think it might help if I give a more general explanation of how my moral system can be used to determine what to do. This is mostly taken from the article, but it's important enough that I think it should be restated.

Suppose you're considering taking some action that would benefit our world or future life cone. You want to see what my ethical system recommends.

Well, for almost possible circumstances an agent could end up in in this universe, I think your action would have effectively no causal or acausal effect on them. There's nothing you can do about them, so don't worry about them in your moral deliberation.

Instead, consider agents of the form, "some agent in an Earth-like world (or in the future light-cone of one) with someone just like <insert detailed description of yourself and circumstances>". *These* are agents you can potentially (acausally) affect. If you take an action to make the world a better place, that means the other people in the universe who are very similar to you and in very similar circumstances would *also* take that action.

So if you take that action, then you'd improve the world, so the expected value of life satisfaction of an agent in the above circumstances would be higher. Such circumstances are of finite complexity and not ruled out by evidence, so the probability of an agent ending up in such a situation, conditioning only on being in this universe, in non-zero. Thus, taking that action would increase the moral value of the universe and my ethical system would thus be liable to recommend taking that action.

To see it another way, moral deliberation with my ethical system works as follows:

Replies from: gjmI'm trying to make the universe a better place. Most agents are in situations in which I can't do anything to affect them, whether causally or acausally. But there are

someagents in situations that that Ican(acausally) affect. So I'm going to focus on making the universe as satisfying as possible for those agents, using some impartial weighting over those possible circumstances.

## ↑ comment by gjm · 2021-11-01T02:28:11.004Z · LW(p) · GW(p)

Your comments are focusing on (so to speak) the decision-theoretic portion of your theory, the bit that would be different if you were using CDT or EDT rather than something FDT-like. That isn't the part I'm whingeing about :-). (There surely *are* difficulties in formalizing any sort of FDT, but they are not my concern; I don't think they have much to do with *infinite ethics* as such.)

My whingeing is about the part of your theory that seems specifically relevant to questions of infinite ethics, the part where you attempt to average over all experience-subjects. I think that one way or another this part runs into the usual average-of-things-that-don't-have-an-average sort of problem which afflicts other attempts at infinite ethics.

As I describe in another comment, the approach I think you're taking can *move where that problem arises* but not (so far as I can currently see) make it actually go away.

## ↑ comment by Chantiel · 2021-10-31T22:35:58.198Z · LW(p) · GW(p)

How is it a distribution over possible agents in possible universes (plural) when the idea is to give a way of assessing the merit of one possible universe?

I do think JBlack understands the idea of my ethical system and is using it appropriately.

my system provides a method of evaluating the moral value of a specific universe. The point of moral agents to to try to make the universe one that scores highlly on this moral valuation. But we don't know exactly what universe we're in, so to make decisions, we need to consider all universes we could be in, and then take the action that maximizes the expected moral value of the universe we're actually in.

For example, suppose I'm considering pressing a button that will either make everyone very slightly happier, or make everyone extremely unhappy. I don't actually know which universe I'm in, but I'm 60% sure I'm in the one that would make everyone happy. Then if I press the button, there's a 40% chance that the universe would end up with very low moral value. That means pressing the button would not in expectation decrease the moral value of the universe, so my morally system would recommend not pressing it.

Even if somehow this is what OP meant, though -- or if OP decides to embrace it as an improvement -- I don't see that it helps at all with the problem I described; in typical cases I expect picking a random agent in a credence-weighted random universe-after-I-do-X to pose all the same difficulties as picking a random agent in a single universe-after-I-do-X. Am I missing some reason why the former would be easier?

I think to some extent you may be over-thinking things. I agree that it's not completely clear how to compute P("I'm satisfied" | "I'm in this universe"). But to use my moral system, I don't need a perfect, rigorous solution to this, nor am I trying to propose one.

I think the ethical system provides reasonably straightforward moral recommendations in the situations we could actually be in. I'll give an example of such a situation that I hope is illuminating. It's paraphrased from the article.

Suppose you can have the ability to create safe AI and are considering whether my moral system recommends doing so. And suppose if you create safe AI everyone in your world will be happy, and if you don't then the world will be destroyed by evil rogue AI.

Consider an agent that knows it will be in this universe, but nothing else. Well, consider the circumstances, "I'm an agent in an Earth-like world that contains someone who is just like gjm and in a very similar situation who has the ability to create safe AI". That above description has finite description length, and the AI has no evidence ruling it out. So it must have *some* non-zero probability of ending up in such a situation, conditioning on being somewhere in this universe.

All the gjms have the same knowledge and value and are in pretty much the same circumstances. So their actions are logically constrained to be the same as yours. Thus, if you decide to create the AI, you are acausally determining the outcome of arbitrary agents in the above circumstances, by making such an agent end up satisfied when they otherwise wouldn't have been. Since an agent in this universe has non-zero probability of ending up in those circumstances, by choosing to make the safe AI you are increasing the moral value of the universe.

Replies from: gjm## ↑ comment by gjm · 2021-11-01T02:20:49.689Z · LW(p) · GW(p)

As I said to JBlack, so far as I can tell none of the problems I think I see with your proposal become any easier to solve if we switch from "evaluate one possible universe" to "evaluate all possible universes, weighted by credence".

to use my moral system, I don't need a perfect, rigorous solution to this

Why not?

Of course you can *make moral decisions* without going through such calculations. We all do that all the time. But the whole issue with infinite ethics -- the thing that a purported system for handling infinite ethics needs to deal with -- is that the usual ways of formalizing moral decision processes produce ill-defined results in many imaginable infinite universes. So when you propose a system of infinite ethics and I say "look, it produces ill-defined results in many imaginable infinite universes", you don't get to just say "bah, who cares about the details?" If you don't deal with the details you aren't addressing the problems of infinite ethics at all!

It's nice that your system gives the expected result in a situation where the choices available are literally "make everyone in the world happy" and "destroy the world". (Though I have to confess I don't think I entirely understand your account of *how* your system actually produces that output.) We don't really need a *system* of ethics to get to that conclusion!

What I would want to know is how your system performs in more difficult cases.

We're concerned about infinitarian paralysis, where we somehow fail to deliver a definite answer because we're trying to balance an infinite amount of good against an infinite amount of bad. So far as I can see, your system still has this problem. E.g., if I know there are infinitely many people with various degrees of (un)happiness, and I am wondering whether to torture 1000 of them, your system is trying to calculate the average utility in an infinite population, and that simply isn't defined.

So, I *think* this is what you have in mind; my apologies if it was supposed to be obvious from the outset.

We are doing something like Solomonoff induction. The usual process there is that your prior says that your observations are generated by a computer program selected at random, using some sort of prefix-free code and generating a random program by generating a random bit-string. Then every observation updates your distribution over programs via Bayes, and once you've been observing for a while your predictions are made by looking at what all those programs would do, with probabilities given by your posterior. So far so good (aside from the fact that this is uncomputable).

But what you actually want (I think) isn't quite a probability distribution over universes; you want a distribution over experiences-in-universes, and not *your* experiences but those of hypothetical other beings in the same universe as you. So now think of the programs you're working with as describing not *your* experiences necessarily but those of *some being in the universe*, so that each update is weighted not by Pr(I have experience X | my experiences are generated by program P) but by Pr(some subject-of-experience has experience X | my experiences are generated by program P), with the constraint that it's meant to be the *same* subject-of-experience for each update. Or maybe by Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P) with the same constraint.

So now after all your updates what you have is a probability distribution over *generators of experience-streams for subjects in your universe*.

When you consider a possible action, you want to condition on that in some suitable fashion, and exactly how you do that will depend on what sort of decision theory you're using; I shall assume all the details of that handwaved away, though again I think they may be rather difficult. So now you have a revised probability distribution over experience-generating programs.

And now, if everything up to this point has worked, you can compute (well, you can't because everything here is uncomputable, but never mind) an expected utility because each of our programs yields a being's stream of experiences, and modulo some handwaving you can convert that into a utility, and you have a perfectly good probability distribution over the programs.

And (I think) I agree that here if we consider either "torture 1000 people" or "don't torture 1000 people" it is reasonable to expect that the latter will genuinely come out with a higher expected utility.

OK, so in this picture of things, what happens to my objections? They apply now to the process by which you are supposedly doing your Bayesian updates on experience. Because (I think) now you are doing one of two things, neither of which need make sense in a world with infinitely many beings in it.

- If you take the "Pr(some subject-of-experience has experience X)" branch: here the problem is that in a universe with infinitely many beings, these probabilities are likely
*all 1*and therefore you never actually learn anything when you do your updating. - If you take the "Pr(a randomly chosen subject-of-experience has experience X)" branch: here the problem is that there's no such thing as a randomly chosen subject-of-experience. (More precisely, there are any number of ways to choose one at random, and I see no grounds for preferring one over another, and in particular neither a
*uniform*nor a*maximum entropy*distribution exists.)

The latter is basically the same problem as I've been complaining about before (well, it's sort of dual to it, because now we're looking at things from the perspective of some possibly-other experiencer in the universe, and *you* are the randomly chosen one). The former is a different problem but seems just as difficult to deal with.

## ↑ comment by Chantiel · 2021-11-02T03:32:33.565Z · LW(p) · GW(p)

Of course you can make moral decisions without going through such calculations. We all do that all the time. But the whole issue with infinite ethics -- the thing that a purported system for handling infinite ethics needs to deal with -- is that the usual ways of formalizing moral decision processes produce ill-defined results in many imaginable infinite universes. So when you propose a system of infinite ethics and I say "look, it produces ill-defined results in many imaginable infinite universes", you don't get to just say "bah, who cares about the details?" If you don't deal with the details you aren't addressing the problems of infinite ethics at all!

Well, I can't say I exactly disagree with you here.

However, I want to note that this isn't a problem specific to my ethical system. It's true that in order to use my ethical system to make precise moral verdicts, you need to more fully formalize probability theory. However, the same is *also* true with effectively every other ethical theory.

For example, consider someone learning about classical utilitarianism and its applications in a finite world. Then they could argue:

Okay, I see your ethical system says to make the balance of happiness to unhappiness as high as possible. But how am I supposed to know what the world is actually like and what the effects of my actions are? Do other animals feel happiness and unhappiness? Is there actually a heaven and Hell that would influence moral choices? This ethical system doesn't answer any of this. You can't just handwave this away! If you don't deal with the details you aren't addressing the problems of ethics at all!

Also, I just want to note that my system as described seems to be unique among the infinite ethical systems I've seen in that it doesn't make obviously ridiculous moral verdicts. Every other one I know of makes some recommendations that seem really silly. So, despite not providing a rigorous formalization of probability theory, I think my ethical system has value.

But what you actually want (I think) isn't quite a probability distribution over universes; you want a distribution over experiences-in-universes, and not your experiences but those of hypothetical other beings in the same universe as you. So now think of the programs you're working with as describing not your experiences necessarily but those of some being in the universe, so that each update is weighted not by Pr(I have experience X | my experiences are generated by program P) but by Pr(some subject-of-experience has experience X | my experiences are generated by program P), with the constraint that it's meant to be the same subject-of-experience for each update. Or maybe by Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P) with the same constraint.

Actually, no, I really do want a probability distribution over what I would experience, or more generally, the situations I'd end up being in. The alternatives you mentioned, Pr(some subject-of-experience has experience X | my experiences are generated by program P) and Pr(a randomly chosen subject-of-experience has experience X | my experiences are generated by program P), both lead to problems for the reasons you've already described.

I'm not sure what made you think I didn't mean, P(I have experience x | ...). Could you explain?

We're concerned about infinitarian paralysis, where we somehow fail to deliver a definite answer because we're trying to balance an infinite amount of good against an infinite amount of bad. So far as I can see, your system still has this problem. E.g., if I know there are infinitely many people with various degrees of (un)happiness, and I am wondering whether to torture 1000 of them, your system is trying to calculate the average utility in an infinite population, and that simply isn't defined.

My system doesn't compute the average utility of anything. Instead, it tries to compute the expected value of utility (or life satisfaction). I'm sorry if this was somehow unclear. I didn't think I ever mentioned I was dealing with averages anywhere, though. I'm trying to get better at writing clearly, so if you remember what made you think this, I'd appreciate hearing.

Replies from: gjm## ↑ comment by gjm · 2021-11-03T20:58:35.153Z · LW(p) · GW(p)

I'll begin at the end: What is "the expected value of utility" if it isn't an average of utilities?

You originally wrote:

suppose you had no idea which agent in the universe it would be, what circumstances you would be in, or what your values would be, but you still knew you would be born into this universe. Consider having a bounded quantitative measure of your general satisfaction with life, for example, a utility function. Then try to make the universe such that the expected value of your life satisfaction is as high as possible if you conditioned on you being an agent in this universe, but didn't condition on anything else.

What is "the expected value of your life satisfaction [] conditioned on you being an agent in this universe but [not] on anything else" if it is not the average of the life satisfactions (utilities) over the agents in this universe?

(The slightly complicated business with conditional probabilities that apparently weren't what you had in mind were my attempt at figuring out what else you might mean. Rather than trying to figure it out, I'm just asking you.)

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-04T21:03:14.552Z · LW(p) · GW(p)

I'll begin at the end: What is "the expected value of utility" if it isn't an average of utilities?

I'm just using the regular notion of expected value. That is, let P(u) be the probability density you get utility u. Then, the expected value of utility is , where uses Lebesgue integration for greater generality. Above, I take utility to be in .

Also note that my system cares about a measure of satisfaction, rather than specifically utility. In this case, just replace P(u) to be that measure of life satisfaction instead of a utility.

Also, of course, P(u) is calculated conditioning on being an agent in this universe, and nothing else.

And how do you calculate P(u) given the above? Well, one way is to first start with some disjoint prior probability distribution over universes and situations you could be in, where the situations are concrete enough to determine your eventual life satisfaction. Then just do a Bayes update on "is an agent in this universe and get utility u" by setting the probabilities of hypothesis in which the agent isn't in this universe or doesn't have preferences. Then just renormalize the probabilities so they sum to 1. After that, you can just use this probability distribution of possible worlds W to calculate P(u) in a straightforward manner. E.g. .

(I know I pretty much mentioned the above calculation before, but I thought rephrasing it might help.)

Replies from: gjm## ↑ comment by gjm · 2021-11-05T00:08:09.892Z · LW(p) · GW(p)

If you are just using the regular notion of expected value then it *is* an average of utilities. (Weighted by probabilities.)

I understand that your measure of satisfaction need not be a utility as such, but "utility" is shorter than "measure of satisfaction which may or may not strictly speaking be utility".

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-05T12:36:01.011Z · LW(p) · GW(p)

Oh, I'm sorry; I misunderstood you. When you said the average of utilities, I thought you meant the utility averaged among all the different agents in the world. Instead, it's just, roughly, an average among probability density function of utility. I say roughly because I guess integration isn't exactly an average.

## ↑ comment by Chantiel · 2021-10-30T22:01:40.718Z · LW(p) · GW(p)

I think this system may have the following problem: It implicitly assumes that you can take a kind of random sample that in fact you can't.

You want to evaluate universes by "how would I feel about being in this universe?", which I think means either something like "suppose I were a randomly chosen subject-of-experiences in this universe, what would my expected utility be?" or "suppose I were inserted into a random place in this universe, what would my expected utility be?". (Where "utility" is shorthand for your notion of "life satisfaction", and you are welcome to insist that it be bounded.)

But in a universe with infinitely many -- countably infinitely many, presumably -- subjects-of-experiences, the first involves an action equivalent to picking a random integer. And in a universe of infinite size (and with a notion of space at least a bit like ours), the second involves an action equivalent to picking a random real number.

And there's no such thing as picking an integer, or a real number, uniformly at random.

Thank you for the response.

You are correct that there's no way to form a uniform distribution over the set of all integers or real numbers. And, similarly, you are also correct that there is no way of sampling from infinitely many agents uniformly at random.

Luckily, my system doesn't require you to do any of these things.

Don't think about my system as requiring you to pick out a specific random agent in the universe (because you can't). It doesn't try to come up with the probability of you being some single specific agent.

Instead, it picks out some some description of circumstances an agent could be in as well as a description of the agent itself. And this, you can do. I don't think anyone's completely formalized a way to compute prior probabilities over situations they could end up. But the basic idea is to, over different circumstances, each of finite description length, take some complexity-weighted or perhaps uniform distribution.

I'm not entirely sure how to form a probability distribution that include situations of infinite complexity. But it doesn't seem like you really need to, because, in our universe at least, you can only be affected by a finite region. But I've thought about how to deal with infinite description lengths, too, and I can discuss it if you're interested.

I'll apply my moral system to the coin flip example. To make it more concrete, suppose there's some AI that uses a pseudorandom number generator that outputs "heads" or "tails", and then the AI, having precise control of the environment, makes the actual coin land on heads iff the pseudorandom number generator outputted "heads". And it does so for each agent and makes them happy if it lands on heads and unhappy if it lands on tails.

Let's consider the situation in which the pseudorandom number generator says "heads" 99.9% of the time. Well, pseudorandom number generators tend to work by having some (finite) internal seed, then using that seed to pick out a random number in, say, [0, 1]. Then, for the next number, it updates its (still finite) internal state from the initial seed in a very chaotic manner, and then again generates a new number in [0, 1]. And my understanding is that the internal state tends to be uniform in the sense that on average each internal state is just as common as each other internal state. I'll assume this in the following.

If the generator says "heads" 99.9% of the time, then that means that, among the different internal states, 99.9% of them result in the answer being "heads" and 0.1% result in the answer being "tails".

Suppose you're know you're in this universe, but nothing else. Well, you know you will be in a circumstance in which there is some AI that uses a pseudorandom number generator to determine your life satisfaction, because that's how it is for everyone in the universe. However, you have no way of knowing the specifics of the internal state of of the pseudorandom number generator.

So, to compute the probability of life satisfaction, just take some very high-entropy probability distribution over them, for example, a uniform distribution. So, 99.9% of the internal states would result in you being happy, and only 0.1% result in you being unhappy. So, using a very high-entropy distribution of internal states would result in you assigning probability of approximate 99.9% to you ending up happy.

Similarly, suppose instead that the generator generates heads only 0.1% of the time. Then only 0.1% of internal states of the pseudorandom number generator would result in it outputting "heads". Thus, if you use a high-entropy probability distribution over the internal state, you would assign a probability of approximately 0.1% to you being happy.

Thus, if I'm reasoning correctly, the probability of you being satisfied conditioning only you being in the 99.9%-heads universe is approximately 99.9%, and the probability of being satisfied in the 0.01%-heads universe is approximately 0.01%. Thus, the former universe would be seen as having more moral value than the latter universe according to my ethical system.

And I hope what I'm saying isn't too controversial. I mean, in order to reason, there must be *some* way to assign a probability distribution over situations you end up in, even if you don't yet of any idea what concrete situation you'll be in. I mean, suppose you actually learned you were in the 99.9%-heads universe, but knew nothing else. Then it really shouldn't seem unreasonable that you assign 99.9% probability to ending up happy. I mean, what else would you think?

Does this clear things up?

Replies from: gjm## ↑ comment by gjm · 2021-10-30T23:12:25.702Z · LW(p) · GW(p)

I don't think I understand *why* your system doesn't require something along the lines of choosing a uniformly-random agent or place. Not necessarily exactly either of those things, but something of that kind. You said, in OP:

suppose you had no idea which agent in the universe it would be, what circumstances you would be in, or what your values would be, but you still knew you would be born into this universe.

How does that cash out if *not* in terms of picking a random agent, or random circumstances in the universe?

If I understand your comment correctly, you want to deal with that by picking a random *description* of a situation in the universe, which is just a random bit-string with some constraints on it, which you presumably do in something like the same way as choosing a random program when doing Solomonoff induction: cook up a prefix-free language for describing situations-in-the-universe, generate a random bit-string with each bit equally likely 0 or 1, and see what situation it describes.

But now everything depends on the details of how descriptions map to actual situations, and I don't see any canonical way to do that or any anything-like-canonical way to do it. (Compare the analogous issue with Solomonoff induction. There, everything depends on the underlying machine, but one can argue at-least-kinda-plausibly that if we consider "reasonable" candidates, the differences between them will quickly be swamped by all the actual evidence we get. I don't see anything like that happening here. What am I missing?

Your example with an AI generating people with a PRNG is, so far as it goes, fine. But the epistemic situation one needs to be in for that example to be relevant seems to me *incredibly* different from any epistemic situation anyone is ever really in. If our universe is running on a computer, we don't know what computer or what program or what inputs produced it. We can't do anything remotely like putting a uniform distribution on the internal states of the machine.

Further, your AI/PRNG example is importantly different from the infinitely-many-random-people example on which it's based. You're supposing that your AI's PRNG has an internal state you can sample from uniformly at random! But that's exactly the thing we *can't* do in the randomly-generated-people example.

Further further, your prescription in this case is very much *not* the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here *doesn't* actually produce an inhabitant of the universe in question. You're replacing a difficult (indeed impossible) question -- "how do things go, on average, for a random person in this universe?" -- with an easier *but different* question -- "how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?". Maybe that's a reasonable thing to do, but it is *not what your theory as originally stated tells you to do* and I don't see any obvious reason why someone who accepted your theory as you originally stated it should behave as you're now telling them they should.

Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there's no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, ..., then n! (un)happy people, then ... . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can't do. Picking a state at random seems like it means picking a random positive integer, which again you can't do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn't actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it's less than. How does your system evaluate this universe?

Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 *the same universe* up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing *the same* infinitely many people, just arranged differently, one of which is much better than the other. Really?

(Of course, in something that's less of a toy model, the arrangement of people can matter a lot. It's nice to be near to friends and far from enemies, for instance. But of course that isn't what we're talking about here; when we rearrange the people we do so in a way that preserves all their experiences and their level of happiness.)

It really *should* seem unreasonable to suppose that in the 99.9% universe there's a 99.9% chance that you'll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn't fully grasped the fact that you can't sample uniformly at random from an infinite population.

## ↑ comment by Chantiel · 2021-10-31T21:35:17.933Z · LW(p) · GW(p)

How does that cash out if not in terms of picking a random agent, or random circumstances in the universe? So, remember, the moral value of the universe according to my ethical system depends on P(I'll be satisfied | I'm some creature in this universe).

There must be *some* reasonable way to calculate this. And one that doesn't rely on impossibly taking a uniform sample from a set that has none. Now, we haven't fully formalized reasoning and priors yet. But there is *some* reasonable prior probability distribution over situations you could end up in. And after that you can just do a Bayesian update on the evidence "I'm in universe x".

I mean, imagine you had some superintelligent AI that takes evidence and outputs probability distributions. And you provide the AI with evidence about what the universe it's in is like, without letting it know anything about the specific circumstances it will end up in. There must be *some* reasonable probability for the AI to assign to outcomes. If there isn't, then that means whatever probabilistic reasoning system the AI uses must be incomplete.

It really should seem unreasonable to suppose that in the 99.9% universe there's a 99.9% chance that you'll end up happy! Because the 99.9% universe is also the 0.1% universe, just looked at differently. If your intuition says we should prefer one to the other, your intuition hasn't fully grasped the fact that you can't sample uniformly at random from an infinite population.

I'm surprised you said this and interested in why. Could you explain what probability you would assign to being happy in that universe?

I mean, conditioning on being in that universe, I'm really not sure what else I would do. I know that I'll end up with my happiness determined by some AI with a pseudorandom number generator. And I have no idea what the internal state of the random number generator will be. In Bayesian probability theory, the standard way to deal with this is to take a maximum entropy (i.e. uniform in this case) distribution over the possible states. And such a distribution would imply that I'd be happy with probability 99.9%. So that's how I would reason about my probability of happiness using conventional probability theory.

Further further further, let me propose another hypothetical scenario in which an AI generates random people. This time, there's no PRNG, it just has a counter, counting up from 1. And what it does is to make 1 happy person, then 1 unhappy person, then 2 happy people, then 6 unhappy people, then 24 happy people, then 120 unhappy people, ..., then n! (un)happy people, then ... . How do you propose to evaluate the typical happiness of a person in this universe? Your original proposal (it still seems to me) is to pick one of these people at random, which you can't do. Picking a state at random seems like it means picking a random positive integer, which again you can't do. If you suppose that the state is held in some infinitely-wide binary thing, you can choose all its bits at random, but then with probability 1 that doesn't actually give you a finite integer value and there is no meaningful way to tell which is the first 0!+1!+...+n! value it's less than. How does your system evaluate this universe?

I'm not entirely sure how my system would evaluate this universe, but that's due to my own uncertainty about what specific prior to use and its implications.

But I'll take a stab at it. I see the counter alternates through periods of making happy people and periods of making unhappy people. I have no idea which period I'd end up being in, so I think I'd use the principle of indifference to assign probability 0.5 to both. If I'm in the happy period, then I'd end up happy, and if I'm in the unhappy period, I'd end up unhappy. So I'd assign probability approximately 0.5 to ending up happy.

Further further, your prescription in this case is very much not the same as the general prescription you stated earlier. You said that we should consider the possible lives of agents in the universe. But (at least if our AI is producing a genuinely infinite amount of pseudorandomness) its state space is of infinite size, there are uncountably many states it can be in, but (ex hypothesi) it only ever actually generates countably many people. So with probability 1 the procedure you describe here doesn't actually produce an inhabitant of the universe in question. You're replacing a difficult (indeed impossible) question -- "how do things go, on average, for a random person in this universe?" -- with an easier but different question -- "how do things go, on average, for a random person from this much larger uncountable population that I hope resembles the population of this universe?". Maybe that's a reasonable thing to do, but it is not what your theory as originally stated tells you to do and I don't see any obvious reason why someone who accepted your theory as you originally stated it should behave as you're now telling them they should.

Oh, I had in mind that the internal state of the pseudorandom number generator was finite, and that each pseudorandom number generator was only used finitely-many times. For example, maybe each AI on its world had its own pseudorandom number generator.

And I don't see how else I could interpret this. I mean, if the pseudorandom number generator is used infinitely-many times, then it couldn't have outputted "happy" 99.9% of the time and "unhappy" 0.1% of the time. With infinitely-many outputs, it would output "happy" infinitely-many times and output "unhappy" infinitely-many times, and thus the proportion it outputs "happy" or "unhappy" would be undefined.

Returning to my original example, let me repeat a key point: Those two universes, generated by biased coin-flips, are with probability 1 the same universe up to a mere rearrangement of the people in them. If your system tells us we should strongly prefer one to another, it is telling us that there can be two universes, each containing the same infinitely many people, just arranged differently, one of which is much better than the other. Really?

Yep. And I don't think there's any way around this. When talking about infinite ethics, we've had in mind a canonically infinite universe: one that, for every level of happiness, suffering, satisfaction, and dissatisfaction, there exists infinite many agents with that level. It looks like this is the sort of universe we're stuck in.

So then there's *no difference* in terms of moral value of two canonically-infinite universes *except* the patterning of value. So if you want to compare the moral value of two canonically-infinite universes, there's just nothing you can do except to consider the patterning of values. That is, unless you want to consider any two canonically-infinite universes to be of equivalent moral value, which doesn't seem like an intuitively desirable idea.

The problem with some of the other infinite ethical systems I've seen is that they would morally recommend redistributing unhappy agents extremely thinly in the universe, rather than actually try to make them happy, provided this was easier. As discussed in my article, my ethical system provides some degree of defense against this, which seems to me like a very important benefit.

Replies from: gjm## ↑ comment by gjm · 2021-11-01T01:19:12.978Z · LW(p) · GW(p)

You say

There must be

somereasonable way to calculate this.

(where "this" is Pr(I'm satisfied | I'm some being in such-and-such a universe)) Why must there be? I agree that it would be *nice* if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.

Does whatever argument or intuition leads you to say that there must be a reasonable way to calculate Pr(X is satisfied | X is a being in universe U) also tell you that there must be a reasonable way to calculate Pr(X is even | X is a positive integer)? How about Pr(the smallest n with x <= n! is even | x is a positive integer)?

I should maybe be more explicit about my position here. Of course there are *ways* to give a meaning to such expressions. For instance, we can suppose that the integer n occurs with probability 2^-n, and then e.g. if I've done my calculations right then the second probability is the sum of 2^-0! + (2^-2!-2^-3!) + (2^-4!-2^-5!) + ... which presumably doesn't have a nice closed form (it's transcendental for sure) but can be calculated to high precision very easily. But that doesn't mean that there's and such thing as *the way* to give meaning to such an expression. We could use some other sequence of weights adding up to 1 instead of the powers of 1/2, for instance, and we would get a substantially different answer. And if the objects of interest to us were *beings in universe U* rather than *positive integers*, they wouldn't come equipped with a standard order to look at them in.

Why should we expect there to be a well-defined answer to the question "what fraction of these beings are satisfied"?

Could you explain what probability you would assign to being happy in that universe?

No, because I do not assign any probability to being happy in that universe. I don't know a good way to assign such probabilities and strongly suspect that there is none.

You suggest doing maximum entropy on the states of the pseudorandom random number generator being used by the AI making this universe. But when I was describing that universe I said nothing about AIs and nothing about pseudorandom number generators. If I am contemplating being in such a universe, then I don't know how the universe is being generated, and I *certainly* don't know the details of any pseudorandom number generator that might be being used.

Suppose there is a PRNG, but an infinite one somehow, and suppose its state is a positive integer (of arbitrary size). (Of course this means that the universe is not being generated by a computing device of finite capabilities. Perhaps you want to exclude such possibilities from consideration, but if so then you might equally well want to exclude infinite universes from consideration: a finite machine can't e.g. generate a complete description of what happens in an infinite universe. If you're bothering to consider infinite universes at all, I think you should also be considering universes that aren't generated by finite computational processes.)

Well, in this case there is no uniform prior over the states of the PRNG. OK, you say, let's take the maximum-entropy prior instead. That would mean (p_k) minimizing sum p_k log p_k subject to the sum of p_k being 1. Unfortunately there is no such (p_k). If we take p_k = 1/n for k=1..n and 0 for larger k, the sum is log 1/n which -> -oo as n -> oo. In other words, we can make the entropy of (p_k) as large as we please.

You might suppose (arbitrarily, it seems to me) that the integer that's the state of our PRNG is held in an infinite sequence of bits, and choose each bit at random. But then with probability 1 you get an impossible state of the RNG, and for all we know the AI's program might look like "if PRNG state is a finite positive integer, use it to generate a number between 0 and 1 and make our being happy if that number is <= 0.999; if PRNG state isn't a finite positive integer, put our being in hell".

I mean, if the pseudorandom number generator is used infinitely many times, then [...] it would output "happy" infinitely many times and output "unhappy" infinitely many times, and thus the proportion it outputs "happy" or "unhappy" would be undefined.

Yes, exactly! When I described this hypothetical world, I didn't say "the probability that a being in it is happy is 99.9%". I said "a biased coin-flip determines the happiness of each being in it, choosing 'happy' with probability 99.9%". Or words to that effect. This is, so far as I can see, a perfectly coherent (albeit partial!) specification of a possible world. And it does indeed have the property that "the probability that a being in it is happy" is not well defined.

This doesn't mean the scenario is improper somehow. It means that any ethical (or other) system that depends on evaluating such probabilities will fail when presented with such a universe. Or, for that matter, pretty much any universe with infinitely many beings in it.

there's just nothing you can do except to consider the patterning of values.

But then I don't see that you've explained *how* your system considers the patterning of values. In the OP you just talk about the probability that a being in such-and-such a universe is satisfied; and that probability is *typically not defined*. Here in the comments you've been proposing something involving knowing the PRNG used by the AI that generated the universe, and sampling randomly from the outputs of that PRNG; but (1) this implies being in an epistemic situation completely unlike any that any real agent is ever in, (2) nothing like this can work (so far as I can see) unless you know that the universe you're considering is being generated by some finite computational process, and if you're going to assume that you might as well assume a finite universe to begin with and avoid having to deal with infinite ethics at all, (3) I don't understand how your "look at the AI's PRNG" proposal generalizes to non-toy questions, and (4) even if (1-3) are resolved somehow, it seems like it requires a literally infinite amount of computation to evaluate any given universe. (Which is especially problematic when we are assuming we are in a universe generated by a finite computational process.)

## ↑ comment by Chantiel · 2021-11-02T03:28:59.531Z · LW(p) · GW(p)

You say, "There must be some reasonable way to calculate this."

(where "this" is Pr(I'm satisfied | I'm some being in such-and-such a universe)) Why must there be? I agree that it would be nice if there were, of course, but there is no guarantee that what we find nice matches how the world actually is.

To use probability theory to form accurate beliefs, we need a prior. I didn't think this was controversial. And if you have a prior, as far as I can tell, you can then compute Pr(I'm satisfied | I'm some being in such-and-such a universe) by simply updating on "I'm some being in such-and-such a universe" using Bayes' theorem.

That is, you need to have some prior probability distribution over concrete specifications of the universe you're in and your situation in it. Now, to update on "I'm some being in such-and-such a universe", just look at each concrete possible situation-and-universe and assign P("I'm some being in such-and-such a universe" | some concrete hypothesis) to 0 if the hypothesis specifies you're in some universe other than the such-and-such universe. And set this probability is 1 if it does specify you are in such a universe. As long as the possible universes are specified sufficiently precisely, then I don't see why you couldn't do this.

Replies from: gjm## ↑ comment by gjm · 2021-11-05T00:05:28.593Z · LW(p) · GW(p)

OK, so I think I now understand your proposal better than I did.

So if I'm contemplating making the world be a particular way, you then propose that I should do the following calculation (as always, of course I can't do it because it's uncomputable, but never mind that):

- Consider all possible computable experience-streams that a subject-of-experiences could have.
- Consider them, specifically, as being generated by programs drawn from a universal distribution.
- Condition on being in the world that's the particular way I'm contemplating making it -- that is, discard experience-streams that are literally inconsistent with being in that world.
- We now have a probability distribution over experience-streams. Compute a utility for each, and take its expectation.

And now we compare possible universes by comparing *this* expected utility.

(Having failed to understand your proposal correctly before, I am not super-confident that I've got it right now. But let's suppose I have and run with it. You can correct me if not. In that case, some or all of what follows may be irrelevant.)

I agree that this seems like it will (aside from concerns about uncomputability, and assuming our utilities are bounded) yield a definite value for every possible universe. However, it seems to me that it has other serious problems which stop me finding it credible.

SCENARIO ONE. So, for instance, consider once again a world in which there are exactly two sorts of experience-subject, happy and unhappy. Traditionally we suppose infinitely many of both, but actually let's also consider possible worlds where there is just *one* happy experience-subject, or just *one* unhappy one. All these worlds come out exactly the same, so "infinitely many happy, one unhappy" is indistinguishable from "infinitely many unhappy, one happy". That seems regrettable, but it's a bullet I can imagine biting -- perhaps we just don't care at all about multiple instantiations of the *exact* same stream of experiences: it's just the same person and it's a mistake to think of them as contributing separately to the goodness of the universe.

So now let's consider some variations on this theme.

SCENARIO TWO. Suppose I think up an infinite (or for that matter merely very large) number of *highly improbable* experience-streams that one might have, all of them unpleasant. And I find a single *rather probable* experience-stream, a pleasant one, whose probability (according to our universal prior) is greater than the sum of those other ones. If I am contemplating bringing into being a world containing exactly the experience-streams described in this paragraph, then it seems that I *should*, because the expected net utility is positive, at least if the pleasantness and unpleasantness of the experiences in question are all about equal.

To me, this seems obviously crazy. Perhaps there's some reason why this scenario is incoherent (e.g., maybe somehow I shouldn't be *able* to bring into being all those very unlikely beings, at least not with non-negligible probability, so it shouldn't matter much what happens if I do, or something), but at present I don't see how that would work out.

The problem in SCENARIO TWO seems to arise from paying too much attention to the prior probability of the experience-subjects. We can also get into trouble by not paying enough attention to their *posterior* probability, in some sense.

SCENARIO THREE. I have before me a switch with two positions, placed there by the Creator of the Universe. They are labelled "Nice" and "Nasty". The CotU explains to me that the creation of future experience-subjects will be controlled by a source of True Randomness (whatever exactly that might be), in such a way that *all possible computable experience-subjects* have a real chance of being instantiated. The CotU has designed two different prefix-free codes mapping strings of bits to possible experience-subjects; then he has set a Truly Random coin to flip for ever, generating a new experience-subject every time a leaf of the code's binary tree is reached, so that we get an infinite number of experience-subjects generated at random, with a distribution depending on the prefix-free code being used. The Nice and Nasty settings of the switch correspond to two different codes. The CotU has computed that with the switch in the "Nice" position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the "Nasty" position, it's large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.

In this case, our conditioning doesn't remove *any* possible experience-subjects from consideration, so we are indifferent between the "Nice" and "Nasty" settings of the switch.

This is another one where we *might* be right to bite the bullet. In the long run infinitely many of every possible experience-subject will be created in each version of the universe, so maybe these two universes are "anagrams" of one another and *should* be considered equal. So let's tweak it.

SCENARIO FOUR. Same as in SCENARIO THREE, except that now the CotU's generator will run until it has produced a trillion experience-subjects and then shut off for ever.

It is still the case that with the switch in either setting any experience-subject is possible, so we don't get to throw any of them out. But it's no longer the case that the universes generated in the "Nice" and "Nasty" versions are with probability 1 (or indeed with not-tiny probability) identical in any sense.

So far, these scenarios all suppose that somehow we are able to generate arbitrary sets of possible experience-subjects, and arrange for those to be *all* the experience-subjects there are, or at least all there are after we make whatever decision we're making. That's kinda artificial.

SCENARIO FIVE. Our universe, just as it is now. We assume, though, that our universe is in fact infinite. You are trying to decide whether to torture me to death.

So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don't. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe. Perhaps in the version of the world where you torture me to death this makes you more likely to do other horrible things, or makes other people who care for me suffer more, but again none of this makes any experiences *impossible* that would otherwise have been possible, or vice versa. So our universe-evaluator is indifferent between these choices.

(The possibly-overcomplicated business in one of my other comments, where I tried to consider doing something Solomoff-like using both *my* experiences and those of some hypothetical possibly-other experience-subject in the world, was intended to address these problems caused by considering only *possibility* and not anything stronger. I couldn't see how to make it work, though.)

## ↑ comment by Chantiel · 2021-11-05T03:57:15.590Z · LW(p) · GW(p)

RE: scenario one:

All these worlds come out exactly the same, so "infinitely many happy, one unhappy" is indistinguishable from "infinitely many unhappy, one happy"

It's not clear to me how they are indistinguishable. As long as the agent that's unhappy can have itself and its circumstances described with a finite description length, then it would have non-zero probability of an agent ending up as that one. Thus, making the agent unhappy would decrease the moral value of the world.

I'm not sure what would happen if the single unhappy agent has infinite complexity and 0 probability. But I suspect that this could be dealt with if you expanded the system to also consider non-real probabilities. I'm no expert on non-real probabilities, but I bet you the probability of being unhappy given there is an unhappy agent would be infinitesimally more probable than the probability in the world in which there's no unhappy agents.

RE: scenario two: It's not clear to me how this is crazy. For example, consider this situation: when agents are born, an AI flips a biased coin to determine what will happen to them. Each coin has a 99.999% chance of landing on heads and a 0.001% chance of landing on tails. If the coin lands on heads, the AI will give the agent some very pleasant experience stream, and all such agents will get the same pleasant experience stream. But if it lands on tails, the AI will give the agent some unpleasant experience stream that is also very different from the other unpleasant ones.

This sounds like a pretty good situation to me. It's not clear to me why it wouldn't be. I mean, I don't see why the diversity of the positive experiences matters. And if you do care about the diversity of positive experiences, this would have unintuitive results. For example, suppose all agents have identical preferences and they satisfaction is maximized by experience stream S. Well, if you have a problem with the satisfied agents having just one experience stream, then you would be incentivized to coerce the agents to instead have a variety of different experience streams, even if they didn't like these experience streams as much.

RE: scenario three:

The CotU has computed that with the switch in the "Nice" position, the expected utility of an experience-subject in the resulting universe is large and positive; with the switch in the "Nasty" position, it's large and negative. But in both cases every possible experience-subject has a nonzero probability of being generated at any time.

I don't follow your reasoning. You just said in the "Nice" position, the expected value of this is large and positive and in the "Nasty" it's large and negative. And since my ethical system seeks to maximize the expected value of life satisfaction, it seems trivial to me that it would prefer the "nice" button.

Whether or not you switch it to the "Nice" position won't rule out any possible outcomes for an agent, but it seems pretty clear that it would change the probabilities of them.

RE: scenario four: My ethical system would prefer the "Nice" position for the same reason described in scenario three.

RE: scenario five:

So far as I can tell, there is no difference in the set of possible experience-subjects in the world where you do and the world where you don't. Both the tortured-to-death and the not-tortured-to-death versions of me are apparently possibilities, so it seems that with probability 1 each of them will occur somewhere in this universe, so neither of them is removed from our set of possible experience-streams when we condition on occurrence in our universe.

Though none of the experience streams are impossible, the probability of you getting tortured is still higher conditioning on me deciding the torture you. To see why, note the situation, "Is someone just like Slider who is vulnerable to being tortured by demon lord Chantiel". This has finite description length, and thus non-zero probability. And if I decide to torture you, then the probability of you getting tortured if you end up in this situation is high. Thus, the total expected value of life satisfaction would be lower if I decided to torture you. So my ethical system would recommend not torturing you.

In general, don't worry about if an experience stream is possible or not. In an infinite universe with quantum noise, I think pretty much all experience streams would occur with non-zero probability. But you can still adjust the probabilities of an agent ending up with the different streams.

Replies from: gjm## ↑ comment by gjm · 2021-11-05T05:09:56.722Z · LW(p) · GW(p)

It sounds as if my latest attempt at interpreting what your system proposes doing is incorrect, because the things you're disagreeing with seem to me to be straightforward consequences of that interpretation. Would you like to clarify how I'm misinterpreting now?

Here's my best guess.

You wrote about specifications of an experience-subject's *universe and situation in it*. I mentally translated that to their *stream of experiences* because I'm thinking in terms of Solomonoff induction. Maybe that's a mistake.

So let's try again. The key thing in your system is *not* a program that outputs a hypothetical being's stream of experiences, it's a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that's probably OK.

So that ought to give a well-defined (modulo the usual stuff about uncomputability) probability distribution over experience-subjects-in-universes. And then you want to condition on "being in a universe with such-and-such characteristics" (which may or may not specify the universe itself completely) and look at the expected utility-or-utility-like-quantity of all those experience-subjects-in-universes after you rule out the universes without such-and-such characteristics.

It's now stupid-o'-clock where I am and I need to get some sleep. I'm posting this even though I haven't had time to think about whether my current understanding of your proposal seems like it might work, because on past form there's an excellent chance that said understanding is wrong, so this gives you more time to tell me so if it is :-). If I don't hear from you that I'm still getting it all wrong, I'll doubtless have more to say later...

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-05T22:18:12.727Z · LW(p) · GW(p)

So let's try again. The key thing in your system is not a program that outputs a hypothetical being's stream of experiences, it's a program that outputs a complete description of a (possibly infinite) universe and also an unambiguous specification of a particular experience-subject within that universe. This is only possible if there are at most countably many experience-subjects in said universe, but that's probably OK.

That's closer to what I meant. By "experience-subject", I think you mean a specific agent at a specific time. If so, my system doesn't require an unambiguous specification of an experience-subject.

My system doesn't require you to pinpoint the exact agent. Instead, it only requires you to specify a (reasonably-precise) description of an agent and its circumstances. This doesn't mean picking out a single agent, as there many be infinitely-many agents that satisfy such a description.

As an example, a description could be something like, "Someone named gjm in an 2021-Earth-like world with personality <insert a description of your personality and thoughts> who has <insert description of my life experiences> and is currently <insert description of how your life is currently>"

This doesn't pick out a single individual. There are probably infinitely-many gjms out there. But as long as the description is precise enough, you can still infer your probable eventual life satisfaction.

But other than that, your description seems pretty much correct.

It's now stupid-o'-clock where I am and I need to get some sleep.

I feel you. I also posted something at stupid-o'-clock and then woke up a 5am, realized I messed up, and then edited a comment and hoped no one saw the previous error.

Replies from: gjm## ↑ comment by gjm · 2021-11-06T01:23:58.017Z · LW(p) · GW(p)

No, I don't intend "experience-subject" to pick out a specific time. (It's not obvious to me whether a variant of your system that worked that way would be better or worse than your system as it is.) I'm using that term rather than "agent" because -- as I think you point out in te OP -- what matters for moral relevance is having experiences rather than performing actions.

So, anyway, I *think* I now agree that your system does indeed do approximately what you say it does, and many of my previous criticisms do not in fact apply to it; my apologies for the many misunderstandings.

The fact that it's lavishly uncomputable is a problem for using it in practice, of course :-).

I have some other concerns, but haven't given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe *physically bigger* experience subjects get treated as more important because they're "easier to locate", and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?

## ↑ comment by Chantiel · 2021-11-07T19:39:45.789Z · LW(p) · GW(p)

The fact that it's lavishly uncomputable is a problem for using it in practice, of course :-).

Yep. To be fair, though, I suspect any ethical system that respects agents' arbitrary preferences would also be incomputable. As a silly example, consider an agent whose terminal values are, "If Turing machine T halts, I want nothing more than to jump up and down. However, if it doesn't halt, then it is of the utmost importance to me that I never jump up and down and instead sit down and frown." Then any ethical system that cares about those preferences is incomputable.

Now this is pretty silly example, but I wouldn't be surprised if there were more realistic ones. For one, it's important to respect other agents' moral preferences, and I wouldn't be surprised if their ideal moral-preferences-on-infinite-reflection would be incomputable. I seems to me that morall philosophers act as some approximation of, "Find the simplest model of morality that mostly agrees with my moral intuitions". If they include incomputable models, or arbitrary Turing machines that may or may not halt, then the moral value of the world to them would in fact be incomputable, so any ethical system that cares about preferences-given-infinite-reflection would also be incomputable.

I have some other concerns, but haven't given the matter enough thought to be confident about how much they matter. For instance: if the fundamental thing we are considering probability distributions over is programs specifying a universe and an experience-subject within that universe, then it seems like maybe physically bigger experience subjects get treated as more important because they're "easier to locate", and that seems pretty silly. But (1) I think this effect may be fairly small, and (2) perhaps physically bigger experience-subjects should on average matter more because size probably correlates with some sort of depth-of-experience?

I'm not that worried about agents that are physically bigger, but it's true that there may be some agents or agents descriptions in situations that are easier to pick out (in terms of having a short description length) then others. Maybe there's something really special about the agent that makes it easy to pin down.

I'm not entirely sure if this would be a bug or a feature. But if it's a bug, I think it could be dealt with by just choosing the right prior over agents-situations. Specifically, for any description of an environment with finitely-many agents A, make the probability of ending up as , conditioned only on being one of the agents in that environment, should be constant for all . This way, the prior isn't biased in favor of the agents that are easy to pick out.

## comment by Richard_Kennaway · 2021-11-04T10:44:10.445Z · LW(p) · GW(p)

How do you deal with the problem that in an infinite setting, expected values do not always exist? For example, the Cauchy distribution has no expected value. Neither do various infinite games, e.g. St Petersburg with payouts of alternating sign. Even if you can handle Inf and -Inf, you're then exposed to NaNs.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-05T21:11:50.930Z · LW(p) · GW(p)

Thanks for responding. As I said, the measure of satisfaction is bounded. And all bounded random variables have a well-defined expected value. Source: Stack Exchange.

## comment by Slider · 2021-11-01T00:25:46.284Z · LW(p) · GW(p)

Kind of hard to ge a handle. I will poke in just a little bit. I understood the question about trying to make the AI as a slight chance as a enourmous value. It seems like deciding to do makes limits what the probablity can be. However the logic seems ot be that since its not zero, it means its a positive probablity. I would say that it argues that the probabily is finite from it being possible to happen. Unfortunately, probability 0 is compatible with single cases being possible. That a case happens doesn't prove a non-zero probablity.

In general I woudl think that a finite judge in an infinite sea is going to have infinidesimal impact. Your theory seems to deal a lot in probablities and I guess in some sense infinidesimals are more palpable than transfinites. However I would be very careful how situations like X+neligble vs X+different neglible get calculated especially if small neglible differences get rounded to the nearest real accuracy of 0.

If I have a choice of (finitely) helping a single human and I believe there to be infinite humans then the probability of a human being helped in my world will nudge less than a real number. And if we want to stick with probabilties being real then the rounding will make infinitarian paralysis.

Another scenario raises the possibility of the specter of fanatism. Say by doing murder I can create an AI that will make all future agents happy but being murdered is not happy times. Comparing agents before and after the singularity might make sense. And so might killing different finite amounts of people. but mixing them gets tricky or favours the "wider class". One could think of a distribution where for values between 0 and 4 you up the utility by 1 except for pi (or any single real (or any set of measure 0)) for which you lower it by X. Any finite value for X will not be able to nudge the expectation value anywhere. Real ranges vs real ranges makes sense, discrete sets vs discrete sets makes sense, but when you cross transfinite archimedean classes one is in trouble.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-02T01:42:16.013Z · LW(p) · GW(p)

Kind of hard to ge a handle.

Are you referring to it being hard to understand? If so, I appreciate the feedback and am interested in the specifics what is difficult to understand. Clarity is a top priority for me.

If I have a choice of (finitely) helping a single human and I believe there to be infinite humans then the probability of a human being helped in my world will nudge less than a real number. And if we want to stick with probabilties being real then the rounding will make infinitarian paralysis.

You are correct that a single human would have 0 or infinitistimal *causal* impact on the moral value of the world or the satisfaction of an arbitrary human. However, it's important to note that my system requires you to use a decision theory that considers not just your causal impacts, but also your acausal ones.

Remember that if you decide to take a certain action, that implies that other agents who are sufficiently similar to you and in sufficiently similar circumstances *also* take that action. Thus, you can acausally have non-infinitesimal impact on the satisfaction of agents in situations of the form, "An agent in a world with someone just like Slider who is also in very similar circumstances to Slider's." The above scenario is of finite complexity and isn't ruled out by evidence. Thus, the probability of an agent ending up in such a situation, conditioning only only on being some agent in this universe, is nonzero.

Another scenario raises the possibility of the specter of fanatism. Say by doing murder I can create an AI that will make all future agents happy but being murdered is not happy times. Comparing agents before and after the singularity might make sense. And so might killing different finite amounts of people. but mixing them gets tricky or favours the "wider class". One could think of a distribution where for values between 0 and 4 you up the utility by 1 except for pi (or any single real (or any set of measure 0)) for which you lower it by X. Any finite value for X will not be able to nudge the expectation value anywhere. Real ranges vs real ranges makes sense, discrete sets vs discrete sets makes sense, but when you cross transfinite archimedean classes one is in trouble.

I'm not really following what you see as the problem here. Perhaps by above explanation clears things up. If not, would you be willing to elaborate on how transfinite archimedean classes could potentially lead to trouble?

Also, to be clear, my system only considers finite probabilities and finite changes to the moral value of the world. Perhaps there's some way to extend it beyond this, but as far as I know it's not necessary.

Replies from: Slider## ↑ comment by Slider · 2021-11-02T10:53:39.245Z · LW(p) · GW(p)

Post is pretty long winded,a bit wall fo texty in a lot of text which seems like fixed amount of content while being very claimy and less showy about the properties.

My suspicion is that the acausal impact ends up being infinidesimal anyway. Even if one would get finite probability impact for probabilties concerning a infinite universe for claims like "should I help this one person" then claims like "should I help these infinite persons" would still have an infinity class jump between the statements (even if both need to have an infinite kick into the universe to make a dent there is an additional level to one of these statements and not all infinities are equal).

I am going to anticipate that your scheme will try to rule out statements like "should I help these infinite persons" for a reason like "its not of finite complexity". I am not convinced that finite complexity descriptions are good guarantees that the described condition makes for a finite proportion of possibility space. I think "Getting a perfect bullseye" is a description of finite complexity but it describes and outcome of (real) 0 probabaility. Being positive is of no guarantee of finitude, infinidesimal chances would spell trouble for the theory. And if statements like "Slider or (near equivalent) gets a perfect bullseye" are disallowed for not being finitely groundable then most references to infinite objects are ruled out anyway. Its not exactly an infinite ethic if it is not allowed to refer to infinite things.

I am also slightly worried that "description cuts" will allow "doubling the ball" kind of events where total probability doesn't get preserved. That phenomenon gets around the theorethical problems by designating some sets non-measurable. But then being a a set doesn't mean its measurable. I am worried that "descriptions always have a usable probablity" is too lax and will bleed from the edges like a naive assumption that all sets are measurable would.

I feel at a loss for ebign able to spell out my worries. It is mainly about being murky making it possible to hide undefidedness. As an analog one could think of people trying to formulate calculus. There is a way of thinking about it where you make tiny infinidesimal triangles and measure their properties. In order for the side length to be "sane" the sides fo the triangle need both be "small". If you had a triangle that was finite in length in one side and infinidesimal in one side then the angle of the remaining side is likely to be something "wild". If you "properly take the limits" then you can essentially forget that you are in infinidesimal realm (or one that can be made analogous to one) but checking for that "properness" forgetfulness doesn't help with.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-04T00:53:40.462Z · LW(p) · GW(p)

Post is pretty long winded,a bit wall fo texty in a lot of text which seems like fixed amount of content while being very claimy and less showy about the properties.

Yeah, I see what you mean. I have a hard time balancing between being succinct and providing sufficient support and detail. It actually used to be shorter, but I lengthened it to address concerns brought up a review.

My suspicion is that the acausal impact ends up being infinidesimal anyway. Even if one would get finite probability impact for probabilties concerning a infinite universe for claims like "should I help this one person" then claims like "should I help these infinite persons" would still have an infinity class jump between the statements (even if both need to have an infinite kick into the universe to make a dent there is an additional level to one of these statements and not all infinities are equal).

Could you elaborate what you mean by a class jump?

Remember that if you ask, "should I help this one person", that is another way of saying, "should I (acausally) help this infinite class of people in similar circumstances". And I think in general the cardinality of this infinity would be the same as the cardinality of people helped by considering "should I help these infinitely-many persons"

Most likely the number of people in this universe is countably infinite, and all situations are repeated infinitely-many times. Thus, asking, "should I help this one person" would acausally help people, and so would causally helping the infinitely-many people.

I am going to anticipate that your scheme will try to rule out statements like "should I help these infinite persons" for a reason like "its not of finite complexity". I am not convinced that finite complexity descriptions are good guarantees that the described condition makes for a finite proportion of possibility space. I think "Getting a perfect bullseye" is a description of finite complexity but it describes and outcome of (real) 0 probabaility. Being positive is of no guarantee of finitude, infinidesimal chances would spell trouble for the theory. And if statements like "Slider or (near equivalent) gets a perfect bullseye" are disallowed for not being finitely groundable then most references to infinite objects are ruled out anyway. Its not exactly an infinite ethic if it is not allowed to refer to infinite things.

No, my system doesn't rule out statements of the form, "should I help these infinitely-many persons". This can have finite complexity, after all, provided there is sufficient regularity in who will be helped. Also, don't forget, even if you're just causally helping a single person, you're still acausally helping infinitely-many people. So, in a sense, ruling out helping infinitely-many people would rule out helping anyone.

I am also slightly worried that "description cuts" will allow "doubling the ball" kind of events where total probability doesn't get preserved. That phenomenon gets around the theorethical problems by designating some sets non-measurable. But then being a a set doesn't mean its measurable. I am worried that "descriptions always have a usable probablity" is too lax and will bleed from the edges like a naive assumption that all sets are measurable would.

I'm not sure what specifically you have in mind with respect to doubling the sphere-esque issues. But if your system of probabilistic reasoning doesn't preserve the total probability when partitioning an event into multiple events, that sounds like a serious problem with your probabilistic reasoning system. I mean, if your reasoning system does this, then it's not even a probability measure.

If you can prove , but the system still says , then you aren't satisfying one of the basic desiderata that motivated Bayesian probability theory: asking the same question in two different ways should result in the same probability. And is just another way of asking .

Replies from: Slider## ↑ comment by Slider · 2021-11-04T11:18:04.571Z · LW(p) · GW(p)

The "nearby" acausal relatedness gives a certain multiplier (that is transfinite). That multiplier should be the same for all options in that scenario. Then if you have an option that has a finite multiplier and an infinite multipier the "simple" option is "only" infinite overall but the "large" option is "doubly" infinite because each of your likenesses has a infinite impact alone already (plus as an aggregate it would gain a infinite quality that way too).

Now cardinalities don't really support "doubly infinite" + is just . However for transfinite values cardinality and ordinality diverge and for example with surreal numbers one could have ω+ω>ω and for relevantly for here ω<ω^{2 }. As I understand there are four kinds of impact A="direct impact of helping one", B="direct impact of helping infinite amount", C="acasual impact of choosing ot help 1" and D="acausal impact of choosing to help infinite". You claim that B and C are either equivalent or roughly equivalent and A and B are not. But there is a lurking paralysis if D and C are (roughly) equivalent. By one logic because we prefer B to A then if we "acausalize" this we should still preserve this preference (because "the amount of copies granted" would seem to be even handed), so we would expect to prefer D to C. However in a system where all infinites are of equal size then C=D and we become ambivalent between the options. To me it would seem natural and the boundary conditions are near to forcing that D has just a vast cap to C that B has to A.

In the above "roughly" can be somewhat translated to more precise language as "are within finite multiples away from each other" ie they are not relatively infinite ie they belong to the same archimedean field (helping 1 person or 2 person are not the same but they represent the case of "help fixed finite amount of people"). Within the example it seems we need to identify atleast 3 such fields. Moving within the field is "easy" understood real math. But when you need to move between them, "jump levels" that is less understood. Like a question like "are two fininte numbers equal?" can't be answered in the abstract but we need to specify the finites (and the result could go either way), knowing that an amount is transfinite only tells about its quality and we still (need/find utility) to ask how big they are.

One way one can avoid the weaknesses of the system is not pinning it down. and another place for the infinities to hide is in the infinidesimals. I have the feeling that the normalization is done slightly different in different turns. Consider peace (don't do anything) and punch (punch 1 person). As a separate problem this is no biggie. Then consider, dust (throw sand at infinitely many people) and injury (throw sand at infinitely many people and punch 1 person). Here an adhoc analysis might choose a clear winner. Then consider the combined problem where you have all the options, peace, punch, dust and insult. Having 1 analysis that gets applied to all options equally will run into trouble. If the analysis is somehow "turn options into real probablities" then problem with infinidesimals are likely to crop up.

The structural reason is that 2 arcimedian fields can't be compressed to 1. The problems would be that methods that differentiate between throwing sand or not would gloss over punching or not and methods that differentiate between punching or not blow up for considering sanding or not. Now my liked answer would be "use infinidesimal propablities as real entities" but then I am using something more powerful than real probablities. But that the probalities are in the range of 0 to 1 doesn't make it "easy" for reals to cope with them. The problems would manifest in being systematically being able to assign different numbers. There could be the "highest impact only" problem of neglecting any smaller scale impact which would assign dust and insult the same number. There could be the "modulo infinity" failure mode where peace and dust get the same number. "One class only" would fail to give numbers for one of the subproblems.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-05T02:58:52.119Z · LW(p) · GW(p)

By one logic because we prefer B to A then if we "acausalize" this we should still preserve this preference (because "the amount of copies granted" would seem to be even handed), so we would expect to prefer D to C. However in a system where all infinites are of equal size then C=D and we become ambivalent between the options.

We shouldn't necessarily prefer D to C. Remember that one of the main things you can do to increase the moral value of the universe is to try to causally help other creatures so that other people who are in sufficiently similar circumstances to you you will also help, so you acausally make them help others. Suppose you instead have the option to instead causally help all of the agents that would have been acausally helped if you just causally help one agent. Then the AI shouldn't prefer D to C, because the results are identical.

Here an adhoc analysis might choose a clear winner. Then consider the combined problem where you have all the options, peace, punch, dust and insult. Having 1 analysis that gets applied to all options equally will run into trouble. If the analysis is somehow "turn options into real probablities" then problem with infinidesimals are likely to crop up.

Could you explain how this would cause problems? If those are the options, it seems like a clear-but case of my ethical system recommending peace, unless there is some benefit to punching, insulting, or throwing sand you haven't mentioned.

To see why, if you decide to throw sand, you're decreasing the satisfaction of agents in situations of the form "Can get sand thrown at them from someone just like Slider". This would in general decrease the moral value of the world, so my system wouldn't recommend it. The same reasoning can show that the system wouldn't recommend punching or insulting.

There could be the "modulo infinity" failure mode where peace and dust get the same number. "One class only" would fail to give numbers for one of the subproblems.

Interesting. Could you elaborate?

I'm not really clear the reason for you are worried about these different classes. Remember that any action you will do will, at least acausally, help a countably infinite number of agents. Similarly, I think all your actions will have some real-valued affect on the moral value of the universe. To see why, just note that as long as you help one agent, then the expected satisfaction of agents in situations of the form, "<description of the circumstances of the above agent> who can be helped by someone just like Slider". This has finite complexity, and thus real and non-zero probability. And the moral value of the universe is capped at whatever the domain of the life satisfaction measure is, so you can't have infinite increases to the moral value of the universe, either.

Replies from: Slider## ↑ comment by Slider · 2021-11-05T09:02:20.565Z · LW(p) · GW(p)

You can't causally help people without also acausally helping in the same go. Your acausal "influence" forces people matching your description to act the same. Even if it is possible to consider the directly helped and the undirectly helped to be the same they could also be different. In order to be fair we should also extend this to C. What if the person helped by all the acausal copies are in fact the same person? (If there is a proof it can't be why doesn't that apply when the patient group is large?)

The integactions are all supposed to be negative in peace, punch, dust, insult. The surprising thing to me would be that the system would be ambivalent between sand and insult being a bad idea. If we don't necceasrily prefer D to C when helping does it matter if we torture our people a lot or a little as its going to get infinity saturated anyway.

The basic sitatuino is that I have intuitions which I can't formulate that well. I will try another route. Suppose I help one person and then there is either a finite or infinite amount of people in my world. Finite impact over finite people leads to a real and finite kick. Finite impact over infinite people leads to a infinidesimal kick. Ah, but acausal copies of the finites! Yeah, but what about the acausal copies of the infinites? When I say "world has finite or infinite people" that is "within description" say that there are infinite people because I believe there are infinitely many stars. Then all the acausal copies of sol are going to have their own "out there" stars. Acts that "help all the stars" and "all the stars as they could have been" are different. Atleast until we consider that any agent that decides to "help all the stars" will have acausal shadows "that could have been". But still this consideration increases the impact on the multiverse (or keeps it the same if moving from a monoverse to a multiverse in the same step).

One way to slither out of this is to claim that world-predescription-expansion needs to be finite, that there are only a finite configuration of stars until they start to repeat. Then we can drop "directly infinite" worlds and all infinity is because of acausality. So there is no such thing as directly helping infinite amount of people.

If I have real, non-zero impacts for infinite amount of people naively that would add up to a more than finite aggregate. Fine, we can renormalise the the aggregate to be 1 with a division but that will mean that single agent weights on that average is going to be infinidesimal (and thus not real). If we acausalise then we should do so both for the numerator and the denominator. If we don't acausalise the denominator then we should still acausalise the nomerator even if we have finite patients (but then we end up with more than finite kick). It is inconsistent if the nudges happen based on bad luck if we are "in the wrong weight class".

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-05T21:48:36.171Z · LW(p) · GW(p)

The integactions are all supposed to be negative in peace, punch, dust, insult. The surprising thing to me would be that the system would be ambivalent between sand and insult being a bad idea. If we don't necceasrily prefer D to C when helping does it matter if we torture our people a lot or a little as its going to get infinity saturated anyway.

Could you explain what insult is supposed to do? You didn't say what in the previous comment. Does it causally hurt infinitely-many people?

Anyways, it seems to me that my system would not be ambivalent about whether you torture people a little or a lot. Let C be the class of finite descriptions of circumstances of agents in the universe that would get hurt a little or a lot if you decide to hurt them. The probability of an agent ending up in class C is non-zero. But if you decide to torture them a lot their expected life-satisfaction would be much lower than if you decide to torture them a little. Thus, the total moral value of the universe would be lower if you decide to torture a lot rather than a little.

When I say "world has finite or infinite people" that is "within description" say that there are infinite people because I believe there are infinitely many stars. Then all the acausal copies of sol are going to have their own "out there" stars. Acts that "help all the stars" and "all the stars as they could have been" are different. Atleast until we consider that any agent that decides to "help all the stars" will have acausal shadows "that could have been". But still this consideration increases the impact on the multiverse (or keeps it the same if moving from a monoverse to a multiverse in the same step).

I can't say I'm following you here. Specifically, how do you consider, "help all the stars" and "all the stars as they could have been" to be different? I thought, "help" meant, "make it better than it otherwise could have been". I'm also not sure what counts as acausal shadows. I, alas, couldn't find this phrase used anywhere else online.

If I have real, non-zero impacts for infinite amount of people naively that would add up to a more than finite aggregate.

Remember that my ethical system doesn't aggregate anything across all agents in the universe. Instead, it merely considers finite descriptions of situations an agent could be in the universe, and then aggregates the expected value of satisfaction in these situations, weighted by probability conditioning only on being in this universe.

There's no way for this to be infinite. The probabilities of all the situations sum to 1 (they are assumed to be disjoint), and the measure of life satisfaction was said to be bounded.

And remember, my system doesn't first find your causal impact on moral value of the universe and then somehow use this to find the acausal impact. Because in our universe, I think the causal impact will always be zero. Instead, just directly worry about acausal impacts. And your acausal impact on the moral value of the universe will always be finite and non-infinitesimal.

Replies from: Slider## ↑ comment by Slider · 2021-11-06T00:56:58.407Z · LW(p) · GW(p)

Insult is when you do both punch and dust ie make a negative impact on infinite amotun of people and an additional negative impact on a single person. If degree of torture matters then dusting and punching the same person would be relevant. I guess the theory per se would treat it differntly if the punched person was not one of the dusted ones.

"doesn't aggregate anything" - "aggregates the expected value of satisfaction in these situations"

When we form the expecation what is going to happen in the descriped situation I imagine breaking it down into sad stories and good stories. The expectation sways upwards if ther are more good stories and downwards if there are more bad stories. My life will turn out somehow which can differ from my "storymates" outcomes. I didn't try to hit any special term but just refer to the cases the probabilities of the stories refer to.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-08T20:49:34.988Z · LW(p) · GW(p)

Thanks for clearing some things up. There are still some things I don't follow, though.

You said my system would be ambivalent between between sand and insult. I just wanted to make sure I understand what you're saying here. Is insult specifically throwing sand at the same people that get it thrown at in dust, and get the sand amount of sand thrown at them at the same throwing speed? If so, then it seems to me that my system would clearly prefer sand to insult. This is because there in some non-zero chance of an agent, conditioning only on being in this universe, being punched due to people like me choosing insult. This would make their satisfaction lower than it otherwise would be, thus decreasing the moral value of the universe if I chose insult over sand.

On the other hand, perhaps the people harmed by sand from "insult" would be lower than the number harmed by sand in "dust". In this situation, my ethical system could potentially prefer insult over dust. This doesn't seem like a bad thing to me, though, if it means you save some agents in certain agent-situation-descriptions from getting sand thrown at them.

Also, I'm wondering about your paragraph starting with, "The basic sitatuino is that I have intuitions which I can't formulate that well. I will try another route." If I'm understanding it correctly, I think I more or less agree with what you said in that paragraph. But I'm having a hard time understanding the significance of it. Are you intending to show a potential problem with my ethical system using it? The paragraph after it makes it seem like you were, but I'm not really sure.

Replies from: Slider## ↑ comment by Slider · 2021-11-08T22:04:01.346Z · LW(p) · GW(p)

Yes, insult is supposed to add to the injury.

Under my eror model you run into trouble when you treat any transfininte amount the same. From that perspective recognising two transfinite amounts that could be different is progress.

Another attempt to throw a situation you might not be able to handle. Instead of having 2 infinite groups of unknown relative size all receiving the same bad thing as compensation for the abuse 1 slice of cake for one gorup and 2 slices of cake for the second group. Could there be a difference in the group size that perfectly balances the cake slice difference in order to keep cake expectation constant?

Additional challenging situation. Instead of giving 1 or 2 slices of cake say that each slice is 3 cm wide so the original choices are between 3 cm of cake and 6 cm of cake. Now take some custom amount of cake slice (say 2.7 cm) then determine what would be group size to keep the world cake expectation the same. Then add 1 person to that group. Then convert that back to a cake slice width that keeps cake expectation the same. How wide is the slice?. Another formulation of the same challenge: Define a real number r for which converting that to a group size would get you a group of 5 people.

Did you get on board about the difference between "help all the stars" and "all the stars as they could have been"?

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-10T23:31:33.993Z · LW(p) · GW(p)

Under my eror model you run into trouble when you treat any transfininte amount the same. From that perspective recognising two transfinite amounts that could be different is progress.

I guess this is the part I don't really understand. My infinite ethical system doesn't even think about transfinite quantities. It only considers the prior probability over ending up in situations, which is always real-valued. I'm not saying you're wrong, of course, but I still can't see any clear problem.

Another attempt to throw a situation you might not be able to handle. Instead of having 2 infinite groups of unknown relative size all receiving the same bad thing as compensation for the abuse 1 slice of cake for one gorup and 2 slices of cake for the second group. Could there be a difference in the group size that perfectly balances the cake slice difference in order to keep cake expectation constant?

Are you asking if there is a way to simultaneously change the group size as well as change the relative amount of cake for each group so the expected number of cakes received is constant?

If this is what you mean, then my system can deal with this. First off, remember that my system doesn't worry about the number of agents in a group, but instead merely cares about the probability of an agent ending up in that group, conditioning only on being in this universe.

By changing the group size, however you define it, you can affect the probability of you ending up in that group. To see why, suppose you can do something to add any agents in a certain situation-description into the group. Well, as long as this situation has a finite description length, the probability of ending up in that situation is non-zero, so thus stopping them from being in that situation can decrease the probability of you ending up in that group.

So, currently, the expected value of cake received from these situations is P(in first group) * 1 + P(in second group) * 2. (For simplicity, I'm assuming no one else in the universe gets cake.) So, if you increase the number of cakes received by the second group by u, you just need to decrease P(in the first group) by 2u to keep the expectation constant.

Additional challenging situation. Instead of giving 1 or 2 slices of cake say that each slice is 3 cm wide so the original choices are between 3 cm of cake and 6 cm of cake. Now take some custom amount of cake slice (say 2.7 cm) then determine what would be group size to keep the world cake expectation the same. Then add 1 person to that group. Then convert that back to a cake slice width that keeps cake expectation the same. How wide is the slice?.

If literally only one more person gets cake, even considering acaucal effects, then this would in general not affect the expected value of cake. So the slice would still be 2.7cm.

Now, perhaps you meant that you directly cause one more person to get cake, resulting acausally in infinitely-many others getting cake. If so, then here's my reasoning:

Previously, the expected value of cake received from these situations was P(in first group) * 1 + P(in second group) * 2. Since cake size in non-constant, let's add a variable to this. So let's use P(in first group) * u + P(in second group) * 2. I'm assuming only the 1-slice group gets its cake amount adjusted; you can generalize beyond this. u represents the amount of cake the first group gets, with one 3cm slice being represented as 1.

Suppose adding the extra person acausally results in an increase in the probability of ending up in the first group by . So then, to avoid changing the expected value of cake, we need P(old probability of being in first group) * 1 = (P(old probability of being in first group) + $\epsilon) * u.

Solve that, and you get u = P(old probability of being in first group) / (P(old probability of being in first group) + $\epsilon). Just plug in the exact numbers of how much adding the person changes the probability of of ending up in the group, and you can get an exact slice width.

Another formulation of the same challenge: Define a real number r for which converting that to a group size would get you a group of 5 people.

I'm not sure what you mean here. What does it mean to convert a real number to a group size? One trivial way to interpret this is that the answer is 5: if you convert 5 to a group size, I guess(?) that means a group of five people. So, there you go, the answer would be 5. I take it this isn't what you meant, though.

Did you get on board about the difference between "help all the stars" and "all the stars as they could have been"?

No, I'm still not sure what you mean by this.

Replies from: Slider## ↑ comment by Slider · 2021-11-13T22:35:31.207Z · LW(p) · GW(p)

In P(old probability of being in first group) * 1 = (P(old probability of being in first group) + $\epsilon) * u the epsilon is smaller than any real number and there is no real small enough that it could characterise the difference between 1 and u.

If you have some odds or expectations that deal with groups and you have other considerations that deal with a finite amount of individuals you either have the finite people not impact the probabilities at all or the probabilities will stay infinidesimally close (for which is see a~b been used as I am reading up on infinities) which will conflict with the desarata of

Avoiding the fanaticism problem. Remedies that assign lexical priority to infinite goods may have strongly counterintuitive consequences.

In the usual way lexical priorities enter the picture beecause of something large but in your system there is a lexical priority because of something small, disintctions so faint that they become separable from the "big league" issues.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-14T22:25:16.592Z · LW(p) · GW(p)

In P(old probability of being in first group) * 1 = (P(old probability of being in first group) + $\epsilon) * u the epsilon is smaller than any real number and there is no real small enough that it could characterise the difference between 1 and u.

Could you explain why you think so? I had already explained why would be real, so I'm wondering if you had an issue with my reasoning. To quote my past self:

Remember that if you decide to take a certain action, that implies that other agents who are sufficiently similar to you and in sufficiently similar circumstances also take that action. Thus, you can acausally have non-infinitesimal impact on the satisfaction of agents in situations of the form, "An agent in a world with someone just like Slider who is also in very similar circumstances to Slider's." The above scenario is of finite complexity and isn't ruled out by evidence. Thus, the probability of an agent ending up in such a situation, conditioning only only on being some agent in this universe, is nonzero [and non-infinitesimal].

If you have some odds or expectations that deal with groups and you have other considerations that deal with a finite amount of individuals you either have the finite people not impact the probabilities at all or the probabilities will stay infinidesimally close (for which is see a~b been used as I am reading up on infinities) which will conflict with the desarata...

Just to remind you, my ethical system basically never needs to worry about finite impacts. My ethical system doesn't worry about causal impacts, except to the extent that the inform you about the total acausal impact of your actions on the moral value of the universe. All things you do have infinite acausal impact, and these are all my system needs to consider. To use my ethical system, you don't even need a notion of causal impact at all.

## comment by conchis · 2021-10-29T21:45:39.071Z · LW(p) · GW(p)

This sounds essentially like average utilitarianism with bounded utility functions. Is that right? If so, have you considered the usual objections to average utilitarianism (in particular, re rankings over different populations)?

Replies from: Chantiel## ↑ comment by Chantiel · 2021-10-30T22:06:59.716Z · LW(p) · GW(p)

Thank you for responding. I actually had someone else bring up the same way in a review; maybe I should have addressed this in the article.

The average life satisfaction is undefined in a universe with infinitely-many agents of varying life-satisfaction. Thus a moral system using it suffers from infinitarian paralysis. My system doesn't worry about averages, and thus does not suffer from this problem.

Replies from: conchis, tivelen## ↑ comment by conchis · 2021-11-08T22:09:19.206Z · LW(p) · GW(p)

My point was more that, even if you can calculate the expectation, standard versions of average utilitarianism are usually rejected for non-infinitarian reasons (e.g. the repugnant conclusion) that seem like they would plausibly carry over to this proposal as well. I haven't worked through the details though, so perhaps I'm wrong.

Separately, while I understand the technical reasons for imposing boundedness on the utility function, I think you probably also need a substantive argument for why boundedness makes sense, or at least is morally acceptable. Boundedness below risks having some pretty unappealing properties, I think.

Arguments that utility functions are in fact bounded *in practice* seem highly contingent, and potentially vulnerable e.g. to the creation of utility-monsters, so I assume what you really need is an argument that some form of sigmoid transformation from an underlying real-valued welfare, u = s(w), is justified.

On the one hand, the resulting diminishing marginal utility for high-values of welfare will likely be broadly acceptable to those with prioritarian intuitions. But I don't know that I've ever seen an argument for the sort of anti-prioritarian results you get as a result of increasing marginal utility at very low levels of welfare. Not only would this imply that there's a meaningful range where it's morally required to deprioritise the welfare of the worse off, this deprioritisation is greatest for the *very worst* off. Because the sigmoid function essentially saturates at very low levels of welfare, at some point you seem to end up in a perverse version of Torture vs. dust specks [LW · GW] where you think it's ok (or indeed required) to have 3^^^3 people (whose lives are already sufficiently *terrible*) horribly tortured for fifty years without hope or rest, to avoid someone in the middle of the welfare distribution getting a dust speck in their eye. This seems, well, problematic.

## ↑ comment by Chantiel · 2021-11-10T22:02:08.764Z · LW(p) · GW(p)

My point was more that, even if you can calculate the expectation, standard versions of average utilitarianism are usually rejected for non-infinitarian reasons (e.g. the repugnant conclusion) that seem like they would plausibly carry over to this proposal as well.

If I understand correctly, average utilitarianism isn't rejected due to the repugnant conclusion. In fact, it's the opposite: the repugnant conclusion is a problem for total utilitarianism, and average utilitarianism is one way to avoid the problem. I'm just going off what I read on The Stanford Encyclopedia of Philosophy, but I don't have particular reason to doubt what it says.

Separately, while I understand the technical reasons for imposing boundedness on the utility function, I think you probably also need a substantive argument for why boundedness makes sense, or at least is morally acceptable. Boundedness below risks having some pretty unappealing properties, I think.

Yes, I do think boundedness is essential for a utility function. The issue unbounded utility functions is that the expected value according to some probability distributions will be undefined. For example, if your utility follows a Cauchy distribution, then the expected utility is undefined.

Your actual probability distribution over utilities in an unbounded utility function wouldn't exactly follow a Cauchy distribution. However, I think that for whatever reasonable probability distribution you would use in real life, an unbounded utility function have still have an undefined expected value.

To see why, note that there is a non-zero probability probability that your utility really will be sampled from a Cauchy distribution. For example, suppose you're in some simulation run by aliens, and to determine your utility in your life after the simulation ends, they sample from the Cauchy distribution. (This is supposing that they're powerful enough to give you any utility). I don't have any completely conclusive evidence to rule out this possibility, so it has non-zero probability. It's not clear to me why an alien would do the above, or that they would even have the power to, but I still have no way to rule it out with infinite confidence. So your expected utility, conditioning on being in this situation, would be undefined. As a result, you can prove that your total expected utility would also be undefined.

So it seems to me that the only way you can actually have your expected values be robustly well-defined is by having a bounded utility function.

Because the sigmoid function essentially saturates at very low levels of welfare, at some point you seem to end up in a perverse version of Torture vs. dust specks where you think it's ok (or indeed required) to have 3^^^3 people (whose lives are already sufficiently terrible) horribly tortured for fifty years without hope or rest, to avoid someone in the middle of the welfare distribution getting a dust speck in their eye.

In principle, I do think this could occur. I agree that at first it intuitively seems undesirable. However, I'm not convinced it is, and I'm not convinced that there is a value system that avoids this without having even more undesirable results.

It's important to note that the sufficiently terrible lives need to be really, really, really bad already. So much so that being horribly tortured for fifty years does almost exactly nothing to affect their overall satisfaction. For example, maybe they're already being tortured for more than 3^^^^3 years, so adding fifty more years does almost exactly nothing to their life satisfaction.

Maybe it still seems to you that getting tortured for 50 more years would still be worse than getting a dust speck in the eye of an average person. However, if so, consider this scenario. You know you have a 50% chance of being tortured for more than 3^^^^3 years, and a 50% chance not being tortured and living in a regular world. However, you have have a choice: you can agree to get a very minor form of discomfort, like a dust speck in your eye in the case in which you aren't tortured, and you will as a result tortured for 50 fewer years if you don't end up in the situation in which you get tortured. So I suppose, given what you say, you would take it. But suppose your were given this opportuinty again. Well, you'd again be able to subtract 50 years of torture and get just a dust speck, so I guess you'd take it.

Imagine you're allowed to repeat this process for an extremely long time. If you think that getting one dust speck is worth it to avoid 50 years of torture, then I think you would keep accepting one more dust speck until your eyes have as much dust in them as they possibly could. And then, once you're done this this, you could go on to accepting some other extremely minor form of discomfort to avoid another 50 years of torture. Maybe you you start accepting an almost-exactly-imperceptible amount of back pain for another 50 years of torture reduction. And then continue this until your back, and the rest of your body parts, hurt quite a lot.

Here's the result of your deals: you have a 50% chance of being incredibly uncomfortable. Your eyes are constantly blinded and heavily irritated by dust specs, and you feel a lot of pain all over your body. And you have a 50% chance of being horribly tortured for more than 3^^^^3 years. Note that even though you get your tortured sentence reduced by 50 * <number extremely minor discomforts you get> years, this results in the amount of time your spend tortured would decrease by a very, very, very, almost infinitesimal proportion.

Personally, I much rather have a 50% chance of being able to have a life that actually decent, even if it means that I won't get to decrease the amount of time I'd spend possibly getting tortured by a near-infinitesimal proportion.

What if you still refuse? Well, the only way I can think of justifying your refusal is by having an unbounded utility function, so getting an extra 50 years of torture is around as bad as getting the first 50 years of torture. But as I've said, the expected values of unbounded utility functions seem to be undefined in reality, so this doesn't seem like a good idea.

My point from the above is that getting one more dust speck in someone's eye could in principle be better than having someone be tortured for 50 years, provided the tortured person would already have been tortured by a super-ultra-virtually-infinitely long time anyways.

Replies from: conchis, conchis## ↑ comment by conchis · 2021-11-11T06:48:54.575Z · LW(p) · GW(p)

Re boundedness:

It's important to note that the sufficiently terrible lives need to be really, really, really bad already. So much so that being horribly tortured for fifty years does almost exactly nothing to affect their overall satisfaction. For example, maybe they're already being tortured for more than 3^^^^3 years, so adding fifty more years does almost exactly nothing to their life satisfaction.

I realise now that I may have moved through a critical step of the argument quite quickly above, which may be why this quote doesn't seem to capture the core of the objection I was trying to describe. Let me take another shot.

I am very much *not* suggesting that 50 years of torture does virtually nothing to [life satisfaction - or whatever other empirical value you want to take as axiologically primitive; happy to stick with life satisfaction as a running example]. I am suggesting that 50 years of torture is *terrible* for [life satisfaction]. I am then drawing a distinction between [life-satisfaction] and the output of the utility function that you then take expectations of. The reason I am doing this, is because it seems to me that whether [life satisfaction] is bounded is a contingent empirical question, not one that can be settled by normative fiat in order to make it easier to take expectations.

If, as a matter of empirical fact, [life satisfaction] is bounded, then the objection I describe will not bite.

If, on the other hand [life-satisfaction] is not bounded, then requiring the utility function you take expectations of to be bounded forces us to adopt some form of sigmoid mapping from [life satisfaction] to "utility", and this in turn forces us, at some margin, to not care about things that are absolutely awful (from the perspective of [life satisfaction]). (If an extra 50 years of torture isn't sufficient awful for some reason, then we just need to pick something more awful for the purposes of the argument).

Perhaps because I didn't explain this very well the first time, what's not totally clear to me from your response, is whether you think:

(a) [life satisfaction] is in fact bounded; or

(b) even if [life satisfaction] is unbounded, it's actually ok to not care about stuff that is absolutely (infinitely?) awful from the perspective of [life-satisfaction] because it lets us take expectations more conveniently. [Intentionally provocative framing, sorry. Intended as an attempt to prompt genuine reflection, rather than to score rhetorical points.]

It's possible that (a) is true, and much of your response seems like it's probably (?) targeted at that claim, but FWIW, I don't think this case can be convincingly made by appealing to contingent personal values: e.g. suggesting that another 50 years of torture wouldn't much matter to you personally won't escape the objection, as long as there's a possible agent who would view their life-satisfaction as being materially reduced in the same circumstances.

Suggesting evolutionary bounds on satisfaction is another potential avenue of argument, but also feels too contingent to do what you really want.

Maybe you could make a case for (a) if you were to substitute a representation of individual preferences for [life satisfaction]? I'm personally disinclined towards preferences as moral primitives, particularly as they're not unique, and consequently can't deal with distributional issues, but YMMV.

ETA: An alternative (more promising?) approach could be to accept that, while it may not cover all possible choices, in practice we're more likely to face choices with an infinite extensive margin than with an infinite intensive margin, and that the proposed method could be a reasonable decision rule for such choices. Practically, this seems like it would be acceptable as long as whatever function we're using to map [life-satisfaction] into utility isn't a sigmoid over the relevant range, and instead has a (weakly) negative second derivative over the (finite) range of [life satisfaction] covered by all relevant options.

(I assume (in)ability-to-take-expectations wasn't intended as an argument for (a), as it doesn't seem up to making such an empirical case?)

On the other hand, if you're actually arguing for (b), then I guess that's a bullet you can bite; though I think I'd still be trying to dodge it if I could. ETA: If there's no alternative but to ignore infinities on either the intensive or extensive margin, I could accept choosing the intensive margin, but I'm inclined think this choice should be explicitly justified, and recognised as tragic if it really can't be avoided.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-12T21:23:12.524Z · LW(p) · GW(p)

It's possible that (a) is true, and much of your response seems like it's probably (?) targeted at that claim, but FWIW, I don't think this case can be convincingly made by appealing to contingent personal values: e.g. suggesting that another 50 years of torture wouldn't much matter to you personally won't escape the objection, as long as there's a possible agent who would view their life-satisfaction as being materially reduced in the same circumstances.

To some extent, whether or not life satisfaction is bounded just comes down to how you want to measure it. But it seems to me that any reasonable measure of life satisfaction really would be bounded.

I'll clarify the measure of life satisfaction I had in mind. Imagine if you showed an agent finitely-many descriptions of situations they could end up being in, and asked the agent to pick out the worst and the best of all of them. Assign the worst scenario satisfaction 0 and the best scenario satisfaction 1. For any other outcome w set the satisfaction to p, where p is the probability in which the agent would be indifferent between getting satisfaction 1 with probability p and satisfaction 0 with probability 1 - p. This is very much like a certain technique for constructing a utility function from elicited preferences. So, according to my definition, life satisfaction is bounded by definition.

(You can also take the limit of the agent's preferences as the number of described situations approaches infinite, if you want and if it converges. If it doesn't, then you could instead just ask the agent about its preferences with infinitely-many scenarios and require the infimum of satisfactions to be 0 and the supremum to be 1. Also you might need to do something special to deal with agents with preferences that are inconsistent even given infinite reflection, but I don't think this is particularly relevant to the discussion.)

Now, maybe you're opposed to this measure. However, if you reject it, I think you have a pretty big problem you need to deal with: utility monsters.

To quote Wikipedia:

A hypothetical being, which Nozick calls the utility monster, receives much more utility from each unit of a resource they consume than anyone else does. For instance, eating a cookie might bring only one unit of pleasure to an ordinary person but could bring 100 units of pleasure to a utility monster. If the utility monster can get so much pleasure from each unit of resources, it follows from utilitarianism that the distribution of resources should acknowledge this. If the utility monster existed, it would justify the mistreatment and perhaps annihilation of everyone else, according to the mandates of utilitarianism, because, for the utility monster, the pleasure they receive outweighs the suffering they may cause.

If you have some agents with unbounded measures satisfaction, then I think that would imply you would need to be willing cause arbitrary large amounts of suffering of agents with bounded satisfaction in order to increase the satisfaction of a utility monster as much as possible.

This seems pretty horrible to me, so I'm satisfied with keeping the measure of life satisfaction to be bounded.

In principle, you could have utility monster-like creatures in my ethical system, too. Perhaps all the agents other than the monster really have very little in the way of preferences, and so their life satisfaction doesn't change much at all by you helping them. Then you could potentially give resources to the monster. However, the effect of "utility monsters" is much more limited in my ethical system, and it's an effect that doesn't seem intuitively undesirable to me. Unlike if you had an unbounded satisfaction measure, my ethical system doesn't allow a single agent to cause arbitrarily large amounts of suffering to arbitrarily large numbers of other agents.

Further, suppose you do decide to have an unbounded measure of life satisfaction and aggregate it to allow even a finite universe to have arbitrarily high or low moral value. Then the expected moral values of the world would be undefined, just like how to expected value of unbounded utility functions are undefined. Specifically, just consider having a Cauchy distribution over the moral value of the universe. Such a distribution has no expected value. So, if you're trying to maximize the expected moral value of the universe, you won't be able to. And, as a moral agent, what else are you supposed to do?

Also, I want to mention that there's a trivial case in which you could avoid having my ethical system torture the agent for 50 years. Specifically, maybe there's some certain 50 years that decreases the agent's life satisfaction a lot, even though the other 50 years don't. For example, maybe the agent dreads the idea of having more than a million years of torture, so specifically adding those last 50 years would be a problem. But I'm guessing you aren't worrying about this specific case.

Replies from: conchis## ↑ comment by conchis · 2021-11-15T01:06:02.332Z · LW(p) · GW(p)

I'll clarify the measure of life satisfaction I had in mind. Imagine if you showed an agent finitely-many descriptions of situations they could end up being in, and asked the agent to pick out the worst and the best of all of them. Assign the worst scenario satisfaction 0 and the best scenario satisfaction 1.

Thanks. I've toyed with similar ideas perviously myself. The advantage, if this sort of thing works, is that it conveniently avoids a major issue with preference-based measures: that they're not unique and therefore incomparable across individuals. However, this method seems fragile in relying on a finite number of scenarios: doesn't it break if it's possible to imagine something worse than whatever the currently worst scenario is? (E.g. just keep adding 50 more years of torture.) While this might be a reasonable approximation in some circumstances, it doesn't seem like a fully coherent solution to me.

This seems pretty horrible to me, so I'm satisfied with keeping the measure of life satisfaction to be bounded.

IMO, the problem highlighted by the utility monster objection is fundamentally a prioritiarian one. A transformation that guarantees boundedness *above* seems capable of resolving this, without requiring boundedness *below* (and thus avoiding the problematic consequences that boundedness below introduces).

Further, suppose you do decide to have an unbounded measure of life satisfaction

Given issues with the methodology proposed above for constructing bounded satisfaction functions, it's still not entirely clear to me that this is really a *decision*, as opposed to an *empirical* question (which we then need to decide how to cope with from a normative perspective). This seems like it may be a key difference in our perspectives here.

So, if you're trying to maximize the expected moral value of the universe, you won't be able to. And, as a moral agent, what else are you supposed to do?

Well, in general terms the answer to this question has to be either (a) bite a bullet, or (b) find another solution that avoids the uncomfortable trade-offs. It seems to me that you'll be willing to bite most bullets here. (Though I confess it's actually a little hard for me to tell whether you're also denying that there's any meaningful tradeoff here; that case still strikes me as less plausible.) If so, that's fine, but I hope you'll understand why to some of us that might feel less like a solution to the issue of infinities, than a decision to just not worry about them on a particular dimension. Perhaps that's ultimately necessary, but it's definitely non-ideal from my perspective.

A final random thought/question: I get that we can't expected utility maximise unless we can take finite expectations, but does this actually prevent us having a consistent preference ordering over universes, or is it potentially just a representation issue? I would have guessed that the vNM axiom we're violating here is continuity, which I tend to think of as a convenience assumption rather than an actual rationality requirement. (E.g. there's not really anything substantively crazy about lexicographic preferences as far as I can tell, they're just mathematically inconvenient to represent with real numbers.) Conflating a lack of real-valued representations with lack of consistent preference orderings is a fairly common mistake in this space. That said, if it were just really just a representation issue, I would have expected someone smarter than me to have noticed by now, so (in lieu of actually checking) I'm assigning that low probability for now.

Replies from: Chantiel, Chantiel## ↑ comment by Chantiel · 2021-11-16T00:41:59.774Z · LW(p) · GW(p)

Also, in addition to my previous response, I want to note that the issues with unbounded satisfaction measures are not unique to my infinite ethical system. Instead, they are common potential problems with a wide variety of aggregate consequentialist theories.

For example, imagine suppose your a classical utilitarianism with an unbounded utility measure per person. And suppose you know that the universe is finite will consist of a single inhabitant with a utility whose probability distributions follows a Cauchy distribution. Then your expected utilities are undefined, despite the universe being knowably finite.

Similarly, imagine if you again used classical utilitarianism but instead you have a finite universe with one utility monster and 3^^^3 regular people. Then, if your expected utilities are defined, you would need to give the utility monster what it wants, to the expense of of everyone else.

So, I don't think your concern about keeping utility functions bounded is unwarranted; I'm just noting that they are part of a broader issue with aggregate consequentialism, not just with my ethical system.

Replies from: conchis## ↑ comment by Chantiel · 2021-11-15T23:12:15.715Z · LW(p) · GW(p)

Thanks. I've toyed with similar ideas perviously myself. The advantage, if this sort of thing works, is that it conveniently avoids a major issue with preference-based measures: that they're not unique and therefore incomparable across individuals. However, this method seems fragile in relying on a finite number of scenarios: doesn't it break if it's possible to imagine something worse than whatever the currently worst scenario is? (E.g. just keep adding 50 more years of torture.) While this might be a reasonable approximation in some circumstances, it doesn't seem like a fully coherent solution to me.

As I said, you can allow for infinitely-many scenarios if you want; you just need to make it so the supremum of them their value is 1 and the infimum is 0. That is, imagine there's an infinite sequence of scenarios you can come up with, each of which is worse than the last. Then just require that the infimum of the satisfaction of those sequences is 0. That way, as you consider worse and worse scenarios, the satisfaction continues to decrease, but never gets below 0.

IMO, the problem highlighted by the utility monster objection is fundamentally a prioritiarian one. A transformation that guarantees boundedness above seems capable of resolving this, without requiring boundedness below (and thus avoiding the problematic consequences that boundedness below introduces).

One issue with only having boundedness above is that is that the expected of life satisfaction for an arbitrary agent would probably often be undefined or in expectation. For example, consider if an agent had a probability distribution like a Cauchy distribution, except that it assigns probability 0 to anything about the maximize level of satisfaction, and is then renormalized to have probabilities sum to 1. If I'm doing my calculus right, the resulting probability distribution's expected value doesn't converge. You could either interpret this as the expected utility being undefined or being , since the Rienmann sum approaches as the width of the column approaches zero.

That said, even if the expectations are defined, it doesn't seem to me that keeping the satisfaction measure bounded above but not bellow would solve the problem of utility monsters. To see why, imagine a new utility monster as follows. The utility monster feels an incredibly strong need to have everyone on Earth be tortured. For the next hundred years, its satisfaction will will decrease by 3^^^3 for every second there's someone on Earth not being tortured. Thus, assuming the expectations converge, the moral thing to do, according to maximizing average, total, or expected-value-conditioning-on-being-in-this-universe life satisfaction is to torture everyone. This is a problem both in finite and infinite cases.

A final random thought/question: I get that we can't expected utility maximise unless we can take finite expectations, but does this actually prevent us having a consistent preference ordering over universes, or is it potentially just a representation issue?

If I understand what you're asking correctly, you can indeed have consistent preferences over universes, even if you don't have a bounded utility function. The issue is, in order to act, you need more than just a consistent preference order over possible universe. In reality, you only get to choose between probability distributions over possible worlds, not specific possible worlds. And this, with an unbounded utility function, will tend to result in undefined expected utilities over possible actions and thus is not informative of what action you should take. Which is the whole point of utility theory and ethics.

Now, according to some probability distributions, can have well-defined expected values even with an unbounded utility function. But, as I said, this is not robust, and I think that in practice expected values of an unbounded utility function would be undefined.

Replies from: conchis## ↑ comment by conchis · 2021-11-16T05:52:39.953Z · LW(p) · GW(p)

you just need to make it so the supremum of them their value is 1 and the infimum is 0.

Fair. Intuitively though, this feels more like a rescaling of an underlying satisfaction measure than a plausible definition of satisfaction to me. That said, if you're a preferentist, I accept this is internally consistent, and likely an improvement on alternative versions of preferentism.

One issue with only having boundedness above is that is that the expected of life satisfaction for an arbitrary agent would probably often be undefined or in expectation

Yes, and I am obviously not proposing a solution to this problem! More just suggesting that, *if there are infinities in the problem that appear to correspond to actual things we care about*, then defining them out of existence seems more like deprioritising the problem than solving it.

The utility monster feels an incredibly strong need to have everyone on Earth be tortured

I think this framing muddies the intuition pump by introducing sadistic preferences, rather than focusing just on unboundedness below. I don't think it's necessary to do this: unboundedness below means there's a sense in which everyone is a potential "negative utility monster" if you torture them long enough. I think the core issue here is whether there's some point at which we just stop caring, or whether that's morally repugnant.

in order to act, you need more than just a consistent preference order over possible universe. In reality, you only get to choose between probability distributions over possible worlds, not specific possible worlds

Sorry, sloppy wording on my part. The question should have been "does this actually prevent us having a consistent preference ordering over gambles over universes" (even if we are not able to represent those preferences as maximising the expectation of a real-valued social welfare function)? We know (from lexicographic preferences) that "no-real-valued-utility-function-we-are-maximising-expectations-of" does not immediately imply "no-consistent-preference-ordering" (if we're willing to accept orderings that violate continuity). So pointing to undefined expectations doesn't seem to immediately rule out consistent choice.

Replies from: Chantiel, Chantiel## ↑ comment by Chantiel · 2021-11-19T19:50:59.616Z · LW(p) · GW(p)

I think this framing muddies the intuition pump by introducing sadistic preferences, rather than focusing just on unboundedness below. I don't think it's necessary to do this: unboundedness below means there's a sense in which everyone is a potential "negative utility monster" if you torture them long enough. I think the core issue here is whether there's some point at which we just stop caring, or whether that's morally repugnant.

Fair enough. So I'll provide a non-sadistic scenario. Consider again the scenario I previously described in which you have a 0.5 chance of being tortured for 3^^^^3 years, but also have the repeated opportunity to cause yourself minor discomfort in the case of not being tortured and as a result get your possible torture sentence reduced by 50 years.

If you have an unbounded below utility function in which each 50 years causes a linear decrease in satisfaction or utility, then to maximize expected utility or life satisfaction, it seems you would need to opt for living in extreme discomfort in the non-torture scenario to decrease your possible torture time be an astronomically small proportion, provided the expectations are defined.

To me, at least, it seems clear that you should not take the opportunities to reduce your torture sentence. After all, if you repeatedly decide to take them, you will end up with a 0.5 chance of being highly uncomfortable and a 0.5 chance of being tortured for 3^^^^3 years. This seems like a really bad lottery, and worse than the one that lets me have a 0.5 chance of having an okay life.

Sorry, sloppy wording on my part. The question should have been "does this actually prevent us having a consistent preference ordering over gambles over universes" (even if we are not able to represent those preferences as maximising the expectation of a real-valued social welfare function)? We know (from lexicographic preferences) that "no-real-valued-utility-function-we-are-maximising-expectations-of" does not immediately imply "no-consistent-preference-ordering" (if we're willing to accept orderings that violate continuity). So pointing to undefined expectations doesn't seem to immediately rule out consistent choice.

Oh, I see. And yes, you can have consistent preference orderings that aren't represented as a utility function. And such techniques have been proposed before in infinite ethics. For example, one of Bostrom's proposals to deal with infinite ethics is the extended decision rule. Essentially, it says to first look at the set of actions you could take that would maximize P(infinite good) - P(infinite bad). If there is only one such action, take it. Otherwise, take whatever action among these that has highest expected moral value given a finite universe.

As far as I know, you can't represent the above as a utility function, despite it being consistent.

However, the big problem with the above decision rule is that it suffers from the fanaticism problem: people would be willing to bear any finite cost, even 3^^^3 years of torture, to have even an unfathomably small chance of increasing the probability of infinite good or decreasing the probability of infinite bad. And this can get to pretty ridiculous levels. For example, suppose you are sure you can easily design a world that makes every creature happy and greatly increases the moral value of the world in a finite universe if implemented. However, you know that coming up with such a design would take one second of computation on your supercomputer, which means one less second to keep thinking about astronomically-improbable situations in which you could cause infinite good. Thus would have some minuscule chance of avoiding infinite good or causing infinite bad. Thus, you decide to not help anyone, because you won't spare the 1 second of computer time.

More generally, I think the basic property of non-real-valued consistent preference orderings is that they value some things "infinitely more" than others. The issue is, if you really value some property infinitely more than some other property of lesser importance, it won't be worth your time to even consider pursuing the property of lesser importance, because it's always possible you could have used the extra computation to slightly increase your chances of getting the property of greater importance.

Replies from: conchis## ↑ comment by conchis · 2021-11-22T11:26:00.121Z · LW(p) · GW(p)

To me, at least, it seems clear that you should not take the opportunities to reduce your torture sentence. After all, if you repeatedly decide to take them, you will end up with a 0.5 chance of being highly uncomfortable and a 0.5 chance of being tortured for 3^^^^3 years. This seems like a really bad lottery, and worse than the one that lets me have a 0.5 chance of having an okay life.

FWIW, this conclusion is not clear to me. To return to one of my original points: I don't think you can dodge this objection by arguing from potentially idiosyncratic preferences, even *perfectly reasonable* ones; rather, you need it to be the case that *no rational agent* could have different preferences. Either that, or you need to be willing to override otherwise rational individual preferences when making interpersonal tradeoffs.

To be honest, I'm actually not entirely averse to the latter option: having interpersonal trade-offs determined by contingent individual risk-preferences has never seemed especially well-justified to me (particularly if probability is in the mind [LW · GW]). But I confess it's not clear whether that route is open to you, given the motivation for your system as a whole.

More generally, I think the basic property of non-real-valued consistent preference orderings is that they value some things "infinitely more" than others.

That makes sense, thanks.

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-22T23:21:20.743Z · LW(p) · GW(p)

FWIW, this conclusion is not clear to me. To return to one of my original points: I don't think you can dodge this objection by arguing from potentially idiosyncratic preferences, even perfectly reasonable ones; rather, you need it to be the case that no rational agent could have different preferences. Either that, or you need to be willing to override otherwise rational individual preferences when making interpersonal tradeoffs.

Yes, that's correct. It's possible that there are some agents with consistent preferences that really would wish to get extraordinarily uncomfortable to avoid the torture. My point was just that this doesn't seem like it would would be a common thing for agents to want.

Still, it is conceivable that there are at least a few agents out their that would consistently want to opt for the 0.5 chance of being extremely uncomfortable option, and I do suppose it would be best to respect their wishes. This is a problem that I hadn't previously fully appreciated, so I would like to thank you for brining it up.

Luckily, I think I've finally figured out a way to adapt my ethical system to deal with this. That is, the adaptation will allow for agents to choose the extreme-discomfort-from-dust-specks option if that is what they wish for my my ethical system to respect their preferences. To do this, allow for the measure to satisfaction to include infinitesimals. Then, to respect the preferences of such agents, you just need need to pick the right satisfaction measure.

Consider the agent that for which each 50 years of torture causes a linear decrease in their utility function. For simplicity, imagine torture and discomfort are the only things the agent cares about; they have no other preferences; also assume that the agent dislike torture more than it dislikes discomfort, but only be a finite amount. Since the agent's utility function/satisfaction measure is linear, I suppose being tortured for an eternity would be infinitely worse for the agent than being tortured for a finite amount of time. So, assign satisfaction 0 to the scenario in which the agent is tortured for eternity. And if the agent is instead tortured for years, let the agent's satisfaction be , where is whatever infinitesimal number you want. If my understanding of infinitesimals is correct, I think this will do what we want it to do in terms having agents using my ethical system respect the agent's preferences.

Specifically, since being tortured forever would be infinitely worse than being tortured for a finite amount of time, any finite amount of torture would be accepted to decrease the chance of infinite torture. And this is what maximizing this satisfaction measure does: for any lottery, changing the chance of infinite torture has finite affect on expected satisfaction, whereas changing the chance of finite torture only has infinitesimal effect, so so avoiding infinite torture would be prioritized.

Further, among lotteries involving finite amounts of torture, it seems the ethical system using this satisfaction measure continues to do what what it's supposed to do. For example, consider the choice between the previous two options:

- A 0.5 chance of being tortured for 3^^^^3 years and a 0.5 chance of being fine.
- A 0.5 chance of 3^^^^3 - 9999999 years of torture and 0.5 chance of being extraordinarily uncomfortable.

If I'm using my infinitesimal math right, the expected satisfaction of taking option 1 would be , and the expected satisfaction of taking option 2 would be , for some . Thus, to maximize this agent's satisfaction measure, my moral system would indeed let the agent give infinite priority to avoiding infinite torture, the ethical system would itself consider the agent to get infinite torture infinitely-worse than getting finite torture, and would treat finite amounts of torture as decreasing satisfaction in a linear manner. And, since the utility measure is still technically bounded, it would still avoid the problem with utility monsters.

(In case it was unclear, is Knuth's up-arrow notion, just like "^".)

## ↑ comment by Chantiel · 2021-11-19T19:51:43.059Z · LW(p) · GW(p)

I think this framing muddies the intuition pump by introducing sadistic preferences, rather than focusing just on unboundedness below. I don't think it's necessary to do this: unboundedness below means there's a sense in which everyone is a potential "negative utility monster" if you torture them long enough. I think the core issue here is whether there's some point at which we just stop caring, or whether that's morally repugnant.

Fair enough. So I'll provide a non-sadistic scenario. Consider again the scenario I previously described in which you have a 0.5 chance of being tortured for 3^^^^3 years, but also have the repeated opportunity to cause yourself minor discomfort in the case of not being tortured and as a result get your possible torture sentence reduced by 50 years.

If you have an unbounded below utility function in which each 50 years causes a linear decrease in satisfaction or utility, then to maximize expected utility or life satisfaction, it seems you would need to opt for living in extreme discomfort in the non-torture scenario to decrease your possible torture time be an astronomically small proportion, provided the expectations are defined.

To me, at least, it seems clear that you should not take the opportunities to reduce your torture sentence. After all, if you repeatedly decide to take them, you will end up with a 0.5 chance of being highly uncomfortable and a 0.5 chance of being tortured for 3^^^^3 years. This seems like a really bad lottery, and worse than the one that lets me have a 0.5 chance of having an okay life.

Sorry, sloppy wording on my part. The question should have been "does this actually prevent us having a consistent preference ordering over gambles over universes" (even if we are not able to represent those preferences as maximising the expectation of a real-valued social welfare function)? We know (from lexicographic preferences) that "no-real-valued-utility-function-we-are-maximising-expectations-of" does not immediately imply "no-consistent-preference-ordering" (if we're willing to accept orderings that violate continuity). So pointing to undefined expectations doesn't seem to immediately rule out consistent choice.

Oh, I see. And yes, you can have consistent preference orderings that aren't represented as a utility function. And such techniques have been proposed before in infinite ethics. For example, one of Bostrom's proposals to deal with infinite ethics is the extended decision rule. Essentially, it says to first look at the set of actions you could take that would maximize P(infinite good) - P(infinite bad). If there is only one such action, take it. Otherwise, take whatever action among these that has highest expected moral value given a finite universe.

As far as I know, you can't represent the above as a utility function, despite it being consistent.

However, the big problem with the above decision rule is that it suffers from the fanaticism problem: people would be willing to bear any finite cost, even 3^^^3 years of torture, to have even an unfathomably small chance of increasing the probability of infinite good or decreasing the probability of infinite bad. And this can get to pretty ridiculous levels. For example, suppose you are sure you can easily design a world that makes every creature happy and greatly increases the moral value of the world in a finite universe if implemented. However, you know that coming up with such a design would take one second of computation on your supercomputer, which means one less second to keep thinking about astronomically-improbable situations in which you could cause infinite good. Thus would have some minuscule chance of avoiding infinite good or causing infinite bad. Thus, you decide to not help anyone, because you won't spare the one second of computer time.

More generally, I think the basic property of non-real-valued consistent preference orderings is that they value some things "infinitely more" than others. The issue is, if you really value some property infinitely more than some other property of lesser importance, it won't be worth your time to even consider pursuing the property of lesser importance, because it's always possible you could have used the extra computation to slightly increase your chances of getting the property of greater importance.

## ↑ comment by conchis · 2021-11-11T06:41:43.891Z · LW(p) · GW(p)

Re the repugnant conclusion: apologies for the lazy/incorrect example. Let me try again with better illustrations of the same underlying point. To be clear, I am not suggesting these are knock-down arguments; just that, given widespread (non-infinitarian) rejection of average utilitarianisms, you probably want to think through whether your view suffers from the same issues and whether you are ok with that.

Though there's a huge literature on all of this, a decent starting point is here:

Replies from: ChantielHowever, the average view has very little support among moral philosophers since it suffers from severe problems.

First, consider a world inhabited by a single person enduring excruciating suffering. The average view entails that we could improve this world by creating a million new people whose lives were also filled with excruciating suffering if the suffering of the new people was ever-so-slightly less bad than the suffering of the original person.

^{26}Second, the average view entails the

sadistic conclusion: It can sometimes be better to create lives with negative wellbeing than to create lives with positive wellbeing from the same starting point, all else equal.Adding a small number of tortured, miserable people to a population diminishes the average wellbeing less than adding a sufficiently large number of people whose lives are pretty good, yet below the existing average...

Third, the average view prefers arbitrarily small populations over very large populations, as long as the average wellbeing was higher. For example, a world with a single, extremely happy individual would be favored to a world with ten billion people, all of whom are extremely happy but just ever-so-slightly less happy than that single person.

## ↑ comment by Chantiel · 2021-11-12T20:09:53.336Z · LW(p) · GW(p)

Thanks for the response.

Third, the average view prefers arbitrarily small populations over very large populations, as long as the average wellbeing was higher. For example, a world with a single, extremely happy individual would be favored to a world with ten billion people, all of whom are extremely happy but just ever-so-slightly less happy than that single person.

In an infinite universe, there's already infinitely-many people, so I don't think this applies to my infinite ethical system.

First, consider a world inhabited by a single person enduring excruciating suffering. The average view entails that we could improve this world by creating a million new people whose lives were also filled with excruciating suffering if the suffering of the new people was ever-so-slightly less bad than the suffering of the original person.

Second, the average view entails the sadistic conclusion: It can sometimes be better to create lives with negative wellbeing than to create lives with positive wellbeing from the same starting point, all else equal.

In a finite universe, I can see why those verdicts would be undesirable. But in an infinite universe, there's already infinitely-many people at all levels of suffering. So, according to my own moral intuition at least, it doesn't seem that these are bad verdicts.

You might have differing moral intuitions, and that's fine. If you do have an issue with this, you could potentially modify my ethical system to make it an analogue of total utilitarianism. Specifically, consider the probability distribution something would have if it conditions on it ending up somewhere in this universe, but doesn't even know if it will be an actual agent with preferences or not.That is, it uses some prior that allows for the possibility that of ending up as a preference-free rock or something. Also, make sure the measure of life satisfaction treats existences with neutral welfare and the existences of things without preferences as zero. Now, simply modify my system to maximize the expected value of life satisfaction, given this prior. That's my total-utilitarianism-infinite-analog ethical system.

So, to give an example of how this works, consider the situation in which you can torture one person to avoid creating a large number of people with pretty decent lives. Well, the large number of people with pretty decent lives would increase the moral value of the world, because creating those people makes it more likely that a prior that something would end up as an agent with positive life satisfaction rather than some inanimate object, conditioning only on being something in this universe. But adding a tortured creature would only decrease the moral value of the universe. Thus, this total-utilitarian-infinite-analogue ethical system would prefer create the large number of people with decent lives than to tortured one creature.

Of course, if you accept this system, then you have to a way to deal with the repugnant conclusion, just like you need to find a way to deal with it using regular total utilitarian in a finite universe. I've yet to see any satisfactory solution to the repugnant conclusion. But if there is one, I bet you could extend it to this total-utilitarian-infinite-analogue ethical system. This is because because this ethical system is a lot like regular total utilitarian, except it replaces, "total number of creatures with satisfaction x" with "total probability mass of ending up as a creature with satisfaction x".

Given the lack of a satisfactory solution to the repugnant conclusion, I prefer the idea of just sticking with my average-utilitarianism-like infinite ethical system. But I can see why you might have different preferences.

Replies from: conchis## ↑ comment by conchis · 2021-11-15T01:07:28.044Z · LW(p) · GW(p)

In an infinite universe, there's already infinitely-many people, so I don't think this applies to my infinite ethical system.

YMMV, but FWIW allowing a system of infinite ethics to get finite questions (which should just be a special case) wrong seems a very non-ideal property to me, and suggests something has gone wrong somewhere. Is it really never possible to reach a state where all remaining choices have only finite implications?

Replies from: Chantiel## ↑ comment by Chantiel · 2021-11-15T22:27:37.828Z · LW(p) · GW(p)

For the record, according to my intuitions, average consequentialism seems perfectly fine to me in a finite universe.

That said, if you don't like using average consequentialism in a finite case, I don't personally see what's wrong with just having a somewhat different ethical system for finite cases. I know it seems ad-hoc, but I think there really is an important distinction between finite and infinite scenarios. Specifically, people have the moral intuition that larger numbers of satisfied lives are more valuable than smaller numbers of them, which average utilitarianism conflicts with. But in an infinite universe, you can't change the total amount of satisfaction or dissatisfaction.

But, if you want, you could combine both the finite ethical system and infinite ethical system so that a single principle is used for moral deliberation. This might make it feel less ad-hocy. For example, you could have a moral value function that of the form, f(total amount of satisfaction and dissatisfaction in the universe) * expected value of life satisfaction for an arbitrary agent in this universe. And let f be some bounded function that's maximized by and approaches this value very slowly.

For those who don't want this, they are free to use my total-utilitarian-infinite-ethical system. I think that it just ends up as regular total utilitarian in a finite world, or close to it.

## ↑ comment by tivelen · 2021-11-05T00:28:48.998Z · LW(p) · GW(p)

Your system may not worry about average life satisfaction, but it does seem to worry about expected life satisfaction, as far as I can tell. How can you define expected life satisfaction in a universe with infinitely-many agents of varying life-satisfaction? Specifically, given a description of such a universe (in whatever form you'd like, as long as it is general enough to capture any universe we may wish to consider), how would you go about actually doing the computation?

Alternatively, how do you think that computing "expected life satisfaction" can avoid the acknowledged problems of computing "average life satisfaction", in general terms?