Appraising aggregativism and utilitarianism
post by Cleo Nardo (strawberry calm) · 2024-06-21T23:10:37.014Z · LW · GW · 10 commentsContents
1. Introduction 2. Advantages of aggregativism 2.1 Avoids excessive permissions 2.2. Avoids excessive obligations. 2.3. Computationally tractable 2.4. Retains utilitarian spirit 2.4. Lower description complexity 2.6. Avoids counterintuitive implications 2.5.1. Repugnant conclusion 2.5.2. Extreme suffering 2.7. More concrete and relatable 3. Objections to aggregativism 3.1. Inherits human irrationality 3.2. Requires model of human behaviour 3. Conclusion None 10 comments
“My problem is: What are those objects we are adding up? I have no objection to adding them up if there's something to add.” — Kenneth Arrow
1. Introduction
Aggregative principles state that a social planner should make decisions as if they will face the aggregated personal outcomes of every individual in the population. Different modes of aggregation generate different aggregative principles. In general, a mode aggregation provides some method for reducing a collection of personal outcomes to a single personal outcome.
There are three notable aggregative principles:
- Live Every Life Once, stating that a social planner should make decisions as if they face the concatenation of every individual's life in sequence.[1]
- Harsanyi's Lottery, stating that a social planner should make decisions as if they face a uniform lottery over the individuals in population.[2]
- Rawls' Original Position, stating that a social planner should make decisions as if they face ignorance about which individual's life they will live, with no basis for assigning probabilities.[3]
This article follows on from my previous articles on this topic:
- Aggregative principles of social justice [LW · GW]
- Aggregative principles approximate utilitarian principles [LW · GW]
In this article I compare aggregativism with an alternative strategy for specifying principles of social justice: namely, utilitarianism. Aggregativism avoids many theoretical pitfalls that plague utilitarianism, but faces its own objections. Ultimately, I conclude that aggregativism's advantages outweigh the objections; it is superior to utilitarianism as a strategy for specifying principles of social justice, though the objections are serious and worth addressing in future work.
The rest of the article is organized as follows. Section 2 discusses seven advantages of aggregativism over utilitarianism, in descending importance:
- Avoids excessive permissions. If an option strongly defies human nature, then aggregativism will never permit choosing it, unlike utilitarianism.
- Avoids excessive obligations. If an option strongly conforms to human nature, then aggregativism will never forbid choosing it, unlike utilitarianism.
- Computationally tractable. Aggregativism can be implemented with realistic computational resources, unlike utilitarianism.
- Retains utilitarian spirit. Under reasonable conditions, aggregativism approximates utilitarianism.
- Lower description complexity. The aggregative approach sidesteps the complexity of defining a social utility function.
- Avoids counterintuitive implications. Aggregativism resolves Parfit's Repugnant Conclusion, the dismissal of extreme suffering, and similar paradoxes that beset utilitarianism.
- More concrete and relatable. Aggregativism grounds ethics in personal experiences, which are easier to reason about than abstract numbers.
Section 3 discusses two objections to aggregativism, in descending severity:
- Inherits human irrationality. Aggregativism inherits the irrationalities of human decision-making.
- Requires model of human behaviour. Aggregativism depends an accurate model of human choice across all contexts, both mundane and exotic.
2. Advantages of aggregativism
2.1 Avoids excessive permissions
Aggregativism is more robust than utilitarianism to 'going off the rails'. The crucial difference is that, under aggregativism, the ultimate arbiter of social choice is human behaviour, providing a safeguard against bizarre or inhuman choices. Utilitarianism, by contrast, outsources social choices to an abstract optimization process, , which is susceptible to extreme or bizarre results.
To see why aggregativism is more robust than utilitarianism, let's quickly review the formal framework introduced in the previous article [? · GW]. Feel free to skim this if you're already familiar:
- Let be the space of personal outcomes, a full description of the state-of-affairs for a single individual; and let be the space of social outcomes, a full description of the state-of-affairs for society as a whole. If denotes the set of individuals in society, then there is a function such that, if the social outcome obtains, then each individual faces the personal outcome .
- Suppose a social planner may choose from a set of options , and the social consequences of each option are described by a function assigning a social outcome to each option . For instance, might be a set of possible tax rates, and the set of possible societal wealth distributions. We call the social context.
- A social choice principle is any function which takes a social context as input and returns the set of 'permissible' options as output. For example, the principle
will permit choosing an option if and only if most other options would've led to the same social outcome. The task of social ethics is specifying a normatively compelling social choice principle. - A utilitarian principle has the form where is a social utility function mapping social outcomes to real values, and returns the maximisers of any given real-valued function over . For instance, if is the gross world product then the utilitarian principle endorsing maximising gross world product. This strategy for specifying social choice principles is called utilitarianism.
- An aggregative principle has the form where is a function mapping social outcomes to personal outcomes, and models the behavior of a self-interested human, identifying the options they might choose in a personal context . For instance, if is the personal outcome of living every individual's life in sequence, then the aggregative principle is MacAskill's LELO. We call a function a social zeta function. This strategy for specifying social choice principles is called aggregativism.
Using this formal framework, we can prove that aggregative principles satisfy a built-in consistency property. This property, which we call 'always-forbidden consistency' can be stated as follows: if a human will never choose a particular option, regardless of the personal context, then the social planner is forbidden to choose that option, regardless of the personal context. Formally: if for all personal contexts , then for all social contexts .
Clearly, any aggregative principle satisfies this property. For example, suppose a human would never choose to torture an innocent person, no matter the personal consequences. Then a social planner following any aggregative principle is always forbidden to torture an innocent person, no matter the social consequences. Aggregativism is therefore constrained by the limits of human decision-making.
Why might an option be entirely ruled out by ? There are three main reasons, all covered by this consistency property.
- Violates moral constraints. The option may transgress inviolable boundaries, like prohibitions on murder, torture, or slavery. If human behaviour, as modelled by , precludes such options, then the aggregative principle will forbid those options.
- Logical, metaphysical, or nomological impossibility. The option may be logically incoherent or incompatible with the laws of nature. For example, might involve traveling faster than light, or being in two places at once, or making . Our model of the option space might have been too inclusive, including some pseudo-options that weren't actually possible.
- Unthinkability. The option may be so alien to a person's character that they would never ever entertain it. It's not something they would recognise as an eligible option, even though it doesn't violate the laws of nature or transgress ethical boundaries.[4]
The aggregative principles forbid all such options. Crucially, this consistency property holds regardless of how the social zeta function is defined. Even if is poorly specified, or fails to represent a normatively compelling notion of impartial aggregation, the social planner would nonetheless never choose an option that a human would never choose, if they followed the aggregative principle generated by .
Utilitarianism violates 'always-forbidden consistency'. Utilitarianism can permit literally any option, no matter how bizarre, in some social context. This hold for any non-constant social utility function ; there will exists some pair of outcomes with . We can therefore construct a social context where the options permitted by the utilitarian principle are precisely those we want the utilitarian principle to permit.
To prove this, let be any nonempty subset of . Define the social context as follows:
Then for the utilitarian principle , we have if and only if . Therefore, the utilitarian planner maximizing would be permitted to choose precisely the options in , even if contains options which violate moral constraints, are logically, metaphysically, or nomological impossible, or are otherwise unthinkable.[5]
Under utilitarianism, everything is potentially permitted. The root problem is that utilitarianism delegates social choice to , an abstract optimization process deviating wildly from realistic human behaviour. Without this anchor of human decision-making, utilitarianism can go arbitrarily awry.
2.2. Avoids excessive obligations.
In addition, aggregativism satisfies 'always-permissible consistency' which can be stated as follows: if a human might always choose a particular option, regardless of the personal context, then the social planner is permitted to choose that option, regardless of the social context. Formally: if for all personal contexts , then for all social contexts . This is the dual of the 'always-forbidden consistency' property discussed above.
Clearly, an aggregative principle satisfies this property. For example, suppose there's some 'do nothing' option such that, regardless of the personal consequences, it's always possible that the human would choose this option. Then the social planner following any aggregative principle is always permitted to 'do nothing', no matter the social consequences.
In contrast, utilitarian principles violate always-permissible consistency. Utilitarianism can forbid any option, no matter how inoffensive, in some social contexts. To prove this, we again assume that is a non-constant social utility function, with . If is any proper subset of , then we set and define the social context as we did before:
Then for the utilitarian principle , we have if and only if . Now, for any option , there will exist some proper subset containing , provided that . The utilitarian planner maximizing would be forbidden to choose in this social context, even if is inoffensive.
Under utilitarianism, everything is potentially forbidden. This situation is the basis of the demandingness objection to utilitarianism: any option, such as saving a child from a burning building, donating one's income to AMF, or dedicating one's to esoteric research, is potentially obligated by the utilitarian principle.
2.3. Computationally tractable
Consider the following scenario:
A social planner must choose a 1000-bit proof, so their option space is . Let denote all 1000-bit mathematical statements, in a formal language such as Peano Arithmetic. If then the social context is defined as follows:
- If , then , meaning everyone gains £100.
- If and encodes a valid proof of statement , then , meaning everyone gains £200.
- If and doesn't encode a valid proof of statement , then , meaning everyone gains £0.
For the sake of concreteness, take to be Goldbach's conjecture. The key property of this class of social contexts is that finding a proof of statement is computationally intractable, but verifying a claimed proof is easy.[6]
Now, any reasonable social utility function would rank the social outcomes as . Let's examine what is demanded of a social planner by the utilitarian principle . When the social planner faces the social context then, if is a mathematical statement with a 1000-bit proof, then the social planner is obligated to provide it. Choosing 0 would be impermissible, as would choosing any nonzero string that doesn't encode a valid proof of . The catch is that the social planner has bounded computational resources, so they cannot actually implement this principle.[7] The utilitarian principle is not action-guiding for agents with realistic computational limits.
By contrast, the aggregative principle would regularly endorse choosing zero when facing the context . This is because a self-interested human has little chance of proving a mathematical statement such as Goldbach's conjecture. Hence, if a self-interested human anticipated facing the aggregated personal outcomes of each individual, then they would play it safe by choosing zero. Unlike utilitarianism, aggregativism provides clear guidance which doesn't exceed the social planner's computational resources.
The root problem is that the operator, which underlies utilitarianism, is computationally unbounded. Formally, there exist easy functions such that there exist no similarly easy functions satisfying for all . Hence, if denotes the social planner's options, and parameterizes a class of social contexts, then the operator can search over the entire option-space for a maximiser of , although no physical implementation could find a maximiser efficiently.
One may object that the utilitarian principle is not meant to be a computationally tractable decision procedure, but merely to define what counts as right or to ground normative judgments more abstractly.[8] And that's a fair point. However, the fact that utilitarian principles fail to guide action in cases like this, while aggregative principles succeed, does count in favour of aggregativism.[9]
2.4. Retains utilitarian spirit
With a well-chosen social zeta function , aggregativism retains the most appealing features of utilitarianism. Like utilitarianism, aggregativism is sensitive to outcomes: whether an option is permitted typically depends on its social consequences, as captured by the function . Of course, aggregativism may consider nonconsequentialist factors also (see Section 2.1). Nonetheless, it preserves the consequentialist spirit of utilitarianism; directing the social planner to choose options that lead to desirable personal outcomes for individuals, considered impartially.
Consider the famous trolley problem: a runaway trolley will kill five people unless diverted to a side track, where it will kill only one person. The naïve utilitarian verdict is clear: you must divert the trolley. Doing so leads to one death rather than five, and the former social outcome has higher social utility. An aggregative principle would likely agree, assuming the social zeta function is well-chosen. For example, suppose maps each social outcome to a lottery over the fates of the affected individuals. In that case, most people would prefer the lottery corresponding to diverting the trolley over the lottery corresponding to not diverting. The odds of death are lower in the former lottery.
So in 'ordinary' cases, aggregativism and utilitarianism tend to agree. The two approaches diverge only in more extreme scenarios. Suppose the only way to save the five people is to push a large man in front of the trolley, killing him but stopping the trolley due to his mass. Utilitarianism still endorses sacrificing the one to save the five, as this leads to fewer total deaths. But aggregativism likely rejects pushing the man. Most humans would refrain from murder, even if doing so improves their odds of survival. Importantly, the aggregative framework need not explicitly specify the social planner's moral constraints; it simply imports them from the human behavior model.
Indeed, the previous article established a pertinent theorem [? · GW]: under reasonable conditions, aggregativism and utilitarianism are mathematically equivalent, in the sense that the aggregative principle and utilitarian principle will always permit the same subset of options in all contexts .
While these conditions don't hold exactly, they are close approximations, so aggregativism and utilitarianism will yield similar prescriptions in in practice. In particular, the Live Every Life Once (LELO) will approximate longtermist total utilitarianism. Similarly, Harsanyi's Lottery (HL) will approximate average utilitarianism. Finally, Rawls' Original Position (ROI) will approximate his difference principle.
2.4. Lower description complexity
The most glaring problem with utilitarianism is the immense description complexity of any suitable social utility function . That is, fully defining the function would require an impractically vast amount of information, if we limit ourselves to a basic, physicalist language without high-level concepts. By 'description complexity', we mean the quantity of information needed to uniquely identify an object within a given language.
We can establish a lower bound on the complexity of as follows: First, an adequate social choice principle must encompass all the ethical considerations and tradeoffs in real-world social decision-making. This includes the full complexity of axiology, i.e. determining which outcomes are valuable and to what degree. It also includes the complexity of decision theory, such as accounting for long-term consequences, dealing with uncertainty, and considering acausal effects. For utilitarianism to yield a suitable principle, all this complexity must be encoded in the utility function . This is because a utilitarian principle is entirely determined by and the operator. But is a simple mathematical object, contributing no complexity, so almost all the complexity of must come from itself.
The unavoidable complexity of any suitable social utility function has been a major criticism of utilitarianism.[10] Moreover, the unavoidable complexity of makes it infeasible to design a superintelligent AI system by specifying an appropriate utility function.[11]
In contrast, aggregativism sidesteps this prohibitive complexity. Much of the intricacy of social choice may already be captured by the intricacy of self-interested human behaviour, which is 'imported' into the social choice principle via the social zeta function. Recall that an aggregative principle is fully determined by two components: the social zeta function and the model of human behaviour . While an adequate must be highly complex, this complexity is split between and . Crucially, models human decision-making across all personal contexts, so it necessarily encodes a huge amount of information about human preferences, values, and reasoning. Given the complexity of , even a fairly simple can yield a suitable .
2.6. Avoids counterintuitive implications
2.5.1. Repugnant conclusion
The repugnant conclusion, first formulated by Derek Parfit, poses a serious challenge to total utilitarianism and population ethics. Namely, total utilitarianism suggests that a world filled with an enormous number of people leading lives barely worth living is better than one with a much smaller population of very happy individuals.
Let's consider how the three utilitarian principles resolve this:
- Longtermist total utilitarianism, as defended William MacAskill, seeks to maximize the total sum of personal utility across all individuals — past, present, and future. However, this leads to the repugnant conclusion.
- Average utilitarianism, as proposed by John Harsanyi, seeks to maximize the average personal utility across all individuals. This avoids the repugnant conclusion, but results in its own counterintuitive implication: a social planner should refrain from adding additional lives which are worth living, if they are below the average personal utility.
- The difference principle, defended by John Rawls, seeks to maximize the minimal personal utility across all individuals. This avoids the repugnant conclusion, but results in the most bizarre implication: a social planner has no incentive for adding any lives, as this can only reduce the minimum personal utility.
The aggregative principle LELO handles the repugnant conclusion more successfully than any utilitarian principle discussed. Under LELO, a society with individuals facing personal outcomes is concatenated to a single personal outcome, i.e. . This reframes the population ethics dilemma in terms of a personal choice between quality and duration of life.
Formally, when comparing a population of individuals with personal utilities to an alternative population of individuals with utilities , LELO ranks the first population as better if and only if a self-interested human would prefer to live the combined lifespan over . Do people generally prefer a longer life with moderate quality, or a shorter but sublimely happy existence? Most people's preferences likely lie somewhere in between the extremes. This is is because personal utility of a concatenation of personal outcomes is not precisely the sum of the personal utilities of the outcomes being concatenated.
For example, exponential time-discounting is a common assumption in economics, which states that the personal utility function obeys the equation . Here gives the duration of each outcome and is the discount rate. This discounting formula weights the first outcome more than the second outcome , with the difference growing exponentially with the duration of . If human's maximise a personal utility function with this property, then the value gained by adding an additional life will exponentially decay in the total sum of existing lifespans.
Hence, LELO endorses a compromise between total and average utilitarianism, better reflecting our normative intuitions. While not decisive, it is a mark in favour of aggregative principles as a basis for population ethics.
2.5.2. Extreme suffering
Intuitively, some personal outcomes are so horrific that no miniscule benefit to others, no matter how many beneficiaries there are, could ever justify them. The following example illustrates this: Imagine a social outcome where people live comfortable lives. Now compare that to an alternative outcome where where people receive a miniscule additional benefit, but one unfortunate person faces a life of unrelenting agony, subjected to the most horrific physical and psychological torture imaginable.
For many social utility functions , there exists a sufficiently large such that . In other words, the utilitarian principle would not only permit the social planner to switch from to , but in fact obligate it. This holds for both total and average utilitarianism; the social planner would be morally obligated to torture an innocent person for the sake of a trivial benefit to others. This conclusion seems to defy moral common sense; surely there are some personal outcomes which are so awful that no number of miniscule beneficiaries could justify it.[12] Only Rawls' difference principle avoids it, since it equates social utility with the minimum personal utility.
How do aggregative principles handle extreme suffering to a small minority?
According to Harsanyi's Lottery (HL), the social planner should choose over only if a human would prefer a uniform lottery over the individuals in to a uniform lottery over the individuals in . That is, only if a self-interested human would accept a likelihood of torture in exchange for the chance of the miniscule benefit. But would they choose this trade? Perhaps not. Humans do exhibit an extreme aversion to even small risks of catastrophic outcomes.
Formally, let , , and denote a good life, a slightly improved life, and a horrifically torturous life respectively. Let denote the lottery that yields with probability and with probability . Plausibly, a self-interested human would choose over , because they place substantial value on the 'zero-likelihood-ness' of torture.
Note that this behaviour is inconsistent with the following assumptions of human behaviour.
- Humans maximise a personal utility function .
- The outcomes are ranked as .
- Personal utility is continuous in the underlying likelihoods, in the sense that equals and equals .
To be clear, whether humans actually exhibit this discontinuous preference is an empirical question. But it seems plausible given the limits of human reasoning. Humans don't represent probabilities with infinite precision, so there may be some small probability such that they treat any nonzero probability less than in the same way they treat . If there are personal outcomes with this property, then Harsanyi's Lottery would endorse choosing over , conforming to my moral intuitions.
Compared to HL, I think that LELO is even less tolerant of the extreme suffering of a small minority. Facing the uniform lottery proposed by HL, a human may find it easy to dismiss the small likelihood of extreme suffering, and hence a social planner following the principle would dismiss the extreme suffering of the small minority. But under LELO, the social planner must imagine facing the concatenation of the individual lives, rather than a uniform lottery. Hence, is associated with the concatenation of comfortable lives, and with the concatenation of a life of horrific torture followed by slightly improved lives. But it's impossible for a human to ignore the period of torture, no matter how many comfortable lives succeed it, since it is experienced with certainty.[13]
There are other puzzling cases where the paradigm aggregative principles conform more to our moral intuitions than the paradigm utilitarian principles, such as distant future generations, the terminal value of tradition, and infinite ethics.
2.7. More concrete and relatable
Aggregativism is more concrete than utilitarianism. Aggregative principles like LELO, HL, and ROI promote outcomes that are graspable by the social planner. The outcomes promoted by utilitarian principles, in contrast, are far more opaque.
Recall that each aggregative principle is specified by a social zeta function , which maps social outcomes to personal outcomes. Assuming the social planner is human, they will be familiar with these personal outcomes. They will have a rich set of prior attitudes towards them, including intuitions, preferences, experiences, historical analogues, legal precedents, and moral convictions.
The social zeta function extends these prior attitudes to social outcomes. For instance, consider a social outcome with a billion healthy people and a billion sick people. The planner may lack prior attitude towards this social outcome, because the population contains strangers the planner has no interest in. But if maps to the personal outcome of an even lottery between health and sickness, the planner will have strong prior attitudes about it, helping them reason about the social outcome via analogy to something concrete and familiar.
By contrast, each utilitarian principle is specified by a social utility function mapping social outcomes to real numbers. But the planner has no prior attitudes towards these abstract numbers, so does nothing to help the social planner understand the situation. They may know that , but how does this mathematical fact help them reason: is an outcome with utility 540 slightly better than one with 450, or vastly better? Does the difference in social utility warrant violating property rights? Does the difference warrant violating body autonomy? It's not clear.
Indeed, a common objection of utilitarianism is that it has an austere and impersonal quality, seeking to maximize an abstract metric ('utility') rather than anything recognizably valuable to humans. Aggregativism replaces these abstract numbers with personal outcomes, avoiding this objection.
There are limits to the concreteness of personal outcomes. Some social outcomes may map to 'exotic' personal outcomes that are unfamiliar to humans; for instance, in LELO, is the concatenation of billions of individual lives. Nonetheless, even these exotic personal outcomes remain more tangible than abstract utility numbers such as "3.27e10". Humans can imagine the prospect, reason about what it would be like, feel excited or horrified by it, judge whether it's more desirable than some other concatenation of lives, and so on.
The concreteness of aggregativism explains its appeal to moral philosophers, such as MacAskill, Harsanyi, and Rawls, who have used aggregative thought experiments to motivate their principles of social choice.
3. Objections to aggregativism
3.1. Inherits human irrationality
Aggregativism inherits the inconsistencies, irrationalities, and biases of human decision-making, since it is based on , the model of human behaviour.
For a concrete example, consider the case of intransitive preferences. If humans exhibit intransitive preferences, then a social planner following the aggregative principle may also exhibit intransitive preferences. This renders them vulnerable to 'ethical money pumps', a cycle of trades that exploit the social planner's intransitivity. By contrast, utilitarianism is based on the operator, which never exhibits intransitive preferences. Hence, a social planner following the utilitarian principle will not be exploitable in this way.[14]
Similar arguments apply to other forms of irrationality that might exhibit, such as: incomplete preferences, dynamic inconsistency, framing effects, menu effects, and so on. Aggregativism inherits these defects; utilitarianism doesn't.
That said, I suspect aggregativism would still perform adequately in practice. Human irrationality, while real, is not so severe or easily exploited as to undermine their overall competence. Most of the time, they muddle through okay. By extension, a social planner following the aggregative principle should be roughly as resistant to exploitation as a typical human, avoiding the most egregious errors.
Furthermore, it may be possible to mitigate the biases and inconsistencies in the model of human behavior used by aggregativism. The aggregative framework does not demand that perfectly model a realistic human, only that it is some function of type . That is, need not model realistic human behaviour, it may instead model idealised human behaviour. This idealisation may correct for certain biases, expand available information, remove inconsistencies, or otherwise improve on ordinary human decision-making, while still retaining human values and cognition.
However, we must be cautious not to idealize too far: the appeal of aggregativism lies in its proximity to actual human decision-making. If diverges too radically from human behaviour then aggregativism is more likely to 'go off the rails' like utilitarianism, because the underlying choice criterion approaches the alienness of argmax. The further we stray from human psychology, the less obvious the moral authority of the resulting entity. Moreover, as Joe Carlsmith argues in "On the limits of idealized values [LW · GW]", there are myriad ways to idealise human behaviour. The choice a particular method feels arbitrary and subject to scrutiny itself — we can always wonder whether we'd actually endorse the choices of the idealised agent. Therefore, I endorse employing a which models realistic human behaviour.
3.2. Requires model of human behaviour
In order for the social planner to implement an aggregative principle, they need an accurate model of how a self-interested human would behave in different personal contexts. This model is captured by the operator which, as discussed in Section 2.4, is an immensely complicated object. The social planner's uncertainty about self-interested human behaviour will hinder applying the principle in practice.
By contrast, utilitarianism relies on the simple, well-defined operator. (As discussed before, the complexity of utilitarianism lies in the social utility function .) The social planner has no analogous uncertainty about the nature of .
This is a serious objection to aggregativism, but I do think that grounding a normative theory on , the model of human behaviour, has several advantages over grounding it on , the social utility function.
Firstly, is a possibilistic model of human behavior, meaning it specifies which options a human might choose in different contexts, without assigning probabilities. As such, the object supervenes on natural facts: it is simply a question of which options the human might choose in given personal contexts. In contrast, defining a utility function requires making value judgments compare different social outcomes.
Secondly, we can draw on a rich body of existing empirical data and methods to inform the specification of . We can leverage behavioral experiments, economical models, neuroimaging studies, and so on. We could train a deep neural network on a large dataset of human choices to predict which options humans would choose in novel contexts. Specifying through a combination of empirical study and machine learning seems more scientifically grounded and tractable than defining from first principles.
Thirdly, the social planner could consult their own intuitions about what they would do in the hypothetical personal context. For example, if the social planner feels they would choose option over option in the personal context , this provides a reason for them to choose option over option in the social context . Of course, people's self-predictions are often inaccurate,[15] but the social planner's intuitions provide a useful starting point that can be refined with empirical data.
3. Conclusion
The central feature of aggregativism, for better and worse, is that it conforms to the contours of human decision-making in ways that utilitarianism does not.
On the plus side, it never permits an option that the human would never choose, thereby avoiding utilitarianism's excessive permissions. It also never forbids an option that the human might always choose, thereby avoiding utilitarianism's excessive obligations. Moreover, it doesn't require utilitarianism's superhuman computational resources. And it sidesteps the prohibitive complexity of defining a social utility function, by importing the complexity of the human behavior model. Aggregativism also enables more concrete moral reasoning, by dealing in personal experiences rather than abstract utility.
On the other hand, by hewing so closely to human behavior, aggregativism inherits the messy irrationalities and inconsistencies of human decision-making. And it requires a model of how humans act in a vast array of hypothetical scenarios.
- ^
The term LELO originates in Loren Fryxell (2024), "XU", which is where I first encountered the concept. I think Fryxell offers the first formal treatment of the LELO principle. MacAskill (2022), "What We Owe the Future", says this thought experiment comes from Georgia Ray (2018), “The Funnel of Human Experience”, and that the short story Andy Weir (2009), "The Egg", shares a similar premise. But Roger Crisp attributes LELO to C.I. Lewis, which would predate both Ray and Weir, but I haven't traced that reference.
- ^
John C. Harsanyi "Cardinal Utility in Welfare Economics and in the Theory of Risk-Taking" (1953) and "Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparisons of Utility" (1955)
- ^
See John Rawls (1971), "A Theory of Justice" and Samuel Freeman (2023) "Original Position".
- ^
Bernard Williams discusses the notion of "unthinkable" options in his critique of utilitarianism.
It could be a feature of a man’s moral outlook that he regarded certain courses of action as unthinkable, in the sense that he would not entertain the idea of doing them: and the witness to that might, in many cases, be that they simply would not come into his head. Entertaining certain alternatives, regarding them indeed as alternatives, is itself something that he regards as dishonourable or morally absurd.
(Bernard Williams, 1973, "Utilitarianism: For and Against").
This is distinct from options that are ruled out by moral side constraints or physical impossibility. As Williams puts it, "it is perfectly consistent, and it might be thought a mark of sense, to believe, while not being a consequentialist, that there was no type of action which satisfied [the condition of being morally prohibited whatever the consequences]" (Williams, 1973).
- ^
On the flip-side, even if the social context is fixed, we can nonetheless concoct for any option a utility function such that is permitted by the utilitarian principle . That is, any option will be permitted in any social context, provided the social utility function is sufficiently misspecified, no matter how ludicrous that choice would be.
To prove this, define as the indicator function for :
This assigns utility 1 to the social outcome of choosing , and 0 to all other outcomes, so a utilitarian planner maximizing this would be permitted to choose , or any other option that leads to the same outcome as . That is, .
- ^
Determining whether a statement has a proof that is less than bits long is an NP-complete problem. Even would exceed the computational resources of the observable universe.
Firstly, this problem belongs to the complexity class NP because, given a proof that is less than bits in length, it is possible to verify each step of the proof to ensure that it adheres to the rules of Peano Arithmetic (PA). The verification process can be completed in polynomial time with respect to the size of the proof.
Moreover, this problem is NP-hard, as it is possible to reduce the Boolean Satisfiability Problem (SAT), which is known to be NP-hard, to our problem. To demonstrate this reduction, consider an instance of SAT with variables and a Boolean formula . We can construct a statement in the following manner:
If the original Boolean formula is satisfiable, then this newly constructed formula is provable with a proof that requires only a polynomial number of bits. Furthermore, this reduction can be performed in polynomial time.
Our problem is both in NP and NP-hard, and hence is NP-complete.
- ^
Strictly speaking, the claim that a social planner cannot implement the utilitarian principle in this scenario relies on two key assumptions:
(1) The social planner's decision-making process is instantiated by a physical system, such as a machine or computer, that exists in our universe and is bound by the laws of physics.
(2) No physically realizable machine can efficiently solve NP-complete problems. In other words, the time required to find a solution grows exponentially with the size of the problem, quickly becoming infeasible for even moderately large instances.
For a discussion of (1), see Scott Garrabrant and Abram Demski's 2018 article "Embedded Agency [LW · GW]". For a compelling defence of (2), see Scott Aaronson's 2005 paper "NP-complete Problems and Physical Reality".
- ^
See Bales, R. E. (1971), ‘"Act utilitarianism: Account of right-making characteristics or decision making procedure", which "stress[es] the importance of maintaining a sharp distinction between (a) decision-making procedures, and (b) accounts of what makes right acts right."
- ^
We've seen how utilitarianism demands superhuman computational resources from the social planner, in contrast to aggregativism. As I demonstrate below, a similar point can be made about noncomputational resources.
Most humans cannot, I presume, jump exactly 45 cm. It's practically impossible for a typical human to reliably distinguish between jumping 45 cm and 46 cm, as the difference is too small to accurately control or perceive. Hence, in some circumstances, a human might either jump 45 cm or jump 46 cm; in other circumstances, they will surely do neither; but there are no circumstance where a human might jump 45 cm but surely won't jump 46 cm.
Formally, let denote all the possible heights that a human might jump. To say that the human cannot distinguish between 45 cm and 46 cm, we mean that for all personal contexts .
Now, the aggregative principles satisfy a property called 'indistinguishable-options consistency'. Namely, If the aggregative principle permits (resp. forbids) jumping 45 cm in some social context, then it must also permit (resp. forbid) jumping 46 cm in that same context. The social planner is never permitted to jump 45 cm while forbidden to jump 46 cm, nor vice-versa.
More generally, if for all personal contexts , then for all social contexts .
In contrast, utilitarian principles violate indistinguishable-options consistency. If is any non-constant utility function, with , then we can define the social context as follows:
The utilitarian planner maximizing would be obligated to jump exactly 45 cm, and forbidden to jump 46 cm, even though distinguishing between these two options is physically impossible.
- ^
Philosophers like Bernard Williams (1981) rejected the codification of ethics into simple theories such as Kantianism or utilitarianism. “There cannot be any very interesting, tidy or self-contained theory of what morality is… nor… can there be an ethical theory, in the sense of a philosophical structure which, together with some degree of empirical fact, will yield a decision procedure for moral reasoning.”
- ^
See Eliezer Yudkowsky on The Hidden Complexity of Wishes [LW · GW], Not for the Sake of Happiness (Alone) [LW · GW], and Fake Utility Functions [LW · GW].
- ^
I've been persuaded by Brian Tomasik's writings, in particular "The Horror of Suffering" (2017) and "Preventing Extreme Suffering Has Moral Priority" (2016, video presentation, warning: disturbing content).
- ^
In "Three Types of Negative Utilitarianism", Brian Tomasik uses a LELO-esque argument to support lexical-threshold negative utilitarianism. This position states that a small minority facing extreme suffering cannot be compensated by a miniscule benefit to a sufficiently large majority. He justifies this on the grounds that a self-interested human wouldn't desire the concatenation of those lives:
A day in hell could not be outweighed by happiness:
I would not accept a day in hell in exchange for any number of days in heaven. Here I'm thinking of hell as, for example, drowning in lava but with my pain mechanisms remaining intact for the whole day. Heaven just wouldn't be worth it, no matter how long. It seems like there's no comparison. - ^
To formalize this, let be any -choice principle and let be any binary relation over the payoffs . We say that respects if, for all contexts and all options , there exists no such that . In plain terms: if represents strict preference, then never permits choosing a strictly dispreferred option. Moreover, we say that has transitive preferences if it respects some transitive relation .
It's straightforward to show that has transitive preferences: it respects the usual 'less than' ordering on real numbers, which is transitive. Furthermore, if is any function and the -choice principle has transitive preferences, then so does the composite principle defined by . Indeed, if respects a relation , then respects the relation defined by , and is transitive if is. Combining these observations: since has transitive preferences, so does a social planner following any utilitarian principle .
However, the human behaviour model may lack transitive preferences. If so, then a social planner following the aggregative principle , for some social zeta function , may also lack transitive preferences. This exposes the planner to 'ethical money pumps': a sequence of choices that leads to a strictly worse outcome than where they started, by exploiting their intransitive preferences. For example, the planner might trade policy A for B, B for C, and C back to A, each time accepting a small 'ethical cost' that compounds to a large overall loss.
- ^
See e.g. "Affective Forecasting" (Gilbert and Wilson, 2003)
10 comments
Comments sorted by top scores.
comment by Gustav Alexandrie (gustav-alexandrie) · 2024-06-23T20:30:37.126Z · LW(p) · GW(p)
Thanks for writing this!
I only skimmed the post, so I may have missed something, but it seems to me that this post underemphasizes the fact that both Harsanyi's Lottery and LELO imply utilitarianism under plausible assumptions about rationality. For example, if the social planner satisfies the vNM axioms of expected utility theory, then Harsanyi's Lottery implies that the social planner is utilitarian with respect to expected utilities (Harsanyi 1953). Likewise, if the social planner's intertemporal preferences satisfy a set of normatively plausible axioms, then LELO implies that the social planner is utilitarian with respect to experienced utilities (Fryxell 2024). In my view, it is therefore not clear that it makes sense to compare LELO and Harsanyi's Lottery with utilitarianism.
Also, at least some of the advantages of aggregativism that you mention are easily incorporated into utilitarianism. For example, what is achieved by adopting LELO with exponential time-discounting in Section 2.5.1 can also be achieved by adopting discounted utilitarianism (rather than unweighted total utilitarianism).
A final tiny comment: LELO has a long history, going back to at least C.I. Lewis's " An Analysis of Knowledge and Valuation", though the term "LELO" was coined by my colleague Loren Fryxell (Fryxell 2024). It's probably worth adding citations to these.
Replies from: strawberry calm, MichaelStJules↑ comment by Cleo Nardo (strawberry calm) · 2024-06-23T22:40:06.815Z · LW(p) · GW(p)
thanks for comments, gustav
I only skimmed the post, so I may have missed something, but it seems to me that this post underemphasizes the fact that both Harsanyi's Lottery and LELO imply utilitarianism under plausible assumptions about rationality.
the rationality conditions are pretty decent model of human behaviour, but they're only approximations. you're right that if the approximation is perfect then aggregativism is mathematically equivalent to utilitarianism [? · GW], which does render some of these advantages/objections moot. but I don't know how close the approximations are (that's an empirical question).
i kinda see aggregativism vs utilitarianism as a bundle of claims of the following form:
- humans aren't perfectly consequentialist, and aggregativism answers the question "how consequentialist should our moral theory be?" with "exactly as consequentialist as self-interested humans are."
- humans have an inaction bias, and aggregativism answers the question "how inaction-biased should our moral theory be?" with "exactly as inaction-biased as self-interested humans are."
- humans are time-discounting, and aggregativism answers the question "how time-discounting should our moral theory be?" with "exactly as time-discounting as self-interested humans are."
- humans are risk-averse, and aggregativism answers the question "how risk-averse should our moral theory be?" with "exactly as risk-averse as self-interested humans are."
- and so on
the purpose of the social zeta function is simply to map social outcomes (the object of our moral attitudes) to personal outcomes (the object the self-interested human's attitudes) so this bundle of claims type-checks.
Also, at least some of the advantages of aggregativism that you mention are easily incorporated into utilitarianism. For example, what is achieved by adopting LELO with exponential time-discounting in Section 2.5.1 can also be achieved by adopting discounted utilitarianism (rather than unweighted total utilitarianism).
yeah that's true, two quick thoughts:
- i suspect exponential time-discounting was added to total utilitarianism because it's a good model of self-interested human behaviour. aggregativism says "let's do this with everything", i.e. we modify utilitarianism in all the ways that we think self-interested humans behave.
- suppose self-interested humans do time-discounting, then LELO would approximate total utilitarianism with discounting in population time, not calender time. that is, a future generation is discounted by the sum of lifetimes of each preceding generation. (if the calendar time for an event is then the population time for the event is where is the population size at time . I first heard this concept in this Greaves talk.) if you're gonna adopt discounted utilitarianism, then population-time-discounted utilitarianism makes much more sense to me than calendar-time-discounted utilitarianism, and the fact that LELO gives the right answer here is a case in favour of it.
A final tiny comment: LELO has a long history, going back to at least C.I. Lewis's " An Analysis of Knowledge and Valuation", though the term "LELO" was coined by my colleague Loren Fryxell (Fryxell 2024). It's probably worth adding citations to these.
I mention Loren's paper in the footnote of Part 1 [? · GW]. i'll cite him in part 2 and 3 also, thanks for the reminder.
Replies from: gustav-alexandrie↑ comment by Gustav Alexandrie (gustav-alexandrie) · 2024-06-25T08:07:43.637Z · LW(p) · GW(p)
I appreciate the reply!
"the rationality conditions are pretty decent model of human behaviour, but they're only approximations. you're right that if the approximation is perfect then aggregativism is mathematically equivalent to utilitarianism [? · GW], which does render some of these advantages/objections moot. but I don't know how close the approximations are (that's an empirical question)."
I'm not sure why we should combine Harsanyi's Lottery (or LELO or whatever) with a model of actual human behaviour. Here's a rough sketch of how I am thinking about it: Morality is about what preference ordering we should have. If we should have preference ordering R, then R is rational (morality presumably does not require irrationality). If R is rational, then R satisfies the vNM axioms. Hence, I think it is sufficient that the vNM axioms work as principles of rationality; they don't need to describe actual human behaviour in this context.
Regarding your points about two quick thoughts on time-discounting: yes, I basically agree. However, I also want to note that it is a bit unclear how to ground discounting in LELO, because doing so requires that one specifies the order in which lives are concatenated and I am not sure there is a non-arbitrary way of doing so.
Thanks for engaging!
Replies from: strawberry calm↑ comment by Cleo Nardo (strawberry calm) · 2024-06-25T12:46:33.721Z · LW(p) · GW(p)
If we should have preference ordering R, then R is rational (morality presumably does not require irrationality).
I think human behaviour is straight-up irrational, but I want to specify principles of social choice nonetheless. i.e. the motivation is to resolve carlsmith’s On the limits of idealized values.
now, if human behaviour is irrational (e.g. intransitive, incomplete, nonconsequentialist, imprudent, biased, etc), then my social planner (following LELO, or other aggregative principles) will be similarly irrational. this is pretty rough for aggregativism; I list it was the most severe objection, in section 3.1.
but to the extent that human behaviour is irrational, then the utilitarian principles (total, average, Rawls’ minmax) have a pretty rough time also, because they appeal to a personal utility function to add/average/minimise. idk where they get that if humans are irrational.
maybe you the utilitarian can say: “well, first we apply some idealisation procedure to human behaviour, to remove the irrationalities, and then extract a personal utility function, and then maximise the sum/average/minimum of the personal utility function”
but, if provided with a reasonable idealisation procedure, the aggregativist can play the same move: “well, first we apply the idealisation procedure to human behaviour, to remove the irrationalities, and then run LELO/HL/ROI using that idealised model of human behaviour.” i discuss this move in 3.2, but i’m wary about it. like, how alien is this idealised human? why does it have any moral authority? what if it’s just ‘gone off the rails’ so to speak?
it is a bit unclear how to ground discounting in LELO, because doing so requires that one specifies the order in which lives are concatenated and I am not sure there is a non-arbitrary way of doing so.
macaskill orders the population by birth date. this seems non-arbitrary-ish(?);[1] it gives the right result wrt to our permutation-dependent values; and anything else is subject to egyptologist objections, where to determine whether we should choose future A over B, we need to first check the population density of ancient egypt.
Loren sidesteps this the order-dependence of LELO with (imo) an unrealistically strong rationality condition.
- ^
if you’re worried about relativistic effects then use the reference frame of the social planner
↑ comment by Gustav Alexandrie (gustav-alexandrie) · 2024-06-25T22:00:43.087Z · LW(p) · GW(p)
Thanks!
i’m wary about it. like, how alien is this idealised human? why does it have any moral authority?
I don't have great answers to these metaethical questions. Conditional on normative realism, it seems plausible to me that first-order normative views must satisfy the vNM axioms. Conditional on normative antirealism, I agree it is less clear that first-order normative views must satisfy the vNM axioms, but this is just a special case of it being hard to justify any normative views under normative antirealism.
In any case, I suspect that we are close to reaching bedrock in this discussion, so perhaps this is a good place to end the discussion.
↑ comment by MichaelStJules · 2024-06-24T02:50:23.702Z · LW(p) · GW(p)
Harsanyi's theorem has also been generalized in various ways without the rationality axioms; see McCarthy et al., 2020 https://doi.org/10.1016/j.jmateco.2020.01.001. But it still assumes something similar to but weaker than the independence axiom, which in my view is hard to motivate separately.
comment by EJT (ElliottThornley) · 2024-06-25T13:03:23.527Z · LW(p) · GW(p)
Another nice article. Gustav says most of the things that I wanted to say. A couple other things:
- I think LELO with discounting is going to violate Pareto. Suppose that by default Amy is going to be born first with welfare 98 and then Bobby is going to be born with welfare 100. Suppose that you can do something which harms Amy (so her welfare is 97) and harms Bobby (so his welfare is 99). But also suppose that this harming switches the birth order: now Bobby is born first and Amy is born later. Given the right discount-rate, LELO will advocate doing the harming, because it means making good lives happen earlier. Is that right?
- I think a minor reframing of Harsanyi's veil-of-ignorance makes it more compelling as an argument for utilitarianism. Not only is it the case that doing the utilitarian thing maximises the decision-maker's expected welfare behind the veil-of-ignorance, doing the utilitarian thing maximises everyone's expected welfare behind the veil-of-ignorance. So insofar as aggregativism departs from utilitarianism, it means doing what would be worse in expecation for everyone behind a veil-of-ignorance.
↑ comment by Cleo Nardo (strawberry calm) · 2024-06-26T20:08:15.377Z · LW(p) · GW(p)
Is that right?
Yep, Pareto is violated, though how severely it's violated is limited by human psychology.
For example, in your Alice/Bob scenario, would I desire a lifetime of 98 utils then 100 utils over a lifetime with 99 utils then 97 utils? Maybe idk, I don't really understand these abstract numbers very much, which is part of the motivation for replacing them entirely with personal outcomes. But I can certainly imagine I'd take some offer like this, violating pareto. On the plus side, humans are not so imprudent to accept extreme suffering just to reshuffle different experiences in their life.
Secondly, recall that the model of human behaviour is a free variable in the theory. So to ensure higher conformity to pareto, we could…
- Use the behaviour of someone with high delayed gratification.
- Train the model (if it's implemented as a neural network) to increase delayed gratification.
- Remove the permutation-dependence using some idealisation procedure.
But these techniques (1 < 2 < 3) will result in increasingly "alien" optimisers. So there's a trade-off between (1) avoiding human irrationalities and (2) robustness to 'going off the rails'. (See Section 3.1.) I see realistic typical human behaviour on one extreme of the tradeoff, and argmax on the other.
comment by cubefox · 2024-06-24T10:49:20.609Z · LW(p) · GW(p)
Is it right to say that aggregativism is, similar to total and average utilitarianism, incompatible with the procreation asymmetry [? · GW], unlike some forms of person affecting [? · GW] utilitarianism?
Replies from: strawberry calm↑ comment by Cleo Nardo (strawberry calm) · 2024-06-24T15:25:18.915Z · LW(p) · GW(p)
which principles of social justice agrees with (i) adding bad live is bad, but disagrees with (ii) adding good lives is good?
- total utilitarianism agrees with both (i) and (ii).
- average utilitarianism can agree with any of the combination: both (i) and (ii); neither (i) nor (ii); only (i) and not (ii). the combination depends on the existing average utility, because average utilitarianism obliges creating lives above the existing average and forbids creating lives below the existing average.
- Rawls' difference principle (maximise minimum utility) can agree with any of the combination: neither (i) nor (ii); only (i) and not (ii). this is because adding lives is never good (bc it could never increase minimum utility), and adding bad lives is bad iff those lives are below-minimum.
so you're right that utilitarianism doesn't match those intuitions. none of the three principles discussed reliably endorse (i) and reject (ii).
now consider aggregativism. you'll get asymmetry between (i) and (ii) depending on then social zeta function mapping social outcomes to personal outcomes, and on the model of self-interested human behaviour.
let‘s examine LELO (i.e. the social zeta function maps a social outcome to the concatenation of all individuals' lives) and our model of self-interested human behaviour is Alice (described below).
suppose Alice expects 80 year lives of comfortable fulfilling life.
- would she pay to live 85 years instead, with 5 of those years in ecstatic joy? probably.
- would she pay to avoid living 85 years instead, with 5 of those years in horrendous torture? probably.
there’s probably some asymmetry in Alice’s willingness of pay. i think humans are somewhat more misery-averse than joy-seeking. it’s not a 50-50 symmetry, nor a 0-100 asymmetry, maybe a 30-70 asymmetry? idk, this is an empirical psychological fact.
anyway, the aggregative principle (generated by LELO+Alice) says that the social planner should have the same attitudes towards social outcomes that Alice has towards the concatenation of lives in those social outcomes. so the social planner would pay to add joyful lives, and pay to avoid adding miserable lives, and there should be exactly as much willingness-to-pay asymmetry as Alice (our self-interested human) exhibits.