An Introduction to Evidential Decision Theory
post by Babić · 2025-02-02T21:27:35.684Z · LW · GW · 1 commentsContents
Introduction Decision Theory Decision Problems Possible Worlds Subjective Probability An Expected Utility References None 1 comment
Introduction
This blog post serves as an introduction to Evidential Decision Theory (EDT), designed for readers with no prior background. While much has been written about EDT’s application to well-known problems like the Smoking Lesion problem, and Newcomb-like problems, we are unable to find a clear and formal introduction. As such, this post aims to fill that gap.
Our motivation for writing this post stems from the work of Nate Soares and Benja Fallenstein in their papers, Towards Idealized Decision Theory (2015) [8] and Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda (2015) [9]. These works suggest the need for the study of decision theory in aligning smarter-than-human agents with human interests and address the limitations of existing formulations, such as EDT and Causal Decision Theory (CDT) in defining idealized decision procedures.
This post does not take a stance on the need for an idealized decision procedure or the tractability of one. Rather, we hope it provides a clear introduction to EDT and perhaps encourages readers to explore these topics further.
Decision Theory
Decision theory is a rich interdisciplinary field in which philosophers, psychologists, economists, computer scientists, statisticians and mathematicians aptly study decision. While decision theorists disagree on a lot, they do share some foundational distinctions. An important distinction is between normative and descriptive decision theory.
Descriptive decision theory concerns how people actually make decisions.
Normative decision theory, on the other hand, concerns how agents ought to make decisions. We will focus on the latter.
EDT is a branch of normative decision theory first introduced by Richard C. Jeffrey in The Logic of Decision (1965) [4]. In this work, Jeffrey argues that an agent should evaluate actions based on their evidential implications—specifically, one should choose the action with the highest news value. In the following sections, we will unpack exactly what this means, drawing on both Jeffrey’s work and Ahmed’s modern analysis in Evidential Decision Theory (2021) [1].
Decision Problems
To take a step back, let's formalize what we mean by an agent making a decision.
For the remainder of this report, we will primarily follow Jeffery's formalization of decision problems.[1] In this framework, an agent is a decision maker who is faced with a decision problem. Roughly speaking, the agent must choose an action whose outcome depends upon the state of the world.[2] To specify this decision problem, we must first define a set of actions, a set of states, and a set of outcomes. These are all sets of propositions – claims about how things are. Propositions represent the world as being some way; they are true if the world is that way, and are false otherwise. For simplicity, we will assume that these sets of propositions are countable.
First up, we will consider a set of actions
We take an action to be a proposition that describes aspects of the world over which the agent exercises direct control. That is, propositions the agent can make true or false as they please. The action is true if the agent performs the action, and false otherwise. For example, an action may be "You eat the ice cream," and this is true if you indeed go eat the ice cream. To ensure that the decision problem is well-posed, we require that exactly one action is actually performed. That is, actions in are mutually exclusive and jointly exhaustive. For example, we may take the set of actions to be
Next, we will consider a set of states
A state is a proposition that describes aspects of the world over which the agent exercises no direct control. By 'no direct control', we mean that the probability of the state is independent of the action performed. And, by 'probability of the state', we mean a subjective probability representing the agent's belief that the state is true. So, we can think of a state as a description of the world that the agent can have a belief in the truth of, but does not believe they can influence. For example, a state may be the proposition "The ice cream is poisonous." To again ensure that the decision problem is well-posed, we require that exactly one state is actually true. That is, the states in are mutually exclusive and jointly exhaustive. For example, we may take the set of states to be
Finally, we will consider the set of outcomes. We define an outcome to be the proposition , fully describing the world in which the action is performed and the state is true. Accordingly,
Notice that since exactly one action in and exactly one state in are, it follows that exactly one outcome in is true.
Now, given the sets , , and we can define the base algebra to be the smallest set of propositions so that , and is closed under negation and countable disjunction. Finally, we have some notion of what we meant by 'an agent making a decision'; We mean that we have a decision problem that a decision maker must face by choosing an action , despite their uncertainty of which state is true, to bring about some outcome .
Possible Worlds
We may now discuss the above decision problem in terms of possible worlds, where Jeffery gives new meaning to the idea of a proposition.
A possible world is a logically complete and consistent specification of all way things might be—a specification of all relevant facts about reality. Perhaps the most intuitive way to think about possible worlds is as alternate realities, where each world represents a different way the universe could be.
Note that metaphysical status of possible worlds has been a subject of significant debate. Modal realists, such as David Lewis, argue that possible worlds are literally existing alternate realities [7], while others, like Robert Stalnaker, contend that they are merely abstract representations [10]. In any case, within the context of EDT, possible worlds are a practical tool; as such, we will not concern ourselves with these debates.
Within our decision problem, noting that the outcomes in are atomic elements of in the sense that all propositions in are uniquely expressible as a disjunctions of them,
we can interpret the set as the set of all possible worlds, with each element is a particular possible world. Further, we note that there exists exactly one actual world, denoted , which we are yet to realize.
Since we can uniquely express each element of as a disjunction of elements of , we can represent each proposition which is equivalent to the proposition for some , as the subset . Let us denote this set as . Then, we have that our proposition is true if and only if the actual world is indeed in the set .
In this way, the set of possible worlds is often treated as a sample space in probability theory, where propositions (corresponding to propositions about acts, states and outcomes) are represented as regions within , and logical operations on propositions correspond to set-theoretic operations on these sets. For instance, given propositions , we have that can be represented as , and can be represented as .
Subjective Probability
In the context of uncertainty, subjective probability, or credence, provides a way to quantify confidence in the truth of propositions. Unlike objective probabilities derived from frequencies, subjective probabilities reflect an agent's degree of belief based on their knowledge and evidence which is central to the Bayesian interpretation of probability.
Figure 1: A visual representation of the set of possible worlds and the set representing proposition
Intuitively we can visualize confidence in a proposition as the area of the corresponding region as in Figure 1. If one knows that exactly one outcome in is the actual world , and you are equally confident that any given outcome is the actual world, then your subjective probability in any proposition is the proportion of the area of that the set representing it occupies. Through this lens, we define the subjective probability function on a -algebra , which is a set, containing sets each representing propositions, satisfying the following properties:
- .
- (closed under complementation).
- (closed under countable unions).
Our subjective probability function may then assign a probability to every proposition, adhering to the Kolmogorov Axioms:
Such probability theory makes sense in the context of propositions and possible worlds. Axiom states that your confidence in any proposition being true in the actual world is non-negative, while states that you are certain of the existence of an actual world. Finally, states that your confidence in any countable number of mutually exclusive propositions being true is the sum of your confidence in each.
Figure 2: A visual representation of possible worlds and the sets representing propositions and .
Now, what happens when you learn that a proposition is true? That is, suppose that some previously unknown proposition is revealed to be true. How would such evidence update your subjective probabilities? EDT takes the Bayesian approach, where the we update our probabilities using Bayes' Rule:
That is, your updated confidence in given is the proportion of the area of proposition that lies in the area of proposition in Figure 2. One may also note that the restriction for our realised proposition to have non-zero probability is makes sense; if is impossible, then our updated confidence in should not be defined.
It may seem silly to step through EDT in this way, but formalizing it in this manner allows one to understand the philosophy under which EDT operates—namely, taking an action that ‘you most want to learn that you will do.’ From now on, for the sake of brevity, we will switch from thinking about propositions in terms of sets and set-theoretic operations to just thinking about propositions and logical operations, i.e. with the understanding that by we mean .
An Expected Utility
Recall that normative decision theory concerns how agents ought to make decisions. And, we have just ironed out what we mean by an agent making a decision. The elephant left in the room is what we mean by ought – a lot has been swept under this wee word!
EDT's interpretation of this term is grounded in Expected Utility Theory (EUT), the orthodox normative decision theory. Generally, it states that an agent, equipped with a utility function defined on the set of outcomes that assigns subjective values reflecting their preferences, ought to choose the action that maximizes the expected utility.
Expected utility theory may seem arbitrary, and like all interpretations of ‘ought’, it is.
But, there is some serious motivation behind it. There are two main arguments.
The first argument is based on the weak and strong laws of large numbers, first presented by Feller [3]. These laws imply that, roughly speaking, in the long run, given a sequence of independent and identical decision problems, the average amount of utility gained per decision is tremendously likely to be close to the expected utility of an individual decision. Accordingly, then, we should maximize the expected utility of each individual decision. Of course, one could argue in return that agents are rarely faced with sequences of independent and identical decision problems.
The second argument, which is much more compelling, is through representation theorems. Generally, representation theorems aim to show that given a framework for decision problems and a set of axioms about an agent's preferences, the agent can be represented as having a probability function and a utility function such that they prefer actions with higher expected utility (defined in terms of this probability and utility function). These axioms about an agent's preferences are often accepted as rational.
So, one can argue that if an agent does not prefer actions with higher expected utility, then their preferences violate one of the axioms, and hence are irrational.
One of the simpler representation theorems is the Von Neumann–Morgenstern (VNM) utility theorem which is central to EUT. One may also apply the VNM utility theorem to EDT, where given a distribution of subjective probabilities over outcomes and an agent's preferences over such distributions, we can guarantee the existence of a utility function that respects those preferences.[3]
Let be a partition of , and let a distribution over , denoted by , be given by such that . Let be the set of all probability distributions over the set of possible outcomes . Suppose that there exists a preference relation on which holds between two distributions if the agent prefers learning to which satisfies the following axioms:
- (Complete) For all we have that where denotes indifference.
- (Irreflexive) For all not .
- (Transitive) For all if and , then .
- (Independent) For all if , then for any and any
- (Continuous) For all , if , then
Then there exists a utility function such that
We may extend the VNM utility theorem to arbitrary propositions. Suppose that we have a decision problem noting that . Consider the set of distributions
where contains your subjective probabilities over conditional on and suppose that you impose a preference relation on that satisfies the VNM axioms. One can think of as a value system—theoretically it encapsulates everything you care about, in the sense that once you have learnt as true you have learnt everything you care about. For instance, suppose you care about two things:
with actions
You know that exactly one of is true. What matters to you is how learning that you took action redistributes your subjective probabilities over via Bayesian updating for each from to .
That is, you should be just as pleased to learn as to learn the updated distribution over your value system . Thus, the news value (or expected utility) of learning that you took an action such that is given by
or equivalently,
Thus, EDT prescribes the action that maximizes your news value . One may notice that a core innovation of EDT is that everything—acts, states, outcomes—is the same type of object, namely, propositions about the world, which are largely treated the same. A side effect of such a structure in decision-making is the tendency for EDT to "manage the news"—but we will leave that as a discussion for later.
References
[1] Arif Ahmed. Evidential Decision Theory. Cambridge University Press, 2021.
[2] Ethan D. Bolker. Functions resembling quotients of measures. Annals of Mathematics, Second Series, 85(3):451–460, 1967.
[3] William Feller. An Introduction to Probability Theory and its Applications, Volume 1. J. Wiley & Sons: New York, 1968.
[4] Richard C. Jeffrey. The Logic of Decision. University of Chicago Press, 1965.
[5] Richard C. Jeffrey. The Logic of Decision. University of Chicago Press, Chicago, IL, 2nd edition, 1983.
[6] James M. Joyce. The Foundations of Causal Decision Theory. Cambridge University Press, 1999.
[7] David Lewis. On the Plurality of Worlds. Basil Blackwell, Oxford, 1986.
[8] Nate Soares and Benja Fallenstein. Toward idealized decision theory. arXiv preprint arXiv:1507.01986, 2015.
[9] Nate Soares and Benja Fallenstein. Agent foundations for aligning machine intelligence with human interests: A technical research agenda. In Vincent C. Müller, editor, The Technological Singularity: Managing the Journey, pages 103–125. Springer, 2017.
[10] Robert Stalnaker. Possible worlds. Noûs, 10(1):65–75, 1976.
- ^
Although, we use our preferred notation from Joyce's The Foundations of Causal Decision Theory (1999) [6].
- ^
Jeffery calls these acts, conditions and consequences, respectively. We stick with actions, outcomes and states as these terms seem more orthodox.
- ^
Note that we omit discussion around the Bolker–Jeffrey Representation Theorem as it is a lengthy generalisation of the VNM utility theorem. For further information, refer to Functions Resembling Quotients of Measures by Bolker (1967) [2] and Jeffrey's seminal work on The Logic of Decision (1983) [5].
1 comments
Comments sorted by top scores.
comment by jonomyster · 2025-02-03T01:36:04.713Z · LW(p) · GW(p)
Thanks for writing this up! I was wondering how this formalization works for Newcomb's problem [? · GW]. (I'll take box A to be the transparent box containing a thousand dollars, and box B to be the opaque box containing a million dollars or nothing.)
I would like to say that the actions are , the states are , and the outcomes are the four different ways of combining the actions and states.
But it seems like I've violated the definition of a state given in the post:
By 'no direct control', we mean that the probability of the state is independent of the action performed.
After all, the probability of the state certainly depends on the action of the agent, in the sense that .