An Introduction to Evidential Decision Theory

babic

An Introduction to Evidential Decision Theory

post by Babić · 2025-02-02T21:27:35.684Z · LW · GW · 2 comments

  Introduction
  Decision Theory
  Decision Problems
  Possible Worlds
  Subjective Probability
  An Expected Utility
  References
None
2 comments

Introduction

This blog post serves as an introduction to Evidential Decision Theory (EDT), designed for readers with no prior background. While much has been written about EDT’s application to well-known problems like the Smoking Lesion problem, and Newcomb-like problems, we are unable to find a clear and formal introduction. As such, this post aims to fill that gap.

Our motivation for writing this post stems from the work of Nate Soares and Benja Fallenstein in their papers, Towards Idealized Decision Theory (2015) [8] and Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda (2015) [9]. These works suggest the need for the study of decision theory in aligning smarter-than-human agents with human interests and address the limitations of existing formulations, such as EDT and Causal Decision Theory (CDT) in defining idealized decision procedures.

This post does not take a stance on the need for an idealized decision procedure or the tractability of one. Rather, we hope it provides a clear introduction to EDT and perhaps encourages readers to explore these topics further.

Decision Theory

Decision theory is a rich interdisciplinary field in which philosophers, psychologists, economists, computer scientists, statisticians and mathematicians aptly study decision. While decision theorists disagree on a lot, they do share some foundational distinctions. An important distinction is between normative and descriptive decision theory.
Descriptive decision theory concerns how people actually make decisions.
Normative decision theory, on the other hand, concerns how agents ought to make decisions. We will focus on the latter.

EDT is a branch of normative decision theory first introduced by Richard C. Jeffrey in The Logic of Decision (1965) [4]. In this work, Jeffrey argues that an agent should evaluate actions based on their evidential implications—specifically, one should choose the action with the highest news value. In the following sections, we will unpack exactly what this means, drawing on both Jeffrey’s work and Ahmed’s modern analysis in Evidential Decision Theory (2021) [1].

Decision Problems

To take a step back, let's formalize what we mean by an agent making a decision.
For the remainder of this report, we will primarily follow Jeffery's formalization of decision problems.^[1] In this framework, an agent is a decision maker who is faced with a decision problem. Roughly speaking, the agent must choose an action whose outcome depends upon the state of the world.^[2] To specify this decision problem, we must first define a set of actions, a set $S$ of states, and a set $O$ of outcomes. These are all sets of propositions – claims about how things are. Propositions represent the world as being some way; they are true if the world is that way, and are false otherwise. For simplicity, we will assume that these sets of propositions are countable.

First up, we will consider a set of actions

A = {A_{1}, A_{2}, A_{3}, \dots} .

We take an action $A$ to be a proposition that describes aspects of the world over which the agent exercises direct control. That is, propositions the agent can make true or false as they please. The action is true if the agent performs the action, and false otherwise. For example, an action may be "You eat the ice cream," and this is true if you indeed go eat the ice cream. To ensure that the decision problem is well-posed, we require that exactly one action is actually performed. That is, actions in $A$ are mutually exclusive and jointly exhaustive. For example, we may take the set of actions to be

A = {‘‘You eat the ice cream.'', ‘‘You do not eat the ice cream.''} .

Next, we will consider a set of states

S = {S_{1}, S_{2}, S_{3}, \dots} .

A state $S$ is a proposition that describes aspects of the world over which the agent exercises no direct control. By 'no direct control', we mean that the probability of the state is independent of the action performed. And, by 'probability of the state', we mean a subjective probability representing the agent's belief that the state is true. So, we can think of a state as a description of the world that the agent can have a belief in the truth of, but does not believe they can influence. For example, a state may be the proposition "The ice cream is poisonous." To again ensure that the decision problem is well-posed, we require that exactly one state is actually true. That is, the states in $S$ are mutually exclusive and jointly exhaustive. For example, we may take the set of states to be

S = {‘‘The ice cream is poisonous.'', ‘‘The ice cream is not poisonous.''} .

Finally, we will consider the set of outcomes. We define an outcome $O [A, S]$ to be the proposition $A \land S$ , fully describing the world in which the action $A$ is performed and the state $S$ is true. Accordingly,

O = {A \land S : A \in A, S \in S} .

Notice that since exactly one action in $A$ and exactly one state in $S$ are, it follows that exactly one outcome in $O$ is true.

Now, given the sets $A$ , $S$ , and $O$ we can define the base algebra $Ω$ to be the smallest set of propositions so that $A, S, O \subseteq Ω$ , and $Ω$ is closed under negation and countable disjunction. Finally, we have some notion of what we meant by 'an agent making a decision'; We mean that we have a decision problem $D = (Ω, A, S, O)$ that a decision maker must face by choosing an action $A \in A$ , despite their uncertainty of which state $S \in S$ is true, to bring about some outcome $O \in O$ .

Possible Worlds

We may now discuss the above decision problem $D = (Ω, A, S, O)$ in terms of possible worlds, where Jeffery gives new meaning to the idea of a proposition.
A possible world is a logically complete and consistent specification of all way things might be—a specification of all relevant facts about reality. Perhaps the most intuitive way to think about possible worlds is as alternate realities, where each world represents a different way the universe could be.

Note that metaphysical status of possible worlds has been a subject of significant debate. Modal realists, such as David Lewis, argue that possible worlds are literally existing alternate realities [7], while others, like Robert Stalnaker, contend that they are merely abstract representations [10]. In any case, within the context of EDT, possible worlds are a practical tool; as such, we will not concern ourselves with these debates.

Within our decision problem, noting that the outcomes in $O$ are atomic elements of $Ω$ in the sense that all propositions in $Ω$ are uniquely expressible as a disjunctions of them,
we can interpret the set $O$ as the set of all possible worlds, with each element $W \in O$ is a particular possible world. Further, we note that there exists exactly one actual world, denoted $W^{*} \in O$ , which we are yet to realize.

Since we can uniquely express each element of $Ω$ as a disjunction of elements of $O$ , we can represent each proposition $P \in Ω$ which is equivalent to the proposition $O_{1} \lor O_{2} \lor \dots \lor O_{n}$ for some $O_{1}, O_{2}, \dots, O_{n} \in O$ , as the subset ${O_{1}, O_{2}, \dots, O_{n}} \subseteq O$ . Let us denote this set as $~ P := {O_{1}, O_{2}, \dots, O_{n}}$ . Then, we have that our proposition $P \in Ω$ is true if and only if the actual world $W^{*}$ is indeed in the set $~ P$ .

In this way, the set of possible worlds $O$ is often treated as a sample space in probability theory, where propositions $P \in Ω$ (corresponding to propositions about acts, states and outcomes) are represented as regions $~ P$ within $O$ , and logical operations on propositions correspond to set-theoretic operations on these sets. For instance, given propositions $P, Q \in Ω$ , we have that $P \land Q$ can be represented as $~ P \cap ~ Q$ , and $P \lor Q$ can be represented as $~ P \cup ~ Q$ .

Subjective Probability

In the context of uncertainty, subjective probability, or credence, provides a way to quantify confidence in the truth of propositions. Unlike objective probabilities derived from frequencies, subjective probabilities reflect an agent's degree of belief based on their knowledge and evidence which is central to the Bayesian interpretation of probability.

Figure 1: A visual representation of the set of possible worlds $O$ and the set representing proposition $Q .$

Intuitively we can visualize confidence in a proposition $Q$ as the area of the corresponding region as in Figure 1. If one knows that exactly one outcome in $O$ is the actual world $W^{*}$ , and you are equally confident that any given outcome is the actual world, then your subjective probability in any proposition is the proportion of the area of $O$ that the set representing it occupies. Through this lens, we define the subjective probability function on a $σ$ -algebra $F \subseteq P (O)$ , which is a set, containing sets each representing propositions, satisfying the following properties:

$O \in F$ .
$~ Q \in F ⟹ {~ Q}^{∁} \in F$ (closed under complementation).
${~ Q}_{1}, {~ Q}_{2}, \dots \in F ⟹ ⋃_{i = 1}^{\infty} {~ Q}_{i} \in F$ (closed under countable unions).

Our subjective probability function $P : F \to [0, 1]$ may then assign a probability to every proposition, adhering to the Kolmogorov Axioms:

P (~ Q) \geq 0, for all ~ Q \in F . (1)

P (O) = 1. (2)

P (\infty ⋃ i = 1 ~ Q i) = \infty \sum i = 1 P ({~ Q}_{i}), for all {~ Q}_{i} \in F, such that {~ Q}_{i} \cap {~ Q}_{j} = \emptyset, for i \neq j . (3)

Such probability theory makes sense in the context of propositions and possible worlds. Axiom $(1)$ states that your confidence in any proposition being true in the actual world is non-negative, while $(2)$ states that you are certain of the existence of an actual world. Finally, $(3)$ states that your confidence in any countable number of mutually exclusive propositions being true is the sum of your confidence in each.

Figure 2: A visual representation of possible worlds $O$ and the sets representing propositions $G$ and $Q$ .

Now, what happens when you learn that a proposition is true? That is, suppose that some previously unknown proposition $G$ is revealed to be true. How would such evidence update your subjective probabilities? EDT takes the Bayesian approach, where the we update our probabilities using Bayes' Rule:

P (~ Q ∣ ~ G) = \frac{P (~ G ∣ ~ Q) \cdot P (~ Q)}{P (~ G)} = \frac{P (~ G \cap ~ Q)}{P (~ G)} for all ~ Q, ~ G \in F such that P (~ G) > 0.

That is, your updated confidence in $Q$ given $G$ is the proportion of the area of proposition $G$ that lies in the area of proposition $Q$ in Figure 2. One may also note that the restriction for our realised proposition $G$ to have non-zero probability is makes sense; if $G$ is impossible, then our updated confidence in $Q$ should not be defined.

It may seem silly to step through EDT in this way, but formalizing it in this manner allows one to understand the philosophy under which EDT operates—namely, taking an action that ‘you most want to learn that you will do.’ From now on, for the sake of brevity, we will switch from thinking about propositions in terms of sets and set-theoretic operations to just thinking about propositions and logical operations, i.e. with the understanding that by $P (Q_{1} \lor Q_{2})$ we mean $P (_{1} \cup_{2})$ .

An Expected Utility

Recall that normative decision theory concerns how agents ought to make decisions. And, we have just ironed out what we mean by an agent making a decision. The elephant left in the room is what we mean by ought – a lot has been swept under this wee word!

EDT's interpretation of this term is grounded in Expected Utility Theory (EUT), the orthodox normative decision theory. Generally, it states that an agent, equipped with a utility function defined on the set of outcomes $O$ that assigns subjective values reflecting their preferences, ought to choose the action that maximizes the expected utility.

Expected utility theory may seem arbitrary, and like all interpretations of ‘ought’, it is.
But, there is some serious motivation behind it. There are two main arguments.
The first argument is based on the weak and strong laws of large numbers, first presented by Feller [3]. These laws imply that, roughly speaking, in the long run, given a sequence of independent and identical decision problems, the average amount of utility gained per decision is tremendously likely to be close to the expected utility of an individual decision. Accordingly, then, we should maximize the expected utility of each individual decision. Of course, one could argue in return that agents are rarely faced with sequences of independent and identical decision problems.

The second argument, which is much more compelling, is through representation theorems. Generally, representation theorems aim to show that given a framework for decision problems and a set of axioms about an agent's preferences, the agent can be represented as having a probability function and a utility function such that they prefer actions with higher expected utility (defined in terms of this probability and utility function). These axioms about an agent's preferences are often accepted as rational.
So, one can argue that if an agent does not prefer actions with higher expected utility, then their preferences violate one of the axioms, and hence are irrational.

One of the simpler representation theorems is the Von Neumann–Morgenstern (VNM) utility theorem which is central to EUT. One may also apply the VNM utility theorem to EDT, where given a distribution of subjective probabilities over outcomes and an agent's preferences over such distributions, we can guarantee the existence of a utility function that respects those preferences.^[3]

Let $G = {G_{1}, G_{2}, \dots, G_{n}}$ be a partition of $O$ , and let a distribution over $G$ , denoted by $d$ , be given by $d : G \to [0, 1]$ such that $\sum_{i = 1}^{n} d (G_{i}) = 1$ . Let $Δ_{G}$ be the set of all probability distributions over the set of possible outcomes $G$ . Suppose that there exists a preference relation $≻$ on $Δ_{G}^{2}$ which holds between two distributions $d_{1}, d_{2} \in Δ_{G}$ if the agent prefers learning $d_{1}$ to $d_{2}$ which satisfies the following axioms:

(Complete) For all $d_{1}, d_{2} \in Δ_{G}$ we have that $d_{1} ≻ d_{2}, d_{2} ≻ d_{1}, or d_{1} \sim d_{2}$ where $\sim$ denotes indifference.
(Irreflexive) For all $d_{1} \in Δ_{G}$ not $d_{1} ≻ d_{1}$ .
(Transitive) For all $d_{1}, d_{2}, d_{3} \in Δ_{G},$ if $d_{1} ≻ d_{2}$ and $d_{2} ≻ d_{3}$ , then $d_{1} ≻ d_{3}$ .
(Independent) For all $d_{1}, d_{2}, d_{3} \in Δ_{G}$ if $d_{1} ≻ d_{2}$ , then for any $d_{3} \in Δ_{G}$ and any $α \in [0, 1],$ $α d_{1} + (1 - α) d_{3} ≻ α d_{2} + (1 - α) d_{3} .$
(Continuous) For all $d_{1}, d_{2}, d_{3} \in Δ_{G}$ , if $d_{1} ≻ d_{2} ≻ d_{3}$ , then $x d_{1} + (1 - x) d_{3} ≻ d_{2} ≻ y d_{2} + (1 - y) d_{3}, for some x, y \in (0, 1) .$

Then there exists a utility function $u : G \to R$ such that

d_{1} ≻ d_{2} if and only if n \sum i = 1 d_{1} (G_{i}) u (G_{i}) > n \sum i = 1 d_{2} (G_{i}) u (G_{i}) .

We may extend the VNM utility theorem to arbitrary propositions. Suppose that we have a decision problem $D = (Ω, A, S, O)$ noting that $O$ . Consider the set of distributions

Δ_{O} := {d_{A} : O \to [0, 1] : A \in A},

where $d_{A}$ contains your subjective probabilities over $O$ conditional on $A$ and suppose that you impose a preference relation $≻$ on $Δ_{O}^{2}$ that satisfies the VNM axioms. One can think of $O$ as a value system—theoretically it encapsulates everything you care about, in the sense that once you have learnt as true $A \land S \in O$ you have learnt everything you care about. For instance, suppose you care about two things:

S = {‘‘You win the lottery", ‘‘You don't win the lottery"},

with actions

A = {‘‘You buy a lottery ticket", ‘‘You don't buy a lottery ticket"} .

You know that exactly one of $A \land S \in O$ is true. What matters to you is how learning that you took action $A \in A$ redistributes your subjective probabilities over $O$ via Bayesian updating for each $A \land S \in O$ from $P (A \land S)$ to $d_{A} (A \land S) = P (A \land S ∣ A)$ $= P (S ∣ A)$ .

That is, you should be just as pleased to learn $A$ as to learn the updated distribution $d_{A}$ over your value system $O$ . Thus, the news value (or expected utility) of learning that you took an action $A \in A$ such that $P (A) > 0$ is given by

V (A) := \sum S \in S d_{A} (A \land S) \cdot u (A \land S),

or equivalently,

V (A) := \sum S \in S P (S ∣ A) \cdot u (A \land S) .

Thus, EDT prescribes the action $A \in A$ that maximizes your news value $V (A)$ . One may notice that a core innovation of EDT is that everything—acts, states, outcomes—is the same type of object, namely, propositions about the world, which are largely treated the same. A side effect of such a structure in decision-making is the tendency for EDT to "manage the news"—but we will leave that as a discussion for later.

References

[1] Arif Ahmed. Evidential Decision Theory. Cambridge University Press, 2021.

[2] Ethan D. Bolker. Functions resembling quotients of measures. Annals of Mathematics, Second Series, 85(3):451–460, 1967.

[3] William Feller. An Introduction to Probability Theory and its Applications, Volume 1. J. Wiley & Sons: New York, 1968.

[4] Richard C. Jeffrey. The Logic of Decision. University of Chicago Press, 1965.

[5] Richard C. Jeffrey. The Logic of Decision. University of Chicago Press, Chicago, IL, 2nd edition, 1983.

[6] James M. Joyce. The Foundations of Causal Decision Theory. Cambridge University Press, 1999.

[7] David Lewis. On the Plurality of Worlds. Basil Blackwell, Oxford, 1986.

[8] Nate Soares and Benja Fallenstein. Toward idealized decision theory. arXiv preprint arXiv:1507.01986, 2015.

[9] Nate Soares and Benja Fallenstein. Agent foundations for aligning machine intelligence with human interests: A technical research agenda. In Vincent C. Müller, editor, The Technological Singularity: Managing the Journey, pages 103–125. Springer, 2017.

[10] Robert Stalnaker. Possible worlds. Noûs, 10(1):65–75, 1976.

^{^}
Although, we use our preferred notation from Joyce's The Foundations of Causal Decision Theory (1999) [6].
^{^}
Jeffery calls these acts, conditions and consequences, respectively. We stick with actions, outcomes and states as these terms seem more orthodox.
^{^}
Note that we omit discussion around the Bolker–Jeffrey Representation Theorem as it is a lengthy generalisation of the VNM utility theorem. For further information, refer to Functions Resembling Quotients of Measures by Bolker (1967) [2] and Jeffrey's seminal work on The Logic of Decision (1983) [5].

2 comments

Comments sorted by top scores.

comment by jonomyster · 2025-02-03T01:36:04.713Z · LW(p) · GW(p)

Thanks for writing this up! I was wondering how this formalization works for Newcomb's problem [? · GW]. (I'll take box A to be the transparent box containing a thousand dollars, and box B to be the opaque box containing a million dollars or nothing.)

I would like to say that the actions are , the states are $S = {‘ ‘ Box B is full", ‘ ‘ Box B is empty"}$ , and the outcomes $O$ are the four different ways of combining the actions and states.

But it seems like I've violated the definition of a state given in the post:

By 'no direct control', we mean that the probability of the state is independent of the action performed.

After all, the probability of the state $‘ ‘ Box B is full"$ certainly depends on the action of the agent, in the sense that $P (‘ ‘ Box B is full" | ‘ ‘ Take only box B") \neq P (‘ ‘ Box B is full" | ‘ ‘ Take both boxes")$ .

Replies from: aetrain

↑ comment by Adam Train (aetrain) · 2025-02-09T07:06:34.370Z · LW(p) · GW(p)

In my understanding, you are on the right track, but note the difference between taking the action and observing the action.

EDT doesn't assume the agent's action causally determines the state, but rather that you are not restricted (as in CDT) from considering how observing the action may work as evidence about the state. Consider the problem from a detached perspective. If you saw an agent one-box but did not see the outcome of that choice, then you would still be justified in believing because the Newcomb predictor is usually accurate, right?

So, more precisely, your formulation could be stated as:

$P (‘ ‘ B is full" | ‘ ‘ Observing the agent one-boxing") \neq P (‘ ‘ B is full" | ‘ ‘ Observing the agent two-boxing")$

In other words the action is independent of the state, but the observation of the action isn't necessarily. Also see e.g. Joe Carlsmith's discussion of this [LW · GW], most interestingly:

In particular, I suspect that attractive versions of EDT (and perhaps, attractive attempts to recapture the spirit of CDT) require something in the vicinity of “following the policy that you would’ve wanted yourself to commit to, from some epistemic position that ‘forgets’ information you now know.”

The epistemic position you have to use to evaluate EDT is strange. But thinking about yourself as a detached observer of actions (past, present, and anticipated/hypothetical future) is a useful framing for me.

An Introduction to Evidential Decision Theory

Contents

Introduction

Decision Theory

Decision Problems

Possible Worlds

Subjective Probability

An Expected Utility

References

2 comments