# Reflexive decision theory is an unsolved problem

post by Richard_Kennaway · 2023-09-17T14:15:09.222Z · LW · GW · 27 comments

## Contents

  Reflexive issues in logic
Those who know not, and know not that they know not
Those who know not, and know that they know not
Those who reject the very idea of an RDT
Reflexiveness in the real world
A parable
None


By "reflexive decision theory", hereafter RDT, I mean any decision theory that can incorporate information about one's own future decisions into the process of making those decisions. RDT is not itself a decision theory, but a class of decision theories, or a property of decision theories. Some say it is an empty class. FDT (to the extent that it has been worked out — this is not something I have kept up with) is an RDT.

The use of information of this sort is what distinguishes Newcomb's problem, Parfit's Hitchhiker, the Smoking Lesion, and many other problems from "ordinary" decision problems which treat the decision-maker as existing outside the world that he is making decisions about.

There is currently no generally accepted RDT. Unfortunately, that does not stop people insisting that every decision theory but their favoured one is wrong or crazy. There is even a significant literature (which I have never seen cited on LW, but I will do so here) saying that reflexive decision theory itself is an impossibility.

## Reflexive issues in logic

We have known that reflexiveness is a problem for logic ever since Russell said to Frege, "What about the set of sets that aren't members of themselves?" (There is also the liar paradox going back to the ancient Greeks, but it remained a mere curiosity until people started formalising logic in the 19th century. Calculemus, nam veritas in calculo est.)

In set theory, this is a solved problem, solved by the limited comprehension axiom. Since Gödel, we also have ways of making theories talk about themselves, and there are all manner of theorems about the limits of how well they can introspect: Gödel's incompleteness theorem, Löb's theorem, etc.

Compared with that, reflexive decision theory has hardly even started.

## Those who know not, and know not that they know not

Many think they have solutions, but they disagree with each other, and keep on disagreeing. So we have the situation where CDT-ers say "but the boxes already contain what they contain!", and everyone with an RDT replies "then you'll predictably lose!", and both point with scorn at EDT and say "you think you can change reality by managing the news!" The words "flagrantly, confidently, egregiously wrong" [LW · GW] get bandied about, at least by one person. Everyone thinks everyone else is crazy. There is also a curious process by which an XDT'er, for any value of X, responds to counterexamples to X by modifying XDT and claiming it's still XDT, to the point of people ending up saying that CDT and EDT are the same. Now that's crazy.

## Those who know not, and know that they know not

Some people know that they do not have a solution. Andy Egan, in "Some Counterexamples to Causal Decision Theory" (2007, Philosophical Review), shoots down both CDT and EDT, but only calls for a better theory, without any suggestions for finding it.

## Those who reject the very idea of an RDT

Some deny the possibility of any such theory, such as Marion Ledwig ("The No Probabilities for Acts-Principle"), who formulates the principle thus: "Any adequate quantitative decision model must not explicitly or implicitly contain any subjective probabilities for acts." This rejects the very idea of reflexive decision theory. It also implies that one-boxing is wrong for Newcomb's problem, and Ledwig explicitly says that it is.

For the original statement of the principle, Ledwig cites Spohn (1999, "Strategic Rationality", and 1983 "Ein Theorie Der Kausalität". My German is not good enough to analyse what he says in the latter reference, but in the former he says that the rational strategy in one-shot Prisoners' Dilemma is to defect, and in one-shot Newcomb is to two-box. Ledwig and Spohn trace the idea back to Savage's 1954 "The Foundations of Statistics". Savage's whole framework, however, in common with the other "classical" theories such as Jeffrey-Bolker and VNM, does not have room in it for any sort of reflexiveness, ruling it out implicitly rather than considering the idea and explicitly rejecting it. There is more in Spohn 1977 "Where Luce and Krantz Do Really Generalize Savage's Decision Model" where Spohn says:

"[P]robabilities for acts play no role in decision making. For, what only matters in a decision situation is how much the decision maker likes the various acts available to him, and relevant to this, in turn, is what he believes to result from the various acts and how much he likes these results. At no place does there enter any subjective probability for an act."

There is also Itzhak Gilboa "Can free choice be known?". He says, "[W]e are generally happier with a model in which one cannot be said to have beliefs (let alone knowledge) of one's own choice while making this choice", and looks for a way to resolve reflexive paradoxes by ruling out reflexiveness.

These people all defect in PD and two-box in Newcomb. The project of RDT is to do better.

(ETA: Thanks to Sylvester Kollin (see comments [LW(p) · GW(p)]) for drawing my attention to a later paper by Spohn in which he has converted to one-boxing within a causal decision theory.)

## Reflexiveness in the real world

People often make decisions that take into account the states of mind of the people they are interacting with, including other people's assessments of one's own state of mind. This is an essential part of many games (such as Diplomacy) and of many real-world interactions (such as diplomacy). A theory satisfying "no foreknowledge of oneself" (my formulation of "no probabilities for acts") cannot handle these. (Of course one can have foreknowledge of oneself; the principle only excludes this information from input into one's decisions.)

The principle "know thyself" is as old as the Liar paradox.

Just as there have been contests of bots playing iterated prisoners' dilemma, so there have been contests where these bots are granted access to each others' source code. Surely we need a decision theory that can deal with the reasoning processes used by such bots.

The elephant looming behind the efforts of Eliezer and his colleagues to formulate FDT is AI. It might be interesting at some point to have a tournament of bots whose aim is to get other bots to "let them out of the box".

These practical problems need solutions. The no foreknowledge principle rejects any attempt to think about them; thinking about them therefore requires rejecting the principle. That is the aim of RDT. I do not have an RDT, but I do think that duelling intuition pumps is a technique whose usefulness for this problem has been exhausted. It is no longer enough to construct counterexamples to everyone else, for they can as easily do the same to your own theory. Some general principle is needed that will be as decisive for this problem as limited comprehension is for building a consistent set theory.

## A parable

Far away and long ago, each morning the monks at a certain monastery would go out on their alms rounds. To fund the upkeep of the monastery buildings, once every month each monk would buy a single gold coin out of the small coins that he had received, and deposit it in a chest through a slot in the lid.

This system worked well for many years.

One month, a monk thought to himself, "What if I drop in a copper coin instead? No-one will know I did it."

That month, when the chest was opened, it contained nothing but copper coins.

comment by Sylvester Kollin · 2023-09-17T18:33:17.319Z · LW(p) · GW(p)

Some people know that they do not have a solution. Andy Egan, in "Some Counterexamples to Causal Decision Theory" (1999, Philosophical Review)

This should say 2007.

These people all defect in PD and two-box in Newcomb.

Spohn argues for one-boxing in Reversing 30 years of discussion: why causal decision theorists should one-box.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2023-09-17T19:36:46.580Z · LW(p) · GW(p)

This should say 2007.

Fixed.

Spohn argues for one-boxing in Reversing 30 years of discussion: why causal decision theorists should one-box.

Thanks for the reference. I hope this doesn't turn out to be a case of changing CDT and calling the result CDT.

ETA: 8 pages in, I'm impressed by the clarity, and it looks like it leads to something reasonably classified as a CDT.

Replies from: SaidAchmiz
comment by Said Achmiz (SaidAchmiz) · 2023-09-20T21:20:35.949Z · LW(p) · GW(p)

8 pages in, I’m impressed by the clarity

Seconded. This is an extremely impressive paper. It seems like Spohn had most of the insights that motivated and led to the development of logical/functional decision theories, years before Less Wrong existed. I’m astounded that I’ve never heard of him before now.

Replies from: MondSemmel, MondSemmel
comment by MondSemmel · 2023-09-21T16:39:45.490Z · LW(p) · GW(p)

Having searched for "Spohn" on LW, it appears that Spohn was already mentioned a few times on LW. In particular:

11 years ago, in lukeprog's post Eliezer's Sequences and Mainstream Academia [LW · GW] (also see bits of this Wei Dai comment here [LW(p) · GW(p)], and of this long critical comment thread here [LW(p) · GW(p)]):

I don't think Eliezer had encountered this mainstream work when he wrote his articles

Eliezer's TDT decision algorithm (2009 [? · GW], 2010) had been previously discovered as a variant of CDT by Wolfgang Spohn (2003, 2005, 2012). Both TDT and Spohn-CDT (a) use Pearl's causal graphs to describe Newcomblike problems, then add nodes to those graphs to represent the deterministic decision process the agent goes through (Spohn calls them "intention nodes," Yudkowsky calls them "logical nodes"), (b) represent interventions at these nodes by severing (edit: or screening off) the causal connections upstream, and (c) propose to maximize expected utility by summing over possible values of the decision node (or "intention node" / "logical node"). (Beyond this, of course, there are major differences in the motivations behind and further development of Spohn-CDT and TDT.)

And the comments here [LW · GW] on the MIRI paper "Cheating Death in Damascus" briefly mention Spohn.

Finally, a post [LW · GW] from a few months ago also mentions Spohn:

I introduce dependency equilibria (Spohn 2007), an equilibrium concept suitable for ECL, and generalize a folk theorem showing that the Nash bargaining solution is a dependency equilibrium.

comment by MondSemmel · 2023-09-21T13:24:37.978Z · LW(p) · GW(p)

I’m astounded that I’ve never heard of him before now.

It probably doesn't help that he's a German philosopher; language barriers are a thing.

On the other hand, his research interests seem to have lots of overlap with LW content:

Spohn is best known for his contributions to formal epistemology, in particular for comprehensively developing ranking theory[1] since 1982, which is his theory of the dynamics of belief. It is an alternative to probability theory and of similar philosophical impact as a formal account of the dynamics of belief. Spohn's research extends to philosophy of science, the theory of causation, metaphysics and ontology, philosophy of language and mind, two-dimensional semantics, philosophical logic, and decision and game theory (see the collection of papers[2]). His dissertation[3] and his paper "Stochastic Independence, Causal Independence, and Shieldability"[4] are precursors of the theory of Bayesian networks and their causal interpretation. His paper "How to Make Sense of Game Theory"[5] is a forerunner of epistemic game theory.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2023-09-21T15:53:03.436Z · LW(p) · GW(p)

In looking through Spohn's oeuvre, here are a couple of papers that I think will be of interest on LessWrong, although I have not read past the very beginnings.

"Dependency Equilibria" (2007). "Its philosophical motive is to rationalize cooperation in the one shot prisoners’ dilemma."

"The Epistemology and Auto-Epistemology of Temporal Self-Location and Forgetfulness" (2017). It relates to the Sleeping Beauty problem.

Replies from: MondSemmel
comment by MondSemmel · 2023-09-21T16:55:45.118Z · LW(p) · GW(p)

More stuff:

He co-initiated the framework program "New Frameworks of Rationality" (German Wikipedia, German-only website) which seems to have been active in 2012~2018. Their lists of publications and conferences are in English, as are most of the books in this list.

(Incidentally, there's apparently a book called Von Rang und Namen: Philosophical Essays in Honour of Wolfgang Spohn.)

And since 2020, he leads this project on Reflexive Decision & Game Theory (longer outline). The site doesn't list any results of this project, but presumably some of Spohn's papers since 2020 are related to this topic.

The Koselleck project explores reflexive decision and game theory. Standard decision and game theory distinguish only chance nodes (moves by nature) and action nodes (possible actions of the agent or player, usually and confusingly called decision nodes). The reflexive extensions additionally distinguish genuine decision nodes referring to possible decision situations (decision or game trees plus probability and utility functions) the agent may be in or come to be in. Thus, these decision nodes generate a rich recursive structure. This structure is required, though. Surely, we agents not only rationally act within a given decision situation, but are also able to reflect on the possible situations which might be ours and thus about what determines or causes our actions. And clearly, such reflection is important for rational decision making.

The reflexive extension allows for a general account of anticipatory rationality, i.e., of how to rationally behave in view of arbitrary envisaged changes in one's decision situation (so far income­pletely treated under the labels “strategic rationality” and “endogenous preference change”). The extension also allows us to account for what is called sensitive rationality, which considers a so far largely neglected point, namely the fact that being in a certain situation not only causes the pertinent rational action, but may have side effects as well (as exemplified in the Toxin puzzle). This fact is ubiquitous in social settings, and it is highly decision relevant and can obviously be accounted for only in the reflexive perspective. Moreover, the extension also allows us to respect the point that it is a matter of our decision when to decide about a certain issue, e.g., whether to commit early or to decide as late as possible. This leads to an account of so-called commissive rationality possibly rationalizing our inclination to commit ourselves. Finally, in the game theoretic extension suggested by the phenomenon of sensitive rationality, the reflexive perspective leads to a new equilibrium concept called dependency equilibria, which, e.g., allows a rationalization of cooperation in the one-shot prisoners' dilemma and promises a kind of unification of noncooperative and cooperative game theory.

comment by Michael Carey (michael-carey) · 2023-09-20T21:52:44.458Z · LW(p) · GW(p)

Saying that Set Theory "solved the problem" by introducing restricted Comprehension is maybe a stretch.

Restricted Comphrension prevents the question from even being asked. So, it "solves it" by removing the object from the domain of discourse.

The Incompleteness Theorems are Meta-Theorems talking about Proper Theorems.

I'm not sure Set Theory has really solved the self-reference problem in any, real sense besides avoiding it. ( which may be the best solution possible)

The closest might be the Recursion Theorems, which allow functions to "build-themselves" by referencing earlier versions of themself. But, that isn't proper self-reference.

My issue with the practicality of any kind of real world computation of self reference, is I believe it would require infinite time/energy/space. As each time you "update" your current self, you change, and so would need to update again- etc.. You could approximate a RDT decision tree, but not apply it. The exception being for "fixed-points" Decisions which stay constant when we apply the RDT decision algorithm.

comment by Richard_Kennaway · 2023-09-21T12:36:52.665Z · LW(p) · GW(p)

I was painting with a broad brush and omitting talk of alternatives to Limited Comprehension, because my impression is that ZF (with or without C) is nowadays the standard background for doing set theory. Is there any other in use, not known to be straightforwardly equivalent to ZF? (ETA: I also elided all the history of how the subject developed leading up to ZF. For example, in Principia Mathematica Russell & Whitehead used some sort of hierarchy of types to avoid the inconsistency. That approach dropped by the wayside, although type systems are back in use in the various current projects to formalise all of mathematics.)

Restricted Comphrension prevents the question from even being asked. So, it "solves it" by removing the object from the domain of discourse.

Any answer to Russell's question to Frege must exclude something. There may be other ways of avoiding the inconsistencies, such as the thesis that Adele Lopez linked, but one has to do something about them.

Replies from: michael-carey
comment by Michael Carey (michael-carey) · 2023-09-21T16:53:22.527Z · LW(p) · GW(p)

As far as I know for know, all of standard Mathematics is done within ZF + Some Degree of Choice. So it makes sense to restrict discussion to ZF (with C or without).

My comment was a minor nitpick, on the phrasing "in set theory, this is a solved problem". For me, solved implies that an apparent paradox has been shown under additional scrutiny to not be a paradox. For example, the study of convergent series (in particular the geometric series) solves Zeno's Paradox of Motion.

In Set Theory, Restricted Comprehension just restricts us from asking the question, "Consider this Set, with Y property" It's quite a bit different than solving a paradox in my book.  Although, it does remove the paradoxical object from our discourse. It's really more that Axiomatic Set Theory avoids the paradox, rather than solve it.

I want to emphasize that this is a minor nitpick. It actually, ( I believe) serves to strengthen your overall point that RDT is an unsolved problem, I'm just adding that as far as I can tell - I think it's safe to say this component of RDT ( self-reference) isn't really adequately addressed in Standard Logic. If we allow self reference, we don't always produce paradoxes,  x = x is hardly, in any way self-evidently paradoxical. But, sometimes we do - such as in Russell's famous case.

The fact that we don't have a good rule system ( in standard logic, to my knowledge) for predicting when self-reference produces a paradox indicates it's still something of an open problem. This may be radical, but I'm basically claiming that restricted Comprehension isn't a particularly amazing solution for the self-reference problem, it's something of a throwing the baby out with the bathwater kind of solution. Although, to its merit, that ZF hasn't produced any contradictions in all these years of study- is an incredible feat.

Your point about, having to sacrifice to solve Russells question is well taken. I think it may be correct, the removal of something may be the best kind of solution possible. In that sense, restricted comprehension may have "solved" the problem, as it may be the only kind of solution we can hope for.

Adele Lopez's answer was excellent, and I haven't had a chance to digest the referenced thesis, but it does seem to follow your proposed principle- to answer Russells question we need to omit things.

There's some interesting research using "exotic" logical systems where unrestricted comprehension can be done consistently (this thesis includes a survey as well as some interesting remarks about how this relates to computability). This can only happen at the expense of things typically taken for granted in logic, of course. Still, it might be a better solution for reasoning about self-reference than the classical set theory system.

comment by Richard_Kennaway · 2023-09-20T19:27:02.897Z · LW(p) · GW(p)

My anecdote of a recent experience of decision [LW(p) · GW(p)] makes an interesting contrast with all verbal discussion of decision theory.

It appeared to Winston that a long time passed before he answered. For a moment he seemed even to have been deprived of the power of speech. His tongue worked soundlessly, forming the opening syllables first of one word, then of the other, over and over again. Until he had said it, he did not know which word he was going to say. "No," he said finally.

comment by dr_s · 2023-09-22T08:56:11.249Z · LW(p) · GW(p)

If we go by analogy with Godel, Turing, and basically anything else involving self-reflection, one has to be pessimistic and place a higher probability on a proper fully consistent RDT simply being impossible, which could actually have some rather dramatic consequences downstream for e.g. AI alignment.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2023-09-22T09:27:38.276Z · LW(p) · GW(p)

Doing the research necessary to have something more than a vaguely arrived-at probability — that is, a wild guess — would have more dramatic consequences.

Anyway, the precedent of formalising arithmetic and set theory is grounds for optimism: a lot of self-reflection is consistent in those theories.

Replies from: dr_s
comment by dr_s · 2023-09-22T09:52:20.898Z · LW(p) · GW(p)

Doing the research necessary to have something more than a vaguely arrived-at probability — that is, a wild guess — would have more dramatic consequences.

Well, obviously so. But that sounds more like a PhD program than a LW comment. My point was, there seems to be a trend, and the trend is "self reflection allows for self contradiction and inconsistency". I imagine the general thrust of a more formal argument could be imagining that there is a "correct" decision theory, imagining a Turing machine implementing such a theory, proving that this machine is in itself Turing-complete (as in, it is always possible to input a decision problem that maps to any arbitrary algorithm) and then it would follow that it being reflexive would require it to be able to solve the halting problem.

comment by romeostevensit · 2023-09-17T23:12:59.062Z · LW(p) · GW(p)

I hadn't put this name to things and I'm happy to have some way of mentally tagging it. It has seemed to me that many sources of 'irrationality' in people's decision theory relate to resistance against negative reflexive effects. Eg why resist information streams from certain sources? Bc being seen to update on certain forms of information creates an incentive to manipulate or goodhart that data stream.

comment by cubefox · 2023-09-20T16:44:17.437Z · LW(p) · GW(p)

It's worth repeating Spohn's arguments from Where Luce and Krantz Do Really Generalize Savage's Decision Model:

Now, probably anyone will find it absurd to assume that someone has subjective probabilities for things which are under his control and which he can actualize as he pleases. I think this feeling of absurdity can be converted into more serious arguments for our principle:

First, probabilities for acts play no role in decision making. For, what only matters in a decision situation is how much the decision maker likes the various acts available to him, and relevant to this, in turn, is what he believes to result from the various acts and how much he likes these results. At no place does there enter any subjective probability for an act. The decision maker chooses the act he likes most - be its probability as it may. But if this is so, there is no sense in imputing probabilities for acts to the decision maker. For one could tell neither from his actual choices nor from his preferences what they are. Now, decision models are designed to capture just the decision maker's cognitive and motivational dispositions expressed by subjective probabilities and utilities which manifest themselves in and can be guessed from his choices and preferences. Probabilities for acts, if they exist at all, are not of this sort, as just seen, and should therefore not be contained in decision models.

The strangeness of probabilities for acts can also be brought out by a more concrete argument: It is generally acknowledged that subjective probabilities manifest themselves in the readiness to accept bets with appropriate betting odds and small stakes. Hence, a probability for an act should manifest itself in the readiness to accept a bet on that act, if the betting odds are high enough. Of course, this is not the case. The agent's readiness to accept a bet on an act does not depend on the betting odds, but only on his gain. If the gain is high enough to put this act on the top of his preference order of acts, he will accept it, and if not, not. The stake of the agent is of no relevance whatsoever.

One might object that we often do speak of probabilities for acts. For instance, I might say: "It's very unlikely that I shall wear my shorts outdoors next winter." But I do not think that such an utterance expresses a genuine probability for an act; rather I would construe this utterance as expressing that I find it very unlikely to get into a decision situation next winter in which it would be best to wear my shorts outdoors, i.e. that I find it very unlikely that it will be warmer than 20°C next winter, that someone will offer me DM 1000.- for wearing shorts outdoors, or that fashion suddenly will prescribe wearing shorts, etc. Besides, it is characteristic of such utterances that they refer only to acts which one has not yet to decide upon. As soon as I have to make up my mind whether to wear my shorts outdoors or not, my utterance is out of place.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2023-09-20T19:09:02.727Z · LW(p) · GW(p)

And yet it seems that Spohn no longer believes this [LW(p) · GW(p)].

Replies from: cubefox
comment by cubefox · 2023-09-21T08:30:08.733Z · LW(p) · GW(p)

His solution seems to rely on the ability to precommit to a future action, such that the future action can be treated like an ordinary outcome:

It is obvious that in the situation thus presented one-boxing is rational. If my decision determines or strongly influences the prediction, then I rationally decide to one-box, and when standing before the boxes I just do this. (p. 101f)

If people can just "make decisions early", then one-boxing is, of course, the rational thing to do from the point of CDT. It effectively means you are no longer deciding anything when you are standing in front of the two boxes, you are just slavishly one-boxing as if under hypnotic suggestion, or as if being somehow forced to one-box by your earlier self. Then the "decision" or "act" here can be assigned a probability because it is assumed there is nothing left to decide, it's effectively just an consequence of the real decision that was made much earlier, consistent with the view that an action in a decision situation may not be assigned a probability.

The real problem with the precommitment route is that it assumes the possibility of "precommitment". Yet in reality, if you "commit" early to some action, and you are later faced with the situation where the action has to be executed, you are still left with the question of whether or not you should "follow through" with your commitment. Which just means your precommitment wasn't real. You can't make decisions in advance, you can't simply force your later self to do things. The actual decision always has to be made in the present, and the supposed "precommitment" of your past self is nothing more than a suggestion.

(The impossibility of precommitment was illustrated in Kavka's toxin puzzle.)

Replies from: MondSemmel, Richard_Kennaway
comment by MondSemmel · 2023-09-21T13:33:16.916Z · LW(p) · GW(p)

(The impossibility of precommitment was illustrated in Kavka's toxin puzzle.)

The toxin puzzle is also referenced extensively in that aforementioned Spohn paper on one-boxing, and his paper is a response to the toxin puzzle as much as it is to two-boxing.

Replies from: cubefox
comment by cubefox · 2023-09-21T16:08:23.885Z · LW(p) · GW(p)

Spohn shows that you can draw causal graphs such that CDT can get rewards in both cases, though only under the assumption that true precommitment is possible. But Spohn doesn't give arguments for the possibility of precommitment, as far as I can tell.

Replies from: Ape in the coat
comment by Ape in the coat · 2023-09-22T07:04:31.003Z · LW(p) · GW(p)

Isn't the possibility and, moreover, computability of precommitmet just trivially true?

If you have programm DT(data), determinimg a decision according to a particular decision theory in the circumstances, specified by data, then you can easily construct a program PDT(data), determining the decision for the same decision theory but with precommitment:

def PDT(data, memory):
precommited_decision = memory.get(data)
if precommited_decision:
return precommited_decision
else:
return DT(data)

The only thing that is required is an if-statement and  memory object which can be implemented via a dictionary.

Replies from: cubefox
comment by cubefox · 2023-09-22T11:24:24.812Z · LW(p) · GW(p)

Yes, but I was taking about humans. An AI might have a precommitment ability.

Replies from: Ape in the coat
comment by Ape in the coat · 2023-09-22T14:33:36.697Z · LW(p) · GW(p)

Yes, but I was taking about humans.

This also seems trivially true to me. I've successfully precommited multiple times in my life and I bet you have as well.

What you are probably talking about is the fact that occasionally humans fail at precommitments. But isn't it an isolated demand for rigor? Humans occasionally fail at following any decision theory, or fail at being rational in general. It doesn't make all the decision theories and rationality itself incoherent concept which we thus can't talk about.

Actually, when I think about it, isn't deciding what decision theory to follow, itself a precommitment?

comment by Richard_Kennaway · 2023-09-21T12:03:46.684Z · LW(p) · GW(p)

I often do things because I earlier decided to, overruling whatever feelings I may have in the moment. So from a psychological point of view, precommitment is possible. Why did I pause at Alderford [LW(p) · GW(p)]? To let my fatigue clear sufficiently to let the determination to do 100 miles overcome it.

Kavka's toxin puzzle only works if the intention-detecting machine works, and the argument against rationally drinking the toxin when the time comes could equally well be read as an argument against the possibility of such an intention-detecting machine. Its existence, after all, presupposes that the future decision can be determined at midnight, while the argument against drinking presupposes that it cannot be. An inconsistent thought experiment proves nothing. This example is playing much the same role in decision theory as Russell's question to Frege did for set theory. It's pointing to an inconsistency in intuitions around the subject.

Excluding reflectiveness is too strong a restriction, akin to excluding all forms of comprehension axiom from set theory. A precisely formulated limitation is needed that will rule out the intention-detecting machine while allowing the sorts of self-knowledge that people observably use.

Replies from: cubefox
comment by cubefox · 2023-09-21T19:12:08.960Z · LW(p) · GW(p)

But clearly you still made your final decision between 10 and 40 miles only when you were at Alderford. Not hours before that. Our past selves can't simply force us to do certain things, the memory of a past "commitment" is only one factor that may influence our present decision making, but it doesn't replace a decision. Otherwise, always when we "decide" to definitely do an unpleasant task tomorrow rather than today ("I do the dishes tomorrow, I swear!"), we would then tomorrow in fact always follow through with it, which isn't at all the case. (The Kavka/Newcomb cases are even worse than this, because there it isn't just irrational akrasia preventing us from executing past "commitments", but instrumental rationality itself, at least if we believe that CDT captures instrumental rationality.)

A more general remark, somewhat related to reflexivity (reflectivity?): In the Where Luce and Krantz paper, Spohn also criticizes Jeffrey for allowing the assignment of probabilities to acts, because for Jeffrey, everything (acts, outcomes, states) is a proposition. And any boolean combination of propositions is a proposition. In his framework, any proposition can be assigned a probability and a utility. But I'm pretty sure Jeffrey's theory doesn't strictly require that act probabilities are defined. Moreover, even if they are defined, it doesn't require them for decisions. That is, for outcomes O and an action A, to calculate the utility he only requires probabilities of the form , which we can treat as a basic probability instead of, frequentist style, a mere abbreviation for the ratio formula . So and can be undefined. In his theory is a theorem. I'm planning a post on explaining Jeffrey's theory because I think it is way underappreciated. It's a general theory of utility, rather than just a decision theory which is restricted to "acts" and "outcomes". To be fair, I don't know whether that would really help much with elucidating reflectivity. The lesson would probably be something like "according to Jeffrey's theory you can have prior probabilities for present acts but you should ignore them when making decisions". The interesting part is that his theory can't be simply dismissed because others aren't as general and thus are not a full replacement.

A precisely formulated limitation is needed that will rule out the intention-detecting machine while allowing the sorts of self-knowledge that people observably use.

Maybe the first question is then what form of "self-knowledge" people do, in fact, observably use. I think we treat memories of past "commitments"/intentions more like non-binding recommendations from a close friend (our "past self"), which we may very well just ignore. Maybe there is an ideal rule of rationality that we should always adhere to our past commitments, at least if we learn no new information. But I'd say "should" implies "can", so by contraposition, "not can" implies "not should". Which would mean if precommitment is not possible for an agent it's not required by rationality.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2023-09-21T21:02:16.808Z · LW(p) · GW(p)

at least if we believe that CDT captures instrumental rationality.

That is the question at issue.