Posts

Concave Utility Question 2023-04-15T00:14:58.895Z
Local Memes Against Geometric Rationality 2022-12-21T03:53:28.196Z
Geometric Rationality is Not VNM Rational 2022-11-27T19:36:00.939Z
Respecting your Local Preferences 2022-11-26T19:04:14.252Z
The Least Controversial Application of Geometric Rationality 2022-11-25T16:50:56.497Z
Geometric Exploration, Arithmetic Exploitation 2022-11-24T15:36:30.334Z
The Geometric Expectation 2022-11-23T18:05:12.206Z
Tyranny of the Epistemic Majority 2022-11-22T17:19:34.144Z
Utilitarianism Meets Egalitarianism 2022-11-21T19:00:12.168Z
Counterfactability 2022-11-07T05:39:05.668Z
Boundaries vs Frames 2022-10-31T15:14:37.312Z
Open Problem in Voting Theory 2022-10-17T20:42:05.130Z
Maximal Lottery-Lotteries 2022-10-17T20:39:20.143Z
Maximal Lotteries 2022-10-17T08:54:09.001Z
Voting Theory Introduction 2022-10-17T08:48:42.781Z
Mystery Hunt 2022 2021-12-13T21:57:06.527Z
Cartesian Frames and Factored Sets on ArXiv 2021-09-24T04:58:54.845Z
Finite Factored Sets: Applications 2021-08-31T21:19:03.570Z
Finite Factored Sets: Inferring Time 2021-08-31T21:18:36.189Z
Finite Factored Sets: Polynomials and Probability 2021-08-17T21:53:03.269Z
Finite Factored Sets: Conditional Orthogonality 2021-07-09T06:01:46.747Z
Finite Factored Sets: LW transcript with running commentary 2021-06-27T16:02:06.063Z
Finite Factored Sets: Orthogonality and Time 2021-06-10T01:22:34.040Z
Finite Factored Sets: Introduction and Factorizations 2021-06-04T17:41:34.827Z
Finite Factored Sets 2021-05-23T20:52:48.575Z
Saving Time 2021-05-18T20:11:14.651Z
2021 New Year Optimization Puzzles 2020-12-31T08:22:29.951Z
Teacher's Password: The LessWrong Mystery Hunt Team 2020-12-04T00:04:42.900Z
Time in Cartesian Frames 2020-11-11T20:25:19.400Z
Eight Definitions of Observability 2020-11-10T23:37:07.827Z
Committing, Assuming, Externalizing, and Internalizing 2020-11-09T16:59:01.525Z
Additive and Multiplicative Subagents 2020-11-06T14:26:36.298Z
Sub-Sums and Sub-Tensors 2020-11-05T18:06:44.421Z
Multiplicative Operations on Cartesian Frames 2020-11-03T19:27:15.489Z
Subagents of Cartesian Frames 2020-11-02T22:02:52.605Z
Functors and Coarse Worlds 2020-10-30T15:19:22.976Z
Controllables and Observables, Revisited 2020-10-29T16:38:14.878Z
What are good election betting opportunities? 2020-10-29T07:19:10.409Z
Biextensional Equivalence 2020-10-28T14:07:38.289Z
Additive Operations on Cartesian Frames 2020-10-26T15:12:14.556Z
Introduction to Cartesian Frames 2020-10-22T13:00:00.000Z
Puzzle Games 2020-09-27T21:14:13.979Z
What Would I Do? Self-prediction in Simple Algorithms 2020-07-20T04:27:25.490Z
Does Agent-like Behavior Imply Agent-like Architecture? 2019-08-23T02:01:09.651Z
Intentional Bucket Errors 2019-08-22T20:02:11.357Z
Risks from Learned Optimization: Conclusion and Related Work 2019-06-07T19:53:51.660Z
Deceptive Alignment 2019-06-05T20:16:28.651Z
The Inner Alignment Problem 2019-06-04T01:20:35.538Z
Conditions for Mesa-Optimization 2019-06-01T20:52:19.461Z
Risks from Learned Optimization: Introduction 2019-05-31T23:44:53.703Z

Comments

Comment by Scott Garrabrant on Is this voting system strategy proof? · 2024-09-06T21:05:05.732Z · LW · GW

I proposed this same voting system here: https://www.lesswrong.com/s/gnAaZtdwjDBBRpDmw


It is not strategy proof. If it were, that would violate https://en.wikipedia.org/wiki/Gibbard–Satterthwaite_theorem [Edit: I think, for some version of the theorem. It might not literally violate it, but I also believe you can make a small example that demonstrates it is not strategy proof. This is because the equilibrium sometimes extracts all the value from a voter until they are indifferent, and if they lie about their preferences less value can be extracted.]

Further, it is not obviously well defined. Because of the discontinuities around ties, you cannot take advantage of the compactness of the space of distributions, so it is not clear that Nash equilibria exist. (It is also not clear that they don't exist. My best guess is that they do.)

Comment by Scott Garrabrant on (Geometrically) Maximal Lottery-Lotteries Exist · 2024-05-11T23:47:20.690Z · LW · GW

This post does not prove Maximal Lottery Lotteries exist. Instead, it redefines MLL to be equivalent to the Nash bargaining solution (in a way that is obscured by using the same language as the MLL proposal), and then claims that under the new definition MLL exist (because the Nash bargaining solution exists).

I like Nash bargaining, and I don't like majoritarianism, but the MLL proposal is supposed to be a steelman of majoritarianism, and Nash bargaining is not only not MLL, but it is not even majoritarian. (If a majority of voters have the same favorite candidate, this is not sufficient to make this candidate win in the Nash bargaining solution.)

Comment by Scott Garrabrant on The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review · 2024-04-03T02:17:58.740Z · LW · GW

I agree with this.

Comment by Scott Garrabrant on The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review · 2024-03-29T01:36:00.297Z · LW · GW

More like illuminating ontologies than great predictions, but yeah.

Comment by Scott Garrabrant on The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review · 2024-03-27T22:39:20.644Z · LW · GW

I think Chris Langan and the CTMU are very interesting, and I there is an interesting and important challenge for LW readers to figure out how (and whether) to learn from Chris. Here are some things I think are true about Chris (and about me) and relevant to this challenge. (I do not feel ready to talk about the object level CTMU here, I am mostly just talking about Chris Langan.)

  1. Chris has a legitimate claim of being approximately the smartest man alive according to IQ tests.
  2. Chris wrote papers/books that make up a bunch of words there are defined circularly, and are difficult to follow. It is easy to mistake him for a complete crackpot.
  3. Chris claims to have proven the existence of God.
  4. Chris has been something-sort-of-like-canceled for a long time. (In the way that seems predictable when "World's Smartest Man Proves Existence of God.")
  5. Chris has some followers that I think don't really understand him. (In the way that seems predictable when "World's Smartest Man Proves Existence of God.")
  6. Chris acts socially in a very nonstandard way that seems like a natural consequence of having much higher IQ than anyone else he has ever met. In particular, I think this manifests in part as an extreme lack of humility.
  7. Chris is actually very pleasant to talk to if (like me) it does not bother you that he acts like he is much smarter than you.
  8. I personally think the proof of the existence of God is kid of boring. It reads to me as kind of like "I am going to define God to be everything. Notice how this meets a bunch of the criteria people normally attribute to God. In the CTMU, the universe is mind-like. Notice how this meets a bunch more criteria people normally attribute to God."
  9. While the proof of the existence of God feels kind of mundane to me, Chris is the kind of person who chooses to interpret it as a proof of the existence of God. Further, he also has other more concrete supernatural-like and conspiracy-theory-like beliefs, that I expect most people here would want to bet against.
  10. I find the CTMU in general interesting, (but I do not claim to understand it).
  11. I have noticed many thoughts that come naturally to me that do not seem to come naturally to other people (e.g. about time or identity), where it appears to me that Chris Langan just gets it (as in he is independently generating it all).
  12. For years, I have partially depended on a proxy when judging other people (e.g. for recommending funding) that is is something like "Do I, Scott, like where my own thoughts go in contact with the other person?" Chris Langan is approximately at the top according to this proxy.
  13. I believe I and others here probably have a lot to learn from Chris, and arguments of the form "Chris confidently believes false thing X," are not really a crux for me about this.
  14. IQ is not the real think-oomph (and I think Chris agrees), but Chris is very smart, and one should be wary of clever arguers, especially when trying to learn from someone with much higher IQ than you.
  15. I feel like I am spending (a small amount of) social credit in this comment, in that when I imagine a typical LWer thinking "oh, Scott semi-endorses Chris, maybe I should look into Chris," I imagine the most likely result is that they will reach the conclusion is that Chris is a crackpot, and that Scott's semi-endorsements should be trusted less.
Comment by Scott Garrabrant on Geometric Rationality is Not VNM Rational · 2023-09-01T00:26:32.498Z · LW · GW

So, I am trying to talk about the preferences of the couple, not the preferences of either individual. You might reject that the couple is capable of having preference, if so I am curious if you think Bob is capable of having preferences, but not the couple, and if so, why?

I agree if you can do arbitrary utility transfers between Alice and Bob at a given exchange rate, then they should maximize the sum of their utilities (at that exchange rate), and do a side transfer. However, I am assuming here that efficient compensation is not possible. I specifically made it a relatively big decision, so that compensation would not obviously be possible.

Comment by Scott Garrabrant on Infrafunctions and Robust Optimization · 2023-05-02T01:13:54.658Z · LW · GW

Here are the most interesting things about these objects to me that I think this post does not capture. 

Given a distribution over non-negative non-identically-zero infrafunctions, up to a positive scalar multiple, the pointwise geometric expectation exists, and is an infra function (up to a positive scalar multiple).

(I am not going to give all the math and be careful here, but hopefully this comment will provide enough of a pointer if someone wants to investigate this.)

This is a bit of a miracle. Compare this with arithmetic expectation of utility functions. This is not always well defined. For example, if you have a sequence of utility functions U_n, each with weight 2^{-n}, but which alternate in which of two outcomes they prefer, and each utility function gets an internal weighting to cancel out their small weight an then some, the expected utility will not exist. There will be a series of larger and larger utility monsters canceling each other out, and the limit will not exist. You could fix this requiring your utility functions are bounded, as is standard for dealing with utility monsters, but it is really interesting that in the case of infra functions and geometric expectation, you don't have to.

If you try to do a similar trick with infra functions, up to a positive scalar multiple, geometric expectation will go to infinity, but you can renormalize everything since you are only working up to a scalar multiple, to make things well defined.

We needed the geometric expectation to only be working up to a scalar multiple, and you cant expect a utility function if you take a geometric expectation of utility functions. (but you do get an infrafunction!)

If you start with utility functions, and then merge them geometrically, the resulting infrafunction will be maximized at the Nash bargaining solution, but the entire infrafunction can be thought of as an extended preference over lotteries of the pair of utility functions, where as Nash bargaining only told you the maximum. In this way geometric merging of infrafunctions is starting with an input more general than the utility functions of Nash bargaining, and giving an output more structured than the output of Nash bargaining, and so can be thought of as a way of making Nash bargaining more compositional. (Since the input and output are now the same type, so you can stack them on top of each other.)

For these two reasons (utility monster resistance and extending Nash bargaining), I am very interested in the mathematical object that is non-negative non-identically-zero infrafunctions defined only up to a positive scalar multiple, and more specifically, I am interested in the set of such functions as a convex set where mixing is interpreted as pointwise geometric expectation.

Comment by Scott Garrabrant on Infrafunctions and Robust Optimization · 2023-05-02T00:54:42.241Z · LW · GW

I have been thinking about this same mathematical object (although with a different orientation/motivation) as where I want to go with a weaker replacement for utility functions.

I get the impression that for Diffractor/Vanessa, the heart of a concave-value-function-on-lotteries is that it represents the worst case utility over some set of possible utility functions. For me, on the other hand, a concave value function represents the capacity for compromise -- if I get at least half the good if I get what I want with 50% probability, then I have the capacity to merge/compromise with others using tools like Nash bargaining. 

This brings us to the same mathematical object, but it feels like I am using the definition of convex set related to the line segment connecting any two points in the set is also in the set, where Diffractor/Vanessa is using the definition of convex set related to being an intersection of half planes. 

I think this pattern where I am more interested in merging, and Diffractor and Vanessa are more interested in guarantees, but we end up looking at the same math is a pattern, and I think the dual definitions of convex set in part explains (or at least rhymes with) this pattern.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-17T07:59:18.572Z · LW · GW

Then it is equivalent to the thing I call B2 in edit 2 in the post (Assuming A1-A3).

In this case, your modified B2 is my B2, and your B3 is my A4, which follows from A5 assuming A1-A3 and B2, so your suspicion that these imply C4 is stronger than my Q6, which is false, as I argue here.

However, without A5, it is actually much easier to see that this doesn't work. The counterexample here satisfies my A1-A3, your weaker version of B2, your B3, and violates C4.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-16T18:49:42.547Z · LW · GW

Your B3 is equivalent to A4 (assuming A1-3).

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-16T18:45:06.696Z · LW · GW

Your B2 is going to rule out a bunch of concave functions. I was hoping to only use axioms consistent with all (continuous) concave functions.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T08:12:37.022Z · LW · GW

I am skeptical that it will be possible to salvage any nice VNM-like theorem here that makes it all the way to concavity. It seems like the jump necessary to fix this counterexample will be hard to express in terms of only a preference relation.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T08:09:20.803Z · LW · GW

The answers to Q3, Q4 and Q6 are all no. I will give a sketchy argument here.

Consider the one dimensional case, where the lotteries are represented by real numbers in the interval , and consider the function  given by . Let  be the preference order given by  if and only if .

 is continuous and quasi-concave, which means  is going to satisfy A1, A2, A3, A4, and B2. Further, since  is monotonically increasing up to the unique argmax, and then monotonically decreasing,  is going to satisfy A5. 

 is not concave, but we need to show there is not another concave function giving the same preference relation as . The only way to keep the same preference relation is to compose  with a strictly monotonic function , so ).

If  is smooth, we have a problem, since . However, since,  must be on some , but concavity would require  to be decreasing.

In order to remove the inflection point at , we need to flatten it out with some  that has infinite slope at . For example, we could take . However, any f that removes the inflection point at , will end up adding an inflection point at , which will have a infinite negate slope. This newly created inflection point will cause a problem for similar reasons.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T06:58:45.138Z · LW · GW

You can also think of A5 in terms of its contrapositive: For all , if , then for all 

This is basically just the strict version of A4. I probably should have written it that way instead. I wanted to use  instead of , because it is closer to the base definition, but that is not how I was natively thinking about it, and I probably should have written it the way I think about it.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T06:41:10.461Z · LW · GW

Alex's counterexample as stated is not a counterexample to Q4, since it is in fact concave.
 

I believe your counterexample violates A5, taking , and .

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T05:07:06.512Z · LW · GW

That does not rule out your counterexample. The condition is never met in your counterexample.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T04:32:32.817Z · LW · GW

The answer to Q1 is no, using the same counter example here. However, the spirit of my original question lives on in Q4 (and Q6).

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T04:14:42.388Z · LW · GW

Claim: A1, A2, A3, A5, and B2 imply A4.

Proof: Assume we have a preference ordering that satisfies A1, A2, A3, A5, and B2, and consider lotteries , and , with . Let . It suffices to show . Assume not, for the purpose of contradiction. Then (by axiom A1), . Thus by axiom B2 there exists a  such that . By axiom A3, we may assume  for some . Observe that  where  is positive, since otherwise . Thus, we can apply A5 to get that since , we have . Thus , a contradiction.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T03:19:02.295Z · LW · GW

Oh, nvm, that is fine, maybe it works.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T03:14:20.748Z · LW · GW

Oh, no, I made a mistake, this counterexample violates A3. However, the proposed fix still doesn't work, because you just need a function that is decreasing in probability of , but does not hit 0, and then jumps to 0 when probability of  is 1.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T03:10:38.626Z · LW · GW

I haven't actually thought about whether A5 implies A4 though. It is plausible that it does. (together with A1-A3, or some other simple axioms,)

When , we get A4 from A5, so it suffices to replace A4 with the special case that . If , and , a mixture of  and , then all we need to do is have any Y such that , then we can get  between  and  by A3, and then  will also be a mixture of  and , contradicting A5, since .

A1,A2,A3,A5 do not imply A4 directly, because you can have the function that assigns utility 0 to a fair coin flip between two options, and utility 1 to everything else. However, I suspect when we add the right axiom to imply continuity, I think that will be sufficient to also allow us to remove A4, and only have A5.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T03:08:52.579Z · LW · GW

(and everywhere you say "good" and "bad", they are the non-strict versions of the words)

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T02:52:57.722Z · LW · GW

Your understanding of A4 is right. In A5, "good" should be replaced with "bad."

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T02:50:45.862Z · LW · GW

You have the inequality backwards. You can't apply A5 when the mixture is better than the endpoint, only when the mixture is worse than the endpoint.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T02:00:21.424Z · LW · GW

That proposed axiom to add does not work. Consider the function on lotteries over  that gives utility 1 if  is supported, and otherwise gives utility equality to the probability of . This function is concave but not continuous, satisfies A1-A5 and the extra axiom I just proposed, and cannot be made continuous.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T01:53:44.994Z · LW · GW

I edited the post to remove the continuity assumption from the main conclusion. However, my guess is that if we get a VNM-like result, we will want to add back in another axiom that gives us continuity,

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T01:31:12.362Z · LW · GW

I meant the conclusions to all be adding to the previous one, so this actually also answers the main question I stated, by violating continuity, but not the main question I care about. I will edit the post to say that I actually care about concavity, even without continuity.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T01:23:51.716Z · LW · GW

Nice! This, of course, seems like something we should salvage, by e.g. adding an axiom that if A is strictly preferred to B, there should be a lottery strictly between them.

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T01:00:36.106Z · LW · GW

To see why A1-A4 is not enough to prove C4 on its own, consider the preference relation on the space of lotteries between two outcomes X and Y such that all lotteries are equivalent if , and if , higher values of  are preferred. This satisfies A1-A4, but cannot be expressed with a concave function, since we would have to have , contradicting concavity. We can, however express it with a quasi-concave function: .

Comment by Scott Garrabrant on Concave Utility Question · 2023-04-15T00:47:42.432Z · LW · GW

I believe using A4 (and maybe also A5) in multiple places will be important to proving a positive result. This is because A1, A2, and A3 are extremely week on their own. 

A1-A3 is not even enough to prove C1. To see a counterexample, take any well ordering on , and consider the preference ordering over the space of lotteries on a two element set of deterministic outcomes. If two lotteries have probabilities of the first outcome that differ by a rational number, they are equivalent, otherwise, you compare them according to your well ordering. This clearly satisfies A1 and A2, and it satisfies A3, since every nonempty open set contains lotteries incomparable with any given lottery. However, has a continuum length ascending chain of strict preference, and so cant be captured in a function to the interval.

Further, one might hope that C1 together with A3 would be enough to conclude C2, but this is also not possible, since there are discontinuous functions on the simplex that are continuous when restricted to any line segment in the domain. 

In both of these cases, it seems to me like there is hope that A4 provides enough structure to eliminate the pathological counterexamples, since there is much less you can do with convex upsets.

Comment by Scott Garrabrant on Review of AI Alignment Progress · 2023-02-10T19:28:37.123Z · LW · GW

Even if EUM doesn't get "utility", I think it at least gets "utility function", since "function" implies cardinal utility rather than ordinal utility and I think people almost always mean EUM when talking about cardinal utility.

I personally care about cardinal utility, where the magnitude of the utility is information about how to aggregate rather than information about how to take lotteries, but I think this is a very small minority usage of cardinal utility, so I don't think it should change the naming convention very much. 

Comment by Scott Garrabrant on Review of AI Alignment Progress · 2023-02-10T19:17:11.559Z · LW · GW

I think UDT as you specified it has utility functions. What do you mean by doesn't have independence? I am advocating for an updateless agent model that might strictly prefer a mixture between outcomes A and B to either A or B deterministically. I think an agent model with this property should not be described as having a "utility." Maybe I am conflating "utility" with expected utility maximization/VNM and you are meaning something more general? 

If you mean by utility something more general than utility as used in EUM, then I think it is mostly a terminological issue. 

I think I endorse the word "utility" without any qualifiers as referring to EUM. In part because I think that is how it is used, and in part because EUM is nice enough to deserve the word utility.

Comment by Scott Garrabrant on Review of AI Alignment Progress · 2023-02-08T16:54:47.274Z · LW · GW

Although I note that my flavor of rejecting utility functions is trying to replace them with something more general, not something incompatible.

Comment by Scott Garrabrant on Review of AI Alignment Progress · 2023-02-08T16:48:03.945Z · LW · GW

I feel like reflective stability is what caused me to reject utility. Specifically, it seems like it is impossible to be reflectively stable if I am the kind of mind that would follow the style of argument given for the independence axiom. It seems like there is a conflict between reflective stability and Bayesian updating. 

I am choosing reflective stability, in spite of the fact that loosing updating is making things very messy and confusing (especially in the logical setting), because reflective stability is that important.

When I lose updating, the independence axiom, and thus utility goes along with it.

Comment by Scott Garrabrant on Basics of Rationalist Discourse · 2023-01-27T07:52:05.710Z · LW · GW

I think the short statement would be a lot weaker (and better IMO) if "inability" were replaced with "inability or unwillingness". "Inability" is implying a hierarchy where falsifiable statements are better than the poetry, since the only reason why you would resort to poetry is if you are unable to turn it into falsifiable statements.

Comment by Scott Garrabrant on Basics of Rationalist Discourse · 2023-01-27T06:01:15.881Z · LW · GW

I would also love a more personalized/detailed description of how I made this list, and what I do poorly. 

I think I have imposter syndrome here. My top guess is that I do actually have some skill in communication/discourse, but my identity/inside view really wants to reject this possibility. I think this is because I (correctly) think of myself as very bad at some of the subskills related to passing people's ITTs.

Comment by Scott Garrabrant on Geometric Rationality is Not VNM Rational · 2023-01-08T14:07:19.040Z · LW · GW

From listening to that podcast, it seems like even she would not advocate for preferring a lottery between two outcomes to either of the pure components.

Comment by Scott Garrabrant on Finite Factored Sets · 2023-01-07T21:52:56.694Z · LW · GW

This underrated post is pretty good at explaining how to translate between FFSs and DAGs.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-29T00:56:20.823Z · LW · GW

Hmm, examples are hard. Maybe the intuitions contribute to concept of edge instantiation?

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T21:37:18.465Z · LW · GW

I note that EU maximization has this baggage of never strictly preferring a lottery over outcomes to the component outcomes, and you steelmen appear to me to not carry that baggage. I think that baggage is actually doing work in some people's reasoning and intuitions.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T19:06:36.140Z · LW · GW

I am not sure if there is any disagreement in this comment. What you say sounds right to me. I agree that UDT does not really set us up to want to talk about "coherence" in the first place, which makes it weird to have it be formalized in term of expected utility maximization.

This does not make me think intelligent/rational agents will/should converge to having utility.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T18:57:40.952Z · LW · GW

Yeah, I don't have a specific UDT proposal in mind. Maybe instead of "updateless" I should say "the kind of mind that might get counterfactually mugged" as in this example.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T18:54:19.172Z · LW · GW

FDT and UDT are formulated in terms of expected utility. I am saying that the they advocate for a way of thinking about the world that makes it so that you don't just Bayesian update on your observations, and forget about the other possible worlds.

Once you take on this worldview, the Dutch books that made you believe in expected utility in the first place are less convincing, so maybe we want to rethink utility.

I don't know what the FDT authors were thinking, but it seems like they did not propagate the consequences of the worldview into reevaluating what preferences over outcomes look like.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T18:44:06.512Z · LW · GW

No, at least probably not at the time that we lose all control. 

However, I expect that systems that are self-transparent and can easily sellf-modify might quickly converge to reflective stability (and thus updatelessness). They might not, but I think the same arguments that might make you think they would develop a utility function also can be used to argue that they would develop updatelessness (and thus possibly also not develop a utility function).

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T18:35:32.151Z · LW · GW

Here is a situation where you make an "observation" and can still interact with the other possible worlds. Maybe you do not want to call this an observation, but if you don't call it an observation, then true observations probably never really happen in practice.

I was not trying to say that is relevant to the coin flip directly. I was trying to say that the move used to justify the coin flip is the same move that is rejected in other contexts, and so we should open to the idea of agents that refuse to make that move, and thus might not have utility.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T00:29:06.596Z · LW · GW

I think UDT is as you say. I think it is also important to clarify that you are not updating on your observations when you decide on a policy. (If you did, it wouldn't really be a function from observations to actions, but it is important to emphasize in UDT.)

Note that I am using "updateless" differently than "UDT". By updateless, I mostly mean anything that is not performing Bayesian updates and forgetting the other possible worlds when it makes observations. UDT is more of a specific proposal. "Updateless" is more of negative property, defined by lack of updating.

I have been trying to write a big post on utility, and haven't yet, and decided it would be good to give a quick argument here because of the question. The only posts I remember making against utility are in the geometric rationality sequence, especially this post.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-28T00:08:05.279Z · LW · GW

You could take as an input parameter to UDT a preference ordering over lotteries that does not satisfy the independence axiom, but is a total order (or total preorder if you want ties). Each policy you can take results in a lottery over outcomes, and you take the policy that gives your favorite lottery. There is no need for the assumption that your preferences over lotteries is vNM.

Note that I don't think that we really understand decision theory, and have a coherent proposal. The only thing I feel like I can say confidently is that if you are convinced by the style of argument that is used to argue for the independence axiom, then you should probably also be convinced by arguments that cause you to be updateful and thus not reflectively stable.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-27T23:55:28.890Z · LW · GW

Also, if by "have a utility function" you mean something other than "try to maximize expected utility," I don't know what you mean. To me, the cardinal (as opposed to ordinal) structure of preferences that makes me want to call something a "utility function" is about how to choose between lotteries.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-27T23:52:29.976Z · LW · GW

Note that I am not saying here that rational agents can't have a utility function. I am only saying that they don't have to.

Comment by Scott Garrabrant on Why The Focus on Expected Utility Maximisers? · 2022-12-27T23:40:42.930Z · LW · GW

That depends on what you mean by "suitably coherent." If you mean they need to satisfy the independence vNM axiom, then yes. But the point is that I don't see any good argument why updateless agents should satisfy that axiom. The argument for that axiom passes through wanting to have a certain relationship with Bayesian updating.