Posts

Invulnerable Incomplete Preferences: A Formal Statement 2023-08-30T21:59:36.186Z
Rational Unilateralists Aren't So Cursed 2023-07-04T12:19:12.048Z
What’s this probability you’re reporting? 2023-04-14T15:07:42.844Z

Comments

Comment by Sami Petersen (sami-petersen) on Wei Dai's Shortform · 2024-03-04T12:37:52.288Z · LW · GW

It may be worth thinking about why proponents of a very popular idea in this community don't know of its academic analogues, despite them having existed since the early 90s[1] and appearing on the introductory SEP page for dynamic choice.

Academics may in turn ask: clearly LessWrong has some blind spots, but how big?

  1. ^

    And it's not like these have been forgotton; e.g., McClennen's (1990) work still gets cited regularly.

Comment by Sami Petersen (sami-petersen) on Meaning & Agency · 2024-01-11T19:04:35.580Z · LW · GW

I argued that the signal-theoretic[1] analysis of meaning (which is the most common Bayesian analysis of communication) fails to adequately define lying, and fails to offer any distinction between denotation and connotation or literal content vs conversational implicature.

In case you haven't come accross this, here are two papers on lying by the founders of the modern economics literature on communication. I've only skimmed your discussion but if this is relevant, here's a great non-technical discussion of lying in that framework. A common thread in these discussions is that the apparent "no-lying" implication of the analysis of language in the Lewis-Skyrms/Crawford-Sobel signalling tradition relies importantly on common knowledge of rationality and, implicitly, on common knowledge of the game being played, i.e. of the available actions and all the players' preferences.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-31T11:28:12.188Z · LW · GW

In your example, DSM permits the agent to end up with either A+ or B. Neither is strictly dominated, and neither has become mandatory for the agent to choose over the other. The agent won't have reason to push probability mass from one towards the other.

You can think of me as trying to run an obvious-to-me assertion test on code which I haven't carefully inspected, to see if the result of the test looks sane.

This is reasonable but I think my response to your comment will mainly involve re-stating what I wrote in the post, so maybe it'll be easier to point to the relevant sections: 3.1. for what DSM mandates when the agent has beliefs about its decision tree, 3.2.2 for what DSM mandates when the agent hadn't considered an actualised continuation of its decision tree, and 3.3. for discussion of these results. In particular, the following paragraphs are meant to illustrate what DSM mandates in the least favourable epistemic state that the agent could be in (unawareness with new options appearing):

It seems we can’t guarantee non-trammelling in general and between all prospects. But we don’t need to guarantee this for all prospects to guarantee it for some, even under awareness growth. Indeed, as we’ve now shown, there are always prospects with respect to which the agent never gets trammelled, no matter how many choices it faces. In fact, whenever the tree expansion does not bring about new prospects, trammelling will never occur (Proposition 7). And even when it does, trammelling is bounded above by the number of comparability classes (Proposition 10).

And it’s intuitive why this would be: we’re simply picking out the best prospects in each class. For instance, suppose prospects were representable as pairs  that are comparable iff the -values are the same, and then preferred to the extent that  is large. Then here’s the process: for each value of , identify the options that maximise . Put all of these in a set. Then choice between any options in that set will always remain arbitrary; never trammelled.

Comment by Sami Petersen (sami-petersen) on What's Hard About The Shutdown Problem · 2023-10-23T23:51:00.212Z · LW · GW

The key question is whether the revealed preferences are immune to trammelling. This was a major point of confusion for me in discussion with Sami - his proposal involves a set of preferences passed into a decision rule, but those “preferences” are (potentially) different from the revealed preferences. (I'm still unsure whether Sami's proposal solves the problem.)

I claim that, yes, the revealed preferences in this sense are immune to trammeling. I'm happy to continue the existing discussion thread but here's a short motivation: what my results about trammelling show is that there will always be multiple (relevant) options between which the agent lacks a preference and the DSM choice rule does not mandate picking one over another. The agent will not try to push probability mass toward one of those options over another.

Comment by Sami Petersen (sami-petersen) on What's Hard About The Shutdown Problem · 2023-10-23T23:39:02.210Z · LW · GW

(I learned from Sami’s post that this is called “trammelling” of incomplete preferences.)

Just for reference: this isn't a standard term of art; I made it up. Though I do think it's fitting.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-21T14:04:03.212Z · LW · GW

Great, I think bits of this comment help me understand what you're pointing to.

the desired behavior implies a revealed preference gap

I think this is roughly right, together with all the caveats about the exact statements of Thornley's impossibility theorems. Speaking precisely here will be cumbersome so for the sake of clarity I'll try to restate what you wrote like this:

  1. Useful agents satisfying completeness and other properties X won't be shutdownable.
  2. Properties X are necessary for an agent to be useful.
  3. So, useful agents satisfying completeness won't be shutdownable.
  4. So, if a useful agent is shutdownable, its preferences are incomplete.

This argument would let us say that observing usefulness and shutdownability reveals a preferential gap.

I think the question I'm interested in is: "do trammelling-style issues imply that DSM agents will not have a revealed preference gap (under reasonable assumptions about their environment and capabilities)?"

A quick distinction: an agent can (i) reveal p, (ii) reveal ¬p, or (iii) neither reveal p nor ¬p. The problem of underdetermination of preference is of the third form.

We can think of some of the properties we've discussed as 'tests' of incomparability, which might or might not reveal preferential gaps. The test in the argument just above is whether the agent is useful and shutdownable. The test I use for my results above (roughly) is 'arbitrary choice'. The reason I use that test is that my results are self-contained; I don't make use of Thornley's various requirements for shutdownability. Of course, arbitrary choice isn't what we want for shutdownability. It's just a test for incomparability that I used for an agent that isn't yet endowed with Thornley's other requirements.

The trammelling results, though, don't give me any reason to think that DSM is problematic for shutdownability. I haven't formally characterised an agent satisfying DSM as well as TND, Stochastic Near-Dominance, and so on, so I can't yet give a definitive or exact answer to how DSM affects the behaviour of a Thornley-style agent. (This is something I'll be working on.) But regarding trammelling, I think my results are reasons for optimism if anything. Even in the least convenient case that I looked at—awareness growth—I wrote this in section 3.3. as an intuition pump:

we’re simply picking out the best prospects in each class. For instance, suppose prospects were representable as pairs  that are comparable iff the -values are the same, and then preferred to the extent that  is large. Then here’s the process: for each value of , identify the options that maximise . Put all of these in a set. Then choice between any options in that set will always remain arbitrary; never trammelled.

That is, we retain the preferential gap between the options we want a preferential gap between.


[As an aside, the description in your first paragraph of what we want from a shutdownable agent doesn't quite match Thornley's setup; the relevant part to see this is section 10.1. here.]

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-20T09:09:13.707Z · LW · GW

On my understanding, the argument isn’t that your DSM agent can be made better off, but that the reason it can’t be made better off is because it is engaging in trammeling/“collusion”, and that the form of “trammeling” you’ve ruled out isn’t the useful kind.

I don't see how this could be right. Consider the bounding results on trammelling under unawareness (e.g. Proposition 10). They show that there will always be a set of options between which DSM does not require choosing one over the other. Suppose these are X and Y. The agent will always be able to choose either one. They might end up always choosing X, always Y, switching back and forth, whatever. This doesn't look like the outcome of two subagents, one preferring X and the other Y, negotiating to get some portion of the picks.

As far as an example goes, consider a sequence of actions which, starting from an unpressed world state, routes through a pressed world state (or series of pressed world states), before eventually returning to an unpressed world state with higher utility than the initial state.

Forgive me; I'm still not seeing it. For coming up with examples, I think for now it's unhelpful to use the shutdown problem, because the actual proposal from Thornley includes several more requirements. I think it's perfectly fine to construct examples about trammelling and subagents using something like this: A is a set of options with typical member . These are all comparable and ranked according to their subscripts. That is,  is preferred to , and so on. Likewise with set B. And all options in A are incomparable to all options in B.

If your proposed DSM agent passes up this action sequence on the grounds that some of the intermediate steps need to bridge between “incomparable” pressed/unpressed trajectories, then it does in fact pass up the certain gain. Conversely, if it doesn’t pass up such a sequence, then its behavior is the same as that of a set of negotiating subagents cooperating in order to form a larger macroagent.

This looks to me like a misunderstanding that I tried to explain in section 3.1. Let me know if not, though, ideally with a worked-out example of the form: "here's the decision tree(s), here's what DSM mandates, here's why it's untrammelled according to the OP definition, and here's why it's problematic."

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-19T19:13:16.789Z · LW · GW

That makes sense, yeah.

Let me first make some comments about revealed preferences that might clarify how I'm seeing this. Preferences are famously underdetermined by limited choice behaviour. If A and B are available and I pick A, you can't infer that I like A more than B — I might be indifferent or unable to compare them. Worse, under uncertainty, you can't tell why I chose some lottery over another even if you assume I have strict preferences between all options — the lottery I choose depends on my beliefs too. In expected utility theory, beliefs and preferences together induce choice, so if we only observe a choice, we have one equation in two unknowns.[1] Given my choice, you'd need to read my mind's probabilities to be able to infer my preferences (and vice versa).[2]

In that sense, preferences (mostly) aren't actually revealed. Economists often assume various things to apply revealed preference theory, e.g. setting beliefs equal to 'objective chances', or assuming a certain functional form for the utility function.

But why do we care about preferences per se, rather than what's revealed? Because we want to predict future behaviour. If you can't infer my preferences from my choices, you can't predict my future choices. In the example above, if my 'revealed preference' between A and B is that I prefer A, then you might make false predictions about my future behaviour (because I might well choose B next time).

Let me know if I'm on the right track for clarifying things. If I am, could you say how you see trammelling/shutdown connecting to revealed preferences as described here, and I'll respond to that?

  1. ^

  2. ^

    The situation is even worse when you can't tell what I'm choosing between, or what my preference relation is defined over.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-19T08:25:54.566Z · LW · GW

I disagree; see my reply to John above.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-19T08:16:02.348Z · LW · GW

if the subagents representing a set of incomplete preferences would trade with each other to emulate more complete preferences, then an agent with the plain set of incomplete preferences would precommit to act in the same way

My results above on invulnerability preclude the possibility that the agent can predictably be made better off by its own lights through an alternative sequence of actions. So I don't think that's possible, though I may be misreading you. Could you give an example of a precommitment that the agent would take? In my mind, an example of this would have to show that the agent (not the negotiating subagents) strictly prefers the commitment to what it otherwise would've done according to DSM etc.

Yeah, I wasn't using Bradley. The full set of coherent completions is overkill, we just need to nail down the partial order.

I agree the full set won't always be needed, at least when we're just after ordinal preferences, though I personally don't have a clear picture of when exactly that holds.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-18T23:34:23.214Z · LW · GW

On John's-simplified-model-of-Thornley's-proposal, we have complete preference orderings over trajectories-in-which-the-button-isn't-pressed and trajectories-in-which-the-button-is-pressed, separately, but no preference between any button-pressed and button-not-pressed trajectory pair.

For the purposes of this discussion, this is right. I don't think the differences between this description and the actual proposal matter in this case.

Represented as subagents, those incomplete preferences require two subagents:

  • One subagent always prefers button pressed to unpressed, is indifferent between unpressed trajectories, and has the original complete order on pressed trajectories.
  • The other subagent always prefers button unpressed to pressed, is indifferent between pressed trajectories, and has the original complete order on unpressed trajectories.

I don't think this representation is quite right, although not for a reason I expect to matter for this discussion. It's a technicality but I'll mention it for completeness. If we're using Bradley's representation theorem from section 2.1., the set of subagents must include every coherent completion of the agent's preferences. E.g., suppose there are three possible trajectories. Let  denote a pressed trajectory and  two unpressed trajectories, where  gets you strictly more coins than . Then there'll be five (ordinal) subagents, described in order of preference:  , , and .

But you and Thornley want to interpret "no preference" as "just choose arbitrarily/randomly" rather than "don't trade either for the other", so... this agent just chooses arbitrarily/randomly between the best-available pressed-option and the best-available unpressed-option? But that would imply that the agent is choosing (albeit arbitrarily/randomly) between button-pressed and button-unpressed, which is not what we want, so presumably you're imagining something else?

Indeed, this wouldn't be good, and isn't what Thornley's proposal does. The agent doesn't choose arbitrarily between the best pressed vs unpressed options. Thornley's proposal adds more requirements on the agent to ensure this. My use of 'arbitrary' in the post is a bit misleading in that context. I'm only using it to identify when the agent has multiple permissible options available, which is what we're after to get TND. If no other requirements are added to the agent, and it's acting under certainty, this could well lead it to actually choose arbitrarily. But it doesn't have to in general, and under uncertainty and together with the rest of Thornley's requirements, it doesn't. (The requirements are described in his proposal.)

With that flagged, on to the trammelling issue. The potential problem is that the two subagents might want to trade, so that the system sometimes tries to make the button be pressed (in cases which get relatively high pressed-utility), and sometimes tries to make the button be unpressed (in cases which get relatively high unpressed-utility). (E.g. the pink arrows in the diagram.) And... locking in a trajectory at the start doesn't seem to help that problem at all? Like, the subagents do that trading in logical time (i.e. time zero), that adds preferences, and then sometimes they lock in a plan which involves manipulating the button.

I'll first flag that the results don't rely on subagents. Creating a group agent out of multiple subagents is possibly an interesting way to create an agent representable as having incomplete preferences, but this isn't the same as creating a single agent whose single preference relation happens not to satisfy completeness.

That said, I will spend some more time thinking about the subagent idea, and I do think collusion between them seems like the major initial hurdle for this approach to creating an agent with preferential gaps.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-18T22:25:05.077Z · LW · GW

Why does wanting to maintain indifference to shifting probability mass between (some) trajectories, imply that we care about ex-ante permissibility?

The ex-ante permissible trajectories are the trajectories that the agent lacks any strict preference between. Suppose the permissible trajectories are {A,B,C}. Then, from the agent's perspective, A isn't better than B, B isn't better than A, and so on. The agent considers them all equally choiceworthy. So, the agent doesn't mind picking any one of them over any other, nor therefore switching from one lottery over them with some distribution to another lottery with some other distribution. The agent doesn't care whether it gets A versus B, versus an even chance of A or B, versus a one-third chance of A, B, or C.[1]

Suppose we didn't have multiple permissible options ex-ante. For example, if only A was permissible, then the agent would dislike shifting probability mass away from A and towards B or C—because B and C aren't among the best options.[2] So that's why we want multiple ex-ante permissible trajectories: it's the only way to maintain indifference to shifting probability mass between (those) trajectories.

[I'll respond to the stuff in your second paragraph under your longer comment.]

  1. ^

    The analogous case with complete preferences is clearer: if there are multiple permissible options, the agent must be indifferent between them all (or else the agent would be fine picking a strictly dominated option). So if  options are permissible, then . Assuming expected utility theory, we'll then of course have  for any probability functions . This means the agent is indifferent to shifting probability mass between the permissible options.

  2. ^

    This is a bit simplified but it should get the point across.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-10-16T21:35:34.965Z · LW · GW

This is a tricky topic to think about because it's not obvious how trammelling could be a worry for Thornley's Incomplete Preference Proposal. I think the most important thing to clarify is why care about ex-ante permissibility. I'll try to describe that first (this should help with my responses to downstream concerns).
 

Big picture

Getting terminology out of the way: words like "permissibility" and "mandatory" are shorthand for rankings of prospects. A prospect is permissible iff it's in a choice set, e.g. by satisfying DSM. It's mandatory iff it's the sole element of a choice set.

To see why ex-ante permissibility matters, note that it's essentially a test to see which prospects the agent is either indifferent between or has a preferential gap between (and are not ranked below anything else). When you can improve a permissible prospect along some dimension and yet retain the same set of permissible prospects, for example, you necessarily have a preferential gap between those remaining prospects. In short, ex-ante permissibility tells you which prospects the agent doesn't mind picking between.

The part of the Incomplete Preference Proposal that carries much of the weight is the Timestep Near-Dominance (TND) principle for choice under uncertainty. One thing it does, roughly, is require that the agent does not mind shifting probability mass between trajectories in which the shutdown time differs. And this is where incompleteness comes in. You need preferential gaps between trajectories that differ in shutdown time for this to hold in general. If the agent had complete preferences over trajectories, it would have strict preferences between at least some trajectories that differ in shutdown time, giving it reason to shift probability mass by manipulating the button.

Why TND helps get you shutdownability is described in Thornley's proposal, so I'll refer to his description and take that as a given here. So, roughly, we're using TND to get shutdownability, and we're using incompleteness to get TND. The reason incompleteness helps is that we want to maintain indifference to shifting probability mass between certain trajectories. And that is why we care about ex-ante permissibility. We need the agent, when contemplating manipulating the button, not to want to shift probability mass in that direction. That'll help give us TND. The rest of Thornley's proposal includes further conditions on the agent such that it will in fact, ex-post, not manipulate the button. But the reason for the focus on ex-ante permissibility here is TND.
 

Miscellany

For purposes of e.g. the shutdown problem, or corrigibility more generally, I don't think I care about the difference between "mandatory" vs "actually chosen"?

The description above should help clear up why we care about multiple options being permissible and none mandatory: to help satisfy TND. What's "actually chosen" in my framework doesn't neatly connect to the Thornley proposal since he adds extra scaffolding to the agent to determine how it should act. But that's a separate issue.

The rough mental model I have of DSM is: at time zero, the agent somehow picks between a bunch of different candidate plans (all of which are "permissible", whatever that means), and from then on it will behave-as-though it has complete preferences consistent with that plan.
...
it sounds like the proposal in the post just frontloads all the trammelling - i.e. it happens immediately at timestep zero.

The notion of trammelling I'm using refers to the set of permissible options shrinking as a result of repeated choice. And I argued that there's no trammelling under certainty or uncertainty, and that trammelling under unawareness is bounded. Here's why I don't think you can see it as the agent behaving as if its preferences were complete.

Consider the case of static choice. It's meaningful to say that an agent has incomplete preferences. (I don't think you disagree with that but just for the sake of completeness, I'll give an example.) Suppose the agent has preferential gaps between all different-letter prospects. From {A,A+,B} the agent will pick either A+ or B. Suppose it picks B. That doesn't imply, say, that the agent can be thought of as having a strict preference for B over A+. After all, if you offered it {A,A+,B} once again, it might just pick A+, a contradiction. And you can set up something similar with transitivity to get a contradiction from inferring indifference between A+ and B.

Onto dynamic choice. As you write, it's reasonable to think of various dynamic choice principles as immediately, statically, choosing a trajectory at timestep zero. Suppose we do that. Then by the argument just above, it's still not appropriate to model the agent as having complete preferences at the time of choosing. We're not frontloading any trammelling; the set of ex-ante permissible prospects hasn't changed. And that's what we care about for TND.

I can kinda vaguely pattern-match the setup in this post to the problem: I want to have one "permissible" choice which involves the shutdown button not being pressed, and another "permissible" choice which involves the button being pressed, and I want these two choices to be incomparable to the agent. Now (my mental model of) the DSM rule says: when the agent is turned on, it somehow chooses between (two plans leading to) those two options, and from then on out acts as though it has complete preferences consistent with the choice - i.e. it either (follows a plan which) makes sure the button is pressed, or (follows a plan which) makes sure the button is not pressed, and actively prevents operators from changing it. Which sounds like not-at-all what I wanted for the shutdown problem!

Agreed! The ex-ante permissibility of various options is not sufficient for shutdownability. The rest of Thornley's proposal outlines how the agent has to pick (lotteries over) trajectories, which involves more than TND.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-09-30T13:09:08.875Z · LW · GW

Thanks Sylvester; fixed!

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-09-09T21:45:11.509Z · LW · GW

Thanks for saying!

This is an interesting topic. Regarding the discussion you mention, I think my results might help illustrate Elliott Thornley's point. John Wentworth wrote: 

That makes me think that the small decision trees implicitly contain a lot of assumptions that various trades have zero probability of happening, which is load-bearing for your counterexamples. In a larger world, with a lot more opportunities to trade between various things, I'd expect that sort of issue to be much less relevant.

My results made no assumptions about the size or complexity of the decision trees, so I don't think this itself is a reason to doubt my conclusion. More generally, if there exists some Bayesian decision tree that faithfully represents an agent's decision problem, and the agent uses the appropriate decision principles with respect to that tree, then my results apply. The existence of such a representation is not hindered by the number of choices, the number of options, or the subjective probability distributions involved.

I think my results under unawareness (section 3) are particularly likely to be applicable to complex real-world decision problems. The agent can be entirely wrong about their actual decision tree—e.g., falsely assigning probability zero to events that will occur—and yet appropriate opportunism remains and trammelling is bounded. This is because any suboptimal decision by an agent in these kinds of cases is a product of its epistemic state; not its preferences. Whether the agent's preferences are complete or not, it will make wrong turns in the same class of situations. The globally-DSM choice function will guarantee that the agent couldn't have done better given its knowledge and values, even if the agent's model of the world is wrong.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-09-06T22:54:34.829Z · LW · GW

Good question. They implicitly assume a dynamic choice principle and a choice function that leaves the agent non-opportunistic.

  • Their dynamic choice principle is something like myopia: the agent only looks at their node's immediate successors and, if a successor is yet another choice node, the agent represents it as some 'default' prospect.
  • Their choice rule is something like this: the agent assigns some natural 'default' prospect and deviates from it iff it prefers some other prospect. (So if some prospect is incomparable to the default, it's never chosen.)

These aren't the only approaches an agent can employ, and that's where it fails. It's wrong to conclude that "non-dominated strategy implies utility maximization" since we know from section 2 that we can achieve non-domination without completeness—by using a different dynamic choice principle and choice function.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-09-06T22:28:42.279Z · LW · GW

I take certainty to be a special case of uncertainty. Regarding proof, the relevant bit is here:

This argument does not apply when the agent is unaware of the structure of its decision tree, so I provide some formal results for these cases which bound the extent to which preferences can de facto be completed. ... These results apply naturally to cases in which agents are unaware of the state space, but readers sceptical of the earlier conceptual argument can re-purpose them to make analogous claims in standard cases of certainty and uncertainty.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-09-06T22:08:39.304Z · LW · GW

No, the codomain of gamma is the set of (distributions over) consequences.

Hammond's notation is inspired by the Savage framework in which states and consequences are distinct. Savage thinks of a consequence as the result of behaviour or action in some state, though this isn't so intuitively applicable in the case of decision trees. I included it for completeness but I don't use the gamma function explicitly anywhere.

Comment by Sami Petersen (sami-petersen) on Invulnerable Incomplete Preferences: A Formal Statement · 2023-09-06T21:45:19.976Z · LW · GW

It's the set of elementary states.

So  is an event (a subset of elementary states, ).

E.g., we could have  be all the possible worlds;  be the possible worlds in which featherless bipeds evolved; and  be our actual world.

Comment by Sami Petersen (sami-petersen) on Rational Unilateralists Aren't So Cursed · 2023-07-05T10:43:28.846Z · LW · GW

doesn’t Bostrom’s model of “naive unilateralists” by definition preclude updating on the behavior of other group members?

Yeah, this is right; it's what I tried to clarify in the second paragraph.

isn’t updating on the beliefs of others (as signaled by their behavior) an example of adopting a version of the “principle of conformity” that he endorses as a solution to the curse? If so, it seems like you are framing a proof of Bostrom’s point as a rebuttal to it.

The introduction of the post tries to explain how this post relates to Bostrom et al's paper (e.g., I'm not rebutting Bostrom et al). But I'll say some more here.

You're broadly right on the principle of conformity. The paper suggests a few ways to implement it, one of which is being rational. But they don't go so far as to endorse this because they consider it mostly unrealistic. I tried to point to some reasons it might not be. Bostrom et al are sceptical because (i) identical priors are assumed and (ii) it would be surprising for humans to be this thoughtful anyway. The derivation above should help motivate why identical priors are sufficient but not necessary for the main upshot, and what I included in the conclusion suggests that many humans—or at least some firms—actually do the rational thing by default.

But the main point of the post is to do what I explained in the introduction: correct misconceptions and clarify. My experience of informal discussions of the curse suggests people think of it as a flaw of collective action that applies to agents simpliciter, and I wanted to flesh out this mistake. I think the formal framework I used is better at capturing the relevant intuition than the one used in Bostrom et al.

Comment by Sami Petersen (sami-petersen) on What’s this probability you’re reporting? · 2023-04-15T12:27:17.327Z · LW · GW

probabilities should correspond to expected observations and expected observations only

FWIW I think this is wrong. There's a perfectly coherent framework—subjective expected utility theory (Jeffrey, Joyce, etc)—in which probabilities can correspond to many other things. Probabilities as credences can correspond to confidence in propositions unrelated to future observations, e.g., philosophical beliefs or practically-unobservable facts. You can unambiguously assign probabilities to 'cosmopsychism' and 'Everett's many-worlds interpretation' without expecting to ever observe their truth or falsity.

However, there is another source of uncertainty: observational uncertainty. The other person might be uncertain whether they have all the facts that feed into their model, or whether their observations are correct.

This is reasonable. If a deterministic model has three free parameters, two of which you have specificied, you could just use your prior over the third parameter to create a distribution of model outcomes. This kind of situation should be pretty easy to clarify though, by saying something like "my model predicts event E iff parameter A is above A*" and "my prior P(A>A*) is 50% which implies P(E)=50%."

But generically, the distribution is not coming from a model. It just looks like your all things considered credence that A>A*. I'd be hesitant calling a probability based on it your "inside view/model" probability.

Comment by Sami Petersen (sami-petersen) on Some Variants of Sleeping Beauty · 2023-03-01T17:54:24.928Z · LW · GW

These are great. Though Sleeping Mary can tell that she's colourblind on any account of consciousness. Whether or not she learns a phenomenal fact when going from 'colourblind scientist' to 'scientist who sees colour', she does learn the propositional fact that she isn't colourblind.

So, if she sees no colour, she ought to believe that the outcome of the coin toss is Tails. If she does see colour, both SSA and SIA say P(Heads)=1/2.