samshap's Shortform

samshap

samshap's Shortform

post by samshap · 2021-03-12T07:53:08.778Z · LW · GW · 7 comments

7 comments

7 comments

Comments sorted by top scores.

comment by samshap · 2021-04-19T20:10:44.917Z · LW(p) · GW(p)

Is this a failure of inner or outer alignment?

http://smbc-comics.com/comic/ai-6

comment by samshap · 2021-03-12T07:53:08.995Z · LW(p) · GW(p)

Redissolving sleeping beauty (and maybe solving it entirely)

[epistemic status - I'm new to thinking about anthropics, but I don't see any obvious flaws]

If a tree falls on sleeping beauty [LW · GW] famously claims to have dissolved the Sleeping Beauty problem - that SB's correct answer just depended on what the reward structure for her answers, and that her actual credance didn't matter.

Several lesswrongers [LW · GW] seem unsatisfied with that answer - understandably, given a longstanding commitment to epistemics and Bayesianism!

I would argue that ata did some key work in answering the problem from a purely epistemic perspective.

Recall the question SB is to be asked upon waking:

Each interview consists of one question, “What is your credence now for the proposition that our coin landed heads?”

And one of the bets ata formulated:

Each interview consists of one question, “What is your credence now for the proposition that our coin landed heads?”, and the answer given will be scored according to a logarithmic scoring rule, with the aggregate result corresponding to the number of utilons (converted to dollars, let’s say) she will be penalized after the experiment.

These questions are actually equivalent! A properly calibrated belief is one that is optimal w.r.t to the logarithmic scoring rule.

ata goes on to show that the answer to that question is 1/3. This result, I think, is actually contingent on the meaning of 'aggregate'. If 'aggregate' just means 'sum over all predictions ever', then ata's math checks out, the thirders are right, and the problem is solved.

However, given the premise of SB - in case of tails, she forgets everything that happened on Monday - you could argue for 'aggregate' meaning 'sum over all predictions she remembers making', in which case the correct answer is one half. Or if we include the log score for predictions that she was told she made, (say because the interviewers wrote it down and told her afterwards), then the answer becomes 1/3 again!

So the SB paradox boils down to what you, as an epistemic rationalist, consider the correct way to aggregate the entropy of predictions!

The 'sum over all predictions' seems best to me (and thus I suppose I lean to the 1/3 answer), but I don't have a definitive reason as to why.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2021-03-12T13:01:59.522Z · LW(p) · GW(p)

Sleeping Beauty illustrates the consequences of following general epistemic principles. Merely finding an assignment of probabilities that's optimal for a given way of measuring outcomes is appeal to consequences, on its own it doesn't work as a general way of managing knowledge (though some general ways of managing knowledge might happen to assign probabilities so that the consequences are optimal, in a given example). In principle consequentialism makes superfluous any particular elements of agent design, including those pertaining to knowledge. But that observation doesn't help with designing specific ways of working with knowledge.

Replies from: samshap

↑ comment by samshap · 2021-03-12T14:47:38.606Z · LW(p) · GW(p)

My argument is that the log scoring rule is not just a "given way of measuring outcomes". A belief that maximizes E(log(p)) is the definition of a proper Bayesian belief. There's no appeal to consequence other than "SB's beliefs are well calibrated".

Replies from: interstice

↑ comment by interstice · 2021-03-14T23:02:47.195Z · LW(p) · GW(p)

Isn't this kind of circular? The justification for the logarithmic scoring rule is that it gets agents to report their true beliefs, in contexts where such beliefs clearly make sense(no anthropic weirdness, in particular), and where agents have utlities linear in money. Extending this as definition to situations where such beliefs don't make sense seems arbitrary.

Replies from: samshap

↑ comment by samshap · 2021-03-15T05:44:08.676Z · LW(p) · GW(p)

Do you have some source for saying the log scoring rule should only be used when no anthropics are involved? Without that, what does it even mean to have a well-calibrated belief?

(BTW, there are other nice features of using the log-scoring rule, such as rewarding models that minimize their cross-entropy with the territory).

Replies from: interstice

↑ comment by interstice · 2021-03-15T21:26:12.433Z · LW(p) · GW(p)

I mean, there's nothing wrong with using the log scoring rule. But since the implied probabilities will change depending on how you aggregate the utilities, it doesn't seem to me that it gets us any closer to a truly objective, consequence-free answer -- 'objective probability' is still meaningless here, it all depends on the bet structure.

samshap's Shortform

Contents

7 comments

Redissolving sleeping beauty (and maybe solving it entirely)