# Jokes Thread

post by JosephY · 2014-07-24T00:31:36.379Z · LW · GW · Legacy · 84 commentsThis is a thread for rationality-related or LW-related jokes and humor. Please post jokes (new or old) in the comments.

------------------------------------

Q: Why are Chromebooks good Bayesians?

A: Because they frequently update!

------------------------------------

A super-intelligent AI walks out of a box...

------------------------------------

Q: Why did the psychopathic utilitarian push a fat man in front of a trolley?

A: Just for fun.

## 84 comments

Comments sorted by top scores.

## comment by Ben_LandauTaylor · 2014-07-24T04:02:14.780Z · LW(p) · GW(p)

**How many rationalists does it take to change a lightbulb?**

Just one. They’ll take any excuse to change something.

**How many effective altruists does it take to screw in a lightbulb?**

Actually, it’s far more efficient if you convince someone *else* to screw it in.

**How many Giving What We Can members does it take to change a lightbulb?**

Fifteen have pledged to change it later, but we’ll have to wait until they finish grad school.

**How many MIRI researchers does it take to screw in a lightbulb?**

The problem is that there are multiple ways to parse that, and while it might naively seem like the ambiguity is harmless, it would actually be disastrous if *any* number of MIRI researchers tried to screw inside of a lightbulb.

**How many CFAR instructors does it take to change a lightbulb?**

By the time they’re done, the lightbulb should be able to change itself.

**How many Leverage Research employees does it take to screw in a lightbulb?**

I don’t know, but we have a team working to figure that out.

**How many GiveWell employees does it take to change a lightbulb?**

Not many. I don't recall the exact number; there’s a writeup somewhere on their site, if you care to check.

**How many cryonicists does it take to change a lightbulb?**

Two; one to change the lightbulb, and one to preserve the old one, just in case.

**How many neoreactionaries does it take to screw in a lightbulb?**

We’d be better off returning to the dark.

Replies from: None, None, sediment, Alsadius, FeepingCreature## ↑ comment by **[deleted]** ·
2014-07-24T20:42:20.943Z · LW(p) · GW(p)

**How many neoreactionaries does it take to screw in a lightbulb?**

Mu. We should all be using oil lamps instead, as oil lamps have been around for thousands of years, lightbulbs only a hundred. Also, oil lamps won't be affected by an EMP or solar flair. Reliable indoor lighting in general is a major factor in the increase of social degeneracy like nightclubs and premarital sex, and biological disorders like insomnia and depression. Lightbulbs are a cause and effect of social technology being outpaced by material conditions, and their place in society should be thoroughly reexamined, preferably via hundreds of blog posts and a few books. (Tangentially, blacks are five times more likely than whites to hate the smell of kerosene. How *interesting*.)

Alternatively, if you are already thoroughly pwned and/or gnoned, the answer is one, at a rate of $50 per lightbulb.

Edit: $45 if you or one of your friends has other electric work that could also be done. $40 if you are willing to take lessons later on how to fix your own damn house. $35 if you're willing to move to Idaho. $30 if you give a good reason to only charge $30 a bulb.

## ↑ comment by **[deleted]** ·
2014-07-27T06:42:07.846Z · LW(p) · GW(p)

The effective altruist comment just got me interested in effective altruism. I've seen the term thrown about, but I never bothered to look it up. Extrapolating from just the joke, I may be an effective altruist. Thanks for getting me interested in something I should have checked ages ago and for reminding me to look things up as I don't know them instead of just assuming I got the "gist of the passage."

Replies from: Ben_LandauTaylor## ↑ comment by Ben_LandauTaylor · 2014-07-27T17:10:16.080Z · LW(p) · GW(p)

Awesome. PM me if you want to talk more about effective altruism. (I'm currently staffing the EA Summit, so I may not reply swiftly.)

Replies from: polymathwannabe## ↑ comment by polymathwannabe · 2014-07-27T23:32:18.009Z · LW(p) · GW(p)

Yet another instance of comedy saving the world.

## ↑ comment by FeepingCreature · 2014-07-24T10:19:44.991Z · LW(p) · GW(p)

Congratulations. You win this thread.

## comment by pragmatist · 2014-07-24T17:58:03.977Z · LW(p) · GW(p)

Moral Philosopher: How would you characterize irrational behavior?

Economist: When someone acts counter to their preferences.

Moral Philosopher: Oh, that’s what we call virtue.

Replies from: BloodyShrimp## ↑ comment by BloodyShrimp · 2014-07-26T13:25:24.517Z · LW(p) · GW(p)

This seems a bit more like an Ayn Rand joke than a Less Wrong joke.

## comment by RichardKennaway · 2014-07-24T09:24:02.050Z · LW(p) · GW(p)

"Yields a joke when preceded by its quotation" yields a joke when preceded by its quotation.

Replies from: Alejandro1, RichardKennaway, pragmatist, Kawoomba, Kaj_Sotala## ↑ comment by Alejandro1 · 2014-07-24T09:40:55.137Z · LW(p) · GW(p)

"However, yields an even better joke (due to an extra meta level) when preceded by its quotation and a comma", however, yields an even better joke (due to an extra meta level) when preceded by its quotation and a comma.

Replies from: Dallas## ↑ comment by RichardKennaway · 2014-07-25T15:19:27.994Z · LW(p) · GW(p)

Q: What's quining?

A: "Is an example, when preceded by its quotation" is an example, when preceded by its quotation.

## ↑ comment by pragmatist · 2014-07-25T08:40:55.010Z · LW(p) · GW(p)

"Kind of misses the point of the joke" kind of misses the point of the joke.

## ↑ comment by Kaj_Sotala · 2014-07-25T06:55:59.557Z · LW(p) · GW(p)

Reminds me of A Self-Referential Story.

## comment by Algernoq · 2014-07-24T01:11:49.248Z · LW(p) · GW(p)

If I want something, it's Rational. If you want something, it's a cognitive bias.

Replies from: RichardKennaway## ↑ comment by RichardKennaway · 2014-07-24T06:22:44.603Z · LW(p) · GW(p)

If they want something, the world is mad and people are crazy.

Replies from: Alejandro1## ↑ comment by Alejandro1 · 2014-07-24T10:54:02.822Z · LW(p) · GW(p)

More succinctly: I am rational, you are biased, they are mind-killed.

Replies from: roystgnr## ↑ comment by roystgnr · 2014-07-24T15:41:28.474Z · LW(p) · GW(p)

None of these quite fit the "irregular verbs" pattern that Russell and others made famous; in those all three words should have overlapping denotations and merely greatly differ in connotations. Maybe "I use heuristics, you are biased, they are mind-killed", but there the "to use"/"to be" distinction still ruins it.

Replies from: pinyaka## comment by skeptical_lurker · 2014-07-24T20:48:49.463Z · LW(p) · GW(p)

Three rationalists walk into a bar.

The first one walks up to the bar, and orders a beer.

The second one orders a cider.

The third one says "Obviously you've never heard of Aumann's agreement theorem."

An exponentially large number of Boltzmann Brains experience the illusion of walking into bars, and order a combination of every drink imaginable.

An attractive woman goes into a bar, and enters into a drinking contest with Nick Bostrom. After repeatedly passing out she wakes up the next day with a hangover and a winning lottery ticket.

Three neoreactionaries walk into a bar

"Oh, how I hate these modern sluts" says the first one, watching some girls in miniskirts on the dancefloor "We should return to the 1950s when people acted respectably"

"Pfft, you call yourself reactionary?" replies the second "I idolise 11th century Austria, where people acted respectably and there were no ethnic minorites"

"Ahh, but I am even more reactionary then either of you" boasts the third "I long for classical Greece and Rome, the birthplace of western civilisation, where bisexuality was normal and people used to feast until they vomited!"

Replies from: FiftyTwo, Transfuturist## ↑ comment by FiftyTwo · 2014-07-26T04:17:29.112Z · LW(p) · GW(p)

I don't get the Bostrom one

Replies from: skeptical_lurker## ↑ comment by skeptical_lurker · 2014-07-26T06:12:23.986Z · LW(p) · GW(p)

I dunno whether its that funny, but its the sleeping beauty problem in anthropics, where you can alter subjective probabilities (e.g. of winning the lottery) by waking people up and giving them an amnesia-inducing drug. Only in this case, sleeping beauty is drunk.

Of course, explained like this it definitely isn't funny

## ↑ comment by Transfuturist · 2014-08-10T02:22:54.963Z · LW(p) · GW(p)

Where did the topic of neoreactionaries come up? (also your joke doesn't use a form of the word 'degeneracy,' minus ten points.)

## comment by James_Miller · 2014-07-24T00:52:18.865Z · LW(p) · GW(p)

Replies from: solipsistAll men are mortal. Socrates was mortal. Therefore, all men are Socrates.

## ↑ comment by solipsist · 2014-07-24T02:54:21.378Z · LW(p) · GW(p)

All syllogisms have three parts

Therefore, this is not a syllogism

Replies from: NancyLebovitz## ↑ comment by NancyLebovitz · 2014-07-24T11:39:50.637Z · LW(p) · GW(p)

I came up with that.

Replies from: solipsist## ↑ comment by solipsist · 2014-07-24T14:32:24.629Z · LW(p) · GW(p)

Cool! Before or after 1987?

Replies from: NancyLebovitz## ↑ comment by NancyLebovitz · 2014-07-24T15:13:27.232Z · LW(p) · GW(p)

Was the joke in that book? I'm pretty sure I've never read it, and I remember coming up with the joke.

Early 80s, I think. "All syllogisms" was one of my first mass-produced button slogans-- the business was started in 1977, but I took some years to start mass producing slogans.

My printing records say that I did 3 print runs in 1988, but that plausibly means that I had been selling the button for a while because I don't think I was doing 3 print runs at a time.

Replies from: solipsist## ↑ comment by solipsist · 2014-07-24T16:38:13.017Z · LW(p) · GW(p)

I thought I read the joke in Culture Made Stupid, but I can't find it now and am probably mistaken.

## comment by polymathwannabe · 2014-07-24T02:41:37.450Z · LW(p) · GW(p)

Not an actual joke, but every time I reread Ayn Rand's dictum "check your premises," I can hear in the distance Eliezer Yudkowsky discreetly coughing and muttering, "check your priors."

Replies from: Alsadius## ↑ comment by Alsadius · 2014-07-25T00:52:29.066Z · LW(p) · GW(p)

Both of those authors are known to use English in nonstandard ways for sake of an argument, so I'm actually now wondering if those two are as synonymous as they look.

Replies from: Viliam_Bur## ↑ comment by Viliam_Bur · 2014-07-25T09:54:54.176Z · LW(p) · GW(p)

Eliezer's version obviously includes probabilities. I don't know if Rand used any probabilistic premises, but on my very limited knowledge I would guess she didn't.

Replies from: Nornagest## ↑ comment by Nornagest · 2014-07-26T06:06:29.686Z · LW(p) · GW(p)

Not as I recall, although I haven't read Ayn Rand in something like fifteen years. Her schtick was more wild extrapolations of non-probabilistic logic.

Replies from: Alsadius## ↑ comment by Alsadius · 2014-07-28T15:25:26.045Z · LW(p) · GW(p)

Pretty much. I've actually gotten in a debate with a Randian on Facebook about what constitutes evidence. He doesn't seem to like Bayes' Theorem very much - he's busy talking about how we shouldn't refer to something as possible unless we have physical evidence of its possibility, because of epistemology.

Replies from: PrometheanFaun## ↑ comment by PrometheanFaun · 2014-08-01T22:33:34.440Z · LW(p) · GW(p)

That's contrary to my experience of epistimology. It's just a word, define it however you want, but in both epistemic logic and pragmatics-stripped conventional usage, *possibility* is nothing more than a lack of disproof.

## comment by ike · 2014-08-03T03:11:57.444Z · LW(p) · GW(p)

An AI robot and a human are hunting. The human is bitten by a snake, and is no longer breathing. The AI quickly calls 911. It yells "My hunting partner was bitten by a poisonous snake and I think they're dead!" The operator says "Calm down. First, make sure he's dead." A gunshot is heard. "Okay, now they're definitely dead."

## comment by B_For_Bandana · 2014-07-24T18:22:58.961Z · LW(p) · GW(p)

"I lack all conviction," he thought. "Guess I'm the best!"

Replies from: pragmatist## ↑ comment by pragmatist · 2014-07-25T12:52:04.812Z · LW(p) · GW(p)

I hate to be "that guy", but could you explain this one? I'm not sure I get it. Is it making fun of LW's "politics is the mindkiller"/"keep your identity small" mindset?

Replies from: arundelo## comment by advancedatheist · 2014-07-24T03:55:32.262Z · LW(p) · GW(p)

A cryonicist's response to someone who has kept you waiting:

"That's okay. I have forever."

## comment by **[deleted]** ·
2014-07-26T14:09:18.010Z · LW(p) · GW(p)

There was a young man from Peru Whose limericks stopped at line two.

There once was a man from Verdun.

And of course, there's the unfortunate case of the man named Nero...

## comment by skeptical_lurker · 2014-07-25T10:04:58.550Z · LW(p) · GW(p)

This isn't exactly rationalist, but it corrlates...

A man with Asperger's walks into a pub. He walks up to the bar, and says "I don't get it, what's the joke?"

Replies from: skeptical_lurker## ↑ comment by skeptical_lurker · 2014-07-26T06:08:09.315Z · LW(p) · GW(p)

Is this being downvoted due to being perceived as offensive, or because its not funny? I certainly did not intend it to be offensive, in fact I first saw it when reading a joke thread on an Asperger's forum.

Replies from: ChristianKl, Gurkenglas## ↑ comment by ChristianKl · 2014-07-27T00:53:58.167Z · LW(p) · GW(p)

I haven't downvoted it.

On the other hand laughing at how people with Asperger sometimes aren't socially skilled, hasn't much value.

Replies from: skeptical_lurker## ↑ comment by skeptical_lurker · 2014-07-27T14:00:02.696Z · LW(p) · GW(p)

Its supposed to be funny due to metahumour. I certainly agree that simply laughing at people with poor social skills is neither witty nor productive.

## ↑ comment by Gurkenglas · 2014-07-26T06:38:01.953Z · LW(p) · GW(p)

Can't speak for others, but anti-jokes are throughly explored already.

Replies from: None## comment by Metus · 2014-07-24T01:48:34.054Z · LW(p) · GW(p)

A Bayesian apparently is someone who after a single throw of a coin will believe that it is biased. Based on either outcome.

Also, why do 'Bayes', 'base' and 'bias' sound similar?

Replies from: Viliam_Bur## ↑ comment by Viliam_Bur · 2014-07-24T11:27:23.458Z · LW(p) · GW(p)

Heck, I had to stop and take a pen and paper to figure that out. Turns out, you were wrong. (I expected that, but I wasn't sure how specifically.)

As a simple example, imagine that my prior belief is that 0.1 of coins always provide head, 0.1 of coins always provide tails, and 0.8 of coins are fair. So, my prior belief is that 0.2 of coins are biased.

I throw a coin and it's... let's say... head. What are the posterior probabilities? Multiplying the prior probabilities with the likelihood of this outcome, we get 0.1 × 1, 0.8 × 0.5, and 0.1 × 0. Multiplied and normalized, it is 0.2 for the heads-only coin, and 0.8 for the fair coin. -- My posterior belief remains 0.2 for biased coin, only in this case I know how specifically it is biased.

The same will be true for any *symetrical* prior belief. For example, if I believe that 0.000001 of coins always provide head, 0.000001 of coins always provide tails, 0.0001 of coins provide head in 80% of cases, 0.0001 of coins provide tails in 80% of cases, and the rest are fair coins... again, after one throw my posterior probability of "a biased coin" will remain exactly the same, only the proportions of specific biases will change.

On the other hand, if my prior belief is asymetrical... let's say I believe that 0.1 of coins always provide head, and 0.9 of coins are fair (and there are no always-tails coins)... then yes, a single throw that comes up head *will* increase my belief that the coin was biased. (Because the outcome of tails would have decreased it.)

(Technically, a Bayesian superintelligence would probably believe that all coins are asymetrical. I mean, they have *different pictures* on their sides, that can influence the probabilities of the outcomes a little bit. But such a superintelligence would have believed that the coin was biased even *before* the first throw.)

## ↑ comment by Lumifer · 2014-07-24T14:33:21.207Z · LW(p) · GW(p)

Turns out, you were wrong.

Not so fast.

imagine that my prior belief is that 0.1 of coins always provide head, 0.1 of coins always provide tails, and 0.8 of coins are fair. So, my prior belief is that 0.2 of coins are biased.

Not quite. In your example 0.2 of coins are not *biased*, they are *predetermined* in that they always provide the same outcome no matter what.

Let's try a bit different example: the prior is that 10% of coins are biased towards heads (their probabilities are 60% heads, 40% tails), 10% are biased towards tails (60% tails, 40% heads), and 80% are fair.

After one throw (let's say it turned out to be heads) your posterior for the fair coin did not change, but your posterior for the heads-biased coin went up and for the tails-biased coin went down. Your expectation for the next throw is now skewed towards heads.

Replies from: Viliam_Bur, DanielLC## ↑ comment by Viliam_Bur · 2014-07-24T15:12:11.964Z · LW(p) · GW(p)

My expectation of "this coin is biased" did not change, but "my expectation of the next result of this coin" changed.

In other words, I changed by expectation that the next flip will be heads, but I didn't change my expectation that from the next 1000 flips approximately 500 will be heads.

Connotationally: If I believe that biased coins are very rare, then my expectation that the next flip will be heads increases only a little. More precisely, if the ratio of biased coins is *p*, my expectation for the next flip increases at most by approximately *p*. The update based on one coin flip does not contradict common sense, it is as small as the biased coins are rare; and as large as they are frequent.

## ↑ comment by Lumifer · 2014-07-24T16:25:57.730Z · LW(p) · GW(p)

My expectation of "this coin is biased" did not change

In this particular example, no, it did not. However if you switch to continuous probabilities (and think not in terms of binary is-biased/is-not-biased but rather in terms of the probability of the true mean not being 0.5 plus-minus epsilon) your estimate of the character of the coin will change.

Also

"my expectation of the next result of this coin" changed

and

but I didn't change my expectation that from the next 1000 flips approximately 500 will be heads.

-- these two statements contradict each other.

Replies from: Viliam_Bur, AlexMennen## ↑ comment by Viliam_Bur · 2014-07-24T17:54:29.202Z · LW(p) · GW(p)

"my expectation of the next result of this coin" changed

and

but I didn't change my expectation that from the next 1000 flips approximately 500 will be heads.

-- these two statements contradict each other.

Using my simplest example, because it's simplest to calculate:

Prior:

0.8 fair coin, 0.1 heads-only coin, 0.1 tails-only coin

probability "next is head" = 0.5

probability "next 1000 flips are approximately 500:500" ~ 0.8

Posterior:

0.8 fair coin, 0.2 heads-only coin

probability "next is head" = 0.6 (increased)

probability "next 1000 flips are approximately 500:500" ~ 0.8 (didn't change)

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-24T18:16:22.999Z · LW(p) · GW(p)

Um.

Probability of a head = 0.5 necessarily means that the expected number of heads in 1000 tosses is 500.

Probability of a head = 0.6 necessarily means that the expected number of heads in 1000 tosses is 600.

Replies from: Viliam_Bur, evand, James_Ernest## ↑ comment by Viliam_Bur · 2014-07-24T19:39:47.510Z · LW(p) · GW(p)

Are you playing with two different meanings of the word "expected" here?

If I roll a 6-sided die, the expected value is 3½.

But I don't really *expect to see* 3½ as an outcome of the roll. I expect to see either 1, or 2, or 3, or 4, or 5, or 6. But certainly not 3½.

If my model says that 0.2 coins are heads-only and 0.8 coins are fair, in 1000 flips I *expect to see* either 1000 heads (probability 0.2) or cca 500 heads (probability 0.8). But I don't *expect to see* cca 600 heads. Yet, the *expected value* of the number of heads in 1000 flips is 600.

## ↑ comment by evand · 2014-07-24T19:48:04.134Z · LW(p) · GW(p)

You can only multiply out P(next result is heads) * ( number of tosses) to get the expected number of heads if you believe those tosses are independent trials. The case of a biased coin toss explicitly violates this assumption.

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-24T20:21:20.908Z · LW(p) · GW(p)

But the tosses *are* independent trials, even for the biased coin. I think you mean the P(heads) is not 0.6, it's either 0.5 or 1, you just don't know which one it is.

## ↑ comment by evand · 2014-07-24T20:47:50.584Z · LW(p) · GW(p)

Which means that P(heads on toss after next|heads on next toss) != P(heads on toss after next|tails on next toss). Independence of A and B means that P(A|B) = P(A).

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-24T21:07:54.003Z · LW(p) · GW(p)

As long as you're using the same coin, P(heads on toss after next|heads on next toss) **==** P(heads on toss after next|tails on next toss).

You're confusing the probability of coin toss outcome with your knowledge about it.

Consider a RNG which generates *independent* samples from a normal distrubution centered on some -- unknown to you -- value mu. As you see more samples you get a better idea of what mu is and your expectations about what numbers you are going to see next change. But these samples do not become dependent just because your knowledge of mu changes.

## ↑ comment by evand · 2014-07-25T03:08:39.669Z · LW(p) · GW(p)

Please actually do your math here.

We have a coin that is heads-only with probability 20%, and fair with probability 80%. We've already conducted exactly one flip of this coin, which came out heads (causing out update from the prior of 10/80/10 to 20/80/0), but no further flips yet.

For simplicity, event A will be "heads on next toss" (toss number 2), and B will be "heads on toss after next" (toss number 3).

P(A) = 0.2 * 1 + 0.8 * 0.5 = 0.6
P(B) = 0.2 * 1 + 0.8 * 0.5 = 0.6

P(A & B) = 0.2 * 1 * 1 + 0.8 * 0.5 * 0.5 = 0.4

Note that this is not the same as P(A) * P(B), which is 0.6 * 0.6 = 0.36.

The definition of independence is that A and B are independent iff P(A & B) = P(A) * P(B). These events are not independent.

Replies from: Lumifer## ↑ comment by James_Ernest · 2014-08-20T00:04:42.106Z · LW(p) · GW(p)

I don't think so. None of the available potential coin-states would generate an expected value of 600 heads.

p = 0.6 -> 600 expected heads is the many-trials (where each trial is 1000 flips) expected value given the prior and the result of the first flip, but this is different from the expectation of *this trial*, which is bimodally distributed at [1000]x0.2 and [central limit around 500]x0.8

## ↑ comment by AlexMennen · 2014-07-24T17:20:11.680Z · LW(p) · GW(p)

However if you switch to continuous probabilities your estimate of the character of the coin will change.

No. If the distribution is symmetrical, then the probability density at .5 will be unchanged after a single coin toss.

these two statements contradict each other.

No they don't. He was saying that his estimate of the probability that the coin is unbiased (or approximately unbiased) does not change, but that the probability that the coin is weighted towards heads increased at the expense of the probability that the coin is weighted towards tails (or vice-versa, depending on the outcome of the first toss), which is correct.

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-24T17:29:52.179Z · LW(p) · GW(p)

If the distribution is symmetrical, then the probability density at .5 will be unchanged after a single coin toss.

In the continuous-distribution world the probability density at exactly 0.5 is infinitesimally small. And the probability density at 0.5 plus-minus epsilon will change.

No they don't.

Yes, they do. We're talking about expected values of coin tosses now, not about the probabilities of the coin being biased.

Replies from: None, AlexMennen## ↑ comment by AlexMennen · 2014-07-24T21:11:03.369Z · LW(p) · GW(p)

the probability mass at 0.5 plus-minus epsilon will change.

(army1987 already addressed density vs mass.) No, for any x, the probability density at 0.5+x goes up by the same amount that the probability density at 0.5-x goes down (assuming a symmetrical prior), so for any x, the probability mass in [0.5-x, 0.5+x] will remain exactly the same.

We're talking about expected values of coin tosses now, not about the probabilities of the coin being biased.

Ok, instead of 1000 flips, think about the next 2 flips. The probability that exactly 1 of them lands heads does not change. This does not contradict the claim that the probability of the next flip being heads increases, because the probability of the next two flips both being heads increases while the probability of the next two flips both being tails decreases by the same amount (assuming you just saw the coin land heads).

You don't even need to explicitly use Bayes's theorem and do the math to see this (though you can). It all follows from symmetry and conservation of expected evidence. By symmetry, the change in probability of some event which is symmetric with respect to heads/tails must change by the same amount whether the result of the first flip is heads or tails, and by conservation of expected evidence, those changes must add to 0. Therefore those changes are 0.

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-25T04:05:35.896Z · LW(p) · GW(p)

for any x, the probability density at 0.5+x goes up by the same amount that the probability density at 0.5-x goes down (assuming a symmetrical prior)

I don't think that is true. Imagine that your probability density is a normal distribution. You update in such a way that the mean changes, 0.5 is no longer the peak. This means that your probability density is no longer symmetrical around 0.5 (even if you started with a symmetrical prior) *and* the probability density line is not a 45 degree straight line -- with the result that the density at 0.5+x changes by a different amount than at 0.5-x.

## ↑ comment by AlexMennen · 2014-07-25T04:42:37.270Z · LW(p) · GW(p)

You update in such a way that the mean changes, 0.5 is no longer the peak. This means that your probability density is no longer symmetrical around 0.5 (even if you started with a symmetrical prior)

That is correct. Your probability distribution is no longer symmetrical after the first flip, which means that on the *second* flip, the symmetry argument I made above no longer holds, and you get information about whether the coin is biased or approximately fair. That doesn't matter for the first flip though. Did you read the last paragraph in my previous comment? If so, was any part of it unclear?

with the result that the density at 0.5+x changes by a different amount than at 0.5-x.

That does not follow from anything you wrote before it (the 45 degree straight line part is particularly irrelevant).

Replies from: Lumifer## ↑ comment by Lumifer · 2014-07-25T15:53:17.898Z · LW(p) · GW(p)

Hm. Interesting how what looks like a trivially simple situation can become so confusing. Let me try to walk through my reasoning and see what's going on...

We have a coin and we would like to know whether it's fair. For convenience let's define heads as 1 and tails as 0, one consequence of that is that we can think of the coin as a bitstring generator. What does it mean for a coin to be fair? It means that expected value of the coin's bitstring is 0.5. That's the same thing as saying that the mean of the sample bitstring converges to 0.5.

Can we know for certain that the coin is fair on the basis of examining its bitsting? No, we can not. Therefore we need to introduce the concept of *acceptable* certainty, that is, the threshold beyond which we think that the chance of the coin being fair is high enough (that's the same concept as the p-value). In frequentist statistics we will just run an exact binomial test, but Bayes makes things a bit more complicated.

Luckily, Gelman in *Bayesian Data Analysis* looks exactly at this case (2nd ed., pp.33-34). Assuming a uniform prior on [0,1] the posterior distribution for theta (which in our case is the probability of the coin coming up heads or generating a 1) is

p( th | y ) is proportional to (th ^ y) * ((1 - th)^(n - y))

where *y* is the number of heads and *n* is the number of trials.

After the first flip y=1, n=1 and so p( th | 1) is proportional to ( th )

Aha, this is interesting. Our prior was uniform so the density was just a straight horizontal line. After the first toss the line is still straight but is now sloping up with the minimum at zero and the maximum at 1.

So the expected value of the mean of our bitstring used to be 0.5 but is now greater than 0.5. And that is why I argued that the very first toss changes your expectations: your expected bitstring mean (= expected probability of the coin coming up heads) is now **no longer 0.5** and so you don't think that the coin is fair (because the fair coin's expected mean is 0.5).

But that's only one way of looking at it and now I see the error of my ways. After the first toss our probability density is still a straight line and it pivoted around the 0.5 point. This means that the probability mass in some neighborhood of [0.5-x, 0.5+x] did not change and so the probability of the coin being fair remains the same. The change in the expected value is because we think that if the coin is biased, it's more likely to be biased towards heads than towards tails.

And yet this works because we started with a uniform prior, a straight density line. What if we start with a different, "curvier" prior? After the first toss the probability density should still pivot around the 0.5 point but because it's not a straight line the probability mass in [0.5-x, 0.5+x] will not necessarily remain the same. Hmm... I don't have time right now to play with it, but it requires some further thought.

Replies from: AlexMennen## ↑ comment by AlexMennen · 2014-07-25T17:54:12.481Z · LW(p) · GW(p)

Yes.

What if we start with a different, "curvier" prior? After the first toss the probability density should still pivot around the 0.5 point but because it's not a straight line the probability mass in [0.5-x, 0.5+x] will not necessarily remain the same.

Provided the prior is symmetrical, the probability mass in [0.5-x, 0.5+x] will remain the same after the first toss by the argument I sketched above, even though the probability density will not be a straight line. On subsequent tosses, of course, that will no longer be true. If you have flipped more heads than tails, then your probability distribution will be skewed, so flipping heads again will decrease the probability of the coin being fair, while flipping tails will increase the probability of the coin being fair. If you have flipped the same (nonzero) number of heads as tails so far, then your probability distribution will be different than it was when you started, but it will still be symmetrical, so the next flip does not change the probability of the coin being fair.

## ↑ comment by DanielLC · 2014-07-24T21:49:36.490Z · LW(p) · GW(p)

I didn't realize you were serious, given that this is a joke thread.

Here's the easy way to solve this:

By conservation of expected evidence, if one outcome is evidence for the coin being biased, then the other outcome is evidence against it.

They might believe that it's biased either way if they have a low prior probability of the coin being fair. For example, if they use a beta distribution for the prior, they only assign an infinitesimal probability to a fair coin. But since they're not finding evidence that it's biased, you can't say the belief is based on the outcome of the toss.

I suppose there is a sense it which your statement is true. If I'm given a coin which is badly made, but in a way that I don't understand, then the first toss is fair. I have no idea if it will land on heads or tails. Once I toss it, I have some idea of in which way it's unfair, so the next toss is not fair.

That's not usually what people mean when they talk about a fair coin, though.

## comment by **[deleted]** ·
2014-07-24T02:52:06.853Z · LW(p) · GW(p)

"I wonder what is the probability of random sc2 player being into math and cognitive biases"

"It's probably more one-sided than a Möbius strip"