# Priors Are Useless

post by DragonGod · 2017-06-21T11:42:31.214Z · LW · GW · Legacy · 22 comments

## Contents


Priors are Useless.
None


## NOTE.

This post contains Latex. Please install Tex the World for Chromium or other similar Tex typesetting extensions to view this post properly.

# Priors are Useless.

Priors are irrelevant. Given two different prior probabilities $Pr_{i_1}$, and $Pr_{i_2}$ for some hypothesis $H_i$.
Let their respective posterior probabilities be $Pr_{i_{z1}}$ and $Pr_{i_{z2}$.
After sufficient number of experiments, the posterior probability $Pr_{i_{z1}} \approx [;Pr_{i_{z2}$.
Or More formally:
$\lim_{n \to \infty} \frac{ Pr_{i_{z1}}}{Pr_{i_{z2}}} = 1$.
Where $n$ is the number of experiments.
Therefore, priors are useless.
The above is true, because as we carry out subsequent experiments, the posterior probability $Pr_{i_{z1_j}}$ gets closer and closer to the true probability of the hypothesis $Pr_i$. The same holds true for $Pr_{i_{z2_j}}$. As such, if you have access to a sufficient number of experiments the initial prior hypothesis you assigned the experiment is irrelevant.

To demonstrate.
http://i.prntscr.com/hj56iDxlQSW2x9Jpt4Sxhg.png
This is the graph of the above table:
http://i.prntscr.com/pcXHKqDAS\_C2aInqzqblnA.png

In the example above, the true probability of Hypothesis $H_i$ $(P_i)$ is $0.5$ and as we see, after sufficient number of trials, the different $Pr_{i_{z1_j}}$s get closer to $0.5$.

To generalize from my above argument:

If you have enough information, your initial beliefs are irrelevant—you will arrive at the same final beliefs.

Because I can’t resist, a corollary to Aumann’s agreement theorem.
Given sufficient information, two rationalists will always arrive at the same final beliefs irrespective of their initial beliefs.

The above can be generalized to what I call the “Universal Agreement Theorem”:

Given sufficient evidence, all rationalists will arrive at the same set of beliefs regarding a phenomenon irrespective of their initial set of beliefs regarding said phenomenon.

Prove $\lim_{n \to \infty} \frac{ Pr_{i_{z1}}}{Pr_{i_{z2}}} = 1$.

comment by Luke_A_Somers · 2017-06-21T14:44:43.060Z · LW(p) · GW(p)

This is totally backwards. I would phrase it, "Priors get out of the way once you have enough data." That's a good thing, that makes them useful, not useless. Its purpose is right there in the name - it's your starting point. The evidence takes you on a journey, and you asymptotically approach your goal.

If priors were capable of skewing the conclusion after an unlimited amount of evidence, that would make them permanent, not simply a starting-point. That would be writing the bottom line first. That would be broken reasoning.

Replies from: TheAncientGeek, ImmortalRationalist
comment by TheAncientGeek · 2017-06-22T14:36:16.932Z · LW(p) · GW(p)

"A ladder you throw away once you have climbed up it".

Replies from: Luke_A_Somers
comment by ImmortalRationalist · 2017-07-03T23:18:51.307Z · LW(p) · GW(p)

But what exactly constitutes "enough data"? With any finite amount of data, couldn't it be cancelled out if your prior probability is small enough?

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2017-07-08T16:15:50.352Z · LW(p) · GW(p)

Yes, but that's not the way the problem goes. You don't fix your prior in response to the evidence in order to force the conclusion (if you're doing it anything like right). So different people with different priors will have different amounts of evidence required: 1 bit of evidence for every bit of prior odds against, to bring it up to even odds, and then a few more to reach it as a (tentative, as always) conclusion.

comment by WalterL · 2017-06-21T13:38:11.585Z · LW(p) · GW(p)

I definitely agree that after we become omniscient it won't matter where we started...but going from there to priors 'are useless' seems like a stretch. Like, shoes will be useless once my feet are replaced with hover engines, but I still own them now.

Replies from: DragonGod
comment by DragonGod · 2017-06-21T14:10:23.152Z · LW(p) · GW(p)

But this isn't all there is to it.
@Alex. also, take a set of rationalists with different priors. Let this set of priors be S.
Let the standard deviation of S after i trials be d_i.

d_{i+1} <= d_i for all i: i is in N. The more experiments are conducted the greater the precision of the probabilities of the rationalists.

comment by 9eB1 · 2017-06-21T14:17:15.296Z · LW(p) · GW(p)

Now analyze this in a decision theoretic context where you want to use these probabilities to maximize utility and where gathering information has a utility cost.

comment by Lumifer · 2017-06-21T15:47:01.948Z · LW(p) · GW(p)

You keep using that word, "useless". I do not think it means what you think it means.

comment by CronoDAS · 2017-06-21T12:01:34.912Z · LW(p) · GW(p)

It can take an awfully long time for N to get big enough.

Replies from: DragonGod
comment by DragonGod · 2017-06-21T14:03:10.281Z · LW(p) · GW(p)

True. I don't disagree with that.

Replies from: Jayson_Virissimo
comment by Jayson_Virissimo · 2017-06-23T05:22:17.968Z · LW(p) · GW(p)

So, in the meantime, priors are useful?

comment by korin43 · 2017-06-22T16:59:46.426Z · LW(p) · GW(p)

I think you lost me at the point where you assume it's trivial to gather an infinite amount of evidence for every hypothesis.

comment by entirelyuseless · 2017-06-21T14:20:42.598Z · LW(p) · GW(p)

This is sometimes false, when there are competing hypotheses. For example, Jaynes talks about the situation where you assign an extremely low probability to some paranormal phenomenon, and a higher probability to the hypothesis that there are people who would fake it. More experiments apparently verifying the existence of the phenomenon just make you more convinced of deception, even in the situation where the phenomenon is real.

Additionally, you should have spoken of converging on the truth, rather than the "true probability," because there is no such thing.

comment by arisen · 2017-07-03T05:38:39.784Z · LW(p) · GW(p)

The only way for a stochastic process to satisfy the Markov property if it's memoryless. Most phenomena are not memoryless, which means that observers will obtain information about them over time.

comment by MrMind · 2017-06-21T14:12:26.628Z · LW(p) · GW(p)

the posterior probability [;Pr_{i_{z1_j}};] gets closer and closer to the true probability of the hypothesis [;Pr_i;]

There's no true probability. Either a model is true or not.

comment by Hafurelus · 2017-08-02T04:32:31.234Z · LW(p) · GW(p).Replies from: arisen
comment by arisen · 2017-08-28T04:07:34.709Z · LW(p) · GW(p)

I'm using opera mini (beta some times) on android and I pasted a whole google search link , maybe their beta servers are in Russia? Or something about opera mini data optimization? I have nothing to hide, it's the ethics of science

comment by MrMind · 2017-06-21T14:09:42.207Z · LW(p) · GW(p)

This is trivially false, if the prior probability is 1 or 0.

It might be true but irrelevant, if the number of needed experiments is impractical or no repeated independent experiment can be performed.

It is also false if applied to two agents: if they do not have the same prior and the same model, their posterior might converge, diverge or stay the same. Aumann's agreement theorem works only in the case of common priors, so it cannot be extended.

comment by Akhenator · 2017-06-21T12:21:10.240Z · LW(p) · GW(p)

I might be a bit blind but what are Priz1 and Priz2? Because here it looks like Priz1=Priz2. And what the priors do? What are your hypothesis?

I am sorry if I didn't get it (and I'm maybe looking like a fool right now).

Replies from: DragonGod
comment by DragonGod · 2017-06-21T14:07:29.827Z · LW(p) · GW(p)

The priors are the probabilities you assign to hypotheses before you receive any evidence for or against that/those hypothesis/hypotheses.
[;Pr_{i_{z1}};] and [;Pr_{i_{z2}};] are the posterior probabilities on [;Pr_{i_1};] and [;Pr_{i_2};] respectively.