[SEQ RERUN] Inductive Bias
post by badger · 2011-05-26T13:20:46.506Z · LW · GW · Legacy · 8 commentsContents
8 comments
Today's post, Inductive Bias was originally published on April 8, 2007. A summary (from the LW wiki):
Inductive bias is a systematic direction in belief revisions. The same observations could be evidence for or against a belief, depending on your prior. Inductive biases are more or less correct depending on how well they correspond with reality, so "bias" might not be the best description.
Discuss the post here (rather than in the comments of the original post).
This post is part of a series rerunning Eliezer Yudkowsky's old posts so those interested can (re-)read and discuss them. The previous post was Debiasing as Non-Self-Destruction, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it, posting the next day's sequence reruns post, summarizing forthcoming articles on the wiki, or creating exercises. Go here for more details, or to discuss the Sequence Reruns.
8 comments
Comments sorted by top scores.
comment by Oscar_Cunningham · 2011-05-26T15:05:44.889Z · LW(p) · GW(p)
Can someone explain to me what Eliezer means when he says that
If you start out with a maximum-entropy prior, then you never learn anything, ever, no matter how much evidence you observe.
?
The prior in this situation that I would call "maximum entropy" is the uniform prior (which certainly does change when you update). The monkey binomial distribution isn't maximum entropy at all! What's going on?
Replies from: jsalvatier, wallowinmaya↑ comment by jsalvatier · 2011-05-26T15:14:13.871Z · LW(p) · GW(p)
I saw this too, I think he means 'maximum entropy over the outcomes (ball draws)' rather than 'maximum entropy over the parameters in your model'. The intuition is that if you posit no structure to your observations, making an observation doesn't tell you anything about your future observations. Though, that interpretation doesn't quite fit since he specified he knew they were drawn with a known probability.
One of the things EY made me realize is that any modeling is part of the prior. Specifically a model is a prior about how observations are related. For example, part of your model might be 'balls are independently with a constant but unknown probability'. If you had a maximum entropy prior over the draws, you would say something more like 'ball draws are completely unrelated and determined by completely separate processes'.
Replies from: Schlega↑ comment by Schlega · 2011-05-28T22:36:27.436Z · LW(p) · GW(p)
This still confuses me. 'Ball draws are completely unrelated and determined by completely separate processes' still contains information about how the balls were generated. It seems like if you observed a string of 10 red balls, then your hypothesis would lose probability mass to the hypothesis 'ball draws are red with p > 0.99.'
It seems like the problem only happens if you include an unjustified assumption in your 'prior', then refuse to consider the possibility that you were wrong.
My prior information is that every time I have found something Eliezer said confusing, it has eventually turned out that I was mistaken. I expect this to remain true, but there's a slight possibility that I am wrong.
Replies from: jsalvatier↑ comment by jsalvatier · 2011-05-30T05:36:39.659Z · LW(p) · GW(p)
Yes, I thought about this a bit too, but did't pay as much attention to being confused as you did. I'm not sure how to resolve it.
↑ comment by David Althaus (wallowinmaya) · 2011-05-27T17:07:31.745Z · LW(p) · GW(p)
This may be off-topic, but why don't you make your comment on the original post? IMO it would be more efficient if the discussion of a certain post were centered at one place, so new readers could see all relevant critiques at once. And since there are already many comments at the original post, why not make your comments there? It would also add to the cohesion of lesswrong in general, I guess, but maybe I'm missing something;)
Replies from: wallowinmaya↑ comment by David Althaus (wallowinmaya) · 2011-05-27T17:12:09.146Z · LW(p) · GW(p)
I will reply to myself: Try google.
comment by Matt_Simpson · 2011-05-27T06:38:58.267Z · LW(p) · GW(p)
I'd like to see the inductive bias / max-ent stuff fleshed out with some math. Any pointers?
Replies from: aletheilia↑ comment by aletheilia · 2011-05-27T12:51:11.570Z · LW(p) · GW(p)
Probability Theory:The Logic of Science?