Changing the Definition of Science
post by Eliezer Yudkowsky (Eliezer_Yudkowsky)
New Scientist on changing the definition of science, ungated here:
Others believe such criticism is based on a misunderstanding. "Some people say that the multiverse concept isn't falsifiable because it's unobservable—but that's a fallacy," says cosmologist Max Tegmark of the Massachusetts Institute of Technology. He argues that the multiverse is a natural consequence of such eminently falsifiable theories as quantum theory and general relativity. As such, the multiverse theory stands or fails according to how well these other theories stand up to observational tests.
So if the simplicity of falsification is misleading, what should scientists be doing instead? Howson believes it is time to ditch Popper's notion of capturing the scientific process using deductive logic. Instead, the focus should be on reflecting what scientists actually do: gathering the weight of evidence for rival theories and assessing their relative plausibility.
Howson is a leading advocate for an alternative view of science based not on simplistic true/false logic, but on the far more subtle concept of degrees of belief. At its heart is a fundamental connection between the subjective concept of belief and the cold, hard mathematics of probability.
I'm a good deal less of a lonely iconoclast than I seem. Maybe it's just the way I talk.
The points of departure between myself and mainstream let's-reformulate-Science-as-Bayesianism is that:
(1) I'm not in academia and can censor myself a lot less when it comes to saying "extreme" things that others might well already be thinking.
(2) I think that just teaching probability theory won't be nearly enough. We'll have to synthesize lessons from multiple sciences like cognitive biases and social psychology, forming a new coherent Art of Bayescraft, before we are actually going to do any better in the real world than modern science. Science tolerates errors, Bayescraft does not. Nobel laureate Robert Aumann, who first proved that Bayesians with the same priors cannot agree to disagree, is a believing Orthodox Jew. Probability theory alone won't do the trick, when it comes to really teaching scientists. This is my primary point of departure, and it is not something I've seen suggested elsewhere.
(3) I think it is possible to do better in the real world. In the extreme case, a Bayesian superintelligence could use enormously less sensory information than a human scientist to come to correct conclusions. First time you ever see an apple fall down, you observe the position goes as the square of time, invent calculus, generalize Newton's Laws... and see that Newton's Laws involve action at a distance, look for alternative explanations with increased locality, invent relativistic covariance around a hypothetical speed limit, and consider that General Relativity might be worth testing. Humans do not process evidence efficiently—our minds are so noisy that it requires orders of magnitude more extra evidence to set us back on track after we derail. Our collective, academia, is even slower.
Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).
comment by Pedro ·
2008-05-18T18:55:38.000Z · LW(p) · GW(p)
"the multiverse is a natural consequence of such eminently falsifiable theories as quantum theory and general relativity. As such, the multiverse theory stands or fails according to how well these other theories stand up to observational tests."
That seems to me to be the fallacy (denying the antecedent). Not that it matters much to his overall message.
I agree with your point about cognitive biases and psychology. With straight up yes/no true/false questions using the hypothetico-deductive method, these things are less important, I think - but when you switch to degrees of belief and plausibility, you really must have a good meta-understanding of your own reasoning.
I think you muddle through some things in point (3). You already know the questions you would ask, because you already know the answer which was reached.
comment by Tom_McCabe2 ·
2008-05-18T18:59:51.000Z · LW(p) · GW(p)
"Science tolerates errors, Bayescraft does not. Nobel laureate Robert Aumann, who first proved that Bayesians with the same priors cannot agree to disagree, is a believing Orthodox Jew."
I think there's a larger problem here. You can obviously make a great deal of progress by working with existing bodies of knowledge, but when some fundamental assumption breaks down, you start making nonsensical predictions if you can't get rid of that assumption gracefully. Aumann learned Science, and Science worked extremely well when applied to probability theory, but because Aumann didn't ask "what is the general principle underlying Science, if you move into an environment without a long history of scientific thought?", he didn't derive principles which could also be applied to religion, and so he remained Jewish. The same thing, I dare say, will probably happen to Bayescraft if it's ever popularized. Bayescraft will work better than Science, across a larger variety of situations. But no textbook could possibly cover every situation- at some point, the rules of Bayescraft will break down, at least from the reader's perspective (you list an example at http://lesswrong.com/lw/nc/newcombs_problem_and_regret_of_rationality/). If you don't have a deeper motivation or system underlying Bayescraft, you won't be able to regenerate the algorithms and work around the error. It's Feynman's cargo cult science, applied to Science itself.
comment by Peter_Turney ·
2008-05-18T19:29:27.000Z · LW(p) · GW(p)
Bayesianism has its uses, but it is not the final answer. It is itself the product of a more fundamental process: evolution. Science, technology, language, and culture are all governed by evolution. I believe that this gives much deeper insight into science and knowledge than Bayesianism. See:
(1) Multiple Discovery: The Pattern of Scientific Progress, Lamb and Easton
(2) Without Miracles: Universal Selection Theory and the Second Darwinian Revolution, Cziko
(3) Darwin's Dangerous Idea: Evolution and the Meanings of Life, Dennett
(4) The Evolution of Technology, Basalla
Scientific method itself evolves. Bayesianism is part of that evolution, but only a small part.
comment by Caledonian2 ·
2008-05-18T19:32:21.000Z · LW(p) · GW(p)
Well thank the benevolence of the Friendly AI that this intelligence didn't see a helium balloon first. Just imagine the kinds of theories it might produce then!
If you see one object falling in a particular way, you might infer that all objects fall that way - but it's an extremely weak inference, as the strength of a single observation is spread over the entirety of "all things". We were so confident in Newton's formulation for such a long time because we had a vast store of observations, and were aware of confounding influences that masked the underlying pattern: things like air resistance and buoyancy. The understanding that all things fall at a given rate was a strong and reliable inference because we observed it to hold across many, many things. Once we knew that, we could show that such behavior was consistent with Newton's hypothesized force. More importantly, we had already determined through observation that the objects in the Solar system moved in elliptical orbits, but we didn't know why. We were able to show that Newton's hypothesized forces would result in objects moving in such a way, and so concluded that his description was correct.
Eliezer is almost certainly wrong about what a hyper-rational AI could determine from a limited set of observations. It would probably notice the implications of Maxwell's laws that require Relativity to fully explain - something real physicists missed for a generation - because the implications follow directly from the mathematics. Actually producing the laws in the first place requires a lot of data regarding electricity and magnetism.
His projected super-intelligence would very quickly outleap its data and rush to all sorts of unsupportable inferences. If it confused those inferences with conclusions, it would fall into error faster than we could possibly correct it, and if it lacked the long, slow, tedious process of checking and re-checking data that science uses, it would be unlikely to ever correct those errors.
comment by Psy-Kosh ·
2008-05-18T19:38:40.000Z · LW(p) · GW(p)
I agree with Pedro about point 3.
There would have to be a bunch of things your Bayesian Superintelligence would have to already know, as near as I can tell, to get all that. First, it'd have to somehow or other have worked out the earth is round, that stuff falls toward the center of the earth rather than in some universal "down" direction. That would help it get the hint that gravitation can vary depending on location. If it also knew of various astronomical objects and their apparent motions in the sky, then it'd have a good starting point to go in the direction you suggest.
But it'd also have to have made enough other observations to at least get the hint about locality. I concede that for a BS... er... maybe a different acronym would be better... anyways, for one of those, it probably wouldn't take too much observation to at least have a good suspicion about the importance of locality. Basically, noticing that things like distance/time/etc actually do seem relevant to how things interact. ie, enough observation to notice that there are properties like distance and time would, I suspect, be enough.
But one apple falling, on its own, without all the surrounding context of apples, falling, ground, earth, sky, probably wouldn't do it.
comment by Daniel_Yokomizo ·
2008-05-18T20:17:10.000Z · LW(p) · GW(p)
I've been reading these last posts on Science vs. Bayes and I really don't get it. I mean, obviously bayesian reasoning supersedes falsifiability and how to analyze evidence, but there's no conflict. Like relativity vs. newtonian mechanics, there's thresholds that we need to cross to see the failures in Science, but there are many situations when just Science works effectively.
The New Scientist is even worse, the idea that we need to ditch falsifiability and use Bayes is idiotic, it's like saying that binary logic should be discarded because we can use probabilities instead of zero and one. Falsifiability is a special case of Bayes, we can't have Bayes without falsifiability (as we can't have natural's addition ruling out 2+2=4), the people that argue this don't understand the extents of Bayes.
WRT multiverse IMHO we have to separate the interpretation of some theory from the theory itself. If the theory (which is testable, falsifiable, etc.) holds against the evidence and one of it's results is the existence of a multiverse, then we have to accept the existence of the multiverse. If it isn't one of the results, but it is one possible interpretation of how the theory "really works", then we are in the realm of philosophy and we can spend thousands of years arguing any way without going forward. In most cases of QM theories there's no clear separation of both, so people attach themselves to the interpretations instead of using the results. If we have two hypothesis that explain the same phenomena we have three possible choices:
- they're equal up to isomorphism (which means that doesn't matter which one we choose, other than convenience).
- one is simpler than the other (using whatever criteria of complexity we want to use).
- both explain more than the phenomena.
Number 1 is a no-brainer. Number 3 is the most usual situation, where the evidence points either way and new evidence is necessary to confirm in both directions. We can use Bayes to assess the probability of each one being "the right one", but if both theories don't contradict each other then there's a smaller theory inside each that falls in the case number 1. Number 2 is the most problematic because plain use of complexity assessment doesn't guarantee that we are picking the right one. The problem lies in the evidence available: there's no way to know if we have sufficient evidence to rule out any one. Just because a equation is simpler it doesn't mean it's correct, perhaps our data set is well known. Again it should be the cause that the simpler theory is isomorphic to a subset of the larger theory.
The only argument that needs to be spoken is if the multiverse is a result or an interpretation, but in the strictest sense of the word: we can't say it's an interpretation assuming that X and Y holds, because them it's an interpretation of QM + X + Y. AFAIK every "interpretation" of QM extends the assumptions in a particular direction. Personally I find the multiverse interpretation cleaner, mathematically simpler and I would bet my money on it.
On your points of departure:
(1) Shows how problematic academia is. I think the academic model is a dead end, we should value rationality more than quantity of papers published, the whole politics of the thing is way too much inefficient.
(2) It won't be enough because our culture values rationality much less than anything else. Even without bayesian reasoning plain old Science rules out the bible, you can either believe in logic or the bible. One of the best calculus professors I had was a fervent adventist. IMO our best strategy is just outsmart the irrationalists, our method is proven and yields much better results, we just need to keep compounding it to the singularity ;)
(3) You're dead wrong (in the example). There are many other necessary experiments other than seeing an apple fall to realize special relativity. Actually a bayesian super-intelligence could get trapped in local maximum for a long time until the "right" set of experiments happened. We have a history of successes in science but there's a long list of known failures, let alone the unknown failures.
comment by poke ·
2008-05-18T21:10:53.000Z · LW(p) · GW(p)
Anybody who thinks Popper provided useful insights into how science proceeds should read David Stove's Scientific Irrationalism. Stove tears Popper to shreds. (He also defends inductive probabilism so he'd be agreeable to seekers of the Way.) Popper's theory never gained much traction in philosophy (inductive probabilism, even Bayesianism, has garnered more serious interest) but certain popularizers who happen to be Popperites (notably Brian Magee) have given him a false sense of prominence in their works. The particular philosophies of science that scientists espouse at any given time are subject to fad; logical positivism, instrumentalism, Kuhn's revolutions, they've all been popular at some point. Personally I think this gives credence to the idea that none of them have anything useful to say.
comment by JulianMorrison ·
2008-05-18T23:08:29.000Z · LW(p) · GW(p)
Do you propose that humans could, if not achieve, then get much closer to the efficient evidence use of a hypothetical super-AI? Lets say, savage to Einstein in a lifetime, assuming said savage starts out pre-trained in Bayescraft.
comment by JulianMorrison ·
2008-05-19T00:59:34.000Z · LW(p) · GW(p)
OK, so what speedup can we expect?
Start with: what speedup do YOU get? As the originator of this synthesis, you could reasonably be expected to be furthest along.
comment by bambi ·
2008-05-19T03:04:19.000Z · LW(p) · GW(p)
Sufficiently-advanced Bayesian rationality is indistinguishable from magic.
It's possible to be "smart" and a nutter at the same time you know.
comment by Thanatos_Savehn ·
2008-05-19T04:48:56.000Z · LW(p) · GW(p)
Let's not and say we did.
Recall I'm a lawyer fighting a fairly lonely battle for sound science in the courtroom. So let me tell you; abandoning Popper and falsification (i.e. rejecting Daubert v. Merrell Dow) and going to a subjective "more likely than not belief" standard is nothing but a recipe for handing billions more over to the already super rich trial lawyers (several of whom are starting to report to prison, along with their experts, for the perjury, bribing of judges, etc that went on back in the bad old days).
Alas. The claim that "science can't prove anything for sure so let's allow "experts" to testify about what they believe and have the jurors sort it out" is starting to surface in cases I work on. In those cases, self-proclaimed experts charging $500/hr or more per hour to testify to their beliefs, which allegedly arise out of the penumbras of their expertise, and which correlate precisely with the position of the side that hired them, are already cashing in. The law will not appreciate the niceties of the argument here and it won't be long before we have PhDs testifying that MRI machines damage ESP powers ... again. (My personal favorite was an expert in New York who was allowed to testify that C6H6 made synthetically was toxic at the one molecule dose level but that C6H6 generated by the body was not because "natural" benzene had a "life resonance electron level" which kept its "electron cloud in a harmless state". He believed it and believed it strongly so he got to testify to it because he is an MD/PhD.)
So just say "Hell NO!" to "subjective degrees of belief", or anything similar, as the standard definition of science in tort cases .... and that's what it will be if "science" says that's what real science is. Unless you're a trial lawyer who files his cases in a poor rural area it will cost you beaucoup otherwise. I get the distinction but most won't and the scoundrels will have a field day.
comment by Tiedemies2 ·
2008-05-19T05:52:40.000Z · LW(p) · GW(p)
Apart from articles submitted for peer-review (while expecting them to be published), exactly how and why do academics need to cencor themselves? I am not talking about politically sensitive ideas like feminism or racial relations, rather, on philosophy of science. Do you have reason to believe there are "closet-bayesians" in the academia?
I think you make an excellent point here, though, but I also think you are being to harsh on (us) academics.
comment by Daniel2 ·
2008-05-19T05:52:51.000Z · LW(p) · GW(p)
It seems to me that in science, there is always an implicit agreement that the current theory could be revised in light of new contradictory evidence. As far as I can tell, the Bayesian approach seems to lack this feature, since we have to assume a fixed model of the world to do the probability updates.
For example, what's the probability that the sun will rise tomorrow? How do you even calculate it? (To keep things simple, suppose you have seen the sun rise N times).
More abstractly, suppose every day you get a bit of information from some source, and the first N bits are all one. What's the probability that the next bit is one? How would the perfect Bayesian mind answer that?
An interesting way to avoid all this is to simply look at behavior (rather than beliefs) and apply an evolutionary argument which goes like this:
Finding and exploiting patterns is useful for survival, so evolution favored organisms that could do so. No "laws" required. The universe just needs to be orderly enough for life to survive. It need not make sense all the way down. I don't believe it, but it's interesting nevertheless.
comment by Gray_Area ·
2008-05-19T22:26:55.000Z · LW(p) · GW(p)
"Eliezer is almost certainly wrong about what a hyper-rational AI could determine from a limited set of observations."
Eliezer is being silly. People invented computational learning theory, which among other things, shows the minimum number of samples needed to recover a given error rate.
comment by Caledonian2 ·
2008-05-19T22:44:56.000Z · LW(p) · GW(p)
He's also leaving out concepts like object permanence. Not only does the AI have to have lots of examples before it can form any general conclusions, it has to produce the most basic conclusions first.
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) ·
2008-05-19T23:02:31.000Z · LW(p) · GW(p)
Gray Area is being silly. I am quite aware of Probably Approximately Correct learning. Would you care to try to apply that theory to Einstein's invention of General Relativity? PAC-learning theorems only work relative to a fixed model class about which we have no other information.
If you can see an apple fall, you already know enough to interpret the input to your webcam as an apple falling. This might require up to a dozen frames of environmental monitoring in order to notice all the objects there - the higher-resolution the webcame, the less time.
Think of Einstein, in a tiny box, thinking a million times as fast and much more cleanly, pondering each frame coming in off the webcam for a thousand years. At what point does he think the pixels might describe a permanent object? Perhaps before he has even seen two frames in succession - he can see many permanent objects just in the landscape of his own mind. At what point does he suspect a 3D world behind the 2D world? As soon as he sees two frames in succession. At what point does he suspect Galileo's formula for gravity? Three frames in succession. At what point does he suspect this formula is universal? As soon as he sees blades of grass leaning over; plus it's a very obvious hypothesis to the right kind of Bayesian. At what point does he suspect General Relativity? As soon as he notices locality of interaction as a principle applying to many things in the environment, and wonders, backed by a Judea-Pearl-like understanding of causality, if the locality principle is universally applied to the spatial organization induced from the webcam.
comment by Caledonian2 ·
2008-05-19T23:26:13.000Z · LW(p) · GW(p)
I'm fairly sure that trapped-in-a-tiny-box Einstein would go completely mad having so little data to analyze in so much subjective time.
Past experiments in sensory deprivation suggest that human neurology requires high-information input to function properly - and I see no reason why artificial brains would be any different. If the speed of thought increases by a factor of a thousand, the access to data must increase by at least as much.
comment by Adirian ·
2008-05-20T00:12:18.000Z · LW(p) · GW(p)
Your high-capacity Einstein would come to the conclusion, left to those parameters, that the picture never changes. The pattern for that is infinitely stronger, thinking so quickly, than any of the smaller patterns within. Indeed, processing the same information so many times, it will encounter information miscopies nigh-infinitely more often than it encounters a change in the data itself - because, after all, a quantum computer will be operating on information storage mechanisms sensitive enough to be altered by a microwave oven a mile away.
You have a severe bootstrapping problem which you're ignoring - thought requires subject. Consciousness requires something to be conscious of. You can't design a consciousness and throw things for it to be conscious of after the fact. You have to start with the webcam and build up to the mind - otherwise the bits flowing in are meaningless. No amount of pattern recognition will give meaning to patterns.
comment by Gray_Area ·
2008-05-20T01:31:45.000Z · LW(p) · GW(p)
"Would you care to try to apply that theory to Einstein's invention of General Relativity? PAC-learning theorems only work relative to a fixed model class about which we have no other information."
PAC-learning stuff is, if anything far easier than general scientific induction. So should the latter require more samples or less?
comment by Nick6 ·
2008-05-20T17:51:07.000Z · LW(p) · GW(p)
I really don't understand the debate. Bayesian reasoning IS the reasoning that scientists use. It is the method underlying the evolution of scientific theory. Popperian falsification is just some theory, more a prescriptive than descriptive rule. It's a pie in the sky which doesn't explain how the body of scientific knowledge evolves in time.
In practice, evidence is gathered to support or falsify a given scientific premise. Newtonian mechanics was TRUE until proven otherwise. And today's theories are more or less true based on their ability to explain reality (i.e., the same thing as positive evidence in a probabilistic sense) and not be disproved (i.e., have negative evidence against them). In reality, there are limits to our understanding and the scientist with any real sense of humility should agree with Box when he said that all models are false but some are useful.
Daniel, I think what you say about an implicit agreement that the current theory could be revised in light of new contradictory evidence, this is exactly Bayesian, a form of Bayesian model selection, where it may be that no theory or model is ever thrown out completely, just assigned a very low probability. Many evolutionary arguments are just a form of Bayesian update, conditioning on new evidence.
The idea that Bayesian decision theory being descriptive of the scientific process is very beautifully detailed in classics like Pearl's book, Causality, in a way that a blog or magazine article cannot so easily convey. In a different vein, for a very readable explanation of how "truth" changes, even in mathematics, the most pure of sciences, have a look at Imre Lakatos' book, Proofs and Refutations. In this book, Lakatos makes it clear that even mathematicians can use a Bayesian update of mathematical "evidence" for or against a given hypothesis, and that old "proofs" even by the greatest of mathematicians often have holes poked in them in time.
Now pure application of Bayes' rule may just merely give the probability that a theory/model is true. In reality, we probably do have some utility/loss function that gives us a decision rule as to whether we wish to use or discard a given theory. This loss function approach will actually allow us to use "false" theories such as Newtonian mechanics, when there is some utility to it, even though the evidence against them is immense.
What Eliezer is saying in the blog and what is said in the NS article is basically descriptive, imho, let's call a spade a spade...science is already Bayesian. Those of you who cannot really accept it and think this opens up science to the possibility of witchcraft are filled with a great idealism in how science is currently conducted behind closed doors. Either that, or like the Church fathers who silenced Galileo, you're awfully scared that the opposite of your dogma, witchcraft, might have an element of truth in it. Being honest about Bayesianism means we have to consider all the alternatives.
But, to be reassuring, I don't think we've seen a terrible amount of positive evidence for witchcraft lately....
comment by Gray_Area ·
2008-05-20T18:54:19.000Z · LW(p) · GW(p)
"The idea that Bayesian decision theory being descriptive of the scientific process is very beautifully detailed in classics like Pearl's book, Causality, in a way that a blog or magazine article cannot so easily convey."
I wish people would stop bringing up this book to support arbitrary points, like people used to bring up the Bible. There's barely any mention of decision theory in Causality, let alone an argument for Bayesian decision theory being descriptive of all scientific process (although Pearl clearly does talk about decisions being modeled as interventions).
comment by Brian_Macker ·
2008-05-22T06:54:53.000Z · LW(p) · GW(p)
"Howson believes it is time to ditch Popper's notion of capturing the scientific process using deductive logic."
Another person who doesn't understand Popper. It's as if the guy believed cars were nothing but wheels. Deduction is only part of Poppers theory. The theory can in fact subsume just about any method (till it's shown not to work). It's really just disciplined evolution. It's certainly not merely about using deduction.
comment by Brian_Macker ·
2008-05-22T07:00:16.000Z · LW(p) · GW(p)
"First time you ever see an apple fall down, you observe the position goes as the square of time, .."
Well no actually you don't. Not unless you prebuild the system to know about time and squaring, etc. Have you no respect for evolution? Evolution is how you get to the point where you have semantics.
comment by lowasser ·
2011-06-08T18:25:02.929Z · LW(p) · GW(p)
I'm curious as to whether Wegener's theory of continental drift works as a case where a Bayesian model would have done better than Science. The coincidences, paleontogical, biological, and geological, between South America and Africa -- how they fit together in so many ways -- should have been seen as convincing evidence for continental drift, even before plate tectonics was invented to provide a mechanism....or am I overidealizing the past?
comment by [deleted] ·
2020-01-08T23:06:49.156Z · LW(p) · GW(p)
I think it is possible to do better in the real world. In the extreme case, a Bayesian superintelligence could use enormously less sensory information than a human scientist to come to correct conclusions. First time you ever see an apple fall down, you observe the position goes as the square of time, invent calculus, generalize Newton’s Laws… and see that Newton’s Laws involve action at a distance, look for alternative explanations with increased locality, invent relativistic covariance around a hypothetical speed limit, and consider that General Relativity might be worth testing.
How is this not hindsight bias?