Posts
Comments
If science is falsifiable and therefore uncertain is any of it true? If not then I assume JTB must judge "scientific knowledge" to be an oxymoron.
If some scientific knowledge is true does that mean that the theory will not be revised, extended or corrected in the next 1,000 years?
Does truth apply to science? If not should "true" be included in our definition of knowledge?
Yes on re-reading I see what you are saying.
Yes, thanks, and the standard mathematical description of the change in frequency of alleles over generations is given in the form of a Bayesian update where the likelihood is the ratio of reproductive fitness of the particular allele to the average reproductive fitness of all competing alleles at that locus.
What a wonderful post!
I find it intellectually exhilarating as I have not been introduced to Solomonoff before and his work may be very informative for my studies. I have come at inference from quite a different direction and I am hopeful that an appreciation of Solomonoff will broaden my scope.
One thing that puzzles me is the assertion that:
Therefore an algorithm that is one bit longer is half as likely to be the true algorithm. Notice that this intuitively fits Occam's razor; a hypothesis that is 8 bits long is much more likely than a hypothesis that is 34 bits long. Why bother with extra bits? We’d need evidence to show that they were necessary.
First my understanding of the principle of maximum entropy suggests that prior probabilities are constrained only by evidence and not by the length of the hypothesis test algorithm. In fact Jaynes argues that 'Ockham's' razor is already built into Bayesian inference.
Second given that the probability is reduced by half with every bit of added algorithm length wouldn't that imply that algorithms' having 1 bit were the most likely and have a probability of 1/2. In fact I doubt if any algorithm at all is describable with 1 bit. Some comments as well as the body of the article suggest that the real accomplishment of Solomonoff's approach is to provide the set of all possible algorithms/hypothesis and that the probabilities assigned to each are not part of a probability distribution but rather are for the purposes of ranking. Why do they need to be ranked? Why not assign them all probability 1/N where N = 2^(n+1) - 2, the number of algorithms having length up to and including length n.
Clearly I am missing something important.
Could it be that ranking them by length is for the purpose of determining the sequence in which the possible hypothesis should be evaluated? When ranking hypothesis by length and then evaluating them against the evidence in sequence from shorter to longer our search will stop at the shortest possible algorithm, which by Occam's razor is the preferred algorithm.
Excellent post.
I have pondered the same sort of questions. Here is an excerpt from my 2009 book.
My father is 88 years old and a devout Christian. Before he became afflicted with Alzheimer’s he expected to have an afterlife where he would be reunited with his deceased daughter and other departed loved ones. He doesn’t talk of this now and would not be able to comprehend the question if asked. He is now almost totally unaware of who he is or what his life was. I sometimes tell him the story of his life, details of what he did in his working life, stories of his friends, the adventures he undertook. Sometimes these accounts stir distant memories. I have recently come to understand that there is more of ‘him’ alive in me then there is in him. When he dies and were he to enter the afterlife in his present state and be reunited with my sister he would not recognize or remember her. Would he be restored to some state earlier in his life? Would he be the same person at all?
I originally wrote this to illustrate problems with the religious idea of resurrection. I now believe that this problem of identity is common to all complex evolving systems including 'ourselves'. For example species evolve over their lifetime and although we intuitively know that we are identifying something distinct when we name a species such as homo-sapiens the exact nature of the distinction is slippery. The debate in biology over the definition of species has been long, heated and unresolved. Some definition referring to species are attempts along the line of interbreeding populations that do not overlap with other populations. However this is a leaky definition. For example it has recently been found that modern human populations contain some Neanderthal DNA. Our 'species' interbred in the past, should we still be considered separate species?
The 'irreducible complexity' argument advocated by the intelligent design community often cites the specific example of the eye. It is argued that an eye is a complex organ with many different individual parts that all must work together perfectly and that this implies it could not have been gradually built out of small gradual random changes.
This argument has been around a long time but it has been well answered within the scientific literature and the vast majority of biologist consider the issue settled.
Dawkins' book 'Climbing mount improbable' provides a summary of the science for the lay reader and uses the eye as a detailed example.
Darwin was the first to explain how the the eye could have evolved via natural selection. I quote the wikipedia article:
Charles Darwin himself wrote in his Origin of Species, that the evolution of the eye by natural selection at first glance >seemed "absurd in the highest possible degree". However, he went on to explain that despite the difficulty in imagining it, >this was perfectly feasible:
...if numerous gradations from a simple and imperfect eye to one complex and perfect can be shown to exist, each >grade being useful to its possessor, as is certainly the case; if further, the eye ever varies and the variations be inherited, >as is likewise certainly the case and if such variations should be useful to any animal under changing conditions of life, >then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperable by >our imagination, should not be considered as subversive of the theory.
The argument of 'irreducible complexity' has been around since Darwin first proposed natural selection and it has been conclusively answered within the scientific literature (for a good summary see the Wikipedia article). Those who believe that all life was created by God cannot believe the scientific explanation. In my view the real problem is that they tend to argue that they have superior scientific evidence which proves that the scientific consensus is wrong. In other words the intelligent design community argues they are scientifically superior to the science community. This reduces their position to a undignified one of deception or perhaps even fraud.
I was also inspired by one of Dawkins' books suggesting something similar. It was some years ago but I believe Dawkins suggested writing a type of computer script which would mimic natural selection. I wrote a script and was quite surprised at the power it demonstrated.
As I remember the general idea is that you can type in any string of characters you like and then click the 'evolve' button. The computer program then:
1) generates and displays a string of random characters of the same length as the entered string.
2) compares the new string with the displayed string and retains all characters that are the same and in the same position.
3) generates random characters in the string where they did not match in 2 and displays the full string.
4) If the string in 3 matches the string entered by the computer the program stops otherwise it goes to step 2.
The rapidity with which this program converges on the one entered it quite surprising.
This simulation is somewhat different from natural selection especially in that the selection rules are hard coded but I think it does demonstrate the power of random changes to converge when there is strong selection pressure.
A fascinating aid in demonstrating natural selection was built by Darwin's cousin Francis Galton in 1877. A illustration and description can be found here. The amazing thing about this device is that, as described in the article, it has been re-discovered and re-purpose to illustrate the process of Bayesian inference.
I have come to consider this isomorphism between Bayesian inference and natural selection or Darwinian processes in general as a deep insight into the workings of nature. I view natural selection as a method of physically performing Bayesian inference, specifically as a method for inferring means for reproductive success. My paper on this subject may be found here
I agree with your statement:
if we require 100% justified confidence to consider something knowledge, no one knows or can know a single thing.
However I think your are misunderstanding me.
I don't think we require 100% justified confidence for there to be knowledge I believe knowledge is always a probability and that scientific knowledge is always something less than 100%.
I suggest that knowledge is justified belief but it is always a probability less than 100%. As I wrote: I mean justified in the Bayesian sense which assigns a probability to a state of knowledge. The correct probability to assign may be calculated with the Bayesian update.
This is a common Bayesian interpretation. As Jaynes wrote:
In our terminology, a probability is something that we assign, in order to represent a state of knowledge.
You misunderstand me. I did not say it was
'known' the theory was true.
I reject the notion that any scientific theory can be known to be 100% true, I stated:
Perhaps those scientist from the past should have said it had a high probability of being true.
As we all know now Newton's theory of gravitation is not 100% true and therefore in a logical sense it is not true at all. We have counter examples as in the shift of Mercury's perihelion which it does not predict. However the theory is still a source of knowledge, it was used by NASA to get men to the moon.
Perhaps considering knowledge as an all or none characteristic is unhelpful.
If we accept that a theory must be true or certain in order to contain knowledge it seems to me that no scientific theory can contain knowledge. All scientific theories are falsifiable and therefore uncertain.
I also consider it hubris to think we might ever develop a 'true' scientific theory as I believe the complexities of reality are far beyond what we can now imagine. I expect however that we will continue to accumulate knowledge along the way.
I would be interested if you would care to elaborate a little.Syllogisms have been a mainstay of philosophy for over two millennium and undoubtedly I have a lot to learn about them.
In my admittedly limited understanding of syllogisms the conclusion is true given the premises being true. Truth is more in the structure of the argument than in its conclusion. If Socrates is not mortal than either he is not a man or not all men are mortal.
Yes I agree, there is only a rough isomorphism between the mathematics of binary logic and the real world; binary logic seems to describe a limit that reality approaches but never reaches.
We should consider that the mathematics of binary logic are the limiting case of probability theory; it is probability theory where the probabilities may only take the values of 0 or 1. Probability theory can do everything that logic can but it can also handle those real world cases where the probability of knowing something is something other than 0 or 1, as is the usual case with scientific knowledge.
Yes, good point. Classical physics, dealing with macroscopic objects, predicts definite (non-probabilistic) measurement outcomes for both the first and second measurements.
The point I was (poorly) aiming at is that while quantum theory is inherently probabilistic even it sometimes predicts specific results as certainties.
I guess the important point for me is that while theories may predict certainties they are always falsifiable; the theory itself may be wrong.
Thanks for the link to Gettier's paper.
It seems he considers that the statement 'S knows that P' can have only two possible values, true or false. This may have been a historical tradition within philosophy since Plato but it seems to rule out many usual usages of 'knowledge' such as 'I know a little about that'.
As noted by Edwin Jaynes Bayesians usually consider knowledge in terms of probability:
In our terminology, a probability is something that we assign, in order to represent a state of knowledge.
In his great text on Bayesian inference, Probability theory: the logic of science, he demonstrates that Aristotelian logic is a limiting case of probability theory; The results of logic are the results of probability theory where the value of probabilities are restricted to only 0 and 1. I believe this probabilistic approach provides a richer context for knowledge in that there are degrees of certainty. My reworking of Plato's definition attempted to transition it to this context.
Pick your favorite case of a scientific theory that was once well supported by the evidence, but turned out to be false. Back when available evidence supported it, did scientists know it was true?
Perhaps those scientist from the past should have said it had a high probability of being true. I may be misunderstanding you but I do not believe science can produce certainty and this seems to be a common view. I quote wikipedia.
A scientific theory is empirical, and is always open to falsification if new evidence is presented. That is, no theory is ever considered strictly certain as science accepts the concept of fallibilism.
It may be interesting that although all measurable results in quantum theory are in the form of probabilities there is at least one instance where this theory predicts a certain result. If the same measurement is immediately made a second time on a quantum system the second result will be the same as the first with probability 1. In other words the state of the quantum system revealed by the first measurement is confirmed by the second measurement. It may seem odd that the theory predicts the result of the first measurement as a probability distribution of possible results but predicts only a single possible result for the second measurement.
Wojciech Zuruk considers this as a postulate of quantum theory (see his paper quantum Darwinism ). (sorry for the typo in the quote).
- Postulate (iii) Immediate repetition of a measurement yields the same outcome starts this task. This is the only uncontroversial measurement postulate (even if it is difficult to approximate in the laboratory): Such repeatability or predictability is behind the very idea of a state.
If we consider that information exchange took place between the quantum system and the measuring device in the first measurement then we might view the probability distribution implied by the wave function as having undergone a Bayesian update on the receipt of new information. We might understand that this new information moved the quantum model to predictive certainty regarding the result of the second measurement.
Of course this certainty is only certain within the terms of quantum theory which is itself falsifiable.
I have some skepticism about absolute certainty. Logic deals in certainties but it seems unclear if it absolutely describes anything in the real world. I am not sure if observed evidence plays a role in logic. If all men are mortal and if Socrates is a man then Socrates is mortal appears to be true. If we were to observe Socrates being immortal the syllogism would still be true but one of the conditional premises that all men are mortal or that Socrates is a man would not be true.
In science at least where evidence plays a decisive role there is no certainty; scientific theories must be falsifiable, there is always some possibility that an experimental result will not agree with theory.
The examples I gave are true by virtue of logical relationships such as if all A are B and all B are C then all A are C. In this vein it might seem certain that if something is here it cannot be there, however this is not true for quantum systems; due to superposition a quantum entity can be said to be both here and there.
Another interesting approach to this problem was taken by David Deutsch. He considers that any mathematical proof is a form of calculation and all calculation is physical just as all information has a physical form. Thus mathematical proofs are no more certain than the physical laws invoked to calculate them. All mathematical proofs require our mathematical intuition, the intuition that one step of the proof follows logically from the other. Undoubtedly such intuition is the result of our long evolutionary history that has built knowledge of how the world works into our brains. Although these intuitions are formed from principles encoded in our genetics they are no more reliable than any other hypothesis supported by the data; they are not certain.
One example in classical logic is the syllogism where if the premises are true then the conclusion is by necessity true:
Socrates is a man
All men are mortal
therefore it is true that Socrates is mortal
Another example is mathematical proofs. Here is the Wikipedia presentation of Euclids proof from 300 BC that there is an infinite number of prime numbers. Perhaps In your terms this proof provides 0% confidence that we will observe the largest prime number.
Take any finite list of prime numbers p1, p2, ..., pn. It will be shown that at least one additional prime number not in this list exists. Let P be the product of all the prime numbers in the list: P = p1p2...pn. Let q = P + 1. Then, q is either prime or not:
1) If q is prime then there is at least one more prime than is listed.
2) If q is not prime then some prime factor p divides q. If this factor p were on our list, then it would divide P (since P is the product of every number on the list); but as we know, p divides P + 1 = q. If p divides P and q then p would have to divide the difference of the two numbers, which is (P + 1) − P or just 1. But no prime number divides 1 so there would be a contradiction, and therefore p cannot be on the list. This means at least one more prime number exists beyond those in the list.
This proves that for every finite list of prime numbers, there is a prime number not on the list. Therefore there must be infinitely many prime numbers.
Interesting discussion but I suspect an important distinction may be required between logic and probability theory. Logic is a special case of probability theory where values are restricted to only 0 and 1, that is to 0% and 100% probability. Within logic you may arrive at certain conclusions but generally within probability theory conclusions are not certain but rather assigned a degree of plausibility.
If logic provides, in some contexts, a valid method of reasoning then conclusions arrived at will be either 0% or 100% true. Denying that 100% confidence is ever rational seems to be equivalent to denying that logic ever applies to anything.
It is certainly true that many phenomena are better described by probability than by logic but can we deny logic any validity. I understand mathematical proofs as being within the realm of logic where things may often be determined as being either true or false. For instance Euclid is credited with first proving that there is no largest prime. I believe most mathematicians accept this as a true statement and that most would agree that 53 is easily proven to be prime.
Philosophy seems to have made little progress defining knowledge since Plato's 'justified true belief'. I concur with this definition given three, hopefully minor caveats:
1) Beliefs and therefore knowledge are not understood as restricted to humans. This perhaps requires that 'beliefs' be replaced with 'expectations'. 'Expectation' or expected value is a property of any model in the form of a probability distribution. The expected value of the 'ignorance' of such a model is its information entropy. It is the amount of information required to move the model to certainty through Bayesian updating. Entropy is information and all information is defined as the negative log of a probability. (See wikipedia page http://en.wikipedia.org/wiki/Self-information) The inverse of entropy is a probability; the value of the entropy in bits raised to the negative two power. Thus if the information entropy of a model is 3 bits, the inverse probability would be one eighth. (It would be easier writing this if some mathematical symbols were available). As this probability is the inverse of a model's ignorance I suggest it be considered as a definition of knowledge. Thus knowledge would be defined as a property of models and would encompass a wider range of natural phenomena including the knowledge within an organism's genetic model.
2) 'Justified' be understood in the Bayesian sense as justified by the evidence. Justified in a Bayesian context is not absolute but refers to degrees of plausibility in the form of Bayes factors or 'odds '. An early use of Bayes factors was by Turing in his cracking of the enigma code; he needed a measure of 'justification ' for deciding if a given key combination cracked a given code variation.
3) 'True' be dropped from the definition. Knowledge, especially scientific knowledge, deals with degrees of plausibility given uncertain information. Logic involving true and false values is a special case of Bayesian probability (where values are restricted to only 0 and 1; see Jaynes, Probability theory: the logic of science). The necessary constraint on the definition is therefore accomplished with 'justified' as described above.
After these alterations knowledge is defined as justified expectations.