The Use of Many Independent Lines of Evidence: The Basel Problem

post by JonahS (JonahSinick) · 2013-06-03T04:42:02.380Z · LW · GW · Legacy · 44 comments

Contents

  Justifiably high confidence
  Limits on the confidence conferred by a single argument
  The use of many independent lines of evidence to develop high confidence
  Euler’s solution of the Basel Problem 
    The analogy with polynomial functions 
    Numerical evidence from the Basel sum and its analogs
    An independent derivation of the Wallis product formula and of the Leibniz series
  Conclusion
None
44 comments

This post describes how one can use many independent arguments to justifiably develop very high confidence in the truth of a statement. It ends with a case study: Euler’s use of several independent lines of evidence to develop confidence in the validity of his unrigorous solution to the Basel Problem.

Justifiably high confidence

In Einstein’s Arrogance, Eliezer described how Einstein could have been justifiably confident in the correctness of the theory of general relativity, even if it were to fail an empirical test:

In 1919, Sir Arthur Eddington led expeditions to Brazil and to the island of Principe, aiming to observe solar eclipses and thereby test an experimental prediction of Einstein's novel theory of General Relativity. A journalist asked Einstein what he would do if Eddington's observations failed to match his theory. Einstein famously replied: "Then I would feel sorry for the good Lord. The theory is correct."

[…]

If Einstein had enough observational evidence to single out the correct equations of General Relativity in the first place, then he probably had enough evidence to be damn sure that General Relativity was true.

[…]

"Then I would feel sorry for the good Lord; the theory is correct," doesn't sound nearly as appalling when you look at it from that perspective. And remember that General Relativity was correct, from all the vast space of possibilities.

Limits on the confidence conferred by a single argument

In Confidence levels inside and outside an argument Yvain described how one can't develop very high confidence in a statement based on a single good argument that makes the prediction with very high confidence, because there’s a sizable (even if small) chance that the apparently good argument is actually wrong.

How can this be reconciled with Einstein's justifiably high confidence in the truth of general relativity?

The use of many independent lines of evidence to develop high confidence

Something that I’ve always admired about Carl Shulman is that he often presents many different arguments in favor of a position, rather than a single argument. In Maximizing Cost-effectiveness via Critical Inquiry Holden Karnofsky wrote about how one can improve one’s confidence in a true statement by examining the statement from many different angles. Even though one can’t gain very high confidence in a statement via a single argument, one can gain very high confidence in the truth of a statement if many independent lines of evidence support it.

Epistemology in the human world is murky:

To illustrate the principle mentioned above, it's instructive to consider an example in the domain of mathematics, which is a much simpler domain.

Euler’s solution of the Basel Problem 

In 1735, Euler solved the famous problem of finding the value of the “Basel Sum,” which is defined to be the sum of reciprocals of squares of positive integers. He found that the value is equal to ‘pi squared divided by 6.’

Euler's initial method of solution is striking in that it relies on an assumption which was unproven at the time, and which is not true in full generality. Euler assumed the product formula for the sine function, which expresses the sine function as an infinite product of linear polynomials, each corresponding to a real root of the sine function. In hindsight, there are a number of reasons why Euler should have been concerned that the product formula sine might not be valid:

The general theorem of which the product formula for sine is a special case is the Weierstrass factorization theorem, which wasn’t proved until the mid/late 1800’s.

And indeed, Euler was initially concerned that the product formula for sine might not be valid. So in view of the (at the time) dubious nature of the product formula for sine, how could Euler have known that he had correctly determined the value of the Basel sum?

Euler: The Master of Us All describes some of Euler's reasons:

The analogy with polynomial functions

A polynomial function with all roots real can be written as products of linear polynomials corresponding to the roots of the function. Isaac Newton had shown that the sine function can be written as an infinite polynomial. So one might guess that the sine function can be written as a product of linear polynomials, in analogy with polynomial functions.

This evidence is weak, because not all polynomials have all roots real, and because statements that are true of finite objects often break down when one passes to infinite objects.

Numerical evidence from the Basel sum and its analogs

If a statement implies a nontrivial true statement, that’s evidence that the original statement is true. (This is just Bayes’ Theorem, and the converse of the fact that Absence of Evidence Is Evidence of Absence.)

In 1731, Euler found that the Basel Sum is equal to 1.644934, up to six places past the decimal point. This decimal approximation agrees with that of ‘pi squared over 6.’ So the assumption that the product formula for sine is true implies something that’s both true and nontrivial.

The above numerical confirmation leaves open the possibility that the Basel Sum and ‘pi squared over 6’ differ by a very small amount. It can happen that two apparently unrelated and simple mathematical quantities differ by less than a trillionth, so this is a legitimate concern.  

To assuage this concern, one can consider the fact that Euler’s method yields the values of the sum of reciprocals of kth powers of positive integers for every even integer k, and check these formulas for numerical accuracy, finding that they hold with high precision. It’s less likely that all of them are just barely wrong than it is that a single one is just barely wrong. So this a nontrivial amount of further evidence that the product formula for sine is true.

An independent derivation of the Wallis product formula and of the Leibniz series

As stated above, if a statement implies a nontrivial true statement, this is evidence that the original statement is true. Euler used the product formula for sine to deduce the Wallis product formula for pi, which had been known since 1655. By assuming the existence of a formula analogous to the product formula for sine, this time for '1 minus sine,' Euler deduced the Leibniz formula for pi

Upon doing so, Euler wrote "For our method, which may appear to some as not reliable enough, a great confirmation comes here to light. Therefore, we should not doubt at all of the other things which are derived by the same method."

Remark: After writing this post, I learned that George Polya gave an overlapping discussion of Euler's work on the Basel Problem on pages 17-21 of Mathematics and Plausible Reasoning, Volume 1: Induction and Analogy in Mathematics. I found Euler's deduction of the Leibniz formula for pi in Polya's book. Polya's book contains more case studies of the same type.

Conclusion

Euler had good reason to believe that his derivation of the value of the Basel sum was valid, even though a rigorous proof that his derivation was valid was years or decades away. 

How much confidence is rational based on the evidence available at the time depends on the degree to which the different lines of evidence are independent. The lines of evidence appear to be independent, but could have subtle interdependencies.

Nevertheless, I've never heard of an example of mathematical statement so robustly supported that turned out to be false. In the context of the fact that there's been a huge amount of mathematical research since then, it may be reasonable to conclude that the appropriate confidence level would have been 99.9999+%. [Edit: I lightly edited this paragraph — see this comment thread.]

It may be appropriate to hedge, with Confidence levels inside and outside an argument in mind, because the arguments that I make in this post may themselves be wrong :-).

The example of Euler's work on the Basel Problem highlights the use of many independent lines of evidence to develop very high confidence in a statement: something which occurs and which can occur in many domains.

Note: I formerly worked as a research analyst at GiveWell. All views here are my own.

44 comments

Comments sorted by top scores.

comment by gjm · 2013-06-03T09:10:28.423Z · LW(p) · GW(p)

Could you elaborate on the role of the 25k mathematics papers on the arXiv in leading you to that 99.9999% figure? I'm having trouble following the logic in that paragraph.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-03T15:54:14.708Z · LW(p) · GW(p)

I was hinting at the sheer volume of mathematical research — given that there's been so much mathematical research, and the fact that I haven't heard of any examples of statements as robustly supported as the product formula for sine that turned out to be false, there should be a very strong prior against such a statement being false.

One could attribute my not having heard of such an example to my ignorance :-) but such a story would have a strong tendency to percolate on account of being so weird, so one wouldn't need to know a great deal to have heard it.

Replies from: gjm
comment by gjm · 2013-06-03T18:36:13.550Z · LW(p) · GW(p)

I understand how that works qualitatively, but I don't understand where the number 99.9999% comes from, nor how the number 25000 feeds into it. And surely only a tiny fraction of mathematics papers on the arXiv deal with conjectures of this sort, so why cite the number of papers there in particular?

I'll be (rather bogusly) quantitative for a moment. Pretend that every single one of those 25k papers on the arXiv makes an argument similar to Euler's, and that if any one of them were wrong then you'd certainly have heard about it. How improbable would an error have to be, to make it unsurprising (say, p=1/2) that you haven't heard of one? Answer: the probability of one paper being correct would have to be at least about 99.997%.

Now, to be sure, there's more out there than the arXiv. But, equally, hardly any papers deal with arguments like Euler's, and many papers go unscrutinized and could be wrong without anyone noticing, and surely many are obscure enough that even if they were noticed the news might not spread.

Maybe I'm being too pernickety. But it seems to me that one oughtn't to say "In the context of the fact that ~25,000 math papers were posted on ArXiv in 2012 it may be reasonable to conclude that the appropriate confidence level would have been 99.9999+%" when one means "There are lots of mathematics papers published and I haven't heard of another case of something like this being wrong, so probably most such cases are right".

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-03T18:45:20.825Z · LW(p) · GW(p)

I agree that my argument isn't tight. I'm partially going on tacit knowledge that I acquired during graduate school. The figure that I gave is a best guess.

Replies from: orthonormal
comment by orthonormal · 2013-06-03T19:17:16.259Z · LW(p) · GW(p)

I think it's inappropriate to cite a figure as support for your estimate unless you indicate in some way how that figure affects your estimate.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-03T20:22:00.027Z · LW(p) · GW(p)

I changed my post accordingly. I'm somewhat puzzled as to why it rubbed people the wrong way (to such a degree that my above comment was downvoted three times.)

comment by Joanna Morningstar (Jonathan_Lee) · 2013-06-03T19:32:19.300Z · LW(p) · GW(p)

Observationally, the vast majority of mathematical papers do not make claims that are non-rigorous but as well supported as the Basel problem. They split into rigorous proofs (potentially conditional on known additional hypotheses eg. Riemann), or they offer purely heuristic arguments with substantially less support.

It should also be noted that Euler was working at a time when it was widely known that the behaviour of infinite sums, products and infinitesimal analysis (following Newton or Leibnitz) was without any firm foundation. So analysis of these objects at that time was generally flanked with "sanity check" demonstrations that the precise objects being analysed did not trivially cause bad behaviour. Essentially everyone treated these kinds of demonstrations as highly suspect until the 1830's and a firm foundation for analysis (cf. Weierstrass and Riemann). Today we grandfather these demonstrations in as proofs because we can show proper behaviour of these objects.

On the other hand, there were a great many statements made at that time which later turned out to be false, or require additional technical assumptions once we understood analysis, as distinct from an calculus of infinitesimals. The most salient to me would be Cauchy's 1821 "proof" that the pointwise limit of continuous functions is continuous; counterexamples were not constructed until 1826 (by which time functions were better understood) and it took until 1853 for the actual conditions (uniform continuity) to be developed properly. This statement was at least as well supported in 1821 as Euler's was in 1735.

As to confidence in modern results: Looking at the Web of Science data collated here for retractions in mathematical fields suggests that around 0.15% of current papers are retracted.

Replies from: JonahSinick, Douglas_Knight
comment by JonahS (JonahSinick) · 2013-06-03T20:13:50.154Z · LW(p) · GW(p)

The most salient to me would be Cauchy's 1821 "proof" that the pointwise limit of continuous functions is continuous; counterexamples were not constructed until 1826 (by which time functions were better understood) and it took until 1853 for the actual conditions (uniform continuity) to be developed properly. This statement was at least as well supported in 1821 as Euler's was in 1735.

What was the evidence?

Replies from: Jonathan_Lee
comment by Joanna Morningstar (Jonathan_Lee) · 2013-06-03T21:25:07.371Z · LW(p) · GW(p)

That it worked in every instance of continuous functions that had been considered up to that point, seemed natural, and extended many existing demonstrations that a specific sequence of continuous functions had a continuous limit.

A need for lemmas of the latter form are endemic, for a concrete class of examples, any argument via a Taylor series on an interval implicitly requires such a lemma, to transfer continuity, integrals and derivatives over. In just this class, you get numerical evidence came from the success of perturbative solutions to Newtonian mechanics, and theoretical evidence in the existence of well behaved Taylor series for most functions.

Replies from: JonahSinick, JonahSinick
comment by JonahS (JonahSinick) · 2013-06-03T21:40:07.512Z · LW(p) · GW(p)

I guess we'll have to agree to disagree here :-). I find Euler's evidence for the product formula for sine to be far more convincing than what was available to Cauchy at the time.

Edit: I say more here, where I highlight how different the two situations are.

Replies from: Eugine_Nier
comment by Eugine_Nier · 2013-06-04T05:38:41.643Z · LW(p) · GW(p)

Are you sure you aren't suffering from hindsight bias?

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-04T05:52:19.206Z · LW(p) · GW(p)

Not 100% sure, but pretty sure. The situation isn't so much that I think that the evidence for the limit of a continuous function being continuous is weak, as much as that the evidence for the product formula for sine is very strong.

The result (and its analog) imply two formulas for pi that had been proved by other means, and predicts infinitely many previously unknown numerical identities, which can be checked to be true to many decimal places. What more could you ask for? :-)

Replies from: Eugine_Nier
comment by Eugine_Nier · 2013-06-06T05:23:33.239Z · LW(p) · GW(p)

and predicts infinitely many previously unknown numerical identities, which can be checked to be true to many decimal places.

And did Euler check them?

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-06T06:03:12.507Z · LW(p) · GW(p)

Polya reports on Euler performing such checks. I don't know how many he did – one would have to look at the original papers (which are in Latin), and even they probably omit some of the plausibility checks that Euler did.

comment by JonahS (JonahSinick) · 2013-06-08T06:11:02.060Z · LW(p) · GW(p)

That it worked in every instance of continuous functions that had been considered up to that point,

In ~1659, Fermat considered the sequence of functions f(n,x) = x^n for n = 0, 1, 2, 3, .... Each of these is a continuous function of x. If you restrict these functions to the interval between 0 and 1, and take the limit as n goes to infinity, you get a discontinuous function.

So there's a very simple counterexample to Cauchy's ostensible theorem from 1821, coming from a sequence of functions that had been studied over 150 years before. If Cauchy had actually looked at those examples of sequences of function that had been considered, he would have recognized his ostensible theorem to be false. By way of contrast, Euler did extensive empirical investigation to check the plausibility of his result. The two situations are very, very different.

Replies from: Jonathan_Lee
comment by Joanna Morningstar (Jonathan_Lee) · 2013-06-08T19:26:16.577Z · LW(p) · GW(p)

Fermat considered the sequence of functions f(n,x) = x^n for n = 0, 1, 2, 3, ....

Only very kind of. Fermat didn't have a notion of function in the sense meant later, and showed geometrically that the area under certain curves could be computed by something akin to Archimedes' method of exhaustion, if you dropped the geometric rigour and worked algebraically. He wasn't looking at a limit of functions in any sense; he showed that the integral could be computed in general.

The counterexample is only "very simple" in the context of knowing that the correct condition is uniform convergence, and knowing that the classical counterexamples look like x^n, n->\infty or bump functions. Counterexamples are not generally obvious upfront; put another way, it's really easy to engage in Whig history in mathematics.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-08T19:58:23.559Z · LW(p) · GW(p)

Independently of whether Fermat thought of it as an example, Cauchy could have considered lots of sequences of functions in order to test his beliefs, and I find it likely that had he spent time doing so, he would have struck on this one.

On a meta-level, my impression is that you haven't updated your beliefs based on anything that I've said on any topic, in the course of our exchanges, whether online or in person. It seems very unlikely that no updates are warranted. I may be misreading you, but to the extent that you're not updating, I suggest that you consider whether you're being argumentative when you could be inquisitive and learn more as a result.

Replies from: Jonathan_Lee
comment by Joanna Morningstar (Jonathan_Lee) · 2013-06-08T20:59:08.296Z · LW(p) · GW(p)

Thank you for calling out a potential failure mode. I observe that my style of inquisition can come across as argumentative, in that I do not consistently note when I have shifted my view (instead querying other points of confusion). This is unfortunate.

To make my object level opinion changes more explicit:

  • I have had a weak shift in opinion towards the value of attempting to quantify and utilise weak arguments in internal epistemology, after our in person conversation and the clarification of what you meant.

  • I have had a much lesser shift in opinion of the value of weak arguments in rhetoric, or other discourse where I cannot assume that my interlocutor is entirely rational and truth-seeking.

  • I have not had a substantial shift in opinion about the history of mathematics (see below).

As regards the history of mathematics, I do not know our relative expertise, but my background prior for most mathematicians (including JDL_{2008}) has a measure >0.99 cluster that finds true results obvious in hindsight and counterexamples to false results obviously natural. My background prior also suggests that those who have spent time thinking about mathematics as it was done at the time fairly reliably do not have this view. It further suggests that on this metric, I have done more thinking than the median mathematician (against a background of Cantab. mathmos, I would estimate I'm somewhere above the 5th centile of the distribution). The upshot of this is that your recent comments have not substantively changed my views about the relative merit of Cauchy and Euler's arguments at the time they were presented; my models of historians of mathematics who have studied this do not reliably make statements that look like your claims wrt. the Basel problem.

I do not know what your priors look like on this point, but it seems highly likely that our difference in views on the mathematics factor through to our priors, and convergence will likely be hindered by being merely human and having low baud channels.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-08T21:36:08.151Z · LW(p) · GW(p)

I have had a weak shift in opinion towards the value of attempting to quantify and utilise weak arguments in internal epistemology, after our in person conversation and the clarification of what you meant.

Ok. See also my discussion post giving clarifications.

I have had a much lesser shift in opinion of the value of weak arguments in rhetoric, or other discourse where I cannot assume that my interlocutor is entirely rational and truth-seeking.

I think that the most productive careful analysis of the validity of a claim occurs in writing, with people who one believes to be arguing in good faith.

In person, you highlighted the problem of the first person to give arguments having an argumentative advantage due to priming effects. I think this is much less of a problem in writing, where one has time to think and formulate responses.

I have not had a substantial shift in opinion about the history of mathematics (see below).

My view on this point is very much contingent on what Euler actually did as opposed to a general argument of the type "heuristics can be used to reach true conclusions, and so we can have high confidence in something that's supported by heuristics."

Beyond using a rough heuristic to generate the identity, Euler numerically checked whether the coefficients agreed (testing highly nontrivial identities that had previously been unknown) and found them to agree with high precision, and verified that specializing the identity recovered known results.

If you don't find his evidence convincing, then as you say, we have to agree to disagree because we can't fully externalize our intuitions

comment by Douglas_Knight · 2013-06-04T17:21:30.003Z · LW(p) · GW(p)

It should also be noted that Euler was working at a time when it was widely known that the behaviour of infinite sums, products and infinitesimal analysis (following Newton or Leibnitz) was without any firm foundation.

Could you give a source for this claim? "Foundation" sounds to me anachronistic for 1735.

Replies from: Jonathan_Lee
comment by Joanna Morningstar (Jonathan_Lee) · 2013-06-04T19:14:58.214Z · LW(p) · GW(p)

It possible that "were known in general to lead to paradoxes" would be a more historically accurate phrasing than "without firm foundation".

For east to cite examples, there's "The Analyst" (1734, Berkeley). The basic issue was that infinitesimals needed to be 0 at some points in a calculation and non-0 at others. For a general overview, this seems reasonable. Grandi noticed in 1703 that infinite series did not need to give determinate answers; this was widely known in by the 1730's. Reading the texts, it's fairly clear that the mathematicians working in the field were aware of the issues; they would dress up the initial propositions of their calculi in lots of metaphysics, and then hurry to examples to prove their methods.

comment by BlueSun · 2013-06-03T17:50:09.824Z · LW(p) · GW(p)

Great article, I have a particular fondness for this line of reasoning as it helped me leave my religious roots behind. I ended up reasoning that despite assurances that revelation was 100% accurate and to rely on it over any and all scientific evidences because they're just "theories", there was a x% chance that the revelation model was wrong. And for any x% larger than something like 0.001%, the multiple independent pieces of scientific, historic, and archaeological evidences would crush it. I then found examples of where revelation was wrong and it became clear that x% was close to what you'd expect from "educated guess." And yes, I did actually work out all the probabilities with Bayes theorem.

comment by Pablo (Pablo_Stafforini) · 2013-06-04T04:13:16.226Z · LW(p) · GW(p)

It may be appropriate to hedge, with Confidence levels inside and outside an argument in mind, because the arguments that I make in this post may themselves be wrong :-).

What would our epistemic state be if we found multiple arguments for the view that multiple arguments can warrant very high confidence?

Replies from: JonahSinick
comment by JoshuaZ · 2013-06-04T03:57:45.691Z · LW(p) · GW(p)

The Euler example raises an issue: when should be more confident about some heuristically believed claim than claims proven in the mathematical literature? For example, the proof for the classification of finite simple groups consists of hundreds of distinct papers by about as many authors. How confident should one be that that proof is actually correct and doesn't contain serious holes? How confident should one be that we haven't missed any finite simple groups? I'm substantially more confident that no group has been missed (>99%?) but much less so in the validity of the proof. Is this the correct approach?

Then there are statements which simply look extremely likely. Let's take for example "White has a winning strategy in chess if black has to play down a queen". How confident should one be for this sort of statement? If someone said they had a proof that this was false, what would it take to convince one that the proof was valid? It would seem to take a lot more than most mathematical facts, but how much so, and can we articulate why?

Note incidentally that there are a variety of conjectures that are currently believed for reasons close to Euler's reasoning. For example, P = BPP is believed because we have a great deal of different statements that all imply it. Similarly, the Riemann hypothesis is widely believed due to a combination of partial results (a positive fraction of zeros must be on the line, almost all zeros must be near the line, the first few billion zeros are on the line, a random model of the Mobius function implies RH, etc.), but how confident should we be in such conjectures?

Replies from: Douglas_Knight, elharo, JonahSinick
comment by Douglas_Knight · 2013-06-04T18:00:38.478Z · LW(p) · GW(p)

The question of whether there is a missing finite simple group is a precise question. But what does it mean for a natural language proof to be valid? Typically a proof contains many precise lemmas and one could ask that these statements are correct (though this leaves the question of whether they prove the theorem), but lots of math papers contain lemmas that are false as stated, but where the paper would be considered salvageable if anyone noticed.

Similarly, the Riemann hypothesis is widely believed due to a combination of partial results (a positive fraction of zeros must be on the line, almost all zeros must be near the line, the first few billion zeros are on the line, a random model of the Mobius function implies RH, etc.)

This is a very standard list of evidence, but I am skeptical that it reflects how mathematicians judge the evidence. I think that of the items you mention, the random model is by far the most important. The study of small zeros is also relevant. But I don't think that the theorems about infinitely many zeros have much effect on the judgement.

Replies from: JonahSinick
comment by JonahS (JonahSinick) · 2013-06-04T18:07:04.472Z · LW(p) · GW(p)

This is a very standard list of evidence, but I am skeptical that it reflects how mathematicians judge the evidence. I think that of the items you mention, the random model is by far the most important. The study of small zeros is also relevant. But I don't think that the theorems about infinitely many zeros have much effect on the judgement.

I agree with this, but would I cite the empirical truth of the RH for other global zeta functions, as well as the proof of the Weil conjectures, as evidence that mathematicians actually think about.

I'd be interested in corresponding a bit — shoot me an email if you'd like

Replies from: Eugine_Nier
comment by Eugine_Nier · 2013-06-06T05:35:31.235Z · LW(p) · GW(p)

I agree with this, but would I cite the empirical truth of the RH for other global zeta functions

An interesting thing about the GRH is that at oft neglected piece of evidence against it is how the Siegal zero seems to behave like a real object, i.e., having consistent properties.

Replies from: JoshuaZ
comment by JoshuaZ · 2013-06-08T18:05:11.635Z · LW(p) · GW(p)

An interesting thing about the GRH is that at oft neglected piece of evidence against it is how the Siegal zero seems to behave like a real object, i.e., having consistent properties.

I'm not sure what you mean here. If it didn't have consistent properties we could show it doesn't exist. Everything looks consistent up until the point you show it isn't real. Do you mean that it has properties that don't look that implausible? That seems like a different argument.

Replies from: Eugine_Nier
comment by Eugine_Nier · 2013-06-09T05:51:09.347Z · LW(p) · GW(p)

If we can easily prove a conjecture except for some seemingly arbitrary case, that's evidence for the conjecture being false in that case.

comment by elharo · 2013-06-04T11:13:31.079Z · LW(p) · GW(p)

The classification of finite simple groups is a very telling example because it was incorrectly believed to be finished back in the 1980s, but wasn't actually finished till 2004. There are many other examples of widely accepted "results" that didn't measure up, some of which were not merely unproven but actively incorrect. For instance, John von Neumann's "proof" that there was no hidden variables theory of quantum mechanics was widely cited at least into the 1980s, 30 years after David Bohm had in fact constructed exactly such a theory.

I suspect we should not be at all confident that the Riemann hypothesis is true given current evidence. There are some reasons to believe it might be false, and the reasons you cite aren't strong evidence that it is true. Given that this is math, not reality, there is an infinite space to search of values infinitely larger than any we have attempted. There are also many examples of hypotheses that were widely believed to be true in math until a counterexample was found.

P != NP is another example of a majority accepted hypothesis with somewhat stronger evidence in the physical world than the Riemann hypothesis has. Yet there are respected professional mathematicians (as well as unrespected amateurs like myself) who bet the other way. I would ask for odds though. :-)

Replies from: Will_Sawin
comment by Will_Sawin · 2013-06-13T06:51:30.084Z · LW(p) · GW(p)

BTW what was John von Neumann's "proof"?

Replies from: elharo
comment by elharo · 2013-06-13T10:12:55.609Z · LW(p) · GW(p)

I won't pretend to be able to reproduce it here. You can find the original in English translation in von Neumann's Mathematical Foundations of Quantum Mechanics. According to Wikipedia,

Von Neumann's abstract treatment permitted him also to confront the foundational issue of determinism vs. non-determinism and in the book he presented a proof according to which quantum mechanics could not possibly be derived by statistical approximation from a deterministic theory of the type used in classical mechanics. In 1966, a paper by John Bell was published, claiming that this proof contained a conceptual error and was therefore invalid (see the article on John Stewart Bell for more information). However, in 2010, Jeffrey Bub published an argument that Bell misconstrued von Neumann's proof, and that it is actually not flawed, after all.[25]

So apparently we're still trying to figure out if this proof is acceptable or not. Note, however, that Bub's claim is that the proof didn't actually say what everyone thought it said, not that Bohm was wrong. Thus we have another possible failure mode: a correct proof that doesn't say what people think it says.

This is not actually as uncommon as it should be, and goes way beyond math. There are many examples of well-known "facts" for which numerous authoritative citations can be produced, but that are in reality false. For example, the lighthouse and aircraft carrier story is in fact false, despite "appearing in a 1987 issue of Proceedings, a publication of the U.S. Naval Institute."

Of course, as I type this I notice that I haven't personally verified that the 1987 issue of Proceedings says what Stephen Covey's The Seven Habits of Highly Effective People, the secondary source that cited it, says it says. This is how bad sources work their way into the literature. Too often authors copy citations from each other without going back to the original. How many of us know about experiments like Robbers Cave or Stanford Prison only from HpMOR? What's the chance we've explained it to others, but gotten crucial details wrong?

Replies from: Will_Sawin
comment by Will_Sawin · 2013-06-13T18:47:56.368Z · LW(p) · GW(p)

I've just seen the claim that von Neumann had a fake proof in a couple places, and it always bothers me, since it seems to me like one can construct a hidden variable theory that explains any set of statistical predictions. Just have the hidden variables be the response to every possible measurement! Or various equivalent schemes. One needs a special condition on the type of hidden variable theory, like Bell's nonlocality.

comment by JonahS (JonahSinick) · 2013-06-04T04:43:35.684Z · LW(p) · GW(p)

Very nice comment.

comment by [deleted] · 2013-06-03T15:10:14.401Z · LW(p) · GW(p)

Nevertheless, I've never heard of an example of mathematical statement so robustly supported that turned out to be false.

Polya's conjecture in 1919 was disproved in 1958, with an explicit counterexample found in 1960 and the smallest counterexample found in 1980: N=906,150,257. (Nowadays, a desktop computer can find this in a matter of seconds.)

Here's another one: the Mertens conjecture in 1885 was disproved in 1985 - the first counterexample is known to be between 10^14 and e^(1.59*10^40), but we don't know what it is.

Replies from: gjm
comment by gjm · 2013-06-03T15:39:29.657Z · LW(p) · GW(p)

Did those have the same sort of theoretical backing as Euler's argument for the Basel problem?

The point of Jonah's article is that Euler didn't just compute a bunch of digits, notice that they matched pi^2/6, and say "done"; he had a theory for why it worked, involving the idea of writing sin(z) as a product of simple factors, and he had substantial (though not conclusive) evidence for that: numerical evidence for other Basel-like sums and a new derivation of Wallis's product. And he had a bit of confirmation for the general method: he used it to rederive another famous formula for pi. (Though in fact the general method doesn't work as generally as Euler might possibly have thought it did, as Jonah mentions.)

So far as I know, the only evidence for the Polya and Mertens conjectures was that they seemed to work numerically for small n, and that they seemed like the kind of thing that might be true. Which is plenty good enough reason for conjecturing them, but not at all the same sort of support Jonah's saying Euler had for his pi^2/6 formula.

Replies from: JonahSinick, None
comment by JonahS (JonahSinick) · 2013-06-03T16:04:46.757Z · LW(p) · GW(p)

Yes, this is right.

The case of Mertens' conjecture is interesting in that surface level heuristic considerations suggest that it's not true. In particular, it violates the heuristic that I described here.

Roughly speaking, the function "the difference between the number of natural numbers up to k with an odd number of prime factors and the number of natural numbers up to k with an even number of prime factors" is supposed to be normally distributed with standard deviation 'square root of k,' and Merten's conjecture predicts a truncated normal distribution.

comment by [deleted] · 2013-06-03T17:03:11.264Z · LW(p) · GW(p)

I agree that Euler's case was stronger - I was just trying to think of such examples.

Replies from: Kindly
comment by Kindly · 2013-06-03T17:36:43.704Z · LW(p) · GW(p)

I think a reasonable example is Lamé's proof of Fermat's theorem. Experimental evidence confirmed that Fermat's theorem holds for small numerical examples, and Lamé's proof shows that if certain rings are unique factorization domains (which is a similar assumption to Euler's assumption that the product formula for sine holds) then Fermat's theorem always holds. Unfortunately, the unique factorization assumption sometimes fails.

Granted, the example isn't perfect because Fermat's theorem did turn out to be true, just not for the same reasons.

comment by Shmi (shminux) · 2013-06-03T19:26:50.883Z · LW(p) · GW(p)

I like the Euler example. The guy was one of the more pragmatic among mathematicians. Would be nice to have more examples and counter-examples from other fields.

Also, I don't understand your arxiv example. Most preprints are not related to each other at all, why lump them together?

comment by Wei Dai (Wei_Dai) · 2013-06-04T02:59:43.417Z · LW(p) · GW(p)

I think this is especially relevant in philosophy and philosophical FAI problems, where, since there aren't any equivalent of mathematical proofs, multiple independent lines of argument seem to be the only way to develop high confidence in some proposed solution. Unfortunately in practice there probably won't be enough time to do this, and we'll be lucky if for each problem we can even come up with one solution that don't look obviously wrong.

comment by lukeprog · 2013-06-03T19:14:07.507Z · LW(p) · GW(p)

Nice article! Front-page worthy, imo. We'll see if the upvotes agree.

comment by Adele_L · 2013-06-03T16:18:53.992Z · LW(p) · GW(p)

I believe this sort of reasoning may be justified through a theory for probabilistic reasoning under restricted computational resources. Unfortunately I don't think this theory has been developed yet, and it will probably be very difficult to do so. But I think there is some progress in this direction, for example there are lots of conjectures in number theory which are justified by probabilistic reasoning, see this excellent post by Terry Tao for some good examples.

I think that a suitable adjustment of Benford's law is the right prior for a generic subset of the natural numbers, which leads one to conjecture the prime number theorem before even knowing the definition of a prime number (and also is why Benford's law is so successful)! Once you do know the definition, and are willing to do a small amount of computation, you can do a sort of update with this new information to get better heuristics, just like in Terry Tao's post linked above. The restriction on the amount of computation is necessary to make this idea useful; if you had unlimited computational resources, you would know everything about the primes already.

It's exciting to see in number theory lots of work on finding good priors for these sorts of things, another good example is with the Cohen-Lenstra heuristics which are some conjectures ultimately based on the prior probability of a generic group of some kind having a certain property (sorry, I don't remember very well). It seems like work dealing with the limited computational resources mostly has yet to be done, but there might be lots of good ideas in computational complexity theory (not my domain of expertise). I think that having this kind of theory will also be important to FAI research, since it will need to reason correctly about math with limited resources.

Anyway, I think you are right that Euler had enough evidence to be confident of his solution to the Basel problem. I hope that a solid theory for this sort of reasoning is found soon, and I'm glad to see that other people are thinking about it.