The value of learning mathematical proof
post by JonahS (JonahSinick) · 2015-06-02T03:15:03.027Z · LW · GW · Legacy · 42 commentsContents
42 comments
The social justice movement espouses the notion that people who are privileged are often unfairly judgmental of those who were less privileged. Until recently, what they said didn't resonate with me. I knew that I had major advantages out of virtue of having been born a white, middle class male. But I recently realized that there were other privileges that I hadn't acknowledged as having benefited enormously from. In particular, I had the unusual experience of growing up with a very intellectually curious father, which gave me a huge head start in intellectual development.
I used to get annoyed when LWers misread my posts in ways that they wouldn't have if they had been reading more carefully. I conceptualized such commenters as being undisciplined, and being unwilling to do the work necessary to maintain high epistemic standards. I now see that my reading was in many cases uncharitable, analogous to many of my teachers having misread my learning disability as reflecting laziness. Many of my readers have probably never had the opportunity to learn how to read really carefully.
How did I myself learn? I don't remember in detail, but the one factor that seems most significant is my study of the mathematical subject of real analysis. A number of strongest thinkers who I know characterized the experience as a turning point in their development as well. It's the subject where one goes through rigorous proofs of the theorems of calculus.
Consider the extreme value theorem:
If a real-valued function f is continuous in the closed and bounded interval [a,b], then f must attain a maximum and a minimum, each at least once.
The theorem may seem obvious, but almost no undergraduate math majors would be able to come up with a logically impeccable proof from scratch. This ties in with why I almost never try to present rigorous arguments. If it's not clear to you that it might be very difficult to construct a rigorous proof of the extreme value theorem, you'd probably benefit intellectually from more exposure to mathematical proof. The experience of seeing how difficult it can be to offer rigorous proofs of even relatively simple statements trains one to read very carefully, and not make any unwarranted assumptions.
If you've studied calculus, haven't yet had the experience of proving theorems from first principles beyond high school geometry, and would are interested, I would recommend:
- Abbott's Understanding Analysis
- Rosenlicht's Introduction to Analysis (as a less expensive second choice)
- Gelbaum and Olmsted's Counterexamples in Analysis
The last of these books is great for developing a sense for how superficially plausible statements are often false.
42 comments
Comments sorted by top scores.
comment by Shmi (shminux) · 2015-06-02T06:55:42.594Z · LW(p) · GW(p)
Sorry to harp on it again, but to enjoy real analysis one does require a fair amount of math aptitude, not just being white middle class male with a very intellectually curious parent. I had all those, and a good math instructor in the 2nd year, and a good TA, and a group of friends I would explain the stuff I learned to, and I got a high mark in the course, but I never really enjoyed it the way I enjoyed physics and some computer courses. I could do rigorous proofs when required, I just never had a thing for it. I would get a kick of figuring out a fancy integral, but not out of figuring out a fancy proof.
The experience of seeing how difficult it can be to offer rigorous proofs of even relatively simple statements trains one to read very carefully, and not make any unwarranted assumptions.
I agree that it is a humbling experience to learn "how difficult it can be to offer rigorous proofs of even relatively simple statements," and I felt suitably humbled, and it has value for the armchair AI researchers frequenting this site, but if your goal is to teach humility, I suspect there are better ways.
What the books you are suggesting are good for is to find people who think they are bad at math, but aren't. I have seen an occasional case of a person being taken in by the beauty of mathematical proofs.
The last of these books is great for developing a sense for how superficially plausible statements are often false.
That seems too heavy, You ought to learn that in your first programming course, where a program that looks perfectly correct inevitably contains multiple bugs.
Replies from: JonahSinick, gjm↑ comment by JonahS (JonahSinick) · 2015-06-02T10:07:25.933Z · LW(p) · GW(p)
Sorry to harp on it again, but to enjoy real analysis one does require a fair amount of math aptitude,
Is there a reason why you keep bringing up this subject? I'm not complaining – I just want to know whether there's a point that you've been trying to make that I've been missing.
In my present post I was advocating learning real analysis for the sake of getting into the habit of reasoning carefully, not for enjoyment.I think that a large fraction of LWers have the mathematical aptitude required to find it tolerable, even if not exciting. I usually don't advocate people learning things that they don't find especially interesting, but in this particular case, the skill is so important that I think it might be worth it – I see it as analogous to literacy.
Replies from: Lumifer↑ comment by Lumifer · 2015-06-02T16:42:17.504Z · LW(p) · GW(p)
In my present post I was advocating learning real analysis for the sake of getting into the habit of reasoning carefully, not for enjoyment.
I am not sure the cost-benefit analysis is favorable, in particular for people who do not intend to become professional mathematicians.
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-02T17:01:13.299Z · LW(p) · GW(p)
Naturally it depends on the other alternatives on the table, and how far one wants to go. But I do know several people who report that learning the subject changed how they think in general, not only in the context of math.
Replies from: k_ebel↑ comment by k_ebel · 2015-06-05T16:11:10.270Z · LW(p) · GW(p)
I would have to add another point on the anecdotal side for this. I made it through Real Analysis (barely!) when I was a math major - and it made a significant difference on the thought process I go through when I consider things. If nothing else, it was very instrumental in breaking the "good rhetoric = good argument" connection I'd been operating under up until that point. And this was long before I'd any notion that places like CFAR or LW even existed.
(I will disclaimer that it also made certain kinds of communication more difficult - because most folks don't like it when you try to make their opinions rigorous - but that could as easily be from how I implemented those ideas as from the change in thinking itself. )
↑ comment by gjm · 2015-06-02T07:42:54.687Z · LW(p) · GW(p)
but if your goal is to teach humility, I suspect there are better ways
I don't think humility is what Jonah is trying to teach -- rather, it's something more like the habit of working really hard at understanding things. (Though I worry that there's a roughly opposite error: thinking that skill in other domains is like skill in pure mathematics and requires the same kind of intellectual work. The same amount, maybe -- though actually I suspect it varies -- but not necessarily the same kind.)
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-02T10:00:39.479Z · LW(p) · GW(p)
(Though I worry that there's a roughly opposite error: thinking that skill in other domains is like skill in pure mathematics and requires the same kind of intellectual work. The same amount, maybe -- though actually I suspect it varies -- but not necessarily the same kind.)
I was addressing the specific skill of reading carefully and not making assumptions that the author hasn't stated, which is highly relevant to learning in general. I agree that the work that goes into understanding things outside of pure math isn't necessarily of the same type as within pure math.
comment by TsviBT · 2015-06-02T10:43:36.322Z · LW(p) · GW(p)
Could you say more about why you think real analysis specifically is good for this kind of general skill? I have pretty serious doubts that analysis is the right way to go, and I'd (wildly) guess that there would be significant benefits from teaching/learning discrete mathematics in place of calculus. Combinatorics, probability, algorithms; even logic, topology, and algebra.
To my mind all of these things are better suited for learning the power of proof and the mathematical way of analyzing problems. I'm not totally sure why, but I think a big part of it is that analysis has a pretty complicated technical foundation that already implicitly uses topology and/or logic (to define limits and stuff), even though you can sort of squint and usually kind of get away with using your intuitive notion of the continuum. With, say, combinatorics or algorithms, everything is very close to intuitive concepts like finite collections of physical objects; I think this makes it all the more educational when a surprising result is proven, because there is less room for a beginner to wonder whether the result is an artifact of the funny formalish stuff.
Replies from: JeremyHahn, Gram_Stone, JonahSinick, Epictetus↑ comment by JeremyHahn · 2015-06-02T15:39:19.585Z · LW(p) · GW(p)
Personally I think real analysis is an awkward way to learn mathematical proofs, and I agree discrete mathematics or elementary number theory is much better. I recommend picking up an Olympiad book for younger kids, like "Mathematical Circles, A Russian Experience."
↑ comment by Gram_Stone · 2015-06-02T19:18:46.898Z · LW(p) · GW(p)
This is also somewhat in reply to your elaboration in this comment. Just some data points:
In regards to this topic of proof, and more generally to the topic of formal science, I have found logic a very useful subject. For one, you can leverage your verbal reasoning ability, and begin by conceiving of it as a symbolization of natural language, which I find for myself and many others is far more convenient than, say, a formal science that requires more spatial reasoning or abstract pattern recognition. Later, the point that formal languages are languages in their own right is driven home, and you can do away with this conceptual bridge.
Logic also has helped me to conceive of formal problems as a continuum of difficulty of proof, rather than proofs and non-proofs. That is, when you read a math textbook, sometimes you are instructed to Solve, sometimes to Evaluate, sometimes to Graph; and then there is the dreaded Show That X or Prove That X! In a logic textbook, almost all exercises require a proof of validity, and you move up over time, deriving new inference rules from old, and moving onto metalogical theorems. Later returning to books about mathematical proof, I found things much less intimidating. I found that proof is not a realm forbidden to those lacking an innate ability to prove; you must work your way upwards as in all things.
Furthermore, in regards to this:
Even the math with simple foundations has surprising results with complicated proofs that require precise understanding.
In my opinion, very significant and complex results in logic are arrived at quite early in comparison to the significance of, and effort invested in, results in other fields of formal science.
And in regards to this:
I think this makes it all the more educational when a surprising result is proven, because there is less room for a beginner to wonder whether the result is an artifact of the funny formalish stuff.
I have found that in continuous mathematics I have walked away from proofs with a feeling best expressed as, "If you say so," as opposed to discrete mathematics and logic, where it's more like, "Why, of course!"
↑ comment by JonahS (JonahSinick) · 2015-06-02T16:58:01.158Z · LW(p) · GW(p)
I agree with Epictetus' comment.
Replies from: TsviBT↑ comment by Epictetus · 2015-06-02T13:44:14.360Z · LW(p) · GW(p)
To my mind all of these things are better suited for learning the power of proof and the mathematical way of analyzing problems.
I think the main thrust of the article was less about the power of mathematics and more about the the habits of close reading and careful attention to detail required to do rigorous mathematics.
I'm not totally sure why, but I think a big part of it is that analysis has a pretty complicated technical foundation that already implicitly uses topology and/or logic (to define limits and stuff), even though you can sort of squint and usually kind of get away with using your intuitive notion of the continuum.
Seems like it's precisely because of the complicated technical foundation that real analysis was recommended. Theorems have to be read carefully, as even simple ones often have lots of hypotheses. Proofs have to be worked through carefully to make sure that no implicit assumptions are being introduced. Even great mathematicians ran into trouble playing fast and loose with the real numbers. It took them about two hundred years to finally lay rigorous foundations for calculus.
Replies from: TsviBT↑ comment by TsviBT · 2015-06-02T18:17:22.135Z · LW(p) · GW(p)
Seems like it's precisely because of the complicated technical foundation that real analysis was recommended.
What I'm saying is, that's not a good reason. Even the math with simple foundations has surprising results with complicated proofs that require precise understanding. It's hard enough as it is, and I am claiming that analysis is too much of a filter. It would be better to start with the most conceptually minimal mathematics.
Even great mathematicians ran into trouble playing fast and loose with the real numbers. It took them about two hundred years to finally lay rigorous foundations for calculus.
...implying that it is actually pretty confusing. There are good reasons for wanting to learn analysis because it is applied so widely. But from the specific perspective of trying to learn lessons about math and rigorous argument in general, it seems like you want a subject that is legitimate math but otherwise as simple as possible. To some extent, trying to do real analysis as a first real math class is like trying to teach physics class in a foreign language. On the one hand, you just want to learn the physics, but at the same time you always have to translate into your native tongue, worrying that you made a subtle mistake in translation. If you want to learn how to prove stuff in general, you don't also want the objects that you're proving stuff about to be overcomplicated to the point that it's a whole chore just to understand what you're talking about. That is an important but distinct skill from understanding and inventing proofs.
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-02T21:04:09.418Z · LW(p) · GW(p)
Oh, sure, in expressing agreement with Epictetus I was just saying that I don't think that you get the full benefits that I was describing from basic discrete math. I agree that some students will find discrete math a better introduction to mathematical proof.
Replies from: TsviBT↑ comment by TsviBT · 2015-06-02T21:33:34.417Z · LW(p) · GW(p)
Ok that makes sense. I'm still curious about any specific benefits that you think studying analysis has, relative to other similarly deep areas of math, or whether you meant hard math in general.
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-02T22:17:26.527Z · LW(p) · GW(p)
I think that analysis is actually the easiest entry point to the kind of mathematical reasoning that I have in mind for people who have learned calculus. Most of the theorems are at least somewhat familiar, so one can focus on the logical rigor without having to simultaneously having to worry about understanding what the high level facts are.
Replies from: TsviBTcomment by RyanCarey · 2015-06-03T11:11:50.823Z · LW(p) · GW(p)
I used to get annoyed when LWers misread my posts in ways that they wouldn't have if they had been reading more carefully. I conceptualized such commenters as being undisciplined, and being unwilling to do the work necessary to maintain high epistemic standards. I now see that my reading was in many cases uncharitable, analogous to many of my teachers having misread my learning disability as reflecting laziness. Many of my readers have probably never had the opportunity to learn how to read really carefully. This seems like you're unnecessarily antagonising your audience.
But the general issue is that even as someone who has met you, and likes you and generally has an unusually high amount of trust on your opinions, this is not enough to move the weight of probability of why your posts were misunderstood - it is still likelier due to arrogance or lack of communication skill than due to whether one's readers studied analysis subjects!
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-03T17:00:54.620Z · LW(p) · GW(p)
There's no single reason why my posts were misunderstood. You can think in terms of there being an underlying statistical model of whether or not a post will be understood. I acknowledge that my lack of communication skills have played a major role. But it's also true that people with very strong quantitative backgrounds (such as Paul Christiano) have understood my posts much more readily than most LWers have.
Replies from: RyanCarey↑ comment by RyanCarey · 2015-06-03T21:12:27.782Z · LW(p) · GW(p)
Depending on the post, there are other reasons that people who share your intellectual background may be better at understanding a post, other than the study making them smarter. It could be that they were smarter already, or that similar effects are apparent for people with shared non quantitative backgrounds as well.
Replies from: JonahSinick, JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-03T21:36:13.852Z · LW(p) · GW(p)
A bit more intuition:
In pure math, it's not uncommon for the statement of a theorem, when unpackaged (with all of the definitions spelled out, relative to what the reader already knows), to span several pages. It's often the case that if a mere word or two in the statement of a theorem were changed, the statement would be false. So you need to read every word carefully – it's a sink or swim situation.
Programming has the same character too, but it contrasts with pure math in that it's easy to place it in a different category from verbal communication in general. Some mathematical proof consists of symbolic manipulation, but the more theoretical areas involve a huge amount of verbal-type reasoning. So you would expect people with background in pure math to have this skill, even if they didn't before they started learning.
Replies from: RyanCarey↑ comment by RyanCarey · 2015-06-04T11:40:37.237Z · LW(p) · GW(p)
I agree that mathematical thinking and communication has a special and interesting character. Communicating in a mathlike way is a skill, to be sure. But if meticulously careful communication was superior in a wide range of scenarios, I would have at least a weak reason to expect more people to use it, at least when talking to other math people. But it's not clear to me that math people talk in mathlike ways to math people anymore than sociologists talk in sociologist-like ways to sociologists.
For success in writing, I imagine it's more important to think in a writer-like way. For success in business, I imagine it's again not mathlike thinking that is necessarily most crucial...
↑ comment by JonahS (JonahSinick) · 2015-06-03T21:14:52.868Z · LW(p) · GW(p)
It could be that they were smarter already, or that similar effects are apparent for people with shared non quantitative backgrounds as well.
Yes, there are multiple possible causes. I'm just expressing my intuition (based on a huge amount of evidence that I can't easily share) and I'm saying that there's reason to make a Bayesian update in the direction of what I'm saying. Naturally, the size of the update depends on how compelling my perspective seems.
comment by Nanashi · 2015-06-02T13:20:39.340Z · LW(p) · GW(p)
I think that this emphasis on explicit, built-from-scratch mathematical proofs runs counter to your previously expressed suggestion that learning via pattern matching is more efficient than learning via explicit reasoning.
I've found that the emphasis on first principles is often symptomatic of someone who is speaking for their own benefit rather than that of their audience. After all, you're making the unwarranted assumption that A.) your audience wants first principles rather than a practical application, and B.) your audience is, for lack of a better word, too dumb to derive these principles for themselves. It's very easy to convince yourself that you are giving the audience the tools they need to understand what you're saying, when in fact, you're using the audience as a sounding board to help yourself better understand what you're actually saying.
(By the way, I'm using the "royal You" rather than specifically singling out you, Jonah. You caution against this very thing in another post of yours. ).
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-02T16:39:40.253Z · LW(p) · GW(p)
I think that this emphasis on explicit, built-from-scratch mathematical proofs runs counter to your previously expressed suggestion that learning via pattern matching is more efficient than learning via explicit reasoning.
My focus here was two-fold
- Learning mathematical proof as a means of learning how to read and listen very carefully.
- Learning the limits of rigorous reasoning by seeing how hard it is to give arguments that are actually rigorous, as opposed to just having the superficial appearance of rigor.
The second point is a part of my case for the first post that you linked.
comment by Ben Pace (Benito) · 2015-06-02T10:39:11.493Z · LW(p) · GW(p)
I have some time this summer to spend learning maths, and I was going to begin studying real analysis with Rudin's "Principles of Mathematical Analysis". I have heard it is the best book if you have the time to study thoroughly, which I do (three almost uninterrupted months, although I plan to learn lots of other maths too). As someone who is mathematically able, but has not done Real Analysis (and will not study it at university) what is your recommendation that I read?
Added: My info comes from the incredivly positive amazon reviews, and the less positive Best Textbooks LW thread.
Replies from: None, JeremyHahn, JonahSinick↑ comment by [deleted] · 2015-06-02T18:31:00.094Z · LW(p) · GW(p)
I wouldn't recommend it for someone's first exposure to analysis. When you first meet a subject, you want to get a sense of how the bits fit together, and what the important concepts and theorems are supposed to "mean" (as opposed to their formal definitions). You learn this by slowly working through examples and thinking about special cases.
Unfortunately, Rudin has very few examples, his proofs are more elegant than enlightening (for the beginner anyway, his proofs are very enlightening if you already know the big picture and are want to know the answers to questions like "Do I really need this strong an assumption for this theorem?"), and develops his theories in a lot more generality than a typical introductory analysis course (Which again, isn't necessarily bad, but you do want to get a feel for how things work in R^n before diving into arbitrary metric spaces).
If you have three months, you might want to spend the first half or so on a more verbose book, and then go over the material again using Rudin. You'd get a deeper understanding, and it might even be faster than just going through Rudin once!
Replies from: Benito, Benito↑ comment by Ben Pace (Benito) · 2015-06-03T05:15:07.610Z · LW(p) · GW(p)
That makes a lot of sense. Thank you.
Added: Much of the praise for baby Rudin suggests that trying to prove each theorem before having seen the proof, is one of the best ways to become a good mathematician. Can you comment on my thought that, after having read a more verbose book, I won't have that same experience? Or is that approach going to work with most real analysis books, so that I could still try to prove everything before it is explained?
Replies from: None↑ comment by [deleted] · 2015-06-03T09:36:56.380Z · LW(p) · GW(p)
Yes, you can try to prove everything before it's explained with pretty much any real analysis book. Just be reasonable about it, if you've gone a few hours without even making partial progress on a theorem, read the proof. A first exposure to analysis doesn't just teach you analysis, it teaches you how to build theories from the bottom up. If you can do that on your first try, great. If you can't (as is a lot more likely), learn how and save the "prove everything on your own" experience for a different subject.
↑ comment by Ben Pace (Benito) · 2015-06-02T19:45:50.496Z · LW(p) · GW(p)
That makes a lot of sense. Thank you.
↑ comment by JeremyHahn · 2015-06-02T15:51:18.832Z · LW(p) · GW(p)
As a current Harvard math grad student I think you should read many different easy books to learn a subject whenever possible, especially if you can find them for free. When you say you are mathematically able it is unclear what level you are at. All of my favorite books for learning involve huge number of exercises, and I recommend you do all of them instead of reading ahead.
For basic real analysis, my favorite book is Rosenlicht's Introduction to Analysis but baby Rudin is pretty good too, and I recommend you flip back and forth between them both.
For learning math in general, I think real analysis is a poor place to start, but that may be personal preference because I have a more algebraic slant. I highly recommend books like Herstein's Abstract Algebra, Mathematical Circles: A Russian Experience, I.M. Gelfand's Trigonometry, and Robert Ash's Abstract Algebra: The Basic Graduate Year, mostly for the wealth of exercises. Some of these are books for small children and I think those are the best sort of books to first learn from.
Replies from: Benito↑ comment by Ben Pace (Benito) · 2015-06-02T18:21:56.585Z · LW(p) · GW(p)
Thanks; I have pm-ed you for a follow-up.
↑ comment by JonahS (JonahSinick) · 2015-06-02T20:05:05.801Z · LW(p) · GW(p)
I agree with SolveIt.
comment by btrettel · 2015-06-03T02:35:13.491Z · LW(p) · GW(p)
Nice that you recommended a book of counterexamples. Counterexample books are particularly interesting for challenging your mental models. I picked one up when I took a measure theoretic probability class, and as I skimmed through the book I realized that much of what I thought was true (usually, implicitly) was not. (Can't think of any examples off hand, but this was the impression I had.)
Books of counterexamples and paradoxes are unfortunately not popular outside of math. In my own field, fluid dynamics, there's Hydrodynamics by Garrett Birkhoff, but nothing else I am aware of. In the first edition there was a discussion of D'Alembert's paradox that made me rethink a lot of what I had been taught about drag. This came down to understanding under what conditions the "paradox" holds, and recognizing these conditions are stricter than I had thought.
comment by Epictetus · 2015-06-02T04:51:23.356Z · LW(p) · GW(p)
I used to get annoyed when LWers misread my posts in ways that they wouldn't have if they had been reading more carefully. I conceptualized such commenters as being undisciplined, and being unwilling to do the work necessary to maintain high epistemic standards. I now see that my reading was in many cases uncharitable, analogous to many of my teachers having misread my learning disability as reflecting laziness. Many of my readers have probably never had the opportunity to learn how to read really carefully.
Careful reading doesn't come naturally. Everyday English prose is redundant enough that one doesn't have to read word-for-word to get an overall picture of what's going on. The sort of reading required to understand dense technical writing takes a lot of practice to pick up.
comment by KnaveOfAllTrades · 2015-06-02T03:36:14.195Z · LW(p) · GW(p)
Yes!! I've also independently come to the conclusion that basic real analysis seems important for these sorts of general lessons. In fact I suspect that seeing the reals constructed synthetically, or the Peano --> Integers --> Rationals --> Dedekind cuts construction, or some similar rigorous construction of an intuitively 'obvious' concept, is probably a big boost in accessing the upper echelons of philosophical ability. Until you've really seen how axioms work and broken some intuitive thing down to the level that you can see how a computer could verify your proofs (at least in principle), I kind of feel like you haven't done the work to really understand the concepts of proof or definition or seen a really fully reduction of a thing to basic terms.
comment by [deleted] · 2015-06-03T07:45:04.684Z · LW(p) · GW(p)
Sorry, this was an useless post so now it's gone
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-03T08:26:38.510Z · LW(p) · GW(p)
This should be it's own post, please do one! You're good enough and I'll love you for it!
Thanks. Which aspect of what I wrote jumped out at you as especially worthy of highlighting?
Replies from: Nonecomment by IlyaShpitser · 2015-06-02T10:08:38.857Z · LW(p) · GW(p)
It's a good exercise to think about why the extreme value theorem fails if the function is not continuous (e.g. a general function is more like an infinitely large, arbitrarily malicious lookup table, not something you draw with a pencil on graph paper).
I liked complex analysis better, but I agree that real analysis is generally the first serious math class people who are not very algebraic cut their teeth on.
Replies from: JonahSinick↑ comment by JonahS (JonahSinick) · 2015-06-02T10:35:09.089Z · LW(p) · GW(p)
I think that complex analysis is much more intrinsically interesting (leading to Riemann surfaces, quasi conformal makings, etc,) but because the class of functions is so restricted, one is saved a lot of the nitty-gritty type work that one has to do in real analysis.