On exact mathematical formulae

daozaich

On exact mathematical formulae

post by daozaich · 2018-04-22T19:41:46.117Z · LW · GW · 14 comments

14 comments

This is inspired by the review on "Linear Algebra done right". I decided to do a top-level post, because it hits a misconception that is pretty common.

The starting point of this post is this quote from "Linear Algebra done right":

Remarkably, mathematicians have proved that no formula exists for the zeros of polynomials of degree 5 or higher. But computers and calculators can use clever numerical methods to find good approximations to the zeros of any polynomial, even when exact zeros cannot be found.

For example, no one will ever be able to give an exact formula for a zero of the polynomial p defined by $p (x) = x^{5} - 5 x^{4} - 6 x^{3} + 17 x^{2} + 4^{x} - 7$ .

The authors misrepresent an important point that is understood by most mathematicians, but not properly understood by many laypeople.

What does it mean to solve a problem? What does it mean to have an exact formula for the solution of a problem?

The answers to both are a social convention that has historically changed and is expected to continue to evolve in the future.

Back in the days, people only considered rational numbers, ie fractions. Oh, but what about the positive solution to $x^{2} = 2$ ? Ok, we can't express this as a rational number (important theorem). Because these kinds of problems occured quite often, the mathematical community arrived at the consensus that $\sqrt{2}$ , or more generally $\sqrt{r}$ for nonnegative $r$ should be considered an explicit solution. Amazingly, this allows us to express the solution to any quadratic equation $0 = a x^{2} + b x + c$ explicitly, with our expanded notion of "explicit". From an algebraic viewpoint it was natural to bless the positive solution to $x^{n} = a$ as an "explicit formula" next; historically it was a more contentious thing, because greek geometry wanted numbers to be constructible using a ruler and compass only. "Doubling the cube", ie expressing the positive solution to $x^{3} = 2$ as a geometric construction was a famous old problem (proven impossible in 1837, after having been a very prominent mathematical research problems for more than 2000 years).

Now, this obviously says not a lot about the cube root of 2, but says a lot about "constructible with ruler and compass".

In other words: "Explicit solutions" are a messy historical map to mathematical territory, nothing more.

The same holds if you ask for explicit formulas for zeros of polynomials after having grudgingly admitted nth roots as "explicit". The same holds if you ask about explicit integrals of explicit functions (also after having grudgingly admitted eg elliptic integrals as "explicit"). The same holds for solutions of differential equations.

In mathematics, asking about an "explicit formula" for solutions to problems means just: Assuming a general background in mathematics, is the solution something I already have spent years of my life developing an intuition for?

And if the answer happens to be "yes, unconditionally", then it is worthwhile.

If the "explicit" formula uses things that are not commonly taught anymore (crazy "special functions" that 100 years ago constituted a perfectly fine explicit solution), or is too lenghty/complicated to inform intuitions, then it is functionally equivalent to "we don't know", which is functionally equivalent to "we can prove that no formula using terms of type xyz exists".

So there is nothing surprising or scary about problems not having an "explicit" solution.

The true value of Galois theory is that it properly elucidates the hidden structure of polynomial equations, not that it tells us that no "explicit solution formula" exists for degree 5 polynomials for this very historical notion of "explicit". The "explicit" degree 4 formula is nothing more than a curiosity with interesting history, but absolutely worthless from both an intuitive and numerical standpoint.

I most often encountered the unjustified bias towards "explicit solutions" for implicit functions (the function $y = g (x)$ is defined by $f (x, g (x)) = 0$ for some fixed $f = f (x, y)$ , implicit function theorem + newton solver) and solutions to differential equations. Integrals are mostly considered "explicit" today.

14 comments

Comments sorted by top scores.

comment by gjm · 2018-04-23T20:00:24.770Z · LW(p) · GW(p)

One thing that isn't made, aha, explicit in this excellent article, and that maybe should be, is that considering ruler-and-compasses constructions to be explicit turns out to be exactly the same thing as considering taking square roots to be explicit because (once you have the idea of coordinate geometry, which for some reason no one thought of until Descartes) it pretty easily transpires that the numerical operations you can do on coordinates using ruler-and-compasses constructions are exactly arithmetic plus square roots. (Handwavily, the square roots come from the fact that the equation of a circle involves squaring the coordinates.)

And then (this uses, in some sense, "half" of Galois theory, so it's easier than the insolubility of the quintic) you can use this to prove that indeed you can't "duplicate the cube" using ruler and compasses (it means taking cube roots, and it turns out that no quantity of arithmetic-plus-square-roots will let you do that), and that another famous ancient problem -- trisecting angles -- can't be done with ruler and compasses for essentially the exact same reason, and (this is a bit more difficult; it was one of Gauss's first important theorems) that the regular polygons you can construct with ruler and compasses are those whose number of sides is a power of 2 times some number of distinct primes of form 2^2^n+1. So you can do 3 and 5 and 17, but not 9 (two 3s: not allowed) and not 11 (wrong sort of prime).

Of course, for all the reasons daozaich gives here, one shouldn't care too much about what can be done with ruler and compasses, as such. But these are still really cool theorems.

Replies from: daozaich

↑ comment by daozaich · 2018-04-23T22:18:25.349Z · LW(p) · GW(p)

You're right, I should have made that clearer, thanks!

comment by Shmi (shminux) · 2018-06-29T01:16:14.799Z · LW(p) · GW(p)

In mathematics, asking about an "explicit formula" for solutions to problems means just: Assuming a general background in mathematics, is the solution something I already have spent years of my life developing an intuition for?

Having spent some time looking at numerical solutions, I would generalize this to "do we have an algorithm that allows us to efficiently explore the relevant domain space?"

The algorithm could be a way to calculate sin(x), to square a number, to solve the Laplace equation with given boundary conditions, or even to calculate the distribution of gravitational radiation from black hole collisions. As long as it as efficient (in terms of computational and human efforts) as pressing sin(x) on the calculator, it is as good as exact.

comment by Vaniver · 2018-05-09T04:51:02.073Z · LW(p) · GW(p)

I'm curating this because I think it makes a valid and subtle mathematical point, of the sort that has direct relevance to thinking about many other topics.

comment by Ben Pace (Benito) · 2018-04-23T07:15:01.356Z · LW(p) · GW(p)

I recently discovered there's no closed-form formula for the circumference of an ellipse, and then was also told "But that just means there's no nice formula in what we consider basic functions." Do you have a sense / opinion on how explicit / arbitrary such elliptic formulae in fact are?

Replies from: daozaich, Thomas

↑ comment by daozaich · 2018-04-23T09:08:20.212Z · LW(p) · GW(p)

It depends on context. Is the exponential explicit? For the last 200 years, the answer is "hell yeah". Exponential, logarithm and trigonometry (complex exponential) appear very often in life, and people can be expected to have a working knowledge of how to manipulate them. Expressing a solution in terms of exponentials is like meeting an old friend.

120 years ago, knowing elliptic integrals, their theory and how to manipulate them was considered basic knowledge that every working mathematician or engineer was expected to have. Back then, these were explicit / basic / closed form.

If you are writing a computer algebra system of similar ambition to maple / mathematica / wolfram alpha, then you better consider them explicit in your internal simplification routines, and write code for manipulating them. Otherwise, users will complain and send you feature requests. If you work as editor at the "Bronstein mathematical handbook", then the answer is yes for the longer versions of the book, and a very hard judgement call for shorter editions.

Today, elliptic integrals are not routinely taught anymore. It is a tiny minority of mathematicians that has working knowledge on these guys. Expressing a solution in terms of elliptic integrals is not like meeting an old friend, it is like meeting a stranger who was famous a century ago, a grainy photo of whom you might have once seen in an old book.

I personally would not consider the circumference of an ellipse "closed form". Just call it the "circumference of the ellipsis", or write it as an integral, depending on how to better make apparent which properties you want.

Of course this is a trade-off, how much time to spend developing an intuition and working knowledge of "general integrals" (likely from a functional analysis perspective, as an operator) and how much time to spend understanding specific special integrals. The specific will always be more effective and impart deeper knowledge when dealing with the specifics, but the general theory is more applicable and "geometric"; you might say that it extrapolates very well from the training set. Some specific special functions are worth it, eg exp/log, and some used to be considered worthy but are today not considered worthy, evidenced by revealed preference (what do people put into course syllabi).

So, in some sense you have a large edifice of "forgotten knowledge" in mathematics. This knowledge is archived, of course, but the unbroken master-apprentice chains of transmission have mostly died out. I think this is sad; we, as a society, should be rich enough to sponsor a handful of people to keep this alive, even if I'd say "good riddance" for removing it from the "standard canon".

Anecdote: Elliptic integrals sometimes appear in averaging: You have a differential equation (dynamical system) and want to average over fast oscillations in order to get an effective (ie leading order / approximate) system with reduced dimension and uniform time-scales. Now, what is your effective equation? You can express it as "the effective equation coming out of Theorem XYZ", or write it down as an integral, which makes apparent both the procedure encoded in Theorem XYZ and an integral expression that is helpful for intuition and calculations. And sometimes, if you type it into Wolfram alpha, it transforms into some extremely lenghty expression containing elliptic integrals. Do you gain understanding from this? I certainly don't, and decided not to use the explicit expressions when I met them in my research (99% of the time, mathematica is not helpful; the 1% pays for the trivial inconvenience of always trying whether there maybe is some amazing transformation that simplifies your problem).

↑ comment by Thomas · 2020-10-01T18:16:23.032Z · LW(p) · GW(p)

I recently discovered there's no closed-form formula for the circumference of an ellipse

Yes. I've asked the computer, to give me some simple approximation formula then. This came out:

It's quite good when b >> a.

comment by interstice · 2018-04-23T00:36:44.785Z · LW(p) · GW(p)

While the concept of explicit solution can be interpreted messily, as in the quote above, there is a version of this idea that more closely cuts reality at the joints, computability. A real number is computable iff there is a Turing machine that outputs the number to any desired accuracy. This covers fractions, roots, implicit solutions, integrals, and, if you believe the Church-Turing thesis, anything else we will be able to come up with. https://en.wikipedia.org/wiki/Computable_number

Replies from: daozaich, Dacyn

↑ comment by daozaich · 2018-04-23T09:21:52.213Z · LW(p) · GW(p)

Computability does not express the same thing we mean with "explicit". The vague term "explicit" crystallizes an important concept, which is dependent on social and historical context that I tried to elucidate. It is useful to give a name to this concept, but you cannot really prove theorems about it (there should be no technical definition of "explicit").

That being said, computability is of course important, but slightly too counter-intuitive in practice. Say, you have two polynomial vectorfields. Are solutions (to the differential equation) computable? Sure. Can you say whether the two solutions, at time t=1 and starting in the origin, coincide? I think not. Equality of computable reals is not decidable after all (literally the halting problem).

Replies from: interstice

↑ comment by interstice · 2018-04-24T21:26:11.147Z · LW(p) · GW(p)

re: differential equation solutions, you can compute if they are within epsilon of each other for any epsilon, which I feel is "morally the same" as knowing if they are equal.

It's true that the concepts are not identical. I feel computability is like the "limit" of the "explicit" concept, as a community of mathematicians comes to accept more and more ways of formally specifying a number. The correspondence is still not perfect, as different families of explicit formulae will have structure(e.g. algebraic structure) that general Turing machines will not.

↑ comment by Dacyn · 2018-04-24T14:43:40.307Z · LW(p) · GW(p)

Polynomial-time computability is probably closer to the notion of explicitness (though still not quite the same, as daozaich points out). I don't know of any number that is considered explicit but is not polynomial-time computable.

comment by ryan_b · 2018-04-22T20:50:41.262Z · LW(p) · GW(p)

Having run up against broadly similar kinds of problems recently, I have concluded that intuitions are the right level for me to focus on.

I also greatly prize historical context for math-related questions now. I feel like this is the not-well-explained motivation for returning to originators of and key historical contributors to a field, even if efficient textbooks have since been written.

comment by Maxwell Peterson (maxwell-peterson) · 2020-12-07T20:34:46.906Z · LW(p) · GW(p)

This is great - I’ve been working on a lot of math lately, and the difference in this post describes is definitely muddied in my mind, but until reading I didn’t realize I was confused about the difference.

comment by sebreens · 2018-05-10T14:03:45.025Z · LW(p) · GW(p)

changed my mind edit

On exact mathematical formulae

Contents

14 comments