Intuition for 1 + 2 + 3 + … = -1/12

shankar-sivarajan

Intuition for 1 + 2 + 3 + … = -1/12

post by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T16:46:42.687Z · LW · GW · 28 comments

  The "Counting" Function C
  Fractions, Negative numbers, Zero
  Infinitely Large Tuples and Pattern Consistency
  Finish
  Concluding Remarks
None
28 comments

You might have been led to believe the only way to make sense of the equation is with "zeta function regularization," involving analytic continuation or similar high-tech tools, and that there is no way to develop intuition for this.

You were probably was told this by mathematicians, to whom "intuition" means something very different than it does to me.

So here follows what I consider an intuitive explanation for this result. Prerequisites include an intuition for the arithmetic of positive integers (most readers will have this), and a willingness to stfu about rigor.

The "Counting" Function $C$

For historical reasons, involving "counting" the number of individual objects (sheep, stones, sticks, whatever) in a collection, there exists a function $C$ that acts on ordered tuples of positive integers and outputs a single positive integer. For example,

$(1, 1) C \to 2$ ^[1]

$(1, 1, 1) C \to 3$

Now, this function has several nice^[2] properties (which have names like "commutativity" and "associativity"), which means that things like the following is also true:

$(2, 1) C \to 3$ .

I shall be dropping the superscript over the arrow for the rest of this.

Fractions, Negative numbers, Zero

Now, playing with this function, we might ask questions like,

Is there a number $x$ such that $(2, x) \to 1$ ?

Or Is there a number $y$ such that $(y, y) \to 1$ ?

Now, obviously, the answer is "No, of course not. The output of the 'counting' function is greater than every number in the input, and there is no number smaller than 1."

However, we're going to make them up in such a way that all the nice properties of the function $C$ hold; this is a silly exercise with no real-world meaning, but mathematicians like doing things like this, and they turn out to have surprising real-world applications, in fields like physics (something, something "unreasonable effectiveness"). These new "numbers" are called negative numbers and fractions.

$(2, - 1) \to 1,$ and $(\frac{1}{2}, \frac{1}{2}) \to 1.$

Now, playing with these new "numbers," we find that to preserve the nice properties of $C,$ we need to introduce something called a "zero" or "0" such that $(1, - 1) \to 0$ .

With me so far?

Infinitely Large Tuples and Pattern Consistency

What happens if we apply our function $C$ to a tuple with infinitely many elements? The obvious answer is "That doesn't make any sense! You can't count infinitely many elements; you'll never get to the end. Consider Achilles and the tortoise …." Yes, of course, but like before, just go with it, okay?

The nice property we're going to make up for $C$ is that it follows patterns: in cases where the output isn't clear (such as with an infinite number of terms when we've only ever seen finite tuples), but there's a neat pattern that would be formed if it had some specific value, that is the value of the function. This is a little vague, but an example might make it clear:

$(1, \frac{1}{2}, \frac{1}{4}, \frac{1}{8}, \frac{1}{16}, \dots) \to ?$

The pattern we're going to match is $(1)$ , $(1, \frac{1}{2})$ , $(1, \frac{1}{2}, \frac{1}{4})$ , $(1, \frac{1}{2}, \frac{1}{4}, \frac{1}{8})$ , and so on.

The output, following the pattern $1$ , $1.5$ , $1.75$ , $1.875$ , … is $2$ .

The pattern is actually more general than that, and for any $r$ , $| r | < 1$ ,

$(1, r, r^{2}, r^{3}, \dots) \to \frac{1}{1 - r} .$

Here's the key idea:

The outputs we've assigned using this pattern-matching technique can themselves be used as part of new patterns!

For the tuple

$(1, 2, 4, 8, 16, \dots) \to ?,$

the analogous sequence $(1)$ , $(1, 2)$ , $(1, 2, 4)$ , $(1, 2, 4, 8)$ , etc. doesn't suggest any particular value. However, we can match it to the metapattern $\frac{1}{1 - r}$ , where $r = 2.$

$(1, 2, 4, 8, 16, \dots) \to - 1.$

Important caveat: If there are multiple patterns that suggest different values for the output of a particular unclear tuple, this pattern matching approach won't work. That's fine, let's see how far it takes us! Remember, if some patterns don't suggest any value, that's not a problem as long as all the ones that do suggest the same value.

For some tuples, like $(1, - \frac{1}{2}, \frac{1}{3}, - \frac{1}{4}, \dots)$ the order of elements matters! Keep that in mind when matching patterns.

Finish

There isn't much more intuition involved to get from $(1, 2, 4, 8, 16, \dots) \to - 1$ to $(1, 2, 3, 4, 5, \dots)$ . The hard part is finding the right patterns: many of the obvious things you might try don't suggest any value. After a lot of (other people's) work, here's an example of something that works.

$(e^{- 1} cos (- 1)) \to 0.1988,$

$(e^{- 1} cos (- 1), 2 e^{- 2} cos (- 2)) = 0.0861,$

$(e^{- 1} cos (- 1), 2 e^{- 2} cos (- 2), 3 e^{- 3} cos (- 3)) \to - 0.0617,$ …

and if you go far enough (use a computer!), the pattern suggests approximately $- 0.08267.$

Now, we can use the results of these patterns to match a metapattern:

$(e^{- 1} cos (- 1), 2 e^{- 2} cos (- 2), 3 e^{- 3} cos (- 3), \dots) \to - 0.08267,$ $(e^{- 1 / 2} cos (- 1 / 2), 2 e^{- 2 / 2} cos (- 2 / 2), 3 e^{- 3 / 2} cos (- 3 / 2), \dots) \to - 0.08329,$ $(e^{- 1 / 4} cos (- 1 / 4), 2 e^{- 2 / 4} cos (- 2 / 4), 3 e^{- 3 / 4} cos (- 3 / 4), \dots) \to - 0.08333,$ …

Notice that this pattern of infinite tuples does match what we want, $(1, 2, 3, \dots)$ . The value suggested by this pattern is $- \frac{1}{12}$ . □

It turns out no pattern that suggests a value suggests any other, though this is hard to prove.

Concluding Remarks

In precisely the same sense that we can write

$1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \frac{1}{16} + \dots = 2,$

despite that no real-world process of "addition" involving infinitely many terms may be performed in a finite number of steps, we can write

$1 + 2 + 3 + 4 + \dots = - \frac{1}{12} .$

It is only familiarity that keeps us from seeing this.

Further, "limit of partial sums" is only one of a whole class of "patterns" that may be used to assign values to infinite series. Its strength is that it's easy to describe in general, but it is not the only one, and where it fails to suggest a particular value in the limit, there are others that are no less reasonable.

^{^}
This is usually notated $1 + 1 = 2$ . Some people annoyingly include quotation marks around some of these symbols, but most people omit them for clarity.
^{^}
"Nice" is deliberately vague. We're gonna make up a bunch of elegant properties as we need them.

28 comments

Comments sorted by top scores.

comment by Yair Halberstadt (yair-halberstadt) · 2024-02-18T18:02:01.942Z · LW(p) · GW(p)

In precisely the same sense that we can write 1 + 1/2 + 1/4 + ... = 2, despite that no real-world process of "addition" involving infinitely many terms may be performed in a finite number of steps, we can write 1 + 2 + 3 + 4 + 5 + ... = -1/12

I think this is overstating things (which is fair enough to make the point you're making).

The first is simply a shorthand for "the limit of this sum is 2", which is an extremely simple, general definition, which applies in almost all contexts, and matches up with what addition means in almost all contexts. It preserves far more of the properties of addition as well - it's commutative, associative, etc. In most cases where you want to work with the sum of an infinite series, the correct value to use for this series is 2.

The second is a shorthand for something far more complex, which applies in a far more limited range of cases, and doesn't preserve almost any of the properties we expect of addition. It's not linear or stable. In most cases where you want to work with sums of infinite series, the correct sum for this series is infinity. Only very rarely would you want -1/12.

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T19:04:37.920Z · LW(p) · GW(p)

The first is simply a shorthand for "the limit of this sum is 2",

It doesn't need to be! It can be more generally something like "the unique value that matches patterns," where what counts is extended first beyond integers, and then to infinite series, and then to divergent infinite series.

Replies from: nikolas-kuhn

↑ comment by Amalthea (nikolas-kuhn) · 2024-02-18T19:37:21.346Z · LW(p) · GW(p)

You run into the trouble of having to defend why your way to fit the divergent series into a pattern is the right one - other approaches may give different results.

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T20:49:40.504Z · LW(p) · GW(p)

other approaches may give different results

The claim is that they don't: any pattern that points to a finite result points to the same one. If you want proof, then you need a more rigorous formalism.

Replies from: nikolas-kuhn, yair-halberstadt

↑ comment by Amalthea (nikolas-kuhn) · 2024-02-18T21:45:05.460Z · LW(p) · GW(p)

Sure, but you're just claiming that, and I don't think it's actually true.

↑ comment by Yair Halberstadt (yair-halberstadt) · 2024-02-19T10:27:08.667Z · LW(p) · GW(p)

That's clearly not true in a general sense. Here's a pattern that points to a different sum:

1 + 2 + 3 + ... = 1 + (1 + 1) + (1 + 1 + 1) + ... = 1 + 1 + 1 + ... = - 1/2

Now the problem is this pattern leads to a contradiction because it can equally prove any number you want. So we don't choose to use it as a definition for an infinite sum.

So you need to do a bit more work here to define what you mean here.

comment by Max H (Maxc) · 2024-02-18T18:09:23.340Z · LW(p) · GW(p)

In precisely the same sense that we can write
,
despite that no real-world process of "addition" involving infinitely many terms may be performed in a finite number of steps, we can write
$1 + 2 + 3 + 4 + \dots = - \frac{1}{12}$ .

Well, not precisely. Because the first series converges, there's a whole bunch more we can practically do with the equivalence-assignment in the first series, like using it as an approximation for the sum of any finite number of terms. -1/12 is a terrible approximation for any of the partial sums of the second series.

IMO the use of "=" is actually an abuse of notation by mathematicians in both cases above, but at least an intuitive / forgivable one in the first case because of the usefulness of approximating partial sums. Writing things as $(1, 2, 3, . . .) \sim - \frac{1}{12}$ or $R ((1, 2, 3, . . .)) = - \frac{1}{12}$ (R() denoting Ramanujan summation, which for convergent series is equivalent to taking the limit of partial sums) would make this all less mysterious.

In other words, (1, 2, 3, ...) is in an equivalence class with -1/12, an equivalence class which also contains any finite series which sum to -1/12, convergent infinite series whose limit of partial sums is -1/12, and divergent series whose Ramanujan sum is -1/12.

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T18:20:58.107Z · LW(p) · GW(p)

an abuse of notation by mathematicians in both cases above

Writing things as or $R ((1, 2, 3, . . .)) = - \frac{1}{12}$

The point I was trying to make is that we already have perfectly good notation for sums, namely the + and = signs, that we've already extended well beyond the (apocryphal) original use of adding finite sets of positive integers. As long as there's no conflict in meaning (where saying "there's no answer" or "it's divergent" doesn't count) extending it further is fine.

Replies from: Maxc

↑ comment by Max H (Maxc) · 2024-02-18T18:35:24.577Z · LW(p) · GW(p)

My point is that there is a conflict for divergent series though, which is why 1 + 2 + 3 + … = -1/12 is confusing in the first place. People (wrongly) expect the extension of + and = to infinite series to imply stuff about approximations of partial sums and limits even when the series diverges.

My own suggestion for clearing up this confusion is that we should actually use less overloaded / extended notation even for convergent sums, e.g. seems just as readable as the usual $l i m \to$ and $+ . . . +$ notation.

comment by gjm · 2024-02-19T02:50:23.939Z · LW(p) · GW(p)

It is not true that "no pattern that suggests a value suggests any other", at least not unless you say more precisely what you are willing to count as a pattern.

Here's a template describing the pattern you've used to argue that 1+2+...=-1/12:

We define numbers with the following two properties. First, ${lim}_{j \to \infty} a_{i j} = i$ , so that for each $j$ we can think of $(a_{i j})$ as a sequence that's looking more and more like (1,2,3,...) as $j$ increases. Second, $\sum_{i = 1}^{\infty} a_{i j} = s_{j}$ where $s_{j} \to - \frac{1}{12}$ , so the sums of these sequences that look more and more like (1,2,3,...) approach -1/12.

(Maybe you mean something more specific by "pattern". You haven't actually said what you mean.)

Well, here are some $a_{i j}$ to consider. When $i > j + 1$ we'll let $a_{i j} = 0$ . When $i \leq j$ we'll let $a_{i j} = i$ . And when $i = j + 1$ we'll let $a_{i j} = A - (1 + \dots + i)$ . Here, $A$ is some fixed number; we can choose it to be anything we like.

This array of numbers satisfies our first property: ${lim}_{j \to \infty} a_{i j} = i$ . Indeed, once $j \geq i$ we have $a_{i j} = i$ , and the limit of an eventually-constant sequence is the thing it's eventually constant at.

What about the second property? Well, as you'll readily see I've arranged that for each $j$ we have $\sum_{i = 1}^{\infty} a_{i j} = A$ . So the sequence of sums converges to $A$ .

In other words, this is a "pattern" that makes the sum equal to $A$ . For any value of $A$ we choose.

I believe there are more stringent notions of "pattern" -- stronger requirements on how the $a_{i j}$ approach $i$ for large $j$ -- for which it is true that every "pattern" that yields a finite sum yields $- \frac{1}{12}$ . But does this actually end up lower-tech than analytic continuation and the like? I'm not sure it does.

(One version of the relevant theory is described at https://terrytao.wordpress.com/2010/04/10/the-euler-maclaurin-formula-bernoulli-numbers-the-zeta-function-and-real-variable-analytic-continuation.)

comment by Amalthea (nikolas-kuhn) · 2024-02-18T21:57:11.574Z · LW(p) · GW(p)

I'm one of these professional mathematicians, and I'll say that this article completely fails to demonstrate it's central thesis that there is a valid intuitive argument for concluding that 1 + 2 + 3 + ... = -1/12 makes sense. What's worse, it only pretends to do so by what's essentially a swindle. In my understanding, it's relatively easy to reason that a given divergent series "should" take an arbitrary finite value by the kind of arguments employed here, so what is being done is taking a foregone conclusion and providing some false intuition for why it should be true.

On a less serious note, speaking to the real reason why 1 + 2 + 3 + ... = -1/12, that's actually what physicists will tell you, and we all know one should be careful around those.

Replies from: lahwran, shankar-sivarajan

↑ comment by the gears to ascension (lahwran) · 2024-02-18T23:34:42.815Z · LW(p) · GW(p)

is the = -12 a typo or a joke?

Replies from: nikolas-kuhn

↑ comment by Amalthea (nikolas-kuhn) · 2024-02-18T23:54:21.036Z · LW(p) · GW(p)

Typo, thanks for pointing it out. Also, see here for the physics reference: https://en.m.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_⋯

Replies from: lahwran

↑ comment by the gears to ascension (lahwran) · 2024-02-19T02:57:52.738Z · LW(p) · GW(p)

In bosonic string theory, the attempt is to compute the possible energy levels of a string, in particular, the lowest energy level. Speaking informally, each harmonic of the string can be viewed as a collection of 'D' − 2 independent quantum harmonic oscillators, one for each transverse direction, where D is the dimension of spacetime. If the fundamental oscillation frequency is ω, then the energy in an oscillator contributing to the n-th harmonic is nħω/2. So using the divergent series, the sum over all harmonics is −ħω(D − 2)/24. Ultimately it is this fact, combined with the Goddard–Thorn theorem, which leads to bosonic string theory failing to be consistent in dimensions other than 26.

Haha what the fuck

Replies from: michael-roe, shankar-sivarajan

↑ comment by Michael Roe (michael-roe) · 2024-02-19T14:25:17.167Z · LW(p) · GW(p)

It's astonishing, but yes, that is the reason why that form of string theory takes place in a 26 dimensional space-time.

wait till you see E8*E8 heterotic string theory,

https://en.wikipedia.org/wiki/Heterotic_string_theory

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-19T04:46:43.564Z · LW(p) · GW(p)

This might be a better description: link

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-19T00:42:26.937Z · LW(p) · GW(p)

I did warn about the prereqs.

comment by John Steidley (JohnSteidley) · 2024-02-18T18:15:58.994Z · LW(p) · GW(p)

The finish was quite a jump for me. I guess I could go and try to stare at your parenthesis and figure it out myself, but mostly I feel somewhat abandoned at that step. I was excited when I found 1, 2, 4, 8... = -1 to be making sense, but that excitement doesn't quite feel sufficient for me to want to decode the relationships between the terms in those two(?) patterns and all the relevant values

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T18:25:38.862Z · LW(p) · GW(p)

That's fair. What I was trying to convey (in my notation, deliberately annoying to reduce a sense of familiarity) is . Any ideas for how I could write that better?

I added some actual values for concreteness. Hopefully that helps.

Replies from: hwold

↑ comment by hwold · 2024-02-18T19:33:13.142Z · LW(p) · GW(p)

Do we have the value of the sum as a function of x, before going to the limit as x goes to 0 ? If yes, it would help (bonus points if it can be proven in a few lines).

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T21:22:25.158Z · LW(p) · GW(p)

Mathematica yields . That probably simplifies though.

comment by Viliam · 2024-03-15T20:25:24.838Z · LW(p) · GW(p)

Here is another explanation, kind of:

Taylor expansion of 1/(1+x)^2 is 1 - 2x + 3x^2 - 4x^3 + 5x^4...

When x = 1, it means that 1 - 2 + 3 - 4 + 5... = 1/4

But 1 - 2 + 3 - 4 + 5... can be written as 1 + 2 + 3 + 4 + 5... - 2×2 - 2×4 - 2×6...

= 1 + 2 + 3 + 4 + 5... - 2×(2 + 4 + 6...)

= 1 + 2 + 3 + 4 + 5... - 2×2×(1 + 2 + 3...)

= (1 - 2×2) × (1 + 2 + 3...)

= -3 × (1 + 2 + 3...)

So if 1 - 2 + 3 - 4 + 5... = 1/4, we get:

1/4 = -3 × (1 + 2 + 3...)

-1/12 = 1 + 2 + 3...

(Found here.)

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-03-16T05:01:30.970Z · LW(p) · GW(p)

Sure, but if you see , you already have all the intuition you need. The rest is detail.

The example I used, $1 + 2 + 4 + 8 + \dots = - 1$ , is the same kind of thing, a power series applied outside its domain of convergence, which I used instead because, while it doesn't lend itself to the derivation directly, it looks more like the sum we seek (in particular, all positive integers on the left and a negative number on the right), and I expected the formula for an infinite geometric series to be more familiar to most readers.

comment by Michael Roe (michael-roe) · 2024-02-19T13:51:15.691Z · LW(p) · GW(p)

Shay you want here is Matthew D. Schwarz, Quantum Theory and the Standard Model, chapter 15.

(or the original paper by CasiMir cited in the above book chapter.

Zeta regularisation is a much saner way to explain it.

====

in a physics context .. suppose that the laws of physics you have are only valid up to some energy scale E, where E is presumed large.

The physical quantity you 're interested in is f(E) - g(E).

lim E -> infinity of f(E) and g(E) is infinite, so can't safely exchange the order of the limits and the subtraction. But lim E -> infinity (f(E) - g(E)) exists and is finite, so you're good to go, and the result is insensitive to what the energy scale E actually is,

comment by Ben Pace (Benito) · 2024-02-18T20:42:25.255Z · LW(p) · GW(p)

This was a fun read and felt (for me) more simple and follow-able than most things I've read explaining math! Thank you.

I got up to the sums of powers of being $- 1$ . That bit took a few close reads but I followed that there was a pattern where infinite sums of $(1, r, r^{2}, r^{3}, \dots)$ equal $\frac{1}{1 - r}$ , and there's reason to believe this holds for $r$ between $- 1$ and $1$ . Then you write that if we apply it to $r = 2$ then it's equal to $- 1$ , which is a daring question to even ask and also a curious answer! But what justification do you have for thinking that equation holds for $r$ that aren’t between $- 1$ and $1$ ? I think that this was skipped over (though I may be missing something simple).

(Also, if it does hold for numbers that aren't between $- 1$ and $1$ that I believe this also implies that all infinite sums of $r^{n}$ equal to negative numbers, and suggests maybe all infinite sums of positive integers will too.)

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-18T20:59:59.307Z · LW(p) · GW(p)

But what justification do you have for thinking that equation holds

I don't have one: this is simply filling in the the pattern. You can say "No, for any there is no answer," (just like you could reasonably say things like " $- 1$ has no square root") but if you decide to extend the domain of the function $C,$ this is not just a possible value, it is the value. (The proof of this claim requires actual rigor, but if you're willing to go along with pattern matching, you get the right answer.)

comment by Zane · 2024-03-13T01:59:21.202Z · LW(p) · GW(p)

If I'm understanding correctly, the argument here is:

B) ${lim}_{x \to \infty} (n e^{- \frac{n}{x}}) = n$

C) ${lim}_{x \to \infty} (cos (- \frac{n}{x})) = 1$

Therefore, $\sum_{n = 1}^{\infty} (n * 1) = - \frac{1}{12}$ .

First off, this seems to have an implicit assumption that ${lim}_{x \to \infty} (\sum_{n = 1}^{\infty} (f (x, n) * g (x, n))) = \sum_{n = 1}^{\infty} (({lim}_{x \to \infty} f (x, n)) * ({lim}_{x \to \infty} g (x, n)))$ .

I think this assumption is true for any functions f and g, but I've learned not to always trust my intuitions when it comes to limits and infinity; can anyone else confirm this is true?

Second, A seems to depend on the relative sizes of the infinities, so to speak. If j and k are large but finite numbers, then $\sum_{n = 1}^{j} (n e^{- \frac{n}{k}} cos (- \frac{n}{k})) \approx - \frac{1}{12}$ if and only if j is substantially greater than k; if k is close to or larger than j, it becomes much less than or greater than -1/12.

I'm not sure exactly how this works when it comes to infinities - does the infinity on the sum have to be larger than the infinity on the limit for this to hold? I'm pretty sure what I just said was nonsense; is there a non-nonsensical version?

In conclusion, I don't know how infinities work and hope someone else does.

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-03-13T02:41:41.208Z · LW(p) · GW(p)

I didn't make any claim about limits. If you're looking for rigor, you're in the wrong place, as I tried to make clear in the introduction.

But (A) is true without any unconventional weirdness: (from Mathematica), and ${lim}_{x \to 0^{-}}$ of that is $- \frac{1}{12}$ .

Intuition for 1 + 2 + 3 + … = -1/12

Contents

The "Counting" Function C

Fractions, Negative numbers, Zero

Infinitely Large Tuples and Pattern Consistency

Finish

Concluding Remarks

28 comments

The "Counting" Function $C$