Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
post by Joe Rogero · 2024-11-12T23:55:46.770Z · LW · GW · 9 commentsContents
There is a lot of value in the universe We already court apocalypse Everything breaks down at infinity Conclusion None 9 comments
TL;DR: People often use the thought experiment of flipping a coin, giving 50% chance of huge gain and 50% chance of losing everything, to say that maximizing utility is bad. But the real problem is that our intuitions on this topic are terrible, and there's no real paradox if you adopt the premise in full.
Epistemic status: confident, but too lazy to write out the math
There's a thought experiment that I've sometimes heard as a counterargument to strict utilitarianism. A god/alien/whatever offers to flip a coin. Heads, it slightly-more-than-doubles the expected utility in the world. Tails, it obliterates the universe. An expected-utility maximizer, the argument goes, keeps taking this bet until the universe goes poof. Bad deal.
People seem to love citing this thought experiment when talking about Sam Bankman-Fried. We should have known he was wrong in the head, critics sigh, when he said he'd bet the universe on a coinflip. They have a point; SBF apparently talked about this a lot, and it came up in his trial. I'm not fully convinced he understood the implications, and he certainly had a reckless and toxic attitude towards risk.
But today I'm here to argue that, despite his many, many flaws, SBF got this one right.
There is a lot of value in the universe
Suppose I'm a utilitarian. I value things like the easing of suffering and the flourishing of sapient creatures. Some mischievous all-powerful entity offers me the coinflip deal. On one side is the end of the world. On the other is "slightly more than double everything I value." What does that actually mean?
It turns out the world is pretty big. There is a lot of flourishing in it. There's also a lot of suffering, but I happen to arrange my preference-ordering such that the net utility of the world continuing to exist is extremely large. To make this coinflip an appealing trade, the Cosmic Flipper has to offer me something whose value is commensurate to that of the whole entire world and everyone in it, plus all the potential future value in humanity's light cone.
That's a big freaking deal.
The number of offers that weigh heavily enough on the other side of the scale is pretty darn small. "Double the number of people in the world" doesn't begin to come close; neither does "make everyone twice as happy." A more appropriate offer IMO might look more like "everyone becomes unaging, doesn't need to eat or drink except for fun, grows two standard deviations smarter and wiser, and is basically immune to suffering."
That's a bet I'd at least consider taking. Odds are, you might feel that way too.
(If you don't, that's okay, but it means the Cosmic Flipper still isn't offering you enough. What would need to be on the table for you, personally, to actually consider wagering the fate of the universe on a coinflip? What would the Cosmic Flipper have to offer? How much better does the world have to be, in the "heads" case, that you would be tempted?)
Suppose I do take the bet, and get lucky. How do you double that? Now we're talking something on the order of "all animals everywhere also stop suffering" and I don't even know what else.
By the time we get to flipping the coin five, ten, or a hundred times, I literally can't even conceive of what sort of offer it would take to make a 50% chance of imploding utopia sound like a good price to pay. It's incredibly difficult to wrap our brains around what "doubling the value in the world" actually means. And that's just the tip of the iceberg.
We already court apocalypse
The thought experiment gets even more complicated when you factor in existing risks.
If you buy the arguments about threats from artificial superintelligence - which I do, for the record - then our world most likely has only a few years or decades left before we're eaten by an unaligned machine. If you don't buy those arguments, there's still the 1 in 10,000 chance per year that we all nuke ourselves to death (or into the Stone Age), which is similar to the odds that you die this year in a car crash (if you're in the US). Even if humanity never invents another superweapon, there's still the chance that Earth gets hit by a meteor or Mother Nature slaughters our civilization with the next Black Death before we get our collective shit together.
What does it mean to "double the expected value of the universe" given the threat of possible extinction? I genuinely don't know. And we can't just say "well, holding x-risk constant..." because any change to the world that's big enough to double its expected utility is going to massively affect the odds of human extinction.
When it comes to thought experiments like this, we can't just rely on what first pops into our head when we hear the phrase "double expected value." For the bargain to make sense to a true expected-utility maximizer, it has to still sound like a good deal even after all these considerations are factored in.
Everything breaks down at infinity
OK, so maybe it's a good idea to flip the coin once or twice, or even many times. But if you take this bet an infinite number of times, then you're guaranteed to destroy the universe. Right?
Firstly, lots of math breaks down at infinity. Infinity is weird like that. I don't think there exists a value system that can't be tied in knots by some contrived thought experiment involving infinite regression, and even if there did, I doubt it would be one I wanted to endorse.
Secondly, and more importantly, I question whether it is possible even in theory to produce infinite expected value. At some point you've created every possible flourishing mind in every conceivable permutation of eudaimonia, satisfaction, and bliss, and the added value of another instance of any of them is basically nil. In reality I would expect to reach a point where the universe is so damn good that there is literally nothing the Cosmic Flipper could offer me that would be worth risking it all.
And given the nature of exponential growth, it probably wouldn't even take that many flips to get to "the universe is approximately perfect". Sounds like a pretty good deal.
Conclusion
The point I'm hoping to make is that this coinflip thought experiment suffers from a gap between the mathematical ideal of "maximizing the expected value in the universe" and our intuitions about it.
On a more specific level, I wish people would stop saying "Of course SBF had a terrible understanding of risk, he took EV seriously!" as though SBF's primary failing was being a utilitarian, and not being reckless and hopelessly blinkered about the real-world consequences of his actions.
9 comments
Comments sorted by top scores.
comment by zoop · 2024-11-13T20:24:56.220Z · LW(p) · GW(p)
I think you've made a motte-and-bailey argument:
- Motte: The payoff structure of the cosmic flip/St. Petersburg Paradox applied to the real world is actually much better than double-or-nothing, and therefore you should play the game.
- Bailey: SBF was correct in saying you should play the double-or-nothing St. Petersburg Paradox game.
Your motte is definitely defensible. Obviously, you can alter the payoff structure of the game to a point where you should play it.
That does not mean "there's no real paradox" , it just means you are no longer talking about the paradox. SBF literally said he would take the game in the specific case where the game was double-or-nothing. Totally different!
This ends my issue with your argument, but I'll also share my favorite anti-St. Petersburg Paradox argument since you didn't really touch on any of the issues it connects to. In short: the definition of expected value as the mean outcome is inappropriate in this scenario and we should instead use the median outcome.
This paper makes the argument better than I can if you're curious, but here's my concise summary:
- Mean values are perhaps appropriate if we play the game many (or infinity) times. In these situations, through the law of large numbers, the mean outcome of the games played will approach the mean interpretation of expected value.
- For a single play-through (as in the thought experiment) the mean is not appropriate, as the law of large numbers does not apply. Instead, we should value the game by its median outcome: the outcome one should reasonably expect.
- Indeed, if you have people actually play this game, their betting behavior is more consistent with an intuition of median expected value (this is tested in the paper).
- There's an argument Median EV is the better interpretation even when playing multiple times. In these situations you can think of the game as "playing the game multiple times, once." This resolves the paradox in all but the infinite cases.
- If you use the median interpretation of EV for finite trials of the game, there is no paradox.
A personal gripe: I find it more than a little stupid that the "expected value" is a value you don't actually "expect" to observe very frequently when sampling highly skewed distributions.
Mathematicians and Economists have taken issue with the mean definition of EV basically as long as it has existed. Regardless of whether or not you agree with it, it seems pretty obvious to me that it is inappropriate to use the mean to value single trial outcomes.
So maybe in the real world we should play the game, but I firmly believe we should value the game using medians and not means. Do we get to play the world outcome optimization game multiple/infinite times? Obviously not.
comment by Noosphere89 (sharmake-farah) · 2024-11-13T00:33:50.188Z · LW(p) · GW(p)
Secondly, and more importantly, I question whether it is possible even in theory to produce infinite expected value. At some point you've created every possible flourishing mind in every conceivable permutation of eudaimonia, satisfaction, and bliss, and the added value of another instance of any of them is basically nil. In reality I would expect to reach a point where the universe is so damn good that there is literally nothing the Cosmic Flipper could offer me that would be worth risking it all.
This very much depends on the rate of growth.
For most human beings, this is probably right, because their values have a function that grows slower than logarithmic, which leads to bounds on the utility even assuming infinite consumption.
But it's definitely possible in theory to generate utility functions that have infinite expected utility from infinite consumption.
You are however pointing to something very real here, and that's the fact that utility theory loses a lot of it's niceness in the infinite realm, and while there might be something like a utility theory that can handle infinity, it will have to lose a lot of very nice properties that it had in the finite case.
See these 2 posts by Paul Christiano for why:
https://www.lesswrong.com/posts/hbmsW2k9DxED5Z4eJ/impossibility-results-for-unbounded-utilities [LW · GW]
https://www.lesswrong.com/posts/gJxHRxnuFudzBFPuu/better-impossibility-result-for-unbounded-utilities [LW · GW]
Replies from: Richard_Kennaway↑ comment by Richard_Kennaway · 2024-11-13T18:57:19.907Z · LW(p) · GW(p)
For most human beings, this is probably right, because their values have a function that grows slower than logarithmic, which leads to bounds on the utility even assuming infinite consumption.
Growing slower than logarithmic does not help. Only being bounded in t he limit give you, well, a bound in the limit.
You are however pointing to something very real here, and that's the fact that utility theory loses a lot of it's niceness in the infinite realm, and while there might be something like a utility theory that can handle infinity, it will have to lose a lot of very nice properties that it had in the finite case.
"Bounded utility solves none of the problems of unbounded utility." Thus the title of something I'm working on, on and off.
It's not ready yet. For a foretaste, some of the points it will make can be found in an earlier unpublished paper "Unbounded Utility and Axiomatic Foundations", section 3.
The reason that bounded utility does not help is that any problem that arises at infinity will already practically arise at a sufficiently large finite stage. Repeated plays of the finite games discussed in that paper will eventually give you a payoff that has a high probability of being close (in relative terms) to the expected value. But the time it takes for this to happen grows exponentially with the lengths of the individual games. You are unlikely to ever see your theoretically expected value, however long you play. The infinite game is non-ergodic; the game truncated to finitely many steps and finite payoffs is ergodic only on impractical timescales.
Infinitude in problems like these is better understood as an approximation to the finite, rather than the other way round. (There's a blog post by Terry Tao on this theme, but I've lost the reference to it.) The problems at infinity point to problems with the finite.
Replies from: sharmake-farah↑ comment by Noosphere89 (sharmake-farah) · 2024-11-13T19:49:15.225Z · LW(p) · GW(p)
Growing slower than logarithmic does not help. Only being bounded in t he limit give you, well, a bound in the limit.
Thanks for catching that error, I did not realize this.
I think I got it from here:
https://www.lesswrong.com/posts/EhHdZ5yBgEvLLx6Pw/chad-jones-paper-modeling-ai-and-x-risk-vs-growth [LW · GW]
"Bounded utility solves none of the problems of unbounded utility." Thus the title of something I'm working on, on and off.
It's not ready yet. For a foretaste, some of the points it will make can be found in an earlier unpublished paper "Unbounded Utility and Axiomatic Foundations", section 3.
The reason that bounded utility does not help is that any problem that arises at infinity will already practically arise at a sufficiently large finite stage. Repeated plays of the finite games discussed in that paper will eventually give you a payoff that has a high probability of being close (in relative terms) to the expected value. But the time it takes for this to happen grows exponentially with the lengths of the individual games. You are unlikely to ever see your theoretically expected value, however long you play. The infinite game is non-ergodic; the game truncated to finitely many steps and finite payoffs is ergodic only on impractical timescales.
Infinitude in problems like these is better understood as an approximation to the finite, rather than the other way round. (There's a blog post by Terry Tao on this theme, but I've lost the reference to it.) The problems at infinity point to problems with the finite.
I definitely agree that the problems of infinite utilities are approximately preserved by the finitary version of the problem, and while there are situations where you can get niceness assuming utilities are bounded (conditional on giving players exponentially large lifespans), it's not the common or typical case.
Infinity makes things worse in that you no longer get any cases where nice properties like ergodicity or dominance are consistent with other properties, but yeah the finitary version is only a little better.
comment by Charlie Steiner · 2024-11-14T02:28:43.139Z · LW(p) · GW(p)
I think you can do some steelmanning of the anti-flippers with something like Lara Buchak's arguments on risk and rationality. Then you'd be replacing the vague "the utility maximizing policy seems bad" argument with a more concrete "I want to do population ethics over the multiverse" argument.
comment by Thomas Kwa (thomas-kwa) · 2024-11-14T00:09:34.688Z · LW(p) · GW(p)
The thought experiment is not about the idea that your VNM utility could theoretically be doubled, but instead about rejecting diminishing returns to actual matter and energy in the universe. SBF said he would flip with a 51% of doubling the universe's size (or creating a duplicate universe) and 49% of destroying the current universe. Taking this bet requires a stronger commitment to utilitarianism than most people are comfortable with; your utility needs to be linear in matter and energy. You must be the kind of person that would take a 0.001% chance of colonizing the universe over a 100% chance of colonizing merely a thousand galaxies. SBF also said he would flip repeatedly, indicating that he didn't believe in any sort of bound to utility.
This is not necessarily crazy-- I think Nate Soares has a similar belief-- but it's philosophically fraught. You need to contend with the unbounded utility paradoxes, and also philosophical issues: what if consciousness is information patterns that become redundant when duplicated, so that only the first universe "counts" morally?
comment by notfnofn · 2024-11-13T15:48:34.830Z · LW(p) · GW(p)
Are you familiar with Kelly betting? The point of maximizing log expectation instead of pure expectation isn't because happiness grows on a logarithmic scale or whatever, it's for the sake of maximizing long-term expected value. This kills off making bets where "0" is on the table (as log(0) is minus infinity); whether or not that's appropriate is still an interesting topic for discussion because, as you mentioned, x-risks exist anyway
Replies from: harfe↑ comment by harfe · 2024-11-13T19:16:00.819Z · LW(p) · GW(p)
it's for the sake of maximizing long-term expected value.
Kelly betting does not maximize long-term expected value in all situations. For example, if some bets are offered only once (or even a finite amount), then you can get better long-term expected utility by sometimes accepting bets with a potential "0"-Utility outcome.
comment by Matthew Roy (matthew-roy) · 2024-11-13T20:04:52.829Z · LW(p) · GW(p)
I'm dead sure you'd need more than 'just more than a doubling' for the payoff to make sense. Let's assume two things.
- Net utility naturally doubles for humans roughly every 300,000 years. (This is deliberately conservative, recent history would suggest something much faster, but the numbers are so stupidly asymmetric, using recent history would be silly. Homo Sapiens have been around that long, net utility has doubled at least once in that time)
- The universe will experience heat death in roughly 10^100 years.
Before you even try to factor in the volatility costs, time value of enjoying that utility, etc. your payoff has to be something like 2^10^95.
Sidenote, that SBF didn't understand time value, volatility premiums, basic compounding, should have been a larger red flag.