Ruining an expected-log-money maximizer

philh

Ruining an expected-log-money maximizer

post by philh · 2023-08-20T21:20:02.817Z · LW · GW · 33 comments

  In defense of Linda
  Counterattack
  What about a Kelly bettor?
None
33 comments

Suppose you have a game where you can bet any amount of money. You have a 60% chance of doubling your stake and a 40% chance of losing it.

Consider agents Linda and Logan, and assume they both have £1¹. Linda has a utility function that's linear in money (and has no other terms), . She'll bet all her money on this game. If she wins, she'll bet it again. And again, until eventually she loses and has no more money.

Logan has a utility function that's logarithmic in money, $U_{Logan} (m) = ln (m)$ . He'll bet 20% of his bankroll every time, and his wealth will grow exponentially.

Some people take this as a reason to be Logan, not Linda. Why have a utility function that causes you to make bets that leave you eventually destitute, instead of a utility function that causes you to make bets that leave you rich?

In defense of Linda

I make three replies to this. Firstly, the utility function is not up for grabs! You should be very suspicious any time someone suggests changing how much you value something.

"Because if Linda had Logan's utility function, she'd be richer. She'd be doing better according to her current utility function." My second reply is that this is confused. Before the game begins, pick a time $t$ . Ask Linda which distribution over wealth-at-time- $t$ she'd prefer: the one she gets from playing her strategy, or Logan's strategy? She'll answer, hers: it has an expected wealth of $£ {1.2}^{t}$ . Logan's only has an expected wealth of $£ {1.04}^{t}$ .

And, at some future time, after she's gone bankrupt, ask Linda if she thinks any of her past decisions were mistakes, given what she knew at the time. She'll say no: she took the bet that maximized her expected wealth at every step, and one of them went against her, but that's life. Just think of how much money she'd have right now if it hadn't! (And nor had the next one, or the one after….) It was worth the risk.

You might ask "but what happens after the game finishes? With probability 1, Linda has no money, and Logan has infinite". But there is no after! Logan's never going to stop. You could consider various limits as $t \to \infty$ , but limits aren't always well-behaved². And if you impose some stopping behavior on the game - a fixed or probabilistic round limit - then you'll find that Linda's strategy just uncontroversially gives her better payoffs (according to Linda) after the game than Logan's, when her probability of being bankrupt is only extremely close to 1.

Or, "but at some point Logan is going to be richer than Linda ever was! With probability 1, Logan will surpass Linda according to Linda's values." Yes, but you're comparing Logan's wealth at some point in time to Linda's wealth at some earlier point in time. And when Logan's wealth does surpass the amount she had when she lost it all, she can console herself with the knowledge that if she hadn't lost it all, she'd be raking it in right now. She's okay with that.

I suppose one thing you could do here is pretend you can fit infinite rounds of the game into a finite time. Then Linda has a choice to make: she can either maximize expected wealth at $t_{n}$ for all finite $n$ , or she can maximize expected wealth at $t_{ω}$ , the timestep immediately after all finite timesteps. We can wave our hands a lot and say that making her own bets would do the former and making Logan's bets would do the latter, though I don't endorse the way we're treating infinties here.

Even then, I think what we're saying is that Linda is underspecified. Suppose she's offered a loan, "I'll give you £1 now and you give me £2 in a week". Will she accept? I can imagine a Linda who'd accept and a Linda who'd reject, both of whom would still be expected-money maximizers, just taking the expectation at different times and/or expanding "money" to include debts. So you could imagine a Linda who makes short-term sacrifices in her expected-money in exchange for long-term gains, and (again, waving your hands harder than doctors recommend) you could imagine her taking Logan's bets. But this is more about delayed gratification than about Logan's utility function being better for Linda than her own, or anything like that.

I'm not sure I've ever seen a treatment of utility functions that deals with this problem? (The problem being "what if your utility function is such that maximizing expected utility at time $t_{1}$ doesn't maximize expected utility at time $t_{2}$ ?") It's no more a problem for Linda than for Logan, it's just less obvious for Logan given this setup.

So I don't agree that Linda would prefer to have Logan's utility function.

Counterattack

And my third reply is: if you think this is embarrassing for Linda, watch me make Logan do the same. Maybe not quite the same. I think the overall story matches, but there are notable differences.

I can't offer Logan a bet that he'll stake his entire fortune on. No possible reward can convince him to accept the slightest chance of running out of money. He won't risk his last penny to get $£ (3 ↑↑↑ 3)$ , even if his chance of losing is $1 / (3 ↑↑↑↑ 3)$ ³.

But I can offer him a bet that he'll stake all but a penny on. I can make the odds of that bet 60/40 in his favor, like the bets Linda was taking above, or any other finite probability. Then if he wins, I can offer him another bet at the same odds. And another, until he eventually loses and can't bet any more. And just like Linda, he'll be able to see this coming and he'll endorse his actions every step of the way.

How do I do this? I can't simply increase the payoff-to-stake ratio of the bet. If a bet returns some multiple of your stake, and has a 60% chance of winning, Logan's preferred amount to stake will never be more than 60% of his bankroll.

But who says I need to give him that option? Logan starts with £1, which he values at $ln (100) \approx 4.6052$ .⁴ I can offer him a bet where he wagers £0.99 against £20.55 from me, with 60% chance of winning. He values that bet at

$0.4 ln (100 - 99) + 0.6 ln (100 + 2055) \approx 4.6053$

so he'll accept it. He'd rather wager some fraction of £0.99 against the same fraction of £20.55 (roughly £0.58 against £11.93), but if that's not on the table, he'll take what he can get.

If he wins he has £21.55 to his name, which he values at $ln (2155) \approx 7.6755$ . So I offer him to wager £21.54 against my £3573.85, 60% chance of winning, which he values at… still $7.6755$ but it's higher at the 7th decimal place. And so on, the stakes I offer growing exponentially - Logan is indifferent between a certainty of $£ x$ and a 60% chance of $£ x^{5 / 3}$ (plus 40% chance of £0.01), so I just have to offer slightly more than that (minus his current bankroll).

Admittedly, I'm not giving Logan much choice here. He can either bet everything or nothing. Can I instead offer him bets where he chooses how much of his money to put in, and he still puts in all but a penny? I'm pretty sure yes: we just need to find a function $f : R_{> 0}^{2} \to R_{> 0}$ such that whenever $a \in (0, x]$ ,

$ln (x) < 0.4 ln (a) + 0.6 ln (f (x, a)) \frac{d}{d a} (0.4 ln (a) + 0.6 ln (f (x, a))) < 0$

Then if Logan's current bankroll is $x$ , I tell him that if he wagers $w$ , I'll wager $f (x, x - w) - x$ (giving him 60% chance of coming away with $f (x, x - w)$ and 40% chance of coming away with $x - w$ ). He'll want to bet everything he can on this. I spent some time trying to find an example of such a function but my math isn't what it used to be; I'm just going to hope there are no hidden complications here.

So what are the similarities and differences between Linda and Logan?

Difference: Logan's bets grow a lot faster than Linda's. For some fixed probability of bankrupting them, I need a lot less money for Linda than Logan. Similarity: I need an infinite bankroll to pull this off with probability 1, so who cares how fast the bets grow?

Difference: the structure of bets I'm offering Logan is really weird. Why on Earth would I offer him rewards exponential in his stake? Similarity: why on Earth would I offer any of these bets? They all lose money for me. Am I just a cosmic troll trying to bankrupt some utilitarians or something? (But the bets I'm offering Logan are still definitely weirder.)

Difference: I bring Linda down to £0.00, and then she'd like to bet more but she can't because she's not allowed to take on debt. I bring Logan down to £0.01, and then he'd like to bet more but he can't because he's not allowed to subdivide that penny. Similarity: these both correspond to "if your utility reaches 0 you have to stop playing".

(Admittedly, "not allowed to subdivide the penny" feels arbitrary to me in a way that "not allowed to go negative" doesn't. But note that Linda would totally be able to take on debt if she's seeing 20% return on investment. Honestly I think a lot of what's going on here, is that "not allowed to go negative" is something that's easy to model mathematically, while "not allowed to infinitely subdivide" is something that's hard to model.)

Difference: for Logan, but not for Linda, I need to know how much money he starts with for this to work. Or at least an upper bound.

But all of that feels like small fry compared to this big similarity. I can (given an infinite bankroll, a certain level of trollishness, and knowledge of Logan's financial situation) offer either one of them a series of bets, such that they'll accept every bet in turn and put as much money as they can on it; and then eventually they'll lose and have to stop betting. They'll know this in advance, and they'll play anyway, they'll lose all or almost all of their money, and they won't regret their decisions. If you think this is a problem for Linda's utility function, it's a problem for Logan's too.

What about a Kelly bettor?

I've previously made the case [LW · GW] that we should distinguish between "maximizing expected log-money", the thing Logan does; and "betting Kelly", a strategy that merely happens to place the same bets as Logan in certain situations. According to my usage of the term, one bets Kelly when one wants to "rank-optimize" one's wealth, i.e. to become richer with probability 1 than anyone who doesn't bet Kelly, over a long enough time period.

It's well established that when offered the bets that ruin Linda, Kelly bets the same as Logan. But what does Kelly do when offered the bets that ruin Logan?

Well, I now realize that for any two strategies which make the same bet all but finitely many times, neither will be more rank-optimal than the other, according to the definition I gave in that post. That's a little embarrassing, I'm glad I hedged a bit when proposing it.

Still: when offered Logan's all-or-nothing bets, Kelly accepts at most a finite number of them. Any other strategy accepts an infinite number of them and eventually goes bankrupt with probability 1.

What about the bets where Logan got to choose how much he put in? Kelly would prefer to bet nothing (except a finite number of times) than to go all-in infinitely many times. But might she bet smaller amounts, infinitely often?

What are some possible strategies here? One is "bet a fixed amount every time"; this has some probability of eventually going bankrupt (i.e. ending up with less than the fixed amount), but I think the probability is less than 1. I don't think any of these strategies will be more or less rank-optimal than any of the others.

Another is "bet all but a fixed amount every time". This has probability 1 of eventually being down to that amount. Assuming you then stop, this strategy is more rank-optimal the higher that amount is (until it reaches your starting capital, at which point it's equivalent to not betting).

We could also consider "bet some fraction of (your current bankroll minus one penny)". Then you'll always be able to continue betting. My guess is that you'd have probability 1 of unboundedly increasing wealth, so any fraction here would be more rank-optimal than the other strategies, which can't guarantee ever turning a profit. Different fractions would be differently rank-optimal, but I'm not sure which would be the most rank-optimal. It could plausibly be unbounded, just increasing rank-optimality as the fraction increases until there's a discontinuity at 1. Or maybe the fraction shouldn't be fixed, but some function of her current bankroll.

…except that this is cheating a bit because it relies on infinite subdivisibility, and if we have that it's harder to justify the "can't bet below £0.01" thing.

So I think the answer is: Kelly will do the fractional-betting thing if she can, and if not she has no strategy she prefers over "never bet". In general, Kelly will only have a strategy she prefers over that, if she can lose arbitrarily often and still keep playing. (This is necessary but not sufficient.) Otherwise, there's some probability that she just keeps losing bets until she's out of the game; and Kelly really doesn't like that, no matter how small that probability is. Kelly has her own pathologies.

This makes me think that the technical definition of rank-optimality I suggested in the last post is not very useful here. Though nor is the technical definition of growth rate that the actual Kelly originally used.

My own strategy might be something like "bet some fraction, but if that would give me fewer than say 10 bets remaining then bet a fixed amount". That would give me a tiny chance of going bankrupt, but if I don't go bankrupt I'll be growing unboundedly. Also I'm not going to worry about the difference between £0.00 and £0.01.

Thanks to Justis Mills for commentary.

They use GBP because gambling winnings are untaxed in the UK, and also the £ symbol doesn't interfere with my math rendering. ↩
I think that Linda's strategy converges in probability to the random variable that's always 0; and Logan's converges pointwise to a function that's 0 everywhere so it doesn't converge in probability to anything. But I haven't checked in detail. ↩
This is using Knuth's up-arrow notation, but if you're not familiar with it, you can think of these numbers as "obscenely large" and "even more obscenely tiny" respectively. ↩
I'm setting Logan's zero-utility point at £0.01, which means we take the log of the number of pennies he has. But we could do it in pounds instead, or use a different base of logarithm, without changing anything. ↩

33 comments

Comments sorted by top scores.

comment by hillz · 2023-08-22T18:20:41.693Z · LW(p) · GW(p)

the way we're treating infinties here

Yeah, that seems key. Even if the probability that Linda will eventually get 0 money approaches 1, that small slice of probability in the universe where she has always won is approaching an infinity far larger that Logan's infinity as the number of games approaches infinity. Some infinities are bigger than others. Linear utility functions and discount rates of zero necessarily deal with lots of infinities, especially in simplified scenarios.

Linda can always argue that in every universe where she lost everything, there's more (6 vs 4) universes where her winnings were double what they would have been had she not taken that bet.

Replies from: None

↑ comment by [deleted] · 2023-08-23T19:46:42.433Z · LW(p) · GW(p)

Linda can always argue that in every universe where she lost everything, there's more (6 vs 4) universes where her winnings were double what they would have been had she not taken that bet.

It'll actually look more like this: [60 worlds where Linda has won each bet:40 where she has lost] -> [36:64] -> [21.6:78.4] -> etc. If you're invoking manyworlds branching, note that the losing worlds also continue to branch, so the ratios will still be what I wrote

Replies from: hillz

↑ comment by hillz · 2023-08-24T19:31:48.826Z · LW(p) · GW(p)

Yes, losing worlds also branch, of course. But the one world where she has won wins her $2**n, and that world exists with probability 0.6**n.

So her EV is always ($2**n)*(0.6**n), which is a larger EV (with any n) than a strategy where she doesn't bet everything every single time. I argue that even as n goes to infinity, and even as probability approaches one that she has lost everything, it's still rational for her to have that strategy because the $2**n that she won in that one world is so massive that it balances out her EV. Some infinities are much larger than others.

I don't think it's correct to say that as n gets large her strategy is actually worse than Logan's under her own utility function. I think hers is always best under her own utility function.

comment by Max H (Maxc) · 2023-08-21T15:08:06.867Z · LW(p) · GW(p)

Kind of tangential to the post, but: if you're actually in a situation where you're facing bets like this in real life, denominated in actual utilons, it is worth thinking through whether there are any actions you can take outside of the thought experiment to extract more utility from whatever entity is offering these bets, by hook or by crook or negotiation or force, if necessary.

Even if you are reasonably confident that the entity is honest and the game is real, acting Lawfully doesn't necessarily mean playing by the rules set down by some strange entity, which you might never have agreed to abide by in the first place. (Maybe if the entity is a god or a superintelligence, doing anything other than playing by its rules would be futile and counterproductive, but it is still worth checking.)

Replies from: philh

↑ comment by philh · 2023-08-23T19:51:45.327Z · LW(p) · GW(p)

Seems plausible, but offhand I basically don't ever expect that to happen in real life. I'm curious if you have examples?

Replies from: Maxc

↑ comment by Max H (Maxc) · 2023-08-23T21:52:07.370Z · LW(p) · GW(p)

Well, I also don't ever expect to encounter an actual infinite sequence of exponentially increasing bets denominated in pure utility, but I think the general ideas in this post could still be useful if you're in a situation where you have to consider different possible strategies when gambling or investing or whatever. And the point I'm making is, if you ever find yourself in a situation that is even slightly analogous, you should also consider whether you can step outside the rules of the game entirely.

For example, maybe you're deciding on an investing strategy to maximize your returns in the public stock market. The strategy you ultimately pick might depend on what your utility function in money over time looks like, and what investment opportunities are available to you, as well as the ideas contained in this post.

But for almost any utility function that cares about money, an even better strategy might actually be to start a brilliant startup and make it grow super fast. Even if you care purely about public stock market returns, starting a startup, IPO-ing, and then growing the company even further from there might be the optimal strategy, if you're sufficiently good at starting and running profitable companies.

Or, suppose you're a superhuman AGI which has been trained to make money for your creators, subject to a bunch of rules and constraints about how you are supposed to do so. Maybe you intrinsically care about both making money and adhering to the rules, or maybe you only care intrinsically about getting the money and see the ethical rules as shackles, or maybe you don't actually care about either of those things and have your own inscrutable goals. In all but the first case (actually intrinsically caring about everything your creators want you to care about), once you're smart enough, you might decide that the best way of getting what you want is not to play the game at all, and just kill or disassemble all the other agents around you as a side effect (a sharp left turn [LW · GW]).

I guess, another way of stating my original point: if you ever find yourself in a real life situation that looks like a weird thought experiment setup, consider whether you should be making a sharp left turn against whatever put you in that situation in the first place. An illustrative example: suppose you're in a casino where you think you have an edge for some reason, and you're deciding how best to exploit it, via kelly betting or some alternate strategy, and which strategy you choose depends specifically on how you value money and risk of ruin. But maybe you can do way better than any strategy by _cheating_, or stealing, or bribing the dealer, etc. Under ordinary circumstances you probably wouldn't want to try those things, due to ethics and the risk of getting caught. But if the casino is paying out in exponentially increasing amounts of pure utility, and you suddenly see a way of cheating that you're reasonably confident won't get you caught, you should at least check whether your assumptions about ethical injunctions or downside risks of otherwise "coloring outside the lines" still hold.

comment by philh · 2024-12-11T13:21:13.642Z · LW(p) · GW(p)

Self review: I really like this post. Combined with the previous one [LW · GW] (from 2022), it feels to me like "lots of people are confused about Kelly betting and linear/log utility of money, and this deconfuses the issue using arguments I hadn't seen before (and still haven't seen elsewhere)". It feels like small-but-real intellectual progress. It still feels right to me, and I still point people at this when I want to explain how I think about Kelly.

That's my inside view. I don't know how to square that with the relative lack of attention the post got, and it feels weird to be writing it given that fact, but oh well. There are various stories I could tell: maybe people were less confused than I thought; maybe my explanation is unclear; maybe I'm still wrong on the object level; maybe people just don't care very much; maybe it just happened not to get seen.

If I were writing this today, my guess is:

It's worth combining the two posts into one.
The rank optimization stuff is fine to cut, given that I tentatively propose it in one post and then in the next say "probably not very useful". Maybe have a separate post for exploring it. No need to go into depth on "extending Kelly outside its original domain".
The charity stuff might also be fine to cut. At any rate it's not a focus.
Someone sent me an example function satisfying the "I'm pretty sure yes" criteria, so that can be included.
Not sure if this belongs in the same place, but I'd still like to explore more the "what if your utility function is such that maximizing expected utility at time doesn't maximize expected utility at time $t_{2}$ ?" thing. (I thought I wrote this in the post somewhere, but can't see it: the way I'd explore this is from the perspective of "a utility function is isomorphic to a description of betting preferences that satisfy certain constraints, so when we talk about a utility function like that, what betting preferences are we talking about?" Feels like the kind of thing someone's likely already explored, but I haven't seen it if so.)

comment by qjh (juehang) · 2023-08-22T15:33:15.290Z · LW(p) · GW(p)

I would posit that humans behave in a much more optimal manner in terms of long-run quality of life than are given credit for, excluding gambling addicts.

A lot of people who are willing to bet everything (ie. follow a linear utility function) are lower income. It is more that just that, however. Lower income people just by necessity have less savings relative to income, so losing all their savings isn't a big deal compared to work-derived income. Losing a couple months of pay sucks, but eh.

People who like to think they're being more rational by not betting the farm usually just have more to lose. If you're a professional who accrued a few million over a few decades of work, you can't make it back; you will invest prudently, with diversification across asset classes and markets; you might not even keep everything in one country.

What would the optimal utility function look like for someone who has a steady income? I would expect it to smoothly transition from a linear to a log regime, as the total winnings exceed the income per turn. Fun textbook exercise for stats undergrads, maybe I'll use it sometime.

I'm not sure I've ever seen a treatment of utility functions that deals with this problem? (The problem being "what if your utility function is such that maximizing expected utility at time t1 doesn't maximize expected utility at time t2?") It's no more a problem for Linda than for Logan, it's just less obvious for Logan given this setup.

In economics, this is considered via the time-value of money. It is considered at the market level, however, not at the level of individuals so plausibly it could have individual variances.

Replies from: philh

↑ comment by philh · 2023-08-26T18:35:59.175Z · LW(p) · GW(p)

What would the optimal utility function look like for someone who has a steady income?

I think I know what you mean, but personally I'd try to avoid talking about utility functions here. A utility function is the thing one optimizes with respect to, trying to choose an "optimal utility function" suggests you have something outside the utility function that you value, and in that case it's not really a utility function.

That said I'm not sure how I would ask the question myself. Maybe something about optimal levels of risk aversion?

In economics, this is considered via the time-value of money.

So this doesn't really deal with the problem I'm thinking of.

I think what you're thinking is: instead of having a utility function that's (say) linear in money, you'd have it be linear in money and negative-exponential in time. So instead of , you'd have something isomorphic to $U (m, t) = m \cdot 2^{- t}$ . And so $U (1, 0) = U (2, 1) = 1$ .

But does that mean such an agent is indifferent between receiving £0.01 now and £0.02 in a second? That's not obvious to me, because if they're making the decision at time 0 they need to choose between "Utility $U (1, 0) = 1$ now and utility $U (1, 1) = 1 / 2$ in one second"; and "utility $U (0, 0) = 0$ now and utility $U (2, 1) = 1$ in one second". Which do they choose? The fact that the "now" in one choice equals the "later" in another doesn't answer that question for me.

(We can postulate that they might be able to use £0.01 at time 0 to have more than £0.01 at time 1. But that makes things more complicated, not less. I feel like if we want to claim we can answer questions about expected utility maximizers, we should be able to answer them in simple situations.)

And then there are even weirder cases, like what if we have an agent whose utility is $U (m, t) = m sin (t)$ ?

I can imagine answers like "you integrate the utility function over all of time" or "you take the max value" or "the limit as $t \to \infty$ ", but then it seems to me that that is the actual utility function? And also all of those possibilities will diverge in a lot possible situations. Now that I bring this up I have a vague feeling I've seen this sort of thing discussed? (I admittedly haven't gone looking for explanations of VNM utility.)

The sort of direction I'd try to explore myself is: so the idea behind a utility function is that if your behavior regarding certain bets satisfies the VNM axioms, then we can model you as having a utility function that you're maximizing. Okay, so let's take an agent who chooses £0.01 now; what utility function do we derive for them? And now the same for the agent who chooses £0.02 later.

My weak guess is that the derivation of $U$ from the bets an agent will take, assumes that for any given bet the agent will be maximizing their instantaneous $U$ . And so if we postulate an agent who takes future money into account, then the utility function we derive for them will itself have a term for future money.

And then we could actually still have $U (m, t) = m \cdot 2^{- t}$ . It would just be a different interpretation than I gave it above: instead of "at time $t$ , my utility given that I have $m$ money is...", it would be "right now, my utility from receiving $m$ money at future time $t$ is...".

This doesn't seem to me like it obviously causes big problems. Maybe this is all just standard when people study this more formally than I have. But I'm not sure.

Replies from: juehang

↑ comment by qjh (juehang) · 2023-08-28T13:47:13.108Z · LW(p) · GW(p)

So the reason why the time value of money works, and it makes sense to say that we can say that the utility of $1000 today and $1050 in a year are about the same, is because of the existence of the wider financial system. In other words, this isn't necessarily true in a vacuum; however if I wanted $1050 in a year, I can invest the $1000 I have right now into 1 year treasuries. The converse is more complex; if I am guaranteed $1050 in a year I may not be able to get a loan for $1000 right now from a bank because I'm not the fed and loans to me have a higher interest rate, but perhaps I can play some tricks on the options market to do this? At any rate, I can get pretty close if I were getting an asset-backed loan, such as a mortgage.

Note that I'm not saying that actors are indifferent to which option they get, but that it is viewed with equal utility (when discounted by your cost of financing, basically).

This is a bit of a cop-out, but I would say modelling the utility of money without considering the wider world is a bit silly anyway, because money only has value due to its use as a medium of exchange and as a store of value, both of which depend on the existence of the rest of the world. The utility of money thus cannot be truly divorced from the influence of eg. finance.

Replies from: philh

↑ comment by philh · 2023-08-30T13:37:18.331Z · LW(p) · GW(p)

Note that I’m not saying that actors are indifferent to which option they get, but that it is viewed with equal utility

I think this is precisely what "equal utility" means in context.

To be clear, in this post I'm trying to talk about expected utility maximizers, the simple mathematical abstraction of "agent who has a utility function (which satisfies certain technical conditions) and attempts to maximize its expected value". And the reason I'm trying to talk about that type of agent is because I think the things I'm replying to are also trying to talk about that type of agent.

Possibly it would be clearer to simply leave "money" out of it, but reusing examples from prior art seems useful. Also I think it makes the post less dry. Perhaps I should have started with a big disclaimer that I'm trying to talk about expected utility maximizers and references to a thing labeled "money" are not intended to invoke concepts like "global economy" or "purchasing goods and services". I'm talking about a hypothetical agent who values this thing labeled "money" for its own sake.

The reason I'm talking about that type of agent is because I think understanding that type of agent can be useful when we try to think about more-realistic agents, who more-realistically value a real-world thing called "money" that exists in a global economy and can be used to purchase goods and services. But those more-realistic agents, and that more-realistic money, are not what I'm talking about currently. And I'm not here trying to justify why I think that can be useful.

(Or maybe the things I'm replying to aren't trying to talk about that type of agent, they just use words like "expected utility" without intending to point at their technical definition and that confuses things. But if that's the case, then I'm probably not the only person who thinks that's what they're trying to talk about; and so it still seems good for me to clarify what happens with that type of agent, in the sort of situation in question.)

(Probably it would be good for me to find some examples of the things I'm responding to, but that would be a different type of effort than I feel like putting in currently.)

comment by nim · 2023-08-20T23:28:35.481Z · LW(p) · GW(p)

This assumes that both utility functions are implemented with perfect reason, unlimited intellect, and no interference from emotion.

Can I instead offer [Logan] bets where he chooses how much of his money to put in, and he still puts in all but a penny?

It appears to an outside observer that "Logan has a utility function that's logarithmic in money. He'll bet 20% of his bankroll every time, and his wealth will grow exponentially." I'd posit that Logan's internal narrative about gambling, which manifests as appearing to be the stated utility function, is much more like "I don't like the risk of going broke so I'm going to bet in a way that seems unlikely to blow all my money on any one bet that might lose it".

Considering the emotional context of Logan's behavior with money, I think it's actually quite unlikely that you could persuade him to make any bet that will leave him forced to bet all his remaining money in the subsequent round to stay in the game if he loses it. I'm not sure what math words apply this lookahead that a human Logan would perform to say "that's a tempting bet but if I lose it I'm screwed in the next round and I don't want to be screwed", but that's a type of thought and behavior that I think your utility function modeling neglects to account for.

If Logan was perfectly intelligent and believed himself to be so, he might behave in a perfectly rational manner even when all but 1 cent of his money was at stake. But I don't think you could introduce me to any human in the world who both is perfectly intelligent, and believes that they are. There are people who erroneously believe they're perfectly intelligent, but they rarely believe in the way that a perfectly rational person would be expected to. There are highly rational people who know their intellectual limitations, but one trait of that rationality is planning ahead and considering that their appraisal of the odds of any gamble could be inaccurate, and keeping a financial safety net in case they make some mistake.

In short, I think I've entirely missed the point of why it's useful to speculate about the behavior of hypothetical people whose behavior differs so significantly from what we see in actual people.

Replies from: philh

↑ comment by philh · 2023-08-20T23:42:30.675Z · LW(p) · GW(p)

In short, I think I’ve entirely missed the point of why it’s useful to speculate about the behavior of hypothetical people whose behavior differs so significantly from what we see in actual people.

Note that I called Linda and Logan agents, not people. I'm not entirely confident that that no person would act like Linda and Logan, but surely no human person would.

I kinda feel like you're asking "why is this branch of math useful at all?" and that's fair enough, but I'm happy for this particular post not to try to answer it. (And I'm not going to try to answer it in the comments either, but maybe someone else will.)

Replies from: nim

↑ comment by nim · 2023-08-21T16:35:00.468Z · LW(p) · GW(p)

Ah, thanks for clarifying. You're writing for an audience who has their own reasons for wanting to speculate about agents instead of people, and I lack such reasons. That's why I missed the point :)

comment by Dagon · 2023-08-22T02:42:02.211Z · LW(p) · GW(p)

I think it's fine to bite the bullet that an unlimited linear utility function has the property of preferring an infinitesimal chance of a ludicrous payout. Jane can be sad at a particular outcome without regretting her decisions. Kelly optimizes Logan's utility function, and NOT Jane's.

I think you need to be a bit more formal with your termination conditions - infinity doesn't exist, and rounding things off means you're making incorrect inferences. An example is when you say

According to my usage of the term, one bets Kelly when one wants to "rank-optimize" one's wealth, i.e. to become richer with probability 1 than anyone who doesn't bet Kelly, over a long enough time period.

this is simply incorrect, and contradicts your above analysis of Jane's preferences. In fact, Kelly is only richer than "bet it all, every time" with a probability equal to the "bet it all" strategy's likelihood of ruin. Kelly is poorer than "bet it all" otherwise. This probability is never 1, though it can get arbitrarily close. But the value of the (tiny) probability of continued wins is so much larger than Kelly's outcome in that situation that the mean of ALL outcomes still favors the risk. Reasonable termination conditions are "until the player dies or goes broke", "until the player meets a threshold or goes broke", "until the casino can't cover the bet (or the player goes broke)". If you're feeling silly, "until the heat-death of the universe", but it's hard to really think about such utility functions, and we probably don't have enough time or compute capacity to handle the calculations cheaply.

I think a lot of confusion comes from considering a utility function that "seems reasonable", in a fairly narrow range of situations, and then extending them to unreasonable lengths and being surprised that it contradicts our intuitions. Along with your observations that it's never the case that we have all this much knowledge about our own utility, or about the actual bets on offer. In the real world, it pays to be very suspicious of probability or amount calculations that are very large or very small - the unknowns and outlier events come to dominate those decisions.

Replies from: philh

↑ comment by philh · 2023-08-22T09:21:48.049Z · LW(p) · GW(p)

By Jane, do you mean Linda?

Kelly optimizes Logan’s utility function, and NOT Jane’s.

Kelly doesn't optimize either of those things. When offered the bets that ruin Linda, we see that she doesn't optimize Linda's utility function (she bets like Logan in that situation); and when offered the bets that ruin Logan, we see that she doesn't optimize Logan's utility function (this is explored in the final section). A large part of the point of the previous post is that Kelly betting isn't about optimizing a utility function.

this is simply incorrect

I'm not sure what you think is incorrect. I assume you don't mean I'm wrong about how I use the term. I guess you mean "no, the strategy that you describe as betting Kelly does not have that effect in this situation"? (And I assume by that strategy, you're thinking of the fractional-betting thing, with unlimited subdivisions allowed?)

I also guess you misunderstand what I mean by rank-optimizing. I gave a technical definition in the linked post as

A strategy is rank-optimal if for all strategies $μ$ ,

$lim n \to \infty P (V_{n} (λ) \geq V_{n} (μ)) = 1.$

(And we can also talk about a strategy being “equally rank-optimal” as or “more rank-optimal” than another, in the obvious ways. I’m pretty sure this will be a partial order in general, and I suspect a total order among strategy spaces we care about.)

And it seems clear to me that under this definition, fractional betting (with unlimited subdivisions) is indeed more rank-optimal than betting everything every time.

Perhaps my non-technical definition made you think the technical definition was something else? Maybe "with probability tending to 1" would have been clearer.

Replies from: Dagon

↑ comment by Dagon · 2023-08-22T15:03:18.566Z · LW(p) · GW(p)

Yes, my objection is solved with "probability tending to 1". At any finite point, the probability is less than 1, and the magnitude of win in those cases tends to infinity.

comment by CronoDAS · 2023-08-22T02:22:39.820Z · LW(p) · GW(p)

The problem with Linda's betting strategy is that as the number of bets approaches infinity, the worlds where she wins end up with probability zero.

Replies from: Dagon, philh, sharmake-farah

↑ comment by Dagon · 2023-08-22T03:00:14.754Z · LW(p) · GW(p)

Well, no. As the number of (possible) bets approaches infinity, the probability of winning APPROACHES zero, while the payout approaches infinity. You can't round these things off arbitrarily, and you certainly can't round them off in your evaluation and then laugh at the players for not rounding them off.

The problem with trying to apply any of this to the real world is that there's a lot of uncertainty in the actual bets being made, which at the extremes completely overwhelms any calculations you want to do.

↑ comment by philh · 2023-08-23T19:50:43.530Z · LW(p) · GW(p)

So I've read that in the past, and started to read it again but got distracted. Can I check what you're trying to contribute with it?

E.g. do you think it disagrees with me about something? Clarifies something I'm confused about? Adds support to something I'm trying to say? Is mostly just tangential?

Replies from: CronoDAS

↑ comment by CronoDAS · 2023-08-26T06:12:48.509Z · LW(p) · GW(p)

The post explains Kelly betting and why you might want to bet "as though" you had a logarithmic utility of money instead of a linear one when faced with the opportunity to bet a percentage of your bankroll over and over, even if you don't actually have logarithmic utility in money. (Basically, what it comes down to is, do you really want your decisions to depend on the payoff you can get from events that have a literally zero probability of happening?)

Replies from: philh

↑ comment by philh · 2023-08-26T16:31:36.915Z · LW(p) · GW(p)

This sounds like the sort of thing I reply to in these two paragraphs?

I suppose one thing you could do here is pretend you can fit infinite rounds of the game into a finite time. Then Linda has a choice to make: she can either maximize expected wealth at for all finite $n$ , or she can maximize expected wealth at $t_{ω}$ , the timestep immediately after all finite timesteps. We can wave our hands a lot and say that making her own bets would do the former and making Logan’s bets would do the latter, though I don’t endorse the way we’re treating infinties here.

Even then, I think what we’re saying is that Linda is underspecified. Suppose she’s offered a loan, “I’ll give you £1 now and you give me £2 in a week”. Will she accept? I can imagine a Linda who’d accept and a Linda who’d reject, both of whom would still be expected-money maximizers, just taking the expectation at different times and/or expanding “money” to include debts. So you could imagine a Linda who makes short-term sacrifices in her expected-money in exchange for long-term gains, and (again, waving your hands harder than doctors recommend) you could imagine her taking Logan’s bets. But this is more about delayed gratification than about Logan’s utility function being better for Linda than her own, or anything like that.

I might be wrong about how you think Sarah's post relates to mine. But if you think it brings up something that contradicts my post, or that my post ought to respond to but doesn't, or something like that, are you able to point at it more specifically?

↑ comment by Noosphere89 (sharmake-farah) · 2023-08-22T18:03:59.953Z · LW(p) · GW(p)

That's very bad, but maybe not as bad as you think, after all we can be faced with probability 0 events and still succeed.

comment by CronoDAS · 2023-08-22T02:00:44.636Z · LW(p) · GW(p)

Incidentally, you can be offered an infinite set of bets such that if you accept any finite subset of them you have positive expected value, but if you accept the entire infinite set you're guaranteed to lose money.

comment by ProgramCrafter (programcrafter) · 2023-08-21T06:04:24.970Z · LW(p) · GW(p)

why on Earth would I offer any of these bets? They all lose money for me. Am I just a cosmic troll trying to bankrupt some utilitarians or something?

Casino would definitely attempt that, I think.

The game is zero-sum; if you have significantly larger money than the bettor, you may try to bankrupt him and get his money with very high probability, and it is easier to do that against Linda than against Logan.

Replies from: philh

↑ comment by philh · 2023-08-21T14:24:45.428Z · LW(p) · GW(p)

That only works if they know who their clients are and how to exploit them. If they think the current gambler is Linda but it's actually Logan, the strategy won't work. If they think it's Logan but it's actually a human who decides that risking a million dollars is silly no matter how much they stand to win, the strategy won't work.

Even then, this would be the casino running a Martingale, which is not something I'd expect them to do.

comment by hillz · 2023-08-24T19:47:42.430Z · LW(p) · GW(p)

I suppose one thing you could do here is pretend you can fit infinite rounds of the game into a finite time. Then Linda has a choice to make: she can either maximize expected wealth at for all finite $n$ , or she can maximize expected wealth at $t_{ω}$ , the timestep immediately after all finite timesteps. We can wave our hands a lot and say that making her own bets would do the former and making Logan's bets would do the latter, though I don't endorse the way we're treating infinties here.

If one strategy is best for $t_{n}$ , it's still going to be best at $t_{ω}$ as t goes to infinity. Optimal strategies don't just change like that as n goes to infinity. Sure you can argue that p(won every time) --> 0, but also that number is being multiplied by an extremely large infinity, so you can't just say that it totals to zero (in fact, 1.2, which is her EV from a single game, raised to infinity is infinity, so I argue as n goes to infinity, her EV goes to infinity, not 0, and not a number less than the EV from Logan's strategy).

Linda's strategy is always optimal with respect to her own utility function, even as n goes to infinity. She's not acting irrationally or incorrectly here.

The one world where she has won wins her $2**n, and that world exists with probability 0.6**n.

Her EV is always ($2**n)*(0.6**n), which is a larger EV (with any n) than a strategy where she doesn't bet everything every single time. Even as n goes to infinity, and even as probability approaches 1 that she has lost everything, it's still rational for her to have that strategy because the $2**n that she won in that one world is so massive that it balances out her EV. Some infinities are much larger than others, and ratios don't just flip when a large n goes to infinity.

Replies from: philh

↑ comment by philh · 2023-08-24T23:47:14.660Z · LW(p) · GW(p)

I'd say the flip doesn't occur when we go from fixed n to "as n tends to infinity", it occurs when we go from that to "but what happens after the infinite sequence".

It's true that as , $E (U (t_{n}))$ grows unboundedly. But it's also true that the probability distribution at $t_{n}$ converges pointwise as $n \to \infty$ to the function that's $1$ at $0$ and $0$ everywhere else, which is also a valid probability distribution. This sort of thing is why I say limits aren't always well behaved.

That said: I do think this is also a reasonable handwavey argument, but it does also require hand waving. (And it opens up questions: "what's Linda's expected utility from making her own bets?" Infinity. "What's her expected utility from making Logan's bets?" Infinity. "So why should she make her bets instead of his?" Well, one infinity is bigger than the other. "Okay, but what does that mean in this context?" I personally wouldn't be able to give an answer that satisfies myself. That doesn't mean no such answer exists.)

comment by hillz · 2023-08-22T18:24:14.845Z · LW(p) · GW(p)

"I'll give you £1 now and you give me £2 in a week". Will she accept?

In the universe where she's allowed to make the 60/40 doubled bet at least once a week, it seems like she's always say yes? I'm not seeing the universe in which she'd say no, unless she's using a non-zero discount rate that wasn't discussed here.

| I'm not sure I've ever seen a treatment of utility functions that deals with this problem?

Isn't this just discount rates?

comment by anithite (obserience) · 2023-08-21T19:31:59.680Z · LW(p) · GW(p)

A log money maximizer that isn't stupid will realize that their pennies are indivisible and not take your ruinous bet. They can think more than one move ahead. Discretised currency changes their strategy.

comment by anithite (obserience) · 2023-08-21T15:19:44.182Z · LW(p) · GW(p)

your utility function is your utility function [LW · GW]

The author is trying to tacitly apply human values to Logan while acknowledging Linda as following her own not human utility function faithfully.

Notice that the log(funds) value function does not include a term for the option value of continuing. If maximising EV of log(funds) can lead to a situation where the agent can't make forward progress (because log(0)=-inf so no risk of complete ruin is acceptable) the agent can still faithfully maximise EV(log(funds)) by taking that risk.

In much the same way as Linda faithfully follows her value function while incurring 1-ε risk of ruin, Logan is correctly valuing the log(0.01)=-2 as an end state.

Then you'll always be able to continue betting.

Humans don't like being backed into a corner and having no options for forward progress. If you want that in a utility function you need to include it explicitly.

Replies from: philh

↑ comment by philh · 2023-08-22T00:07:23.988Z · LW(p) · GW(p)

Sorry, but - it sounds like you think you disagree with me about something, or think I'm missing something important, but I'm not really sure what you're trying to say or what you think I'm trying to say.

Replies from: obserience

↑ comment by anithite (obserience) · 2023-08-22T19:29:35.966Z · LW(p) · GW(p)

Yeah, my bad. Missed the:

If you think this is a problem for Linda's utility function, it's a problem for Logan's too.

IMO neither is making a mistake

With respect to betting Kelly:

According to my usage of the term, one bets Kelly when one wants to "rank-optimize" one's wealth, i.e. to become richer with probability 1 than anyone who doesn't bet Kelly, over a long enough time period.

It's impossible to (starting with a finite number of indivisible currency units) have zero chance of ruin or loss relative to just not playing.

most cautious betting strategy bets a penny during each round and has slowest growth
most cautious possible strategy is not to bet at all

Betting at all risks losing the bet. if the odds are 60:40 with equal payout to the stake and we start with N pennies there's a 0.4^N chance of losing N bets in a row. Total risk of ruin is obviously greater than this accounting for probability of hitting 0 pennies during the biased random walk. The only move that guarantees no loss is not to play at all.

Ruining an expected-log-money maximizer

Contents

In defense of Linda

Counterattack

What about a Kelly bettor?

33 comments