Generalize Kelly to Account for # Iterations?

post by abramdemski · 2020-11-02T16:36:25.699Z · LW · GW · 2 comments

This is a question post.

Contents

  Answers
    21 gwern
    8 Dagon
    2 arielroth
None
2 comments

On my post weird things about money [LW · GW], Bunthut made the following comment [LW(p) · GW(p)]:

I think the interesting question is what to do when you expect many more, but only finitely many rounds. It seems like Kelly should somehow gradually transition, until it recommends normal utility maximization in the case of only a single round happening ever. Log utility doesn't do this. I'm not sure I have anything that does though, so maybe it's unfair to ask it from you, but still it seems like a core part of the idea, that the Kelly strategy comes from the compounding, is lost. 

Kelly betting seems somehow less justified when we're not doing it a bunch of times. If I were making bets left and right, I would feel more inclined to use Kelly; I could visualize the growth-maximizing behavior, and know that if I trusted my own probability assessments, I'd see that growth curve with high probability.

Several prediction markets have recently offered a bet at around 62¢ which superforecasters assign around 85% probability. This resulted in a rare temptation for me to Kelly bet. Calculating the Kelly formula, I found that I was supposed to put 60% of my bankroll on this.

Now, 60% of my savings seems like a lot. But am I really more risk-averse than Kelly? It struck me that if I were to do this sort of thing all the time, I would feel more strongly justified using Kelly.

Bunthut is suggesting a generalization of Kelly which accounts for the number of times we expect to iterate investment. In Bunthut's suggestion, we would get less risk-averse as number of iterations dropped, approaching expectation maximization. This would reflect the idea that the Kelly criterion arises because of long-term performance over many iterations, and normal expectation maximization is the right thing to do in single-shot scenarios.

But I sort of suspect this "origin of Kelly" story is wrong. So I'm also interested in number-iteration formulas which reach different conclusions.

The obvious route is to modulate Kelly by the probability that the result will be close to the median case. With arbitrarily many iterations, we are virtually certain that the fraction of bets which pay out approaches their probabilities of paying out, which is the classic argument in favor of Kelly. But with less iterations, we are less sure. So, how might one use that to modulate betting behavior?

I suggest explicitly stepping outside of an expected-utility framework here. The classic justification for Kelly is very divorced from expected utility, so I doubt you're going to find a really appealing generalization via an expected-utility route.

Answers

answer by gwern · 2020-11-02T18:28:36.368Z · LW(p) · GW(p)

I suggest explicitly stepping outside of an expected-utility framework here.

EV seems fine. You just need to treat it as the multi-stage decision problem it is, and solve the MDP/POMDP. One of the points of my Kelly coin-flip exercises is that the longer the horizon, and the closer you are to the median path, the more Kelly-like optimal decisions look, but the optimal choices looks very unKelly-like as you approach boundaries like the winnings cap (you 'coast in', betting much less than the naive Kelly calculation would suggest, to 'lock in' hitting the cap) or you are far behind when you start to run out of turns (since you won't lose much if you go bankrupt and the opportunity cost decreases the closer you get to the end of the game, the more greedy +EV maximization is optimal so you can extract as much as possible, so you engage in wild 'overbetting' from the KC perspective, which is unaware the game is about to end).

comment by abramdemski · 2020-11-03T14:40:33.892Z · LW(p) · GW(p)

Ah, yeah, this looks pretty close to what I was looking for.

OK, so if I'm understanding correctly, the basic idea is EV maximization with a cap on total possible winnings? (Which makes sense -- there's only ever so much money to win.)

So is the claim that this approaches Kelly in the limit of simultaneously increasing cap and horizon?

Replies from: gwern
comment by gwern · 2020-11-06T00:45:16.439Z · LW(p) · GW(p)

Yes, for some classes of games in some sense... MDP/POMDPs are a very general setting so I don't expect any helpful simple exact answers (although to my surprise there were for this specific game), so I just have qualitative observations that it seems like when you have quasi-investment-like games like the coin-flip game, the longer they run and the higher the cap is, the more the exact optimal policy looks like the Kelly policy because the less you worry about bankruptcy & the glide-in period gets relatively smaller.

I suspect that if the winnings were not end-loaded and you could earn utility in each period, it might look somewhat less Kelly, but I have not tried that in the coin-flip game.

answer by Dagon · 2020-11-02T18:45:31.361Z · LW(p) · GW(p)

The Kelly criterion, as a bet-sizing optimum, makes a few assumptions, which are not true in most humans.

  • Future bets will be available, but limited by the results of the currently-considered wager.  That is, there is a bankroll which can grow and shrink, but if it hits zero, the model ends.  Kelly phrases this requirement as "the possibility of reinvestment".
  • Utility of money is logarithmic in the ranges under consideration (that is, you're considering lifetime resources, not just the amount in your pocket right now).

It's a little unclear whether the log utility is an assumption or a result of the bankrupcy-is-death assumption.  The original paper, http://www.herrold.com/brokerage/kelly.pdf , says:

The gambler introduced here follows an essentially different criterion from the classical gambler. At every bet he maximizes the expected value of the logarithm of his capital. The reason has nothing to do with 926 the bell system technical journal, july 1956 the value function which he attached to his money, but merely with the fact that it is the logarithm which is additive in repeated bets and to which the law of large numbers applies. Suppose the situation were different; for example, suppose the gambler’s wife allowed him to bet one dollar each week but not to reinvest his winnings. He should then maximize his expectation (expected value of capital) on each bet. He would bet all his available capital (one dollar) on the event yielding the highest expectation. With probability one he would get ahead of anyone dividing his money differently.

This all implies that the special-case is the last wager you will ever make.  And from there the more complicated cases of the penultimate wager and the probabilistic-finite cases.  I don't know how big the chain needs to get to converge to Kelly being the optimum, but since it's compatible with logarithmic utility of money in the first place, for some agents it'll be the same regardless.

My strong suspicion is that Kelly always applies if your terminal utility function for money is logarithmic.  But I don't see how that could be - the marginal amount of money/resources you'll control at death is tiny compared to all resources in the universe, so your utility for any margin under consideration should be close to linear.

comment by abramdemski · 2020-11-02T19:52:37.331Z · LW(p) · GW(p)

My strong suspicion is that Kelly always applies if your terminal utility function for money is logarithmic.  But I don't see how that could be - the marginal amount of money/resources you'll control at death is tiny compared to all resources in the universe, so your utility for any margin under consideration should be close to linear.

If the amount is tiny, and your utility is log resources, then that puts us close to the origin, where the derivative of the logarithm is very high, and reducing very quickly.

But logarithm can still look nearly linear if the differences we can make are sufficiently small in relation to the total.

Replies from: Dagon
comment by Dagon · 2020-11-02T20:13:59.570Z · LW(p) · GW(p)

Sure, but the point of the Kelly calculation is to PICK the amount, relative to the potential gain and risk of ruin.  Which ends up equivalent to logarithmic utility.  

For the final bet (or the induction base for a finite sequence), one cannot pick an amount without knowing the zero-point on the utility curve.

Replies from: abramdemski, X4vier, X4vier
comment by abramdemski · 2020-11-02T20:35:04.261Z · LW(p) · GW(p)

Agreed.

What my comment about small differences amounts to is if you can't bet the whole bankroll; for example if your utility is (altruistically) logarithmic in the resources of all humanity, but you can only control how to gamble with a small fraction of that.

This might justify EAs behaving like their utility is linear in resources, even if it's ultimately logarithmic.

comment by X4vier · 2020-11-02T20:38:22.284Z · LW(p) · GW(p)

For the final bet (or the induction base for a finite sequence), one cannot pick an amount without knowing the zero-point on the utility curve.

I'm a little confused about what you mean sorry - 

What's wrong with this example?: 

It's time for the final bet, I have $100 and my utility is 

I have the opportunity to bet on a coin which lands heads with probability , at  odds.

If I bet  on heads, then my expected utility is , which is maximized when .

So I decide to bet 50 dollars.

What am I missing here?

Replies from: Dagon
comment by Dagon · 2020-11-02T21:20:15.955Z · LW(p) · GW(p)

The problem is in the "I have" statement in the setup.  After your final bet, you will be dead (or at least not have any ability for further decisions about the money).  You have to specify what "have" means, in terms of your utility.  Perhaps you've willed it all to a home for cats, in that case the home has 500,100 +/- x.  Perhaps you care about humanity as a whole, in which case your wager has no impact - any that you add or remove from "your" $100 comes out of someone else's.  Or if the wager is making something worth x, or destroying x value as your final act, then humanity as a whole has $90T +/- x.  

comment by X4vier · 2020-11-02T20:35:50.640Z · LW(p) · GW(p)
comment by X4vier · 2020-11-02T19:22:33.632Z · LW(p) · GW(p)

As far as I can tell, the fact that you only ever control a very small proportion of the total wealth in the universe isn't something we need to consider here.

No matter what your wealth is, someone with log utility will treat a prospect of doubling their money to be exactly as good as it would be bad to have their wealth cut in half, right?

Replies from: Dagon
comment by Dagon · 2020-11-02T19:35:07.315Z · LW(p) · GW(p)

I don't know - I don't have a good sense of what "terminal values" mean for humans.  But I suspect it does matter - for a logarithmic utility curve, figuring out the change in utility for a given delta in resources depends entirely on the proportion of the total that the given delta represents. 

comment by abramdemski · 2020-11-02T20:01:57.513Z · LW(p) · GW(p)

makes a few assumptions, which are not true in most humans

Right, it would be interesting to take a Kelly-like approach while relaxing those assumptions.

  • Fixed income: most people have money coming in through work, not just investment. This should make a person less risk-averse, intuitively, since losing all one's money no longer means you're out of the game forever.
  • Fixed expenses: on the flip side, most people have expenses which are relatively fixed in the short term (and increase with bankroll in the longer term, reflecting a desire to spend money to get other things!)
Replies from: Dagon
comment by Dagon · 2020-11-03T17:50:47.173Z · LW(p) · GW(p)

Fixed income: most people have money coming in through work, not just investment. 

This is a very common situation, and the standard recommendation for Kelly criterion usage is "calculate based on your ENTIRE bankroll".  Yes, the point on the logarithm is based on your home equity and expected future earnings, not just the money in your pocket.   

This usually translates to "if the expectation is positive, bet the maximum".  Most people don't think that way, and therefore don't optimize their lifetime bankroll growth.  Also, cases where you actually know that it's +EV with enough certainty to use Kelly are extremely rare.

answer by arielroth · 2020-11-04T16:00:47.744Z · LW(p) · GW(p)

No, the number of iterations is irrelevant. You can derive Kelly by trying to maximize your expected log wealth for a single bet. If you care about wealth instead of log wealth, then just bet the house every opportunity you get.

A bigger issue with Kelly is that it doesn't account for future income and debt streams. There should be an easy fix for that, but I need to think a bit.

comment by abramdemski · 2020-11-04T22:50:21.093Z · LW(p) · GW(p)

It's important that we can derive Kelly that way, but if that were the only derivation, it would not be so interesting. It begs the question: why log wealth?

The derivation that does something interesting to pin down Kelly in particular is the one where we take the limit in iterations.

Replies from: Dagon
comment by Dagon · 2020-11-05T15:52:17.186Z · LW(p) · GW(p)

+1.  It's important to understand that log(money) is the RESULT that Kelly showed, not an assumption he made.  If you start with it as an assumption, you're not "deriving" the Kelly equation, you're just calculating it.

This result is what makes logarithmic utility so attractive as an assumption in OTHER kinds of utility modeling.  

2 comments

Comments sorted by top scores.

comment by ESRogs · 2020-11-02T23:08:37.799Z · LW(p) · GW(p)

Several prediction markets have recently offered a bet at around 62¢ which superforecasters assign around 85% probability. This resulted in a rare temptation for me to Kelly bet. Calculating the Kelly formula, I found that I was supposed to put 60% of my bankroll on this.

Is this assuming you take the 85% number directly as your credence?

Replies from: abramdemski
comment by abramdemski · 2020-11-03T14:32:57.124Z · LW(p) · GW(p)

Right.