This is a dialogue made during the online dialogues party. Phil, River, and I (Garrett) talked about the Kelly criterion. I was under the mistaken impression that it would come up in finite games, even without discount factors. This turned out to be wrong! In such betting games, you always want to bet the maximal amount (assuming linear utility in money).
We then talked about how maybe you could save the criterion without bringing in log utility in money or discount rates in infinite games. The two conclusions were you could either have a finite utility function, or care intrinsically about not being broke/maximizing the probability of getting more money than other agents in your world.
The highlight in my opinion is me changing my mind about the finite games thing after using dynamic programming to try to show Phil to be wrong. The proof we came up with was both surprising and elegant to me.
Possible justifications for Kelly
Garrett Baker
So, I’ve seen some previous explanations of the kelly criterion on LessWrong, and they seem to fall into 3 clusters:
You do the kelly criterion because you have log-utility in money.
You do the kelly criterion because you have linear utility in money, but you can use current money to gain even more in the future.
You do the kelly criterion because you partially update on the market’s position.
You do the kelly criterion because you have log-utility in money.
This one I think is complicated; I’d say “you do a thing that turns out equivalent to the kelly criterion, but for simpler reasons than why the kelly criterion is derived”
philh
You do the kelly criterion because you have linear utility in money, but you can use current money to gain even more in the future.
In this case I’d say no, if you actually have linear utility in money you should just bet everything every time in most/all game structures I’ve seen; this results in behavior that is obviously wrong, but that’s because linear utility in money is obviously wrong
philh
You do the kelly criterion because you partially update on the market’s position.
I don’t think I’ve seen this one, though I guess fractional kelly would be this? In my head pure Kelly is “the market thinks this, I think that, and the difference between the two is how much I can win over the market”
philh
I guess one thing I’d be explicit about here (partially rehashing the above) is that I think if you have a utility function at all, you don’t need to bring Kelly into things. If your utility function is log, then the thing you do turns out equivalent to kelly but derived more simply. The way kelly is derived, it seems to me that the thing it gives you is just a different thing than optimizing a utility function
Garrett Baker
In this case I’d say no, if you actually have linear utility in money you should just bet everything every time in most/all game structures I’ve seen; this results in behavior that is obviously wrong, but that’s because linear utility in money is obviously wrong
In most real world situations this is false? Can you give some concrete examples here? Like, in poker you have this dynamic, in investing you have this dynamic, in betting you have this dynamic, when making life decisions you have this dynamic, etc.
philh
Hm, I admittedly haven’t thought much about realistic scenarios, my thinking was I wanted to figure out simple examples first. So the simple example in my head is the classic “you can bet any amount of money on a 60⁄40 chance to double your stake”, and then I claim that with linear utility you should bet everything every time. Do you disagree with that?
Is Phil using words in a weird way?
Garrett Baker
I guess one thing I’d be explicit about here (partially rehashing the above) is that I think if you have a utility function at all, you don’t need to bring Kelly into things. If your utility function is log, then the thing you do turns out equivalent to kelly but derived more simply. The way kelly is derived, it seems to me that the thing it gives you is just a different thing than optimizing a utility function
This seems like a weird claim to me? Like, I could tell you about the simplex algorithm for optimizing linear programming problems, and using this same logic you could come back to me, and tell me the simplex algorithm isn’t so interesting because the true thing you’re doing is optimizing a linear function under linear constraints, and the linear function you’re optimizing need not be a utility function, it could be anything. And the constraints need not be physical, they could be social constraints as well, or even part of your so-called utility function.
This feels a weird thing to claim to me, despite being true. To me, it seems like there’s a bunch of circumstances where you want to use the simplex algorithm, and a bunch of circumstances where you want to use a kelly bet formulation
Garrett Baker
Hm, I admittedly haven’t thought much about realistic scenarios, my thinking was I wanted to figure out simple examples first. So the simple example in my head is the classic “you can bet any amount of money on a 60⁄40 chance to double your stake”, and then I claim that with linear utility you should bet everything every time. Do you disagree with that?
Yeah, I do disagree with this.
Garrett Baker
Well, actually, if you’re offered lots of bets of this form, then it becomes smart to use kelly, and not bet everything all the time.
philh
I’m not sure I see the connection you’re drawing, so I might try being a bit clearer about why I’m making the claim. So if I have a log utility function, I can do “what is the bet amount that maximizes my expected utility”, and that’s a relatively simple calculation, and it simplifies to a formula that’s equal to the Kelly formula. Whereas the way the Kelly criterion is derived is a much more complicated way of getting to the same formula, that doesn’t involve maximizing log utility. Or like, probably under the hood it’s doing that through deep mathematical equivalence or something. But it’s at any rate more complicated than “maximize my expected log utility”
Garrett Baker
Yeah, the way its derived is via number 2, right? You have a situation, and you want to maximize your utility in that situation, and the way you do this is Kelly
philh
#2 being “You do the kelly criterion because you have linear utility in money, but you can use current money to gain even more in the future.”? I wouldn’t say so; the way it’s originally derived if I recall the paper correctly is by defining a thing it calls “growth rate” and then trying to maximize that
philh
Where growth rate is something like, lim (n->∞) of 1/n log(wealth at time n / wealth at time 0). Which is a limit of random variables, but it’s fine because in the limit you get a random variable which takes on a particular value with probability 1
Garrett Baker
Looking at the wikipedia page, are you talking about Bernouli’s proof?
Garrett Baker
In a 1738 article, Daniel Bernoulli suggested that, when one has a choice of bets or investments, one should choose that with the highest geometric mean of outcomes. This is mathematically equivalent to the Kelly criterion, although the motivation is different (Bernoulli wanted to resolve the St. Petersburg paradox).
philh
I don’t think so—I took this from what I think was the original paper (which was motivated in terms of information theory)
Should you bet everything every time in a finite game?
philh
Um, taking a step back—I think this thread is “am I saying something in a weird way”, right? Happy to continue it if you want, but we might prefer to switch to the “should you bet everything every time” thread
River
My understanding of the reason for using Kelly, which is perhaps related to #2 but maybe distinct, is that if you bet more aggressively than Kelly, you get more and more expected dollars concentrated in smaller and smaller slivers of the possible worlds. Taken to the extreme, you get infinite dollars in an infinitesimally small sliver of the possible worlds, which means you get 0 dollars, and therefor 0 utility. And that is true whether your utility function is logarithmic or linear or anything else. Unless 0 dollars is somehow a positive utility state for you, in which case I guess bet as aggressively as you want.
Garrett Baker
I mean, obviously you shouldn’t bet everything everytime
philh
Taken to the extreme, you get infinite dollars in an infinitesimally small sliver of the possible worlds, which means you get 0 dollars, and therefor 0 utility.
So I claim this is doing infinity in a way that’s not allowed :p “probability 0 of infinite dollars” is more “undefined” than “0 utility”, in this case
River
I mean, obviously you shouldn’t bet everything everytime
You don’t have to bet everything any time to get this conclusion, you just have to bet more than Kelley tells you to.
philh
I mean, obviously you shouldn’t bet everything everytime
So if we have a finite game, I think you’d agree you should? Like, there’s 100 rounds, you end up with a 0.6^100 chance of 2^100 times your original stake, and the rest of the time you get zero, and that’s the best you can do according to a linear utility function. Agree?
Garrett Baker
...finite game...
I’d guess you do something more complicated, since earlier bets influence how much money you can earn in later bets, but later bets don’t have this property. You should bet lots in later bets, but not lots in earlier bets.
philh
You don’t have to bet everything any time to get this conclusion, you just have to bet more than Kelley tells you to.
(Is it really “any amount at all more?” that surprises me if so, but I don’t think it’s a crux for me on anything)
philh
I’d guess you do something more complicated, since earlier bets influence how much money you can earn in later bets, but later bets don’t have this property. You should bet lots in later bets, but not lots in earlier bets.
I don’t think so; I think the math (in this toy example) really does work out that to optimize expected money at the end, you just bet the full amount every time
Garrett Baker
So I claim this is doing infinity in a way that’s not allowed :p “probability 0 of infinite dollars” is more “undefined” than “0 utility”, in this case
In general, infinite utilities are super fucked in lots of different ways. So when I think of an “infinite utility” I think of just an astronomically large utility, and no matter how astronomically large you make your utility, we still get River’s conclusion.
philh
I agree that if utility is bounded at merely astronomically large, things are different
philh
Um, so I think one thing going on here is that you’re modelling it as “you play this game, and you can keep playing it for as long as you want, possibly forever, and then you finish the game with some probability distribution over money and it doesn’t matter how long you played for”. Or something? But like, if you can keep playing forever, kelly gives you infinite money but so does “bet min(kelly, $1) every time”
Garrett Baker
That’s a good point. I was modelling things that way. I don’t know if modelling things differently, with a discount rate for example, gives so different results though?
philh
So the thing I’d want to add to the model isn’t a discount rate, it’s a “how much utility do I have at this point in time, while playing the game?” Like, at timestep 100, what does my utility look like?
philh
Because if the game goes on forever, and then stops… I’m not really sure what to do with the result of that. There are a bunch of ways to get inifinite money in that limit, and there’s a way in which the kelly function gives you a higher infinity but it’s weird
Garrett Baker
Yeah, in that case I will point to my answer here
I’d guess you do something more complicated, since earlier bets influence how much money you can earn in later bets, but later bets don’t have this property. You should bet lots in later bets, but not lots in earlier bets.
And as you increase when you stop, you will approach kelly.
Garrett Baker
Maybe I’m just wrong
philh
Yeah, I still disagree with that. Um, we can try to work through the math but I dunno how conducive dialog format is for that. I guess we can at least do latex
Garrett Baker
Its not very conducive, I’ve tried similar things before, and those were hard, even though I already knew the proof I was trying to make before going in, and if I had a whiteboard, it would be done in 5 min.
philh
Nod
philh
I guess, rather than coming up with a proof, we could do it for small n and see what happens?
Garrett Baker
Yeah
philh
Okay, so for one coin flip you just have one value to choose. Assume you start with £1 and you bet £W0, you end up with 60% of (1+W0) and 40% of (1−W0) which is maximized by maximizing W0
philh
For two you get two values, and it’s possible that when I’ve worked through this in the past I’ve assumed they have to be the same fraction but they don’t. And actually they don’t even have to be the same fraction depending if you win or not, so that makes three values you can bet
Garrett Baker
Well, taking a dynamic programming approach, we can assume that the last time you bet, you bet all your money, and you end up with WT=WT−1+0.2BT, where Wt is your wealth at time t, and Bt is how much you bet at time t. Your bet must be a fraction of your wealth, so we can rewrite this as WT=WT−1+0.2WT−1fT=WT−1(1+0.2fT), so that WT=W0(1+0.2f1)⋯(1+0.2fT). Hm… this would in fact mean we want to choose maximal fi=1 for everything, because multiplication is order independent...
philh
Ah, yes! Yeah, that argument seems correct to me (and gives the conclusion I already expected so confirmation bias :p)
Garrett Baker
Wild, ok. I guess you’ve convinced me of this
philh
I guess a similar argument is, suppose we’re about to make the final bet. We’ve established that no matter how much money we currently have (let’s say it’s VN−1, we want to bet everything) to maximize expected VN at E(VN)=1.2VN−1, and then by induction we maximize E(VN−1) by betting all our wealth on that bet too
philh
Okay, cool. Um, so I guess we take a step back now and figure out what threads were hanging off of this?
Garrett Baker
I think you were asserting that I was assuming there was a discount rate, or at least drawing lots of my intuitions off that fact.
philh
I don’t think I was thinking of it it in those terms, but something like that sounds right, yeah
What about infinite games?
philh
Okay, so at this point we agree that in a finite game a linear utility player should bet everything every time. I guess, we can talk about infinite games if we want? But it sounds like we also both agree those are fucked up, so that might not be where we want to go
Garrett Baker
Infinite utilities are fucked, but infinite games I’m fine with
philh
Okay, so infinite game. So one way to model this is “if you play this game forever, the kelly bettor will have infinite money with probability 1, and the bet-everything bettor will have zero money with probability 1”, but that feels like a bad way to model it to me
philh
Like, if we take the limit of probability distributions over wealth at each timestep, I think the bet-everything limit is indeed a probability distribution that has 1 at 0 and 0 everywhere else. But I think the Kelly limit is a function that’s just zero everywhere, not a probability distribution
Garrett Baker
Yeah, this sounds correct. You have succeeded in getting me confused, which I think is where you want me
Garrett Baker
Ok, I think I found my confusion. The problem is is if there’s a bound to the utility function, then the bet everything guy just gets 0 utility, but the kelly girl gets maximal utility
River
I do not understand what you mean by that Phil. What does it mean for the limit of the kelly function to be something at a particular place? The limit is the place we are talking about.
philh
Ok, I think I found my confusion. The problem is is if there’s a bound to the utility function, then the bet everything guy just gets 0 utility, but the kelly girl gets maximal utility
Yeah, that sounds right. But then the person betting with a linear-up-to-some-bound utility function would act different somehow, it’s not obvious to me how. (Wouldn’t completely shock me if they just end up betting Kelly, actually...)
philh
What does it mean for the limit of the kelly function to be something at a particular place?
So at some fixed timestep we have some probability distribution over wealth, and a probability distribution is a function R→R in this case, that integrates to 1. So in the limit as time → ∞, we can take the limit of these probability distributions. And the pointwise limit of those, i.e. the function that comes from taking each previous function at a fixed point and taking the limit of that sequence of numbers, is a function that’s constantly 0
River
Why would we care about the limit of a probability distribution at a fixed point in time? And wouldn’t that limit always have to be zero?
River
we can take the limit of these probability distributions.
Sure sounds like you are taking the limit of a probability distribution at a fixed point in time?
River
Like, I don’t know how else to parse that statement.
philh
So at time t we have a probability distribution Vt. And we take limt→∞Vt, which since each Vt is a function, the limit is also a function. Um, kinda. I think it might actually be “depending what we think of as the limit blah blah blah”. But in this case, there’s a limit that we call the pointwise limit, and it does exist and it’s a function. But even though each Vt is a function that’s also a probability distribution, the limit is a function that’s not a probability distribution
philh
Um, but part of what’s going on here is I’m saying “this is a silly thing to be doing”, but it’s also a thing that people do when they compare the outcomes of kelly versus bet-everything in the infinite game
River
I agree with Garret that trying to do math on here is annoying. And I agree that the bet-everything strategy has a limit that is 1 at 0 and 0 everywhere else. I do not think I agree that Kelley has a limit that is 0 everywhere.
Garrett Baker
I think by “0 everywhere” it’s meant that p(utility=Utility|kelly) = 0 regardless of what utility is.
River
Kelly will never tell you to bet everything, no matter how certain you are. So even if you take a big loss betting Kelly, you will eventually recover. So in the limit, you should still come out rich.
Garrett Baker
whereas p(0 = Utility | bet everything) = 1
Garrett Baker
And the point is to show that this is actually a really dumb way to analyze things
philh
I think by “0 everywhere” its meant that p(utility=Utility|kelly) = 0 regardless of what utility is.
Yes, this. Any specific outcome has zero probability (which means it’s not a probability distribution)
philh
(Um, that’s imprecise, but I claim it’s at least not a standard probability distribution. I think there are weird things people have come up with that let you handle things like this maybe)
Garrett Baker
I guess someone could retort that having a probability 0 everywhere is better than having probability 0 everywhere except at 0.
philh
Yeah. But my main reply is “at that point we’re not really doing statistics” I guess
River
ok, I think I agree then. Intuitively, I think there must be a relevant sense in which summing all the non-zero outcomes for kelly gives 1, and the non-zero outcomes for bet-everything gives 0, but I don’t know enough analysis to put anything more concrete on that.
Garrett Baker
I guess we could take the limit of the integral over non-zero outcomes
philh
Hm, I think I see two things that could mean, and one of them is “limit of expected value” and the other is “limit of a function that’s constantly 1″
Garrett Baker
I mean
limT→∞∫CTp(u=UT|strategy)du
where CT=(0,T)
River
Maybe the way to think about it is that at each point in time, we can integrate the probability function for each strategy over the range (0, inf), and take the limit of that. For bet-everything, this is 0. For Kelly, this is 1. And this seems like something we should care about.
Garrett Baker
Translating to my thing, limT→∞∫CTp(u=UT|kelly)du=1, and limT→∞∫CTp(u=UT|bet everything)du=0.
Garrett Baker
I think maybe phil can make the point he tried to make in On kelly and altrusim’s TLDR:
One-sentence summary: Kelly is not about optimizing a utility function; in general I recommend you either stop pretending you have one of those, or stop talking about Kelly.
philh
at each point in time, we can integrate the probability function for each strategy over the range (0, inf), and take the limit of that.
So if the thing we’re integrating is just the probability function itself, the intgegral is always 1 (that’s what a probability function is) If the thing we’re doing is integral of xp(x)dx then that’s taking expected value. And both of them grow to infinity in the limit
philh
limT→∞∫CTp(u=UT|strategy)du
(Hm, I don’t currently know what’s going on with this)
Garrett Baker
this is basically taking the limit of T→∞ of the probability you don’t have 0 money after T timesteps
(assuming you can’t be in debt)
River
So if the thing we’re integrating is just the probability function itself, the intgegral is always 1 (that’s what a probability function is)
No, a probability function integrated over all possible outcomes has to be one. My point was to exclude an outcome (being broke), which means that the integral can be less than 1.
philh
No, a probability function integrated over all possible outcomes has to be one. My point was to exclude an outcome (being broke), which means that the integral can be less than 1.
Ah, so if the probability distribution is continuous (which I think it has to be for this infinite game, but… I dunno, maybe not), then excluding a point value actually doesn’t change the integral. But we can consider “probability of being broke”. But then my reply would be “but the linear utility person doesn’t care about their probability of being broke, they care about their expected money”
River
We have discrete timesteps, and discrete numbers of dollars at each timestep, so I don’t think any of these probability distributions will be continuous.
philh
Okay. I’m not sure, but I don’t think it’s a crux
Garrett Baker
We can consider the case of a continuous number of dollars, and be fine
River
Yea, this was not a particularly productive direction, sorry about that.
Returning to Phil’s previous tl;dr
philh
I think maybe phil can make the point he tried to make in On kelly and altrusim’s TLDR:
Sure. So some of this depends on the thread we dropped about whether I was saying something in a weird way. But like, if we accept that weird way of saying things, then I claim that if you have a log-utility function, the thing you do is equivalent to betting Kelly but I (admittedly mildly) disapprove of calling it Kelly betting. And if you have a different utility function, then the thing you do is something different.
But (this is another point we didn’t manage to get to) there is a thing that Kelly gives you that it doesn’t really make sense to think of as being “I’m maximizing my utility function by getting this”, but does seem to me like a good thing (because I don’t have a utility function and if I did it would not be denominated purely in dollars and so on).
That thing is that betting Kelly means that with probability 1, over time you’ll be richer than someone who isn’t betting Kelly. So if you want to achieve that, Kelly is great.
Garrett Baker
I will claim I am confused here. I do think if you make the argument River made, and find it convincing, it really does seem like you’re trying to just maximize the probability you’re the entity with the most amount o money at the end of the day. However, I think you can also get kelly betting if you have some discount rate, which seems a more reasonable adjustment than saying I just want to maximize the probability I have more money than everyone else. I also claim that wanting more money than everyone else is in fact a reasonable utility function to have.
philh
So I think a discount rate with linear utility might just turn out to be equivalent to having log utility?
Garrett Baker
Yeah, seems likely
philh
I agree wanting more money than everyone else is reasonable, I’m not sure it makes sense to think of as a utility function. At any rate it’s awkward as one. Like, it’s certainly not a utility function that can be expressed in terms of your current wealth. I had an appendix in I think “on kelly and altruism” that looked into this a bit but I don’t remember how closely
Garrett Baker
I don’t think it’s so awkward, you just max the probability you have more money than everyone else. But I do agree you need to incorporate info on the distribution of agents in your world in order to implement it, so it’s not solely a function of your wealth.
Garrett Baker
Maybe what you meant to say was that people don’t have utility functions simply representable in terms of their wealth.
philh
Well, I think people, at least human people, don’t have utility functions at all. It’s sometimes reasonable to talk about them as a shorthand, but I think it breaks down. But like, I also think that’s fine in this case, because we don’t need to think of “I want to be the richest person in the room” as a utility function
philh
We can just say “I want to maximize the probability of that”, and then Kelly gives you it
Garrett Baker
There’s a sense in which you’re right, but I do think there’s a stronger sense in which you’re wrong. Like, we don’t literally have utility functions, but you do get useful results by analyzing people as if they have utility functions. See the field of economics for lots of examples of this.
Garrett Baker
And I also think its a useful way of looking at life choices, and looking for areas where you may be making mistakes (keeping in mind your utility function need not be simple, and if a change feels wrong, you probably should find yourself siding with your gut more often than your simple analysis of the situation)
philh
Seems reasonable—I think any disagreement we have here is probably about line-drawing inside grey areas, rather than being substantive
Garrett Baker
Sounds right. This was a good dialogue! I think we probably don’t have anything else to discuss. Anything come to your mind?
philh
I kinda expect we could come up with something, but this does feel like a natural conclusion
When and why should you use the Kelly criterion?
This is a dialogue made during the online dialogues party. Phil, River, and I (Garrett) talked about the Kelly criterion. I was under the mistaken impression that it would come up in finite games, even without discount factors. This turned out to be wrong! In such betting games, you always want to bet the maximal amount (assuming linear utility in money).
We then talked about how maybe you could save the criterion without bringing in log utility in money or discount rates in infinite games. The two conclusions were you could either have a finite utility function, or care intrinsically about not being broke/maximizing the probability of getting more money than other agents in your world.
The highlight in my opinion is me changing my mind about the finite games thing after using dynamic programming to try to show Phil to be wrong. The proof we came up with was both surprising and elegant to me.
Possible justifications for Kelly
Is Phil using words in a weird way?
Should you bet everything every time in a finite game?
What about infinite games?
Returning to Phil’s previous tl;dr