Yes that is a good insight. I’ll rephrase it to perhaps make it clear to a somewhat different set of people. If your strategy is to have a good median outcome of your life, you will still get to bet on longshots with high payoffs, as long as you expect to be offered a lot of those bets. The fewer bets you expect to be offered of a certain type, the more likely winning must be for you to take it, even if the “expected” pay out on these is very high.
A quantification of this concept in somewhat simple cases was done by Jim Kelly and is called the Kelly Criterion. Kelly asked a question: given you have finite wealth, how do you decide how much to bet on a given offered bet in order to maximize the rate at whcih your expected wealth grows? Kelly’s criterion, if followed, also has the side-effect of insuring you never go completely broke, but in a world of minimum bet sizes, you might go broke enough to not be allowed to play anymore.
Of course, all betting strategies, where you are betting against other presumed rational actors, require you to be smarter, or at least more correct thant the people you are betting against, in order to allow you to win. In Kelly’s calculation, the size of your bet depends on both the offered odds and the “true” odds. So how do you determine the true odds? Well that is left as an exercise for the reader!
And so it goes with Pascal’s muggings. As far as my study has taken me, I know of no way to reliably estimate whether the outcomes in offered in Pascal’s muggings are one in a million, one in a google, one in a googleplex, or one in 3^^^3. And yet the “correct” amount to bet using the Kelly criterion will vary by as big a factor as those probability estimates vary one from the other.
There is also the result that well-known cognitive biases will cause you to get infinitesimal probabilities wrong by many orders of magnitude, without properly estimating your probable error on them. For any given problem, there is some probability estimate below which all further attempts to refine the estimate are in the noise: the probability is “essentially zero.” But all the bang in constantly revisiting these scenarios comes from the human biases that allow us to think that because we can state a number like 1 in a million or 1 in a google or 1 in 3^^^3 that we must be able to use it meaningfully in some probabilistic calculation.
If you are of the bent that hypotheses such as the utility of small probabilities should be empirically checked before you start believing the results of these calculations, it may take a few lifetimes of the universe (or perhaps a google lifetimes of the universe) before you have enough evidence to determine whether a calculation involving a number like 1 in a google means anything at all.
Kelly asked a question: given you have finite wealth, how do you decide how much to bet on a given offered bet in order to maximize the rate at whcih your expected wealth grows?
The Kelly criterion doesn’t maximize expected wealth, it maximizes expected log wealth, as the article you linked mentions:
The conventional alternative is utility theory which says bets should be sized to maximize the expected utility of the outcome (to an individual with logarithmic utility, the Kelly bet maximizes utility, so there is no conflict)
Suppose that I can make n bets, each time wagering any proportion of my bankroll that I choose and then getting three times the wagered amount if a fair coin comes out Heads, and losing the wager on Tails. Expected wealth is maximized if I always bet the entire bankroll, with an expected wealth of (initial bankroll)(3^n)(the probability of all Heads=2^-n). The Kelly criterion trades off from that maximum expected wealth in favor of log wealth.
A utility function that goes with log wealth values gains less, but it also values losses much more, with insane implications at the extremes. With log utility, multiplying wealth by a 1,000,000 has the same marginal utility whatever your wealth, and dividing wealth by 1,000,000 has the negative of that utility. Consider these two gambles:
Gamble 1) Wealth of $1 with certainty.
Gamble 2) Wealth of $0.00000001 with 50% probability, wealth of $1,000,000 with 50% probability.
Log utility would favor $1, but for humans Gamble 2 is clearly better; there is very little difference for us between total wealth levels of $1 and a millionth of a cent.
Worse, consider these gambles:
Gamble 3) Wealth of $0.000000000000000000000000001 with certainty.
Gamble 4) Wealth of $1,000,000,000 with probability (1-1/3^^^3) and wealth of $0 with probability 1/3^^^3
Log utility favors Gamble 3, since it assigns $0 wealth infinite negative utility, and will sacrifice any finite gain to avoid it. But for humans Gamble 4 is vastly better, and a 1/3^^^3 chance of bankruptcty is negligibly worse than wealth of $1. Every day humans drive to engage in leisure activities, eat pleasant but not maximally healthful foods, and otherwise accept small, go white-water rafting, and otherwise accept small (1 in 1,000,000, not 1 in 3^^^3) probabilities of death for local pleasure and consumption.
This is not my utility function. I have diminishing utility over a range of wealth levels, which log utility can represent, but it weights losses around zero too highly, and still buys a 1 in 10^100 chance of $3^^^3 in exchange for half my current wealth if no higher EV bets are available, as in Pascal’s Mugging.
Abuse of a log utility function (chosen originally for analytical convenience) is what led Martin Weitzman astray in his “Dismal Theorem” analysis of catastrophic risk, suggesting that we should pay any amount to avoid zero world consumption (and not on astronomical waste grounds or the possibility of infinite computation or the like, just considering the limited populations Earth can support using known physics).
The original justification for the Kelly criterion isn’t that it maximizes a utility function that’s logarithmic in wealth, but that it provides a strategy that, in the infinite limit, does better than any other strategy with probability 1. This doesn’t mean that it maximizes expected utility (as your examples for linear utility show), but it’s not obvious to me that the attractiveness of this property comes mainly from assigning infinite negative value to zero wealth, or that using the Kelly criterion is a similar error to the one Weitzman made.
Yes, if we have large populations of “all-in bettors” and Kelly bettors, then as the number of bets increase the all-in bettors lead in total wealth increases exponentially, while the probability of an all-in bettor being ahead of a Kelly bettor falls exponentially. And as you go to infinity the wealth multiplier of the all-in bettors goes to infinity, while the probability of an all-in bettor leading a Kelly bettor goes to zero. And that was the originally cited reasoning.
Now, one might be confused by the “beats any other constant bankroll allocation (but see the bottom paragraph) with probability 1” and think that it implies “bettors with this strategy will make more money on average than those using other strategies,” as it would in a finite case if every bettor using one strategy did better than any bettor using any other strategy.
But absent that confusion, why favor probability of being ahead over wealth unless one has an appropriate utility function? One route is log utility (for which Kelly is optimal), and I argued against it as psychologically unrealistic, but I agree there are others. Bounded utility functions would also prefer the Kelly outcome to the all-in outcome in the infinite limit, and are more plausible than log utility.
Also, consider strategies that don’t allocate a constant proportion in every bet, e.g. first do an all-in bet, then switch to Kelly. If the first bet has a 60% chance of tripling wealth and a 40% chance of losing everything, then the average, total, and median wealth of these mixed-strategy bettors will beat the Kelly bettors for any number of bets in a big population. These don’t necessarily come to mind when people hear loose descriptions of Kelly.
The Kelly criterion doesn’t maximize expected wealth, it maximizes expected log wealth, as the article you linked mentions
I thought the log function operating on real positive numbers was real and monotonically increasing with wealth.
I thought wealth for the purposes of the wikipedia article and Kelly criterion calculations was real and positive.
So how can something which is said to maximize log(wealth) not also be said to maximize wealth with identical meaning?
Seriously, if there is some meaningful sense in which something that maximizes log(wealth) does not also maximize wealth, I am at a loss to even guess what it is and would appreciate being enlightened.
I gave several examples in my comment, but here’s another with explicitly calculated logs:
1) $100 with certainty. Log base 10 is 2. So expected log wealth 2, expected wealth $100.
2) 50% chance of $1, for a log of 0. 50% chance of $1,000 with a log of 3. Expected log wealth is therefore ((0+3)/2)=1.5, and expected wealth is ($1+$1000)/2=$500.5.
1) has higher expected log wealth, but 2) has higher expected wealth.
Yes that is a good insight. I’ll rephrase it to perhaps make it clear to a somewhat different set of people. If your strategy is to have a good median outcome of your life, you will still get to bet on longshots with high payoffs, as long as you expect to be offered a lot of those bets. The fewer bets you expect to be offered of a certain type, the more likely winning must be for you to take it, even if the “expected” pay out on these is very high.
A quantification of this concept in somewhat simple cases was done by Jim Kelly and is called the Kelly Criterion. Kelly asked a question: given you have finite wealth, how do you decide how much to bet on a given offered bet in order to maximize the rate at whcih your expected wealth grows? Kelly’s criterion, if followed, also has the side-effect of insuring you never go completely broke, but in a world of minimum bet sizes, you might go broke enough to not be allowed to play anymore.
Of course, all betting strategies, where you are betting against other presumed rational actors, require you to be smarter, or at least more correct thant the people you are betting against, in order to allow you to win. In Kelly’s calculation, the size of your bet depends on both the offered odds and the “true” odds. So how do you determine the true odds? Well that is left as an exercise for the reader!
And so it goes with Pascal’s muggings. As far as my study has taken me, I know of no way to reliably estimate whether the outcomes in offered in Pascal’s muggings are one in a million, one in a google, one in a googleplex, or one in 3^^^3. And yet the “correct” amount to bet using the Kelly criterion will vary by as big a factor as those probability estimates vary one from the other.
There is also the result that well-known cognitive biases will cause you to get infinitesimal probabilities wrong by many orders of magnitude, without properly estimating your probable error on them. For any given problem, there is some probability estimate below which all further attempts to refine the estimate are in the noise: the probability is “essentially zero.” But all the bang in constantly revisiting these scenarios comes from the human biases that allow us to think that because we can state a number like 1 in a million or 1 in a google or 1 in 3^^^3 that we must be able to use it meaningfully in some probabilistic calculation.
If you are of the bent that hypotheses such as the utility of small probabilities should be empirically checked before you start believing the results of these calculations, it may take a few lifetimes of the universe (or perhaps a google lifetimes of the universe) before you have enough evidence to determine whether a calculation involving a number like 1 in a google means anything at all.
Googol. Likewise, googolplex.
The Kelly criterion doesn’t maximize expected wealth, it maximizes expected log wealth, as the article you linked mentions:
Suppose that I can make n bets, each time wagering any proportion of my bankroll that I choose and then getting three times the wagered amount if a fair coin comes out Heads, and losing the wager on Tails. Expected wealth is maximized if I always bet the entire bankroll, with an expected wealth of (initial bankroll)(3^n)(the probability of all Heads=2^-n). The Kelly criterion trades off from that maximum expected wealth in favor of log wealth.
A utility function that goes with log wealth values gains less, but it also values losses much more, with insane implications at the extremes. With log utility, multiplying wealth by a 1,000,000 has the same marginal utility whatever your wealth, and dividing wealth by 1,000,000 has the negative of that utility. Consider these two gambles:
Gamble 1) Wealth of $1 with certainty.
Gamble 2) Wealth of $0.00000001 with 50% probability, wealth of $1,000,000 with 50% probability.
Log utility would favor $1, but for humans Gamble 2 is clearly better; there is very little difference for us between total wealth levels of $1 and a millionth of a cent.
Worse, consider these gambles:
Gamble 3) Wealth of $0.000000000000000000000000001 with certainty.
Gamble 4) Wealth of $1,000,000,000 with probability (1-1/3^^^3) and wealth of $0 with probability 1/3^^^3
Log utility favors Gamble 3, since it assigns $0 wealth infinite negative utility, and will sacrifice any finite gain to avoid it. But for humans Gamble 4 is vastly better, and a 1/3^^^3 chance of bankruptcty is negligibly worse than wealth of $1. Every day humans drive to engage in leisure activities, eat pleasant but not maximally healthful foods, and otherwise accept small, go white-water rafting, and otherwise accept small (1 in 1,000,000, not 1 in 3^^^3) probabilities of death for local pleasure and consumption.
This is not my utility function. I have diminishing utility over a range of wealth levels, which log utility can represent, but it weights losses around zero too highly, and still buys a 1 in 10^100 chance of $3^^^3 in exchange for half my current wealth if no higher EV bets are available, as in Pascal’s Mugging.
Abuse of a log utility function (chosen originally for analytical convenience) is what led Martin Weitzman astray in his “Dismal Theorem” analysis of catastrophic risk, suggesting that we should pay any amount to avoid zero world consumption (and not on astronomical waste grounds or the possibility of infinite computation or the like, just considering the limited populations Earth can support using known physics).
The original justification for the Kelly criterion isn’t that it maximizes a utility function that’s logarithmic in wealth, but that it provides a strategy that, in the infinite limit, does better than any other strategy with probability 1. This doesn’t mean that it maximizes expected utility (as your examples for linear utility show), but it’s not obvious to me that the attractiveness of this property comes mainly from assigning infinite negative value to zero wealth, or that using the Kelly criterion is a similar error to the one Weitzman made.
Yes, if we have large populations of “all-in bettors” and Kelly bettors, then as the number of bets increase the all-in bettors lead in total wealth increases exponentially, while the probability of an all-in bettor being ahead of a Kelly bettor falls exponentially. And as you go to infinity the wealth multiplier of the all-in bettors goes to infinity, while the probability of an all-in bettor leading a Kelly bettor goes to zero. And that was the originally cited reasoning.
Now, one might be confused by the “beats any other constant bankroll allocation (but see the bottom paragraph) with probability 1” and think that it implies “bettors with this strategy will make more money on average than those using other strategies,” as it would in a finite case if every bettor using one strategy did better than any bettor using any other strategy.
But absent that confusion, why favor probability of being ahead over wealth unless one has an appropriate utility function? One route is log utility (for which Kelly is optimal), and I argued against it as psychologically unrealistic, but I agree there are others. Bounded utility functions would also prefer the Kelly outcome to the all-in outcome in the infinite limit, and are more plausible than log utility.
Also, consider strategies that don’t allocate a constant proportion in every bet, e.g. first do an all-in bet, then switch to Kelly. If the first bet has a 60% chance of tripling wealth and a 40% chance of losing everything, then the average, total, and median wealth of these mixed-strategy bettors will beat the Kelly bettors for any number of bets in a big population. These don’t necessarily come to mind when people hear loose descriptions of Kelly.
Sure, I don’t see anything here to disagree with.
Please enlighten a poor Physicist. You write:
I thought the log function operating on real positive numbers was real and monotonically increasing with wealth.
I thought wealth for the purposes of the wikipedia article and Kelly criterion calculations was real and positive.
So how can something which is said to maximize log(wealth) not also be said to maximize wealth with identical meaning?
Seriously, if there is some meaningful sense in which something that maximizes log(wealth) does not also maximize wealth, I am at a loss to even guess what it is and would appreciate being enlightened.
I gave several examples in my comment, but here’s another with explicitly calculated logs:
1) $100 with certainty. Log base 10 is 2. So expected log wealth 2, expected wealth $100.
2) 50% chance of $1, for a log of 0. 50% chance of $1,000 with a log of 3. Expected log wealth is therefore ((0+3)/2)=1.5, and expected wealth is ($1+$1000)/2=$500.5.
1) has higher expected log wealth, but 2) has higher expected wealth.