images

Kelly Capital Growth Investing

images

The use of log utility dates to the letters of Daniel Bernoulli in 1738. The idea that additional wealth is worth less and less as it increases and thus utility tails off proportional to the level of wealth is very reasonable . This utility function seems safe for investing. However, I argue that log is the most risky utility function one should ever consider using and it is most dangerous. However, if used properly in situations where it is appropriate, it has wonderful properties. For long term investors who make many short term decisions, it usually yields the highest long run levels of wealth. This is called Kelly betting in honor of Kelly's 1956 paper that introduced this type of betting. In finance, it is called the Capital Growth Theory or Fortune's Formula.1 Kelly was working at Bell Labs and was greatly influenced by Claude Shannon, the father of information theory.

This chapter has examples where the bets are small and other chapters consider large bets. Applications are made to blackjack, lotteries, horse racing and in Chapter 15 commodity trading on the January turn-of-the-year effect is discussed.

Consider the example described in Table 13.1. There are five possible investments and if we bet on any of them, we always have a 14% advantage. The difference between them is that some have a higher chance of winning and, for some, this chance is smaller. For the latter, we receive higher odds if we win than for the former. But we always receive 1.14 for each 1 bet on average. Hence we have a favorable game. The optimal expected log utility bet with one asset (where we either win or lose the bet) equals the edge divided by the odds.2 So for the 1-1 odds bet, the wager is 14% of one's fortune and at 5-1 it's only 2.8%. We bet more when the chance that we will lose our bet is smaller. Also we bet more when the edge is higher. The bet is linear in the edge so doubling the edge doubles the optimal bet. However, the bet is non-linear in the chance of losing our money, which is reinvested so the size of the wager depends more on the chance of losing and less on the edge.

Table 13.1: The Investments

images

Source: Ziemba and Hausch (1986)

The simulation results shown in Table 13.2 assume that the investor's initial wealth is 1000 and that there are 700 investment decision points. The simulation was repeated 1000 times. The numbers here are the number of times out of the possible 1000 that each particular goal was reached. The first line is with log or Kelly betting, The second line is half Kelly betting. That is you compute the optimal Kelly wager but then blend it 50-50 with cash. We discuss various Kelly fractions and how to utilize them wisely but for now, we will just focus on half Kelly. Assuming log normally distributed investments, the α-fractional Kelly wager is equivalent to the optimal bet obtained from using the concave risk averse, negative power utility function, –wβ, where images/ For half Kelly (α = 1/2), β = −1 and the utility function is images/. Here the marginal increase in wealth drops off as w2, which is more conservative than log's w. Log utility is the case β →−∞, α =1 and cash is β →−∞, α = 0.

Table 13.2: Statistics of the Simulation

images

Source: Ziemba and Hausch (1986)

A major advantage of log utility betting is the 166 in the last column. In fully 16.6% of the 1000 cases in the simulation, the final wealth is more than 100 times as much as the initial wealth. Also in 302 cases, the final wealth is more than 50 times the initial wealth. This huge growth in final wealth for log is not shared by the half Kelly strategies, which have only 1 and 30, respectively, for their 50 and 100 time growth levels. Indeed, log usually provides an enormous growth rate but at a price, namely a very high volatility of wealth levels. That is, the final wealth is very likely to be higher than with other strategies, but the ride generally will be very very bumpy. The maximum, mean, and median statistics in Table 13.2 illustrate the enormous gains that log utility strategies usually provide.

Let's now focus on bad outcomes. The first column provides the following re-markable fact: one can make 700 independent bets of which the chance of winning each one is at least 19% and usually is much more, having a 14% advantage on each bet and still turn 1000 into 18 , a loss of more than 98%. Even with half Kelly, the minimum return over the 1000 simulations was 145, a loss of 85.5%. Half Kelly has a 99% chance of not losing more than half the wealth versus only 91.6% for Kelly. The chance of not being ahead is almost three times as large for full versus half Kelly. Hence to protect ourselves from bad scenario outcomes, we need to lower our bets and diversify across many independent investments. This is explored more fully in the context of hedge funds in various chapters in this book.

Figure 13.1 provides a visual representation of the type of information in Table 13.2 displaying typical behavior of full Kelly versus half Kelly wagering in a real situation. These are bets on the Kentucky Derby from 1934 to 1998 using an inefficient market system where probabilities from a simple market (win) are used in a more complex market (place and show) coupled with a breeding filter rule [dosage filter 4.00] to eliminate horses who do not have enough stamina. You bet on horses that have the stamina to finish first, second or third who are underbet to come in second or better or third or better relative to their true chances estimated from their odds to win.

The full Kelly log bettor has the most total wealth at the horizon but has the most bumpy ride: $2500 becomes $16,861. The half Kelly bettor ends up with much less, $6945 but has a much smoother ride. The system did provide out of sample profits. A comparison with random betting proxied by betting on the favorite in the race, shows how tough it is to win at horseracing with the 16% track take plus breakage (rounding payoffs down to the nearest 20 cents per $2 bet) at Churchill Downs. Betting on the favorite turns $2500 into $480. Random betting has even lower final wealth at the horizon since favorites are underbet.

Blackjack

The difference between full and fractional Kelly investing and the resulting size of the optimal investment bets is illustrated via a tradeoff of growth versus security. This is akin to the static mean versus variance so often used in portfolio management and yields two dimensional graphs that aid in the investment decision making process. This can be illustrated by the game of blackjack where fractional Kelly strategies have been used by professional players.

images

Fig. 13.1 Wealth level histories from place and show betting on the Kentucky Derby, 1934-1998 with the Dr Z system utilizing a 4.00 dosage index filter rule with full and half Kelly wagering from $200 flat bets on the favorite using an initial wealth of $2500. Source: Bain, Hausch and Ziemba (2006)

The game of blackjack or 21 evolved from several related card games in the 19th century. It became fashionable during World War I and now has enormous popularity, and is played by millions of people in casinos around the world. Billions of dollars are lost each year by people playing the game in Las Vegas alone. A small number of professionals and advanced amateurs, using various methods such as card counting, are able to beat the game. The object is to reach, or be close to, twenty-one with two or more cards. Scores above twenty-one are said to bust or lose. Cards two to ten are worth their face value: Jacks, Queens and Kings are worth ten points and Aces are worth one or eleven at the player's choice. The game is called blackjack because an ace and a ten-valued card was paid three for two and an additional bonus accrued if the two cards were the Ace of Spades and the Jack of Spades or Clubs. While this extra bonus has been dropped by current casinos, the name has stuck. Dealers normally play a fixed strategy of drawing cards until the total reaches seventeen or more at which point they stop. A variation is when a soft seventeen (an ace with cards totaling six) is hit. It is better for the player if the dealer stands on soft seventeen. The house has an edge of 1-10% against typical players. The strategy of mimicking the dealer loses about 8% because the player must hit first and busts about 28% of the time (0.282 = 0.08). However, in Las Vegas the average player loses only about 1.5% per play.

The edge for a successful card counter varies from about -5% to +10% depending upon the favorability of the deck. By wagering more in favorable situations and less or nothing when the deck is unfavorable, an average weighted edge is about 1-2%. An approximation to provide insight into the long-run behavior of a player's fortune is to assume that the game is a Bernoulli trial with a probability of success p = 0.51 and probability of loss q=1-p= 0.49.

images

Fig. 13.2 Probability of doubling and quadrupling before halving and relative growth rates versus fraction of wealth wagered for Blackjack (2% advantage, p=0.51 and q=0.49). Source: McLean and Ziemba (1999)

Table 13.3: Growth Rates Versus Probability of Doubling Before Halving for Blackjack. Source: MacLean and Ziemba (1999)

images

Figure 13.2 shows the relative growth rate images/ versus the fraction of the investor's wealth wagered, π. The security curves show the bounds on the true probability of doubling or quadrupling before halving. This is maximized by the Kelly log bet π* = p − q = 0.02. The growth rate is lower for smaller and for larger bets than the Kelly bet. Superimposed on this graph is also the probability that the investor doubles or quadruples the initial wealth before losing half of this initial wealth. Since the growth rate and the security are both decreasing for π > π*, it follows that it is never advisable to wager more than π*. The growth rate of a bet that is exactly twice the Kelly bet, namely 2π* = 0.04, is zero plus the risk-free rate of interest. Figure 13.2 illustrates this. Hence log betting is the most aggressive investing that one should ever consider. The root of hedge fund disasters is frequently caused by bets above π* when they should have bets that are π* or less, especially when parameter uncertainty is considered. However, one may wish to trade off lower growth for more security using a fractional Kelly strategy. This growth tradeoff is further illustrated in Table 13.3. For example, a drop from π* = 0.02 to 0.01 for a 0.5 fractional Kelly strategy, decreases the growth rate by 25%, but increases the chance of doubling before halving from 67% to 89%.

The rest of the chapter discusses three topics: investing using unpopular numbers in lotto games with very low probabilities of success but where the expected returns are very large (this illustrates how bets can be very tiny); good and bad properties of the Kelly log strategy and why this led me to work with Len MacLean on a through study of fractional Kelly strategies and futures and commodity trading, and how large undiversified positions can lead to disasters as it has for numerous hedge funds and bank trading departments.

Betting on unpopular lotto numbers using the Kelly criterion

Using the Kelly criterion for betting on favorable (unpopular) numbers in lotto games - even with a substantial edge and very large payoffs if we win - the bets are extremely tiny because the chance of losing most or all of our money is high.

Lotteries predate the birth of Jesus. They have been used by various organiza-tions, governments and individuals to make enormous profits because of the greed and hopes of the players who wish to turn dollars into millions. The Sistine Chapel in the Vatican, including Michelangelo's ceiling, was partially funded from lotteries. So was the British Museum. Major Ivy League universities in the US such as Harvard used lotteries to fund themselves in their early years. Former US president Thomas Jefferson used a lottery to pay off his debts when he was 83. Abuses occur from time to time and government control is typically the norm. Lotteries were banned in the US for over a hundred years from the early 1800s and resurfaced in 1964. In the UK, the dark period was 1826-1994. Since then there has been enormous growth in lottery games in the US, Canada, the UK and other countries. Current lottery sales in the UK are about five billion pounds per year. Sales of the main 6/49 lotto game average about 80 million pounds a week. The lottery operator takes about 5% of lotto sales for its remuneration, 5% goes to retailers, 12% goes to the government in taxes, and another 28% goes to various good causes, as do unclaimed prizes.

One might conclude that the expected payback to the Lotto player is 50% of his or her stake. However, the regulations allow a further 5% of regular sales to be diverted to a Super Draw fund. Furthermore we must allow for the probability that the jackpot is not won. Eighty of 567 jackpots to the end of May 2001 had not been won. This means that the expected payback in a regular draw is not much more than 40%. This is still enough to get people to play. With such low paybacks it is very difficult to win at these games and the chances of winning any prize at all, even the small ones, is low.

Table 13.4 describes the various types of lottery games in terms of the chance of winning and the payoff if you win. Lottery organizations have machines to pick the numbers that yield random number draws. Those who claim that they can predict the numbers that will occur cannot really do so. There are no such things as hot and cold numbers or numbers that are friends. Schemes to combine numbers to increase your chance of winning are mathematically fallacious. For statistical tests on these points, see Ziemba et al. (1986). One possible way to beat pari-mutuel lotto games is to wager on unpopular numbers or, more precisely, unpopular combinations.3 In lotto games players select a small set of numbers from a given list. The prizes are shared by those with the same numbers as those selected in the random drawing. The lottery organization bears no risk in a pure pari-mutuel system and takes its profits before the prizes are shared. I have studied the 6/49 game played in Canada and several other countries.4

Combinations like 1,2,3,4,5,6 tend to be extraordinarily popular: in most lotto games, there would be thousands of jackpot winners if this combination were drawn. Numbers ending in eight and especially nine and zero as well as high numbers (32+, the non-birthday choices) tend to be unpopular. Professor Herman Chernoff found that similar numbers were unpopular in a different lotto game in Massachusetts. The game Chernoff studied had four digit numbers from 0000 to 9999. He found advantages from many of those with 8, 9, 0 in them. Random numbers have an expected loss of about 55%. However, six-tuples of unpopular numbers have an edge with expected returns exceeding their cost by about 65%. For example, the combination 10, 29, 30, 32, 39, 40 is worth about $1.507 while the combination 3, 5, 13, 15, 28, 33 of popular numbers is worth only about $0.154. Hence there is a factor of about ten between the best and worst combinations. The expected value rises and approaches $2.25 per dollar wagered when there are carryovers (that is when the jackpot is accumulating because it has not been won.). Most sets of unpopular numbers are worth $2 per dollar or more when there is a large carryover. Random numbers, such as those from lucky dip and quick pick, and popular numbers are worth more with carryovers but never have an advantage. Howevee, investors (such as Chernoff's students) mty still lose because of mean reversion (the unpopular numbers tend to become less unpopular over time) and gamblers' ruin (the investor har used up his available resources before -winning;). These same two phenomena show up in the financial markets repeatedly.

Table 13.4: Types of Lottery Games

images

Table 13.5 provides an estimate of the most unpopular numbers in Canada in 1984, 1986 and 1996. The same numbers tend to be the most unpopular over time but their advantage becomes less and less over time. Similarly, as stock market anomalies like the January effe ct or weekend effect have lessened over time. However, the advantages are still good enough to create a mathematical advantage in the Canadian and UK lottos.

Strategy Hint #1: When a new lotto game is offered, the best advantage is usually right at the start. This point applies to any type of bet or financial market.

Strategy Hint #2: Games with more separate events, on each of which you can have an advantage, are more easily beatable. The total advantage is the product of individual advantages. Lotto 6/49 has 6; a game with 9 is easier to beat and one with 3 harder to beat.

But can an investor really win with high confidence by playing these unpopular numbers? And if so, how long will it take? To investigate this, consider the following experiment shown in Table 13.6.

Table 13.5: Unpopular Numbers in the Canadian 6/49, 1984, 1986, and 1996

images

Case A assumes unpopular number six-tuples are chosen and there is a medium sized carryover. Case B assumes that there is a large carryover and that the numbers played are the most unpopular combinations. Carryovers (called rollovers in the UK) build up the jackpot until it is won. In Canada, carryovers build until the jackpot is won. In the UK 6/49 game, rollovers are capped at three. If there are no jackpot winners then, the jackpot funds not paid out are added to the existing fund for the second tier prize (bonus) and then shared by the various winners. In all the draws so far, the rollover has never reached this fourth rollover. Betting increases as the carryover builds since the potential jackpot rises.5 These cases are favorable to the unpopular numbers hypothesis; among other tilings they correspond to the Canadian and UK games in which the winnings are paid up front (not over twenty or more years as in the US) and tax free (unlike in the US). The combination of tax free winnings plus being paid in cash makes the Canadian and UK prizes worth about three times those in the US. The optimal Kelly wagers are extremely small. The reason for this is that the bulk of the expected value is from prizes that occur with less than one in a million probability. A wealth level of $1 million is needed in Case A to justify $1 ticket. The corresponding wealth in Case B is over $150,000. Figures 13.3(a) and 13.3(b) provide the chance that the investor will double, quadruple or increase tenfold this fortune before it is halved using Kelly and fractional Kelly strategies for Cases A and B respectively. These chances are in the 40-60% and 55-80% ranges for Cases A and B, respectively. With fractional Kelly strategies in the range of 0.00000004 and 0.00000025 or less of the investor's initial wealth, the chance of increasing one's initial fortune tenfold before halving it is 95% or more with Cases A and B respectively. However, it takes an average of 294 billion and 55 billion years respectively to achieve this goal assuming there are 100 draws per year as there are in the Canadian 6/49 and UK 6-49.

Table 13.6:Lotto Game Experimental Data

images

Figures 13.4(a) and 13.4(b) give the probability of reaching $10 million before falling to $1 million and $25,000 for various initial wealth for cases A and B, respectively, with full, half and quarter Kelly wagering strategies. The results indicate that the investor can have a 95% plus probability of achieving the $10 million goal from a reasonable initial wealth level with the quarter Kelly strategy for cases A and B. Unfortunately the mean time to reach this goal this is 914 million years for case A and 482 million years for case B. For case A with full Kelly it takes 22 million years on average and 384 million years with half Kelly for case A. For case B it takes 2.5 and 19.3 million years for full and half Kelly, respectively. It takes a lot less time, but still millions of years on average to merely double one's fortune: namely 2.6, 4.6 and 82.3 million years for full, half and quarter Kelly, respectively for case A and 0.792, 2.6 and 12.7 for case B. We may then conclude that millionaires can enhance their dynasties' long-run wealth provided their wagers are sufficiently small and made only when carryovers are sufficiently large (in lotto games around the world). There are quite a few that could be played.

images

Fig. 13.3Probability of doubling, quadrupling and tenfolding before halving, Lotto 6/49. Source: MacLean and Ziemba (1999)

images

Fig. 13.4Probability of reaching the goal of $10 million under various conditions. Source: MacLean and Ziemba (1999)

What about a non-millionaire wishing to become one? The aspiring investor must pool funds until $150,000 is available for case B and $1 million for case A to optimally justify buying only one $1 ticket per draw. Such a tactic is legal in Canada and in fact is highly encouraged by the lottery corporation which supplies legal forms for such an arrangement. Also in the UK, Camelot will supply model 'agreement' forms for syndicates to use, specifying who must pay what, how much, and when, and how any prizes will be split. This is potentially very important for the treatment of inheritance tax with large prizes. The situation is modeled in Figure 3. Our aspiring millionaire puts up $100,000 along with nine others for the $1 million bankroll and when they reach $10 million each share is worth $1 million. The syndicate must play full Kelly and has a chance of success of nearly 50 assuming that the members agree to disband if they lose half their stake. Participants do not need to put up the whole $100,000 at the start. The cash outflow is easy to fund, namely 10 cents per draw per participant. To have a 50% chance of reaching the $1 million goal, each participant (and their heirs) must have $50,000 at risk. It will take 22 million years, on average, to achieve the goal.

The situation is improved for case B players. First, the bankroll needed is about $154,000 since 65 tickets are purchased per draw for a $10 million wealth level. Suppose our aspiring nouveau riche is satisfied with $500,000 and is willing to put all but $25,000/2 or $12,500 of the $154,000 at risk. With one partner he can play half Kelly strategy and buy one ticket per case B type draw. Figure 13.4(b) indicates that the probability of success is about 0.95. With initial wealth of $308,000 and full Kelly it would take million years on average to achieve this goal. With half Kelly it would take, on average, 2.7 million years and with quarter Kelly it would take 300 million years.

The conclusion is that except for millionaires and pooled syndicates, it is not possible to use the unpopular numbers in a scientific way to beat the lotto and have high confidence of becoming rich; these aspiring millionaires will also most likely be residing in a cemetery when their distant heirs finally reach the goal.

What did we learn from this exercise?

(1)Lotto games are in principle beatable but the Kelly and fractional Kelly wagers are so small that it takes virtually forever to have high confidence of winning. Of course, you could win earlier or even on the first draw and you do have a positive mean on all bets. Ziemba et al. (1986) have shown that the largest jackpots contain about 47% of the nineteen most unpopular numbers in 1986 shown in Table 13.3(b) versus 17% unpopular numbers in the smallest jackpots. Hence, if you play, emphasizing unpopular numbers is a valuable strategy to employ. But frequently numbers other than the unpopular ones are drawn. So the strategy of focussing on three or four unpopular numbers and then randomly selecting the next two numbers might work. Gadgets to choose such numbers are easy to devise. But you need deep pockets here and even then you might ruin. The best six numbers, see Table 13.5 once won a $10 million unshared jackpot in Florida. Could you bet more? Sorry: log is the most one should ever bet.

(2)The Kelly and fractional Kelly wagering schemes are very useful in practice but the size of the wagers will vary from very tiny to enormous bets. My best advice: never over bet; it will eventually lead to trouble unless it is controlled somehow and that is hard to do!

Good and bad properties of the Kelly criterion

If your outlook is well extended,, the Kelly criterion is the approach best suited to generating a fortune.

We now discuss the good and bad properties of the Kelly expected log capital growth criterion. If your horizon is long enough then the Kelly criterion is the road, however bumpy, to the most wealth at the end and the fastest path to a given rather large fortune.

The great investor Warren Buffett's Berkshire Hathaway actually has had a growth path quite similar to full Kelly betting. Figure 6.1 shows this performance from 1985 to 2000 in comparison with other great funds. Buffett also had a great record from 1977 to 1985 turning 100 into 1429.87, and 65,852.40 in April 2000 and about $132,700 on September 30, 2012.

Keynes was another Kelly type bettor. His record running King's College Cam-bridge's Chest Fund is shown in Figure 6.4 versus the British market index for 1927 to 1945, data from Chua and Woodward (1983). Notice how much Keynes lost the first few years; obviously his academic brilliance and the recognition that he was facing a rather tough market kept him in this job. In total his geometric mean return beat the index by 10.01%. Keynes was an aggressive investor with a capital asset pricing model beta of 1.78 versus the benchmark United Kingdom market return, a Sharpe ratio of 0.385, geometric mean returns of 9.12% per year versus -0.89% for the benchmark. Keynes had a yearly standard deviation of 29.28% versus 12.55% for the benchmark. These returns do not include Keynes' (or the benchmark's) dividends and interest, which he used to pay the college expenses. These were about 3% per year. Kelly cowboys have their great returns and losses and embarrassments. Not covering a grain contract in time led to Keynes taking delivery and filling up the famous chapel. Fortunately it was big enough to fit in the grain and store it safely until it could be sold; see the cartoon. Keynes' investment behavior, according to Ziemba (2003) was equivalent to 80% Kelly and 20% cash so he would use the negative power utility function –w−0.25.

Keynes emphasized three principles of successful investments in his 1933 report:

(1)a careful selection of a few investments (or a few types of investment) having regard to their cheapness in relation to their probable actual and potential intrinsic value over a period of years ahead and in relation to alternative invest-ments at the time;

(2)a steadfast holding of these in fairly large units through thick and thin, perhaps for several years until either they have fulfilled their promise or it is evident that they were purchased on a mistake; and

(3)a balanced investment position, i.e., a variety of risks in spite of individual holdings being large, and if possible, opposed risks.

He really was a lot like Buffett with an emphasis on value, large holdings and patience.

In November 1919, Keynes was appointed second bursar. Up to this time King's College investments were only in fixed income trustee securities plus their own land and buildings. By June 1920 Keynes convinced the college to start a separate fund containing stocks, currency and commodity futures. Keynes became first bursar in 1924 and held this post which had final authority on investment decisions until his death in 1945.

And Keynes did not believe in market timing as he said:

We have not proved able to take much advantage of a general systematic movement out of and into ordinary shares as a whole at different phases of the trade cycle. As a result of these experiences I am clear that the idea of wholesale shifts is for various reasons impracticable and indeed undesirable. Most of those who attempt to sell too late and buy too late, and do both too often, incurring heavy expenses and developing too unsettled and speculative a state of mind, which, if it is widespread, has besides the grave social disadvantage of aggravating the scale of the fluctuations.

images

The main disadvantages result because the Kelly strategy is very very aggressive with huge bets that become larger and larger as the situations are most attractive: recall that the optimal Kelly bet is the mean edge divided by the odds of winning. As I repeatedly argue,. the mean counts by far the most. There is about a 20-2:1 ratio of expected utility loss from similar sized errors of means, variances and covariances, respectively, as discussed in Chapter 3. Returning to Buffett who gets the mean right, better than almost all, notice that the other funds he outperformed are not shabby ones at all. Indeed they are George Soros' Quantum, John Neff's Windsor, Julian Robertson's Tiger and the Ford Foundation, all of whom had great records as measured by the Sharpe ratio. Buffett made 32.07% per year net from July 1977 to March 2000 versus 16.71% for the S&P500. Wow! Those of us who like wealth prefer Warren's path but his higher standard deviation path (mostly winnings) leads to a lower Sharpe (normal distribution based) measure; see Siegel et al. (2001). Chapter 6 proposes a modification of the Sharpe ratio to not penalize gains. This improves Buffet's evaluation.

Since Buffett and Keynes are full or close to full Kelly bettors their means must be even more accurate. With their very low risk tolerances, the errors in the mean are 100+ times as important as the co-variance errors.

Kelly has essentially zero risk aversion since its Arrow-Pratt risk aversion index is

images

which is essentially zero. Hence it never pays to bet more than the Kelly strategy because then risk increases (lower security) and growth decreases so is stochastically dominated. As you bet more and more above the Kelly bet, its properties become worse and worse. When you bet exactly twice the Kelly bet, then the growth rate is zero plus the risk free rate; see the proof at the end of this chapter.

If you bet more than double the Kelly criterion, then you will have a negative growth rate. With derivative positions one's bet changes continuously so a set of positions amounting to a small bet can turn into a large bet very quickly with market moves. Long Term Capital is a prime example of this overbetting leading to disaster but the phenomenon occurs all the time all over the world. Overbetting plus a bad scenario leads invariably to disaster.

Thus you must either bet Kelly or less. We call betting less than Kelly fractional Kelly, which is simply a blend of Kelly and cash. Consider the negative power utility function δwδ for δ < 0. This utility function is concave and when δ → 0 it converges to log utility. As δ gets larger negatively, the investor is less aggressive since his Arrow-Pratt risk aversion is also higher. For a given δ and α =1/(1 — δ) between 0 and 1, will provide the same portfolio when α is invested in the Kelly portfolio and 1 — α is invested in cash.

This result is correct for lognormal investments and approximately correct for other distributed assets; see MacLean, Ziemba and Li (2005). For example, half Kelly is δ = –1 and quarter Kelly is δ = –3. So if you want a less aggressive path than Kelly pick an appropriate δ. Below I discuss a way to pick δ continuously in time so that wealth will stay above a desired wealth growth path with high given probability; see Figure 13.5.

I now list these and other important Kelly criterion properties, updated from MacLean, Ziemba and Blazenko (1992) and MacLean and Ziemba(1999).

Good Maximizing ElogX asymptotically maximizes the rate of asset growth See Breiman (1961), Algoet and Cover (1988).

Good The expected time to reach a preassigned goal is asymptotically as X increases least with a strategy maximizing ElogXN. See Breiman (1961), Algoet and Cover (1988), Browne (1997a).

Good Maximizing median logX See Ethier (1987).

Bad False Property: If maximizing ElogXN almost certainly leads to a better outcome then the expected utility of its outcome exceeds that of any other rule provided N is sufficiently large. Counter Example: u(x) = x, 1/2 < p < 1, Bernoulli trials f = 1 maximizes EU(x) but f = 2p – 1 < 1 maximizes ElogXN. See Samuelson (1971), Thorp (1975, 2006).

Good The ElogX bettor never risks ruin. See Hakansson and Miller (1975).

Bad If the ElogXN bettor wins then loses or loses then wins with coin tosses, he is behind. The order of win and loss is immaterial for one, two,..., sets of trials since (1 + γ)(1 — γ)XO = (1 — γ2)X0 < X0. This is not true for favorable games.

Good The absolute amount bet is monotone in wealth. (δElogX)/δW0 > 0.

Bad The bets are extremely large when the wager is favorable and the risk is very low. For single investment worlds, the optimal wager is proportional to the edge divided by the odds. Hence for low risk situations and corresponding low odds, the wager can be extremely large. For one such example, see Ziemba and Hausch (1986; 159-160). There, in the inaugural 1984 Breeders' Cup Classic $3 million race, the optimal fractional wager on the 3-5 shot Slew of Gold was 64%. (See also the 74% future bet on the January effect in Chapter 15. Thorp and I actually made this place and show bet and won with a low fractional Kelly wager. Slew finished third but the second place horse Gate Dancer was disqualified and placed third. Luck (a good scenario) is also nice to have in betting markets. Wild Again won this race; the first great victory by the masterful jockey Pat Day.

Bad One overinvests when the problem data is uncertain. Investing more than the optimal capital growth wager is dominated in a growth-security sense. Hence, if the problem data provides probabilities, edges and odds that may be in error, then the suggested wager will be too large.

Bad The total amount wagered swamps the winnings - that is, there is much churning. Ethier and Tavare (1983) and Griffin (1985) show that the Expected Gain/E Bet is arbitrarily small and converges to zero in a Bernoulli game where one wins the expected fraction p of games.

Bad The unweighted average rate of return converges to half the arithmetic rate of return. As with the above bad property, this indicates that you do not seem to win as much as you expect. See Ethier and Tavar (1983) and Griffin (1985).

Bad Betting double the optimal Kelly bet reduces the growth rate of wealth to zero plus the risk free rate. See Stutzer (1998) and Janecek (1999) and the appendix in this chapter for a proof.ood The ElogX bettor is never behind any other bettor on average in 1, 2, trials. See Finkelstein & Whitley (1981).

Good The ElogX bettor has an optimal myopic policy. He does not have to consider prior nor subsequent investment opportunities. This is a crucially important result for practical use. Hakansson (1972) proved that the myopic policy obtains for dependent investments with the log utility function. For independent investments and power utility a myopic policy is optimal, see Mossin (1968).

Good The chance that an ElogX wagerer will be ahead of any other wagerer after the first play is at least 50%. See Bell and Cover (1980).

Good Simulation studies show that the ElogX bettor's fortune pulls way ahead of other strategies wealth for reasonable-sized samples. The key again is risk. See Ziemba and Hausch (1986). General formulas are in Aucamp (1993).

Good If you wish to have higher security by trading it off for lower growth, then use a negative power utility function or fractional Kelly strategy. See MacLean, Sanegre, Zhao and Ziemba (2004) who show how to compute the coefficent to stay above a growth path with given probability. See Figure 13.5 for the idea and the example below.

Bad Despite its superior long-run growth properties, it is possible to have very poor return outcome. For example, making 700 wagers all of which have a 14% advantage, the least of which had a 19% chance of winning can turn $1000 into $18. But with full Kelly 16.6% of the time $1000 turns into at least $100,000, see Ziemba and Hausch (1996). Half Kelly does not help much as $1000 can become $145 and the growth is much lower with only $100,000 plus final wealth 0.1% of the time.

Bad It can take a long time for a Kelly bettor to dominate an essentially different strategy. In fact this time may be without limit. Suppose μα = 20%, μβ = 10%, σα = σβ = 10%. Then in five years A is ahead of B with 95% confidence. But if σα = 20, σβ = 10% with the same means, it takes 157 years for A to beat B with 95% confidence. In coin tossing suppose game A has an edge of 1.0% and game B 1.1%. It takes two million trials to have an 84% chance that game A dominates game B, see Thorp (2006).

Calculating the optimal Kelly fraction

I now discuss how to calculate the optimal Kelly fraction to grow wealth as fast as possible in the long run but to stay above a wealth growth path at particular intervals with high probability in the short run. This approach provides one way to scientifically cut down the size of one's bet to raise security levels while still maintaining high growth levels.

Most applications of fractional Kelly strategies pick the fractional Kelly strategy in an ad hoc fashion. MacLean, Ziemba and Li (2005) show that growth and security tradeoffs are effective for general return distributions in the sense that growth is monotone decreasing in security. But with general return distributions, this tradeoff is not necessarily efficient in the sense of Markowitz (generalized growth playing the role of mean and security the role of variance). However, if the investment returns are lognormal, the tradeoff is efficient. MacLean, Ziemba and Li also develop an investment strategy where the investor sets upper and lower targets and rebalances when those targets are achieved. Empirical tests in MacLean, Sanegre, Zhao and Ziemba (MSZZ) (2004) show the advantage of this approach.

A solution of a version of the problem of how to pick an optimal Kelly function was provided in MSZZ (2004). To stay above a wealth path using a Kelly strategy is very difficult since the more attractive the investment opportunity, the larger the bet size and hence the larger the chance of falling below the path. Figure 13.5 illustrates this. An extension of MSZZ (2004) to include convex penalties for drawdowns is in MacLean, Zhao and Ziemba (2009, 2012). By penalizing drawdowns, the actual wealth path is likely to stay above the prespecified wealth path.

images

Fig. 13.5 Kelly fractions and path achievement

Calculating the optimal Kelly fraction

MSZZ use a continuous time lognormally distributed asset model to calculate that function at various points in time to stay above the path with a high exogenously specified value at risk probability. They provide an algorithm for this. The idea is illustrated using the following application to the fundamental problem of asset allocation over time, namely, the determination of optimal fractions over time in cash, bonds and stocks. The data in Table 13.7 are yearly asset returns for the S&P500, the Salomon Brothers Bond index and U.S. T-bills for 1980-1990 with data from Data Resources, Inc. Cash returns are set to one in each period and the mean returns for other assets are adjusted for this shift. The standard deviation for cash is small and is set to 0 for convenience.

Table 13.7:Yearly Wealth Relatives on Assets Relative to Cash (%)

images

A simple grid was constructed from the assumed lognormal distribution for stocks and bonds by partitioning images/ at the centroid along the principal axes. A sample point was selected from each quadrant to approximate the parameter values. The planning horizon is T = 3, with 64 scenarios each with probability 1/64 using the data in Table 13.8. The problems are solved with the VaR constraint (Table 13.9) and then for comparison, with the stronger drawdown constraint (Table 13.10).

Table 13.8: Rates of Return Scenarios

images

VaR Control with w* = a

The model is

images

With initial wealth W(0) = 1, the value at risk is a3. The optimal investment decisions and optimal growth rate for several values of a, the secured average annual growth rate and 1 – α, the security level, are shown in Table 13.9. The heuristic described in MSZZ was used to determine A, the set of scenarios for the security constraint. Since only a single constraint was active at each stage the solution is optimal.

The mean return structure for stocks is favorable in this example, as is typical over long horizons.6 Hence the aggressive Kelly strategy is to invest all the capital in stock most of the time.

When security requirements are high some capital is in bonds.

As the security requirements increase the fraction invested in bonds increases.

The three-period investment decisions are more conservative as the horizon approaches.

Secured Annual Drawdown: b

The VaR condition only controls loss at the horizon. At intermediate times the investor could experience substantial loss, and face bankruptcy. A more stringent risk control constraint, drawdown, considers the loss in each period using the model

images

Table 13.9: Growth with Secured Rate

images

Table 13.10: Growth with Secured Maximum Drawdown

images

This constraint follows from the arithmetic random walk ln W(t),

images

The optimal investment decisions and growth rate for several values of b, the drawdown and 1 – α, the security level are shown in Table 13.10.

The heuristic in MSZZ is used in determining scenarios in the solution.

The security levels are different since constraints are active at different probability levels in this discretized problem.

As with the VaR constraint, investment in the bonds and cash increases as the drawdown rate and/or the security level increases.

The strategy is more conservative as the horizon approaches.

For similar requirements (compare a = 0.97,1 – α = 0.85 and b = 0.97,1 – α = 0.75), the drawdown condition is more stringent, with the Kelly strategy (all stock) optimal for VaR constraint, but the drawdown constraint requires substantial investment in bonds in the second and third periods.

In general, consideration of drawdown requires a heavier investment in secure assets and at an earlier time point. It is not a feature of this aggregate example, but both the VaR and drawdown constraints are insensitive to large losses, which occur with small probability.

Control of that effect would require the lower partial mean violations condition or a model with a convex risk measure that penalizes more and more as larger constraint violations occur, see e.g. the InnoALM model in Chapter 14.

The models lead to hair trigger type behavior, very sensitive to small changes in mean values (as discussed in chapters 3 and 14; see also Figure 3.4).

Appendix

Proof that betting exactly double the Kelly criterion amount leads to a growth rate equal to the risk free rate. This result is due to Thorp (1997), Stutzer (1998) and Janacek (1998) and possibly others. The following simple proof is due to Harry Markowitz.

In continuous time

images

Ep, Vp, gp are the portfolio expected return, variance and expected log, respectively. In the CAPM

images

where X is the portfolio weight and r0 is the risk free rate. Collecting terms and setting the derivative of gp to zero yields

images

which is the optimal Kelly bet with optimal growth rate

images

Substituting double Kelly, namely Y = 2X for X above into

images

and simplifying yields

images

Hence g0 = r0 when Y = 2S.

The CAPM assumption is not needed. For a more general proof and illustration, see Thorp (2006).

1For those who would like a technical survey of capital growth theory, see MacLean, Thorp and Ziemba (2010).

2For one or two assets with fixed odds, take derivatives and solve for the optimal wagers; for multi-asset bets under constraints; and when portfolio choices affect returns (odds), one must solve a nonlinear program which, possibly, is non-convex.

3Another is to look for lottery design errors. As a consultant on lottery design for the past thirty years, I have seen plenty of these. My work has been largely to get these bugs out before the games go to market and to minimize the damage when one escapes the lottery commissions' analysis. Design errors are often associated with departures from the pure parimutuel method, for example guaranteeing the value of smaller prizes at too high a level and not having the games checked by an expert.

4See Ziemba et al. (1986), Dr Zs Lotto 6/49 Guidebook. While parts of the guidebook are dated, the concepts, conclusions, and most of the text provide a good treatment of such games. For those who want more theory, see MacLean and Ziemba (1999, 2006)

5An estimate of the number oftickets sold versus the carryover in millions is proportional to the carryover to the power 0.811. Hence, the growth is close to 1:1 linear. See Ziemba et al. (1986)

6See e.g. Keim and Ziemba (2000), Dimson et al (2006), Constantinides (2002) and Siegel (2002).