Managing Discretionary Wealth
I wrote this draft article several years after the first one on “better risk management”, with a similar message. In a somewhat more polished form, it was published in The Journal of Portfolio Management Spring 2003, Vol. 29, No. 3: pp. 58-65.
Harry Markowitz & the Discretionary Wealth Hypothesis
Draft Copyright 2000 Jarrod W. Wilcox
January 7, 2003
In his 1959 book, Harry Markowitz showed how return mean and variance combined to determine expected long-term growth rate of capital. But the maximization of that growth rate seemed to fit the risk preferences of only a narrow range of aggressive investors with no concern for shortfalls. This paper generalizes that goal to both conservative and aggressive investors by mapping the distribution of returns on total wealth to that of returns on discretionary wealth. It also broadens the definition of risk to include return skew and kurtosis where required, fully encompassing the concept of downside risk. The resulting change in frame of reference extends Markowitz’s criterion to many practical investment decisions involving maximizing long run wealth while controlling the probability of shortfalls along the way.
Harry Markowitz & the Discretionary Wealth Hypothesis
We often wish to better understand the long-run impact of our short-term investment policies. One possible tool is Monte Carlo simulation, the random generation of many alternatives to discover the probability distribution of multi-period outcomes. Not many investors find this easy to implement. There is an unfilled need for practical guidance.
For single periods, we have the mean-variance criterion developed in the 1950’s by Harry Markowitz. He proposed that in each investment period investors should strive for portfolio returns having the greatest difference between their mean and the product of 1) the return’s variance, or expected squared difference from the mean, and 2) a risk aversion coefficient specific to each investor. This is a very useful yardstick, but it is inadequate for constructing policies that will lead to maximum long-term results with acceptable safety against shortfalls along the way.
In his 1959 book, Markowitz showed how return mean and variance combine to affect expected long-term growth rate of capital. But the maximization of expected portfolio growth rate seemed to fit the risk preferences of only a narrow range of aggressive investors. The purpose of the present paper is to show how to better use Markowitz’s ideas for achieving longer-run objectives. To do so, his criterion will be extended with optimal growth and shortfall avoidance concepts. This task has been attempted before with limited success, most notably by Hakansson ; here we take a different approach, the discretionary wealth hypothesis, to overcome the remaining obstacles. By the end of the paper, we will have clarified not only the long-run impact of short-run policies, but also the perceived need to distinguish between variance and “downside variance,” referred to by Markowitz in his later writings as the semivariance.
To keep the scope of the paper within manageable limits, the possible implications for market pricing if investors as a whole act so as to achieve better long-term results will not be considered. The paper will also refrain from adding to the investment literature on implementation problems arising from erroneous risk estimates.
I. GROWTH MODEL CONTEXT
An appealing source for conceptual cross-fertilization with Harry Markowitz’s mean-variance criterion is the optimal growth theory introduced by John Kelly . In his framework, the investor maximizes the expected rate of long-run return in an investment process with independent returns by maximizing the expected logarithm of each single-period return multiple. Kelly was writing from the vantage point of information theory, and made no attempt to fit his concept to the utility theory for risks that was then beginning to take hold in modern finance. Hakansson  modified Kelly’s model in an attempt to bring it into the discourse of economists and to generalize it to the preferences of conservative investors. Hakansson also made the important observation that such a strategy would tend to maximize one’s median wealth along the way.
However, after generating considerable controversy, the growth-optimal model was not generally adopted within finance, for two reasons. On the practical side, it was found that resulting portfolio optimization of weights of stocks in diversified portfolios gave results hard to distinguish from those of Markowitz mean-variance optimization. On the theoretical side, no adequate response was made to the objection by Merton and Samuelson . They argued that maximizing expected log return, or even Hakansson’s proposed linear combination of the mean and variance of log return, could not account for the preferences of conservative investors in a way consistent with utility theory. In the aftermath of that academic exchange, nearly all investors with a quantitative bent focused on an efficient tradeoff between single-period mean and variance. They did so despite the handicap that there seemed no objective way to determine a best risk aversion coefficient to construct their tradeoff. They also were willing to put aside everyday experience that suggested risk aversion to downside risk in the form of negatively-skewed and fat-tailed return distributions.
Recent work in this arena most accessible to readers who are not mathematicians includes Wilcox  and Kritzman and Rich , who note the crucial role of interim shortfalls in determining risk aversion. Recent academic work on the implications for optimal growth on pricing include Bekaert et al , and Harvey and Siddique . They have found strong evidence that the third and fourth moment of returns, risk features not captured by variance, can be priced. A representative study of multi-period investment policy under specialized assumptions is that by Barberis .
II. GROWING DISCRETIONARY WEALTH
Let us first clarify why logarithmic returns are important in understanding long-run investment results, and why the median result is of great practical importance. We will then discuss bridging steps between Markowitz mean-variance optimization and optimal growth models. With these fundamentals in place, the discretionary wealth hypothesis completes the connection by showing how to manage the risk of interim shortfalls.
The Centrality of Logarithmic Returns
Though many investors are familiar with returns based on natural logarithms only as the continuous interest rate, any fractional return can be converted to such a log return. For example, a 50% return gives a return multiple of 1.5 and a log return of 0.405. (In this case, 0.405 is the power to which the natural constant e, or 2.718, must be raised to give 1.5 as a result.) Log returns are important because they convert the representation of a compounding process from multiplication to addition. The result of compounding fractional returns r is the product of their return multiples 1+r. Alternatively, the same answer can be gotten by converting each return multiple into its natural logarithm, adding these together, and taking the antilog of the result (raising e to that power). This perspective explains why the wealth outcomes of long-run compounding usually display a particular positively-skewed statistical distribution (the log-normal).
One does not need to reject active investing to note that though stock prices may be determined by predictable factors, the market appears relatively efficient in incorporating changes in these factors to changes in prices. Consequently, successive investment returns of traded securities appear to us, to a first approximation, as practically independent random events. We know from the Central Limit Theorem of statistics that the distribution of a sum of independent random numbers tends toward a bell-shaped normal distribution as more numbers are included. This is true for identically-distributed returns with finite variance and it is usually true in real-world practice where successive returns are drawn from somewhat different probability distributions.
Although it is possible to create a compounding process where the Central Limit Theorem does not hold, as from a dynamic hedging program, extreme departures are readily identified and isolated. For most practical purposes, the sum of the log returns created by long-run compounding of investment returns approaches the neighborhood of a normal, bell-shaped distribution as the number of periods is increased. Taking the antilog of such distributions, one discovers that the ratio of ending wealth to starting wealth will be positively skewed, approximating the log-normal distribution.
Because long-run wealth outcomes are nearly log-normally distributed, average wealth is strongly influenced by low-probability sequences in which unusually high single-period returns are compounded. Consequently, mean wealth will be greater, often much greater, than the median, or 50th percentile, wealth from a long-run compounding process. Median wealth is closer to what most outcomes will be. Happily, we can estimate median wealth in advance, because of the following relationship.
Since the normal distribution is symmetric, after a sufficient number of periods, the mean and median of the distribution of logarithms of wealth must converge. Consequently, the mean log return each period, by determining mean log return for the long-term, determines the median logarithm of long-term wealth and thus the median long-term wealth. When we maximize mean log return, we maximize median long run wealth, a desirable outcome in itself, and we tend, other things equal, to reduce the probability of an interim low-performance shortfall.
We can clarify these ideas with a concrete example. Begin with $1, which is to be compounded for 10 periods. Each period there is either a gain of 20%, giving a log return of 0.182, or a loss of 10%, with a log return of -0.105. The probability of a gain is 60%. After one period, the mean return is 8%. What will be the result after compounding?
The final mean wealth is 1.08 raised to the 10th power, or $2.16. The median requires a different approach. The mean single-period log return is 0.6*.182 + 0.4 * (-.105) = 0.0672. Multiplying that by 10, we arrive at the estimate for final median log wealth of 0.672. Taking the antilog of that result, we estimate median final wealth at $1.96.
Let us use computer simulation to check our estimate. We randomly-generate 100,000 sequences of 10 periods of compounded returns, a Monte Carlo simulation. Exhibit 1, essentially a cumulative probability chart turned on its side, plots compounded wealth on a vertical logarithmic scale and its percentile rank (100th being the highest) on the horizontal scale. The S-shaped heavy line (actually the overlapping of 100,000 small circles) shows the distribution of outcomes. Its intersection with the centered vertical marker shows the median outcome. The three horizontal lines show, starting at bottom, the original wealth of $1, the median wealth of $1.96, and the mean wealth of $2.16. These amounts closely confirm those predicted earlier.
Why Variance Does Not Fully Capture Risk
There is a deep relationship between the statistical moments that describe a distribution of returns and the successive terms of a Taylor series whose sum is mean log return. By making expected log return more transparent, this relationship shows how to improve long-run median outcomes by showing how they depend on the scale and shape of the single-period return distribution and on the leverage we apply to it. It will also help us understand why some investors may be especially unsatisfied with return variance as the entire measure of risk.
Statistical Moments: The mean, E, is sometimes called the first moment of a statistical distribution. The expected value of the difference between a random outcome and its mean is zero. The expected value of that difference squared is called the second central moment, or for short, the variance V. The expected value of that difference raised to the third power is called the third central moment; raised to the fourth power, it is termed the fourth central moment, and so on. These successive central moments describe the distribution’s dispersion, its lopsidedness and its tendency to have both a central spike and abnormally long tails (more extreme events).
Taylor Series: Many common mathematical functions of a number can be expressed as the sum of an infinite series of terms of increasing powers of that number, beginning with a constant. The natural logarithm of 1+r can be expressed as the natural log of the mean return multiple, ln(1+E), plus a series of terms involving increasing powers of the difference between the return r and its mean, E.
The expected value of the log return is the sum of the expected values of the terms in this Taylor series, making it a function of the central moments of return. We can go further toward linking the formula to commonly-used statistical parameters as follows. The third central moment can be decomposed to a shape parameter, skewness S, multiplied by variance V raised to the 3/2 power. The fourth central moment can be decomposed to a shape parameter, kurtosis K, multiplied by the variance V squared. We then obtain the rather fearsome-looking formula of Equation 1. It links expected log return to statistical parameters that can be easily calculated in a spreadsheet such as Excel or in any statistical software package. The incremental information sought by many investors in avoiding “downside risk” or semivariance is captured by the third and fourth terms.
ln – the natural log function
r – fractional return, the conventional return measure
E – mean r
V – variance of r
S – skewness of r
K – kurtosis of r, for a normal distribution K=3.
For diversified portfolios not incorporating derivatives or leverage, only the leftmost two terms of the formula in Equation 1 are required to produce a good estimate of mean log return. This two-parameter version is the form derived by Markowitz. When the mean return is small, the Markowitz verson can be approximated as mean return less half the variance, or E – V/2. In the example illustrated in Exhibit 1, it provides an estimate of mean log return of 0.0690 as compared to the true 0.0672.
Equation 1 provides us with important insights even before we introduce the discretionary wealth hypothesis. Each succeeding term contributes additional information about events that are more extreme and have smaller probability. Note also that if time periods for measuring return are kept short, the variance V is reduced. That implies that the third, fourth and higher central moments, which include higher powers of V, being reduced in greater proportion, will contribute less and less to the determination of median long-term results. This taming of unruly return distributions offers a sound theoretical basis for the oft-criticized practice of close attention to short-term results.
An investor whose risk aversion coefficient in the Markowitz mean-variance framework happens to yield a coefficient near 1/2 will pursue a policy approximating maximum median wealth over the long-run. However, this relationship is more aggressive than the preferences most of us seem to exhibit. This disparity raises the question of what the rest of us are doing, a question to be answered later by the discretionary wealth hypothesis.
Truncated Growth Models
To achieve its remarkable simplification, Kelly’s growth-optimal model assumes an infinite number of time periods. It says nothing about time preference or finite lifetimes. The same is true for Markowitz’s single-period mean-variance criterion. Sometimes the single-period Markowitz mean-variance criterion pays too little attention to possible disasters with small single-period probabilities. In contrast, Kelly’s growth model, because it is based on an infinite number of periods, can pay too much attention to extremely low probability disasters. That is, our investment interest is usually limited to the impact of events likely during one or two lifetimes.
Consider an analogy. Suppose the annual probability of an automobile driver fatality per year in the US is about one-one hundredth of a percent. If a driver were to drive for thousands of years, the median driving outcome would be grim. But since the cumulative probability of a driving fatality over a realistic lifetime is only about 0.5%, and since driving helps us with many other goals, most of us rationally decide to drive.
We automatically reduce the influence of tiny-probability extreme events whenever we approximate mean log return with a truncated Taylor series. If we start with the linear mean-variance criterion innovated by Markowitz, we take most dispersion into account. When we go further by using the first four terms of the Taylor series for mean log return, we take unusual events seriously. When we go even further and work directly with expected log returns, we avoid any risk with disastrous consequences, no matter how small its probability during the lifetime of our contemplated investment policy.
The Discretionary Wealth Hypothesis
Now we proceed to adapt Kelly’s model, whether as originally published, or in truncated Taylor series form, to the needs of the great majority of investors too conservative to maximize growth in total wealth by assuming a Markowitz risk aversion coefficient of only 1/2. We will assume that risk aversion is caused by the need to avoid shortfalls, not only at some far-off ending period, but all along the way. The discretionary wealth hypothesis asserts that investors will be better off if they strive to maximize their median discretionary wealth over the long run. Discretionary wealth is the amount one could afford to lose without suffering whatever one defines as a shortfall disaster.
By specifying the shortfall boundary as the zero-point of discretionary wealth, we place it infinitely far away in logarithmic terms from median discretionary wealth, out of reach for a log-normal distribution. Consequently, if we truly maximize expected log return of discretionary wealth, we will have the best possible growth in median wealth without an interim shortfall (after a warmup-period for the Central Limit Theorem to take hold). If, on the other hand, we maximize our truncated Taylor series estimate, with four terms, we convert an exact formula to a more heuristic guide. That is, we allow a residual probability of eventual shortfall; but we may by this means achieve something closer to maximum median discretionary wealth during our lifetimes.
The addition of the discretionary wealth hypothesis answers the two main objections to growth-optimality as a basis for investing, the practical and the theoretical. First, it is very often approximated by Markowitz’s mean-variance criterion. This will generally be true for diversified portfolios without large-scale use of derivatives or high leverage. In practice, there is no additional mathematical complexity except in cases where it is needed. Second, it responds to Merton and Samuelson’s theoretical critique, though in a surprising way.
Classical financial utility theory represents conservative investors as having utility functions that are more strongly curved. What we do here is the alternative, varying the apparent distance between two outcomes by changing the scaling of return from that on risky assets to that on discretionary wealth. For the utility theorist, we have said that all investors will be better off if they act as though their utility was given by the log of their discretionary wealth. Every investor is advised to have the same-shaped utility function, of the form log(w-c), where w-c is discretionary wealth. Note that total wealth w is usually more variable over time than is the shortfall point c.
How Discretionary Wealth Affects Risk
The size of the risky portfolio asset will not in general be the same as the size of discretionary wealth. Their ratio, which may be greater or less than one, will be termed implicit leverage. An investor fully invested in stocks but with a discretionary wealth fraction D of only 20% has an implicit leverage of 5 times.
For a given implicit leverage, one could in principle re-scale each return on risky assets to an equivalent return on discretionary assets, and from there directly estimate mean log return on discretionary assets. However, it is much more instructive for design purposes to start with the Taylor series representation of expected log return on risky assets as a base, as in Equation 1. Then one can observe the separate effects on expected log return, and thus preferences, of implicit leverage applied to the mean and to each central moment of the risky asset return. Implicit leverage and the manipulation of these statistical moments of return are the policy design parameters that confront investors.
The discretionary wealth hypothesis translates the relevant expected log return estimate to that shown in Equation 2. Note that rescaling Equation 1 for implicit leverage L is just the multiplication of both return mean and standard deviation (the square root of variance) by that leverage.
ln – the natural log function
r – fractional return, the conventional return measure
L – implicit leverage
E – mean r
V – variance of r
S – skewness of r
K – kurtosis of r, for a normal distribution K=3.
Equation 2 answers the question of when higher moments of return matter. The answer is not just when facing skewed or fat-tailed distributions of the return of underlying assets. It is the separate conjunction of each moment of underlying asset return with the increasing powers of the implicit leverage created by the shortfall constraints for a particular investor. Investors with higher implicit leverage should be more sensitive to variance, still more sensitive to negative skewness, and still more sensitive to the existence of fat-tails in the return distribution. This is a very fundamental result. It implies that investors with high implicit leverage in an investment environment that may have large third and fourth central moments of return should not be satisfied with return variance as the entire measure of risk.
III. BRIEF EXAMPLES
Appropriate Risk Aversion:
The Markowitz criterion can produce an efficient frontier of tradeoffs between mean and variance of portfolio return. It says nothing about which point on the frontier should be selected. Using the simplest version of Equation 2, one finds that the ideal implicit leverage is approximately the ratio of expected real return to variance, or E/V, and thus the proportion of wealth allocated to risky assets should be about D(E/V), where D is the fraction of assets considered discretionary. Setting that allocation equal to the one prescribed by Markowitz, we discover that the best point on the efficient frontier will be obtained if we set the Markowitz risk aversion coefficient equal, not to ½, but to 1/(2D). Specifying D is usually easier, and is never harder, than the alternative of specifying the Markowitz risk aversion coefficient directly.
Comparing Active Managers:
Information ratios are nearly ubiquitous in professional discussion of the performance of active investment managers. Very few investors know the restrictive assumptions necessary to make maximizing this ratio consistent with long-run growth in median wealth. It is not difficult to construct cases where the information ratio gives a ranking to investment policies or managers that is quite misleading for that goal.
That the information ratio takes no account of higher return moments clearly makes it inapplicable for evaluating a portfolio insurance program. But it can be misleading even in cases involving only return mean and variance. Here is a case in point.
Assume market returns in excess of cash have an annual mean of 6% and a standard deviation of 20%, arising from a log-normal distribution with appropriately translated mean and variance. Two active managers of a client’s entire all-equity portfolio are compared – one with an extra return of 6% and a tracking error of 8%, the other with an extra return of 5% and a tracking error of 1%. Log return and variance are appropriately translated and assumed independent of the market log return. The first manager’s commonly measured information ratio of 0.75 is far less than the second’s 5.0. Yet if we calculate mean log return for the total portfolio, it is clear that it is the first manager who offers superior long run results while avoiding shortfall for any investor whose discretionary wealth fraction is greater than about 0.35.
Avoiding Dynamic Hedging Pitfalls:
Anecdotally, investors employing dynamic hedging have often found that they become stuck near a protective floor. This phenomenon might be attributed to unfortunate jumps in pricing, but there is a more general explanation.
Black and Perold’s  CPPI procedure allows the production of option-like positions through trading rules. This procedure dynamically allocates total assets to a risky asset in proportion to a multiple of the “cushion,” the difference between current wealth and a desired protective floor. This produces an effect similar to owning a put option, and if employed without borrowing constraints, also a call option. If the return distribution statistics are constant, and the contribution of higher moments of return to results is minimal, it is similar to the procedure proposed in this paper. The crucial distinction is that CPPI multiples are driven to high ratios by the need to create strong option effects rather than the moderate levels appropriate for best long-run median wealth.
Using Equation 2, it becomes obvious that CPPI multiples of 5 or more necessary to produce a saleable option effect cause too much implicit leverage for the cushion, resulting in a negative mean log return and eventual entrapment near the protected floor. This phenomenon is only partially offset by typical constraints against outside borrowing. Such constraints lower implicit leverage after a sequence of good returns, allowing most investors to escape to safer regions unless early results are negative. However, a substantial fraction of investors will be left behind.
The purpose of this paper has been to make available to every interested portfolio manager a method to better manage the long-term results of investment policies.
We began to build on the Markowitz single-period mean-variance criterion by examining Kelly’s exponential growth model. His criterion of maximizing single-period expected log returns achieves maximum long-run median wealth if no shortfall intervenes to interfere with the process. We used Markowitz’s logic to translate mean log return to a Taylor series representation in terms of the statistical moments of return. By extending the series to four terms, rather than the two at which Markowitz stopped, we showed how the concept of risk extends naturally beyond variance to include negative skewness and excess kurtosis, or fat-tailed return distributions, thus encompassing downside risk.
To account for the needs of conservative investors who must avoid shortfalls along the way to the long run, we re-mapped returns on total wealth to amplified returns on discretionary wealth, the wealth available before shortfall. We represent conservatism not by greater curvature of an abstract utility function, but by greater distance between investment outcomes based on measuring returns against fractions of wealth considered discretionary. For example, a 10% loss for an investor who can lose no more than 20% of total assets without shortfall is represented as a 50% loss.
The addition of the discretionary wealth hypothesis generalizes Markowitz’s framework to investment policy governing both long-run outcomes and risk features not captured by return variance. What is new is its focus on the interaction between shortfall boundaries and leverage in determining suggested separate investor preferences for return mean, variance, skew, and kurtosis. What makes it practically very useful is its relative simplicity.
The author is grateful to Gary Gastineau, Campbell Harvey, Blake LeBaron, Ben Shoval, Dan Rie, Mark Kritzman, Michael Wilcox and Richard Holmes for their comments and encouragement.
Barberis, Nicholas. “Investing for the Long Run when Returns Are Predictable.” Journal of Finance, 55 (2000), pp. 225-264.
Bekaert, Geert, Claude Erb, Campbell Harvey and Tadas Viskanta. “Distributional Characteristics of Emerging Market Returns and Asset Allocation.” Journal of Portfolio Management, 24 (Winter 1997), pp. 102-116.
Black, Fischer, and André Perold. “Theory of Constant Proportion Portfolio Insurance.” Journal of Economic Dynamics and Control, 16 (1992), pp. 403-426.
Hakansson, Nils H. “Multi-Period Mean-Variance Analysis: Toward A General Theory of Portfolio Choice.” Journal of Finance, 26 (1971), pp. 857-884.
Harvey, Campbell R., and Akhtar Siddique. “Conditional Skewness in Asset Pricing Tests”, Journal of Finance, 55 (2000), pp. 1263-1295.
Kelly, J. L., Jr. “A New Interpretation of Information Rate.” Bell Systems Technical Journal, 35 (1956), pp. 917-926.
Kritzman, Mark., and Don Rich. “The Mismeasurement of Risk.” Financial Analysts Journal, (May-June 2002) pp.91-99.
Markowitz, Harry .M. Portfolio Selection: Efficient Diversification of Investments. New Haven, Conn.:Yale University Press,1959.
Merton, Robert C., and Paul A. Samuelson. “Fallacy of the Log-Normal Approximation To Optimal Portfolio Decision-Making Over Many Periods.” Journal of Financial Economics, 95 (1974), pp. 67-94.
Wilcox, Jarrod. “Better Risk Management.” Journal of Portfolio Management, 26 no. 4 (Summer 2000), pp. 53-64.