Semivariance Excel Example

The most in-demand topic on this blog is for an Excel semivariance example. I have posted mathematical semivariance formulas before, but now I am providing a description of exactly how to compute semivariance in “vanilla” Excel… no VBA required.

The starting point is row D. Cell D$2 contains average returns of over the past 36 months. The range D31:D66 contains those returns.  Thus the contents of D$2 are simply:

=AVERAGE(D31:D66)

This leads us to the semivariance formula:

{=SQRT(12)*(SQRT(SUM(IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0))/(COUNT(D31:D66-1))))}

We will now examine each building block of this formula starting with

IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0)

We only want to measure “dips” below the mean return. For all the observations that “dip” below the mean we take the square of the dip, otherwise we return zero. Obviously this is a vector operation, the IF function returns a vector of values.

Next we divide the resulting vector by the number of observations (months) minus 1. We can simply COUNT the number of observations with COUNT(D31:D66-1).  [NOTE 1: The minus 1 means we are taking the semivariance of a sample, not a population. NOTE 2: We could just as easily taken the division “outside” the SUM — the result is the same either way.]

Next is the SUM. The following formula is the monthly semivariance of our returns in row D:

{=SUM(IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0))/(COUNT(D31:D66-1))}

You’ll notice the added curly braces around this formula. This specifies that this formula should be treated as a vector (matrix) operation.  The curly braces allow this formula to stand alone.  The way the curly braces are applied to a vector (or matrix) formula is to hit <CTRL><SHIFT><ENTER> rather than just <ENTER>. Hitting <CTRL><SHIFT><ENTER> is required after every edit.

We now have monthly semivariance. If we wanted annual semivariance we could simply multiply by 12.

Often, however, we ultimately want annual semi-deviation (also called semi-standard deviation) for computing things like Sortino ratios, etc. Going up one more layer in the call stack brings us to the SQRT operation, specifically:

{=SQRT(SUM(IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0))/(COUNT(D31:D66-1)))}

This is monthly (downside) semi-deviation. We are just one step away from computing annual semi-deviation. That step is multiplying by SQRT(12), which brings us back to the big full formula.

There it is in a nutshell. You now have the formulas to compute semivariance and semi-deviation in Excel.

 

 

Advertisement

How to Write a Mean-Variance Optimizer (Part III)… In R

Parts 1 and 2 left a trail of breadcrumbs to follow.  Now I provide a full-color map, a GPS, and local guide.  In other words the complete solution in the R statistical language.

Recall that the fast way to compute portfolio variance is:

The companion equation is rp = wTrtn, where rtn is a column vector of expected returns (or historic returns) for each asset.  The first goal is to find find w0 and wn. w0 minimizes variance regardless of return, while wn maximizes return regardless of variance.  The goal is to then create the set of vectors {w0,w1,…wn} that minimizes variance for a given level of expected return.

I just discovered that someone already wrote an excellent post that shows exactly how to write an MVO optimizer completely in R. Very convenient!  Enjoy…

http://economistatlarge.com/portfolio-theory/r-optimized-portfolio

How to Write a Mean-Variance Optimizer: Part 1

The Equation Everyone in Finance Show Know, but Many Probably Don’t!

Here it is:

… With thanks to codecogs.com which makes it really easy to write equations for the web.

This simple matrix equation is extremely powerful.  This is really two equations.  The first is all you really need.  The second is just merely there for illustrative purposes.

This formula says how the variance of a portfolio can be computed from the position weights wT = [w1 w2 … wn] and the covariance matrix V.

  • σii ≡ σi2 = Var(Ri)
  • σij ≡ Cov(Ri, Rj) for i ≠ j

The second equation is actually rather limiting.  It represents the smallest possible example to clarify the first equation — a two-asset portfolio.  Once you understand it for 2 assets, it is relatively easy to extrapolate to 3-asset portfolios, 4-asset portfolios, and before you know it, n-asset portfolios.

Now I show the truly powerful “naked” general form equation:

This is really all you need to know!  It works for 50-asset portfolios. For 100 assets. For 1000.  You get the point. It works in general. And it is exact. It is the E = mc2 of Modern Portfolio Theory (MPT).  It at least about 55 years old (2014 – 1959), while E = mc2 is about 99 years old (2014 – 1915).  Harry Markowitz, the Father of (M)PT simply called it “Portfolio Theory” because:

There’s nothing modern about it.

 

Yes, I’m calling Markowitz the Einstein of Portfolio Theory AND of finance!  (Now there are several other “post”-Einstein geniuses… Bohr, Heisenberg, Feynman… just as there are Sharpe, Scholes, Black, Merton, Fama, French, Shiller, [Graham?, Buffet?]…)   I’m saying that a physicist who doesn’t know E = mc2 is not much of a physicist. You can read between the lines for what I’m saying about those that dabble in portfolio theory… with other people’s money… without really knowing (or using) the financial analog.

Why Markowitz is Still “The Einstein” of Finance (Even if He was “Wrong”)

Markowitz said that “downside semi-variance” would be better.  Sharpe said “In light of the formidable
computational problems…[he] bases his analysis on the variance and standard deviation.”

Today we have no such excuse.  We have more than sufficient computational power on our laptops to optimize for downside semi-variance, σd. There is no such tidy, efficient equation for downside semi-variance.  (At least not that anyone can agree on… and none that that is exact in any sense of any reasonable mathematical definition of the word ‘exact’.)

Fama and French improve upon Markowitz (M)PT [I say that if M is used in MPT, it should mean “Markowitz,” not “modern”, but I digress.] Shiller, however, decimates it.  As does Buffet, in his own applied way.  I use the word decimate in its strict sense… killing one in ten.  (M)PT is not dead; it is still useful.  Diversification still works; rational investors are still risk-averse; and certain low-beta investments (bonds, gold, commodities…) are still poor very-long-term (20+ year) investments in isolation and relative to stocks, though they still can serve a role as Markowitz Portfolio Theory suggests.

Wanna Build your Own Optimizer (for Mean-Return Variance)?

This blog post tells you most of the important bits.  I don’t really need to write part 2, do I?   Not if you can answer these relatively easy questions…

  • What is the matrix expression for computing E(Rp) based on w?
  • What simple constraint is w subject to?
  • How does the general σp2 equation relate to the efficient frontier?
  • How might you adapt the general equation to efficiently compute the effects of a Δw event where wi increases and wj decreases?  (Hint “cache” the wx terms that don’t change,)
  • What other constraints may be imposed on w or subsets (asset categories within w)?  How will you efficiently deal with these constraints?
  • Is short-selling allowed?  What if it is?
  • OK… this one’s a bit tricky:  How can convex optimization methods be applied?

If you can answer these questions, a Part 2 really isn’t necessary is it?

Clover Patterns Show How Portfolios Manage Risk

Covariance illustration

Illustration of Classic Covariance.

The red and green “clover” pattern illustrates how traditional risk can be modeled.  The red “leaves” are triggered when both the portfolio and the “other asset” move together in concert.  The green leaves are triggered when the portfolio and asset move in opposite directions.

Each event represents a moment in time, say the closing price for each asset (the portfolio or the new asset).  A common time period is 3-years of total-return data [37 months of price and dividend data reduced to 36 monthly returns.]

Plain English

When a portfolio manager considers adding a new asset to an existing portfolio, she may wish to see how that asset’s returns would have interacted with the rest of the portfolio.  Would this new asset have made the portfolio more or less volatile?  Risk can be measured by looking at the time-series return data.  Each time the asset and the portfolio are in the red, risk is added. Each time they are in the green, risk is subtracted.  When all the reds and greens are summed up there is a “mathy” term for this sum: covariance.  “Variance” as in change, and “co” as in together. Covariance means the degree to which two items move together.

If there are mostly red events, the two assets move together most of the time.  Another way of saying this is that the assets are highly correlated. Again, that is “co” as in together and “related” as in relationship between their movements. If, however, the portfolio and asset move in opposite directions most of the time, the green areas, then the covariance is lower, and can even be negative.

Covariance Details

It is not only the whether the two assets move together or apart; it is also the degree to which they move.  Larger movements in the red region result in larger covariance than smaller movements.  Similarly, larger movements in the green region reduce covariance.  In fact it is the product of movements that affects how much the sum of covariance is moved up and down.  Notice how the clover-leaf leaves move to the center, (0,0) if either the asset or the portfolio doesn’t move at all.  This is because the product of zero times anything must be zero.

Getting Technical: The clover-leaf pattern relates to the angle between each pair of asset movements.  It does not show the affect of the magnitude of their positions.

If the incremental covariance of the asset to the portfolio is less than the variance of the portfolio, a portfolio that adds the asset would have had lower overall variance (historically).  Since there is a tenancy (but no guarantee!) for asset’s correlations to remain somewhat similar over time, the portfolio manager might use the covariance analysis to decide whether or not to add the new asset to the portfolio.

Semi-Variance: Another Way to Measure Risk

 

Semi-variance visualization

Semi-variance Visualization

After staring at the covariance visualization, something may strike you as odd — The fact that when the portfolio and the asset move UP together this increases the variance. Since variance is used as a measure of risk, that’s like saying the risk of positive returns.

Most ordinary investors would not consider the two assets going up together to be a bad thing.  In general they would consider this to be a good thing.

So why do many (most?) risk measures use a risk model that resembles the red and green cloverleaf?  Two reasons: 1) It makes the math easier, 2) history and inertia. Many (most?) textbooks today still define risk in terms of variance, or its related cousin standard deviation.

There is an alternative risk measure: semi-variance. The multi-colored cloverleaf, which I will call the yellow-grey cloverleaf, is a visualization of how semi-variance is computed. The grey leaf indicates that events that occur in that quadrant are ignored (multiplied by zero).  So far this is where most academics agree on how to measure semi-variance.

Variants on the Semi-Variance Theme

However differences exist on how to weight the other three clover leaves.  It is well-known that for measuring covariance each leaf is weighted equally, with a weight of 1. When it comes to quantifying semi-covariance, methods and opinions differ. Some favor a (0, 0.5, 0.5, 1) weighting scheme where the order is weights for quadrants 1, 2, 3, and 4 respectively. [As a decoder ring Q1 = grey leaf, Q2 = green leaf, Q3 = red leaf, Q4 = yellow leaf].

Personally, I favor weights (0, 3, 2, -1) for the asset versus portfolio semi-covariance calculation.  For asset vs asset semi-covariance matrices, I favor a (0, 1, 2, 1) weighting.  Notice that in both cases my weighting scheme results in an average weight per quadrant of 1.0, just like for regular covariance calculations.

 

Financial Industry Moving toward Semi-Variance (Gradually)

Semi-variance more closely resembles how ordinary investors view risk. Moreover it also mirrors a concept economists call “utility.” In general, losing $10,000 is more painful than gaining $10,000 is pleasurable. Additionally, losing $10,000 is more likely to adversely affect a person’s lifestyle than gaining $10,000 is to help improve it.  This is the concept of utility in a nutshell: losses and gains have an asymmetrical impact on investors. Losses have a bigger impact than gains of the same size.

Semi-variance optimization software is generally much more expensive than variance-based (MVO mean-variance optimization) software.  This creates an environment where larger investment companies are better equipped to afford and use semi-variance optimization for their investment portfolios.  This too is gradually changing as more competition enters the semi-variance optimization space.  My guestimate is that currently about 20% of professionally-managed U.S. portfolios (as measured by total assets under management, AUM) are using some form of semi-variance in their risk management process.  I predict that that percentage will exceed 50% by 2018.

 

Data Science: Shrinking Big Data into Meaningful Data

In this post I explain how less is more when it comes to using “big data.”  The best data is concise, meaningful, and actionable. It is both an art and a science to turn large, complex data sets into meaningful, useful information. Just like the later paintings of Monet capture the impression of beauty more effectively than a mere photograph, “small data” can help make sense of “big data.”

Monet Painting of the sun through the fog in London

Claude Monet, London

There is beauty in simplicity, but capturing simplicity is not simple. A young child’s drawings are simple too, but they very unlikely to capture light and mood like Monet did.

Worry not. There will be finance and math, but I will save the math for last, in an attempt to retain the interest of non “mathy” readers.

The point of discussing impressionist painting is show that reduction — taking things away — can be a powerful tool.  In fact, filtering out “noise” is both useful and difficult. A great artist can filter out the noise without losing the fidelity of the signal.  In this case, the “signal” is emotion and color and light as as perceived by a master painter’s mind.

 

Applying Impressionism to Finance

Massive amounts of data are available to the financial professional. Two questions I have been asking at Sigma1 since the beginning are 1) How to use “Big Compute” to crunch that data into better portfolios? 2) How to represent that data to humans — both investment pros and lay folk whose money is being invested?  After considerable thought, brainstorming, listening, and learning, I think we are beginning to construct a preliminary picture of how to do that — literally.

Portfolio Asset Relationships

Relationships between Portfolio Assets

While not a beautiful as a Monet painting, the picture above is worth a thousand words (and likely many thousands of dollars over time) to me.  The assets above constitute all of the current non-CASH building blocks of my personal retirement portfolio.  While simple, the above image took considerable software development effort and literally millions of computations to generate [millions is very do-able with computers].

This simple-looking image conveys complex information in an easy-to-understand form. The four colors — red, green, blue, and purple — convey four asset types: fixed income, US stocks, international stocks, and convertible securities. The angle between any two asset lines conveys the relative correlation between the pair.  In portfolio construction larger angles are better.  Finally the length of the line represents the “effectiveness” with which each asset represents its “angular position” within the portfolio (in addition to other information).

With Powerful Data, First Comes Humility, Next Comes Insight

I have applied the same visualizations to other portfolios, and I see that, according to my software, many of the assets in professionally-managed portfolios exhibit superior “robustness” to my own.  As someone who prides myself in having a kick-ass portfolio, this information is humbling, and took some time to absorb from an ego standpoint.  But, having gotten over it, I now see potential.

I have seen portfolios that have a significantly wider angle than my current portfolio.  What does this mean to me?  It means I will begin looking for assets to augment my personal portfolio.  Before I do that let me share some other insights. The plot combines covariance matrix data for the 16 assets in the portfolio, as well as semi-variance data for each asset.  Without getting to “mathy” yet, the data visualization software reduces 136 pieces of data down to 32 (excluding color). The covariance matrix and semi-variance calculation itself are also a reducers in that they combines 5 years monthly total-return data — 976 data points down to 120 unique covariance numbers and 16 semi-deviation numbers. Taking 976 down to 32 results in a compression ratio of 30.5:1.

Finally, as it currently stands, the visualization software and resulting plot say nothing about expected return.  The plot focuses solely on risk mitigation at the moment.  Naturally, I intend to change that.

Time for the Math and Finance — Consider Yourself Warned

I mentioned a 30.5:2 (71:2) compression ratio. Just as music and other data, other information, including financial information can be compressed.  However, only so much compression can be achieved in lossless manner.  In audio compression researchers have learned which portions of music and other audio can be “lost” without the listener telling the difference.  There is a field of psychoacoustics around doing just that — modeling what the human ear (and brain) can hear, and what gets “masked” by various physiological factors.

Even more important that preserving fidelity is extracting meaning. One way of achieving that is by removing “noise.” The visualization software performs significant computation to maintain as much angular fidelity as possible. As it optimizes angles, it keeps track of total error vis-a-vis the covariance matrix. It also keeps track of individual assets error (the reciprocal of fitness — fit versus lack of fit).

The real alchemy comes from the line-length computation.  It combines semi-variance data with various fitness factors to determine each asset line length.

Just like Mercator projections for maps incur unavoidable error when converting from a 3-D globe to a 2-D map, the portfolio asset visualizations introduce error as well.  If one thinks of just the correlation matrix and semi-variance data, each asset has a dimensionality of 8.5 (in the case of 16 assets).  Reducing from 8.5-D to 2-D is a complex process, and there are an infinite number of ways to perform such an operation!  The art and [data] science is to enhance the “signal” while stripping away the “noise.”

The ultimate goals of portfolio data visualization technology are:

1) Transform raw data into actionable insight

2) Preserve sufficient fidelity of relevant data such that the “map” can be used to reliably get to the desired “destination”

I believe that the first goal has been achieved.  I know what actions to take… trying various other securities to find those that can build a “higher-angle”, and arguably more robust, more resilient investment portfolio.

However, the jury is still out on the degree [no pun intended] to which goal #2 has or has not been achieved.  Does this simple 2-D map help portfolio builders reliably and consistently navigate the 8+ dimensional portfolio space?

What about 3-D Modelling and Visualization?

I started working with 2-D for one key reason — I can easily share 2-D images with readers and clients alike.  I want feedback on what people like and dislike about the visuals. What is easy to understand, what is not?  What is useful to them, and what isn’t?  Ironing out those details in 2-D is step 1.

Of course I am excited by 3-D. Most of the building blocks are in my head, and I can heavily leverage the 2-D algorithms.  I am, however, holding off for now. I am waiting for feedback from readers and clients alike.  I spend a lot of time immersed in the language of math, statistics, and finance.  This can create a communication gap that is best mitigated through discussion with other people with other perspectives.  I wish to focus on 2-D for a while to learn more about market needs.

That being said, it is hard to resist creating a 3-D portfolio asset visualizer. The geek in me is extremely curious about how much the error terms will reduce when given a third degree of freedom to work with.

The bottom line is: Please give me any feedback: positive, negative, technical, aesthetic, etc. This is just the start. I am extremely enthusiastic about where this journey will take me and my company.

Disclosure and Disclaimer

Securities mentioned in this post are holdings in my personal retirement accounts (e.g. 401K, IRA, Roth IRA) as of the day of initial publication of this post. The purpose of this post is to illustrate features of Sigma1 Financial software. This is NOT investment advice, and NOT a recommendation to buy, sell, or hold any securities. Please refer to the “Disclaimer” Tab of the main page of this site for further information.

Surpassing the Frontier?

Suppose you have the tools to compute the mean-return efficient frontier to arbitrary (and sufficient) precision — given a set of total-return time-series data of asset/securities.  What would you do with such potential?

I propose that the optimal solution is to “breach the frontier.”  Current portfolios provide a historic reference. Provided reference/starting point portfolios have all (so far) provided sufficient room for meaningful and sufficient further optimization, as gauged by, say, improved Sortino ratios.

Often, when the client proposes portfolio additions, some of these additions allow the optimizer to push beyond the original efficient frontier (EF), and provide improved Sortino ratios. Successful companies contact  ∑1 in order to see how each of their portfolios:

1) Land on a risk-versus-reward (expected-return) plot
2) Compare to one or more benchmarks, e.g. the S&P500 over the same time period
3) Compare to an EF comprised of assets in the baseline portfolio

Our company is not satisfied to provide marginal or incremental improvement. Our current goal is provide our client  with more resilient portfolio solutions. Clients provide the raw materials: a list of vetted assets and expected returns.  ∑1 software then provides near-optimal mix of asset allocations that serve a variety of goals:

1) Improved projected risk-adjusted returns (based on semi-variance optimization)
2) Identification of under-performing assets (in the context of the “optimal” portfolio)
3) Identification of potential portfolio-enhancing assets and their asset weightings

We are obsessed with meaningful optimization. We wish to find the semi-variance (semi-deviation) efficient frontier and then breach it by including client-selected auxiliary assets. Our “mission” is  as simple as that — Better, more resilient portfolios

Approaching the Frontier

Disclosure: The purpose of this post is to show how I, personally, use the HALO Portfolio Optimizer software to manage my personal portfolio. It is not investment advice! I use my personal opinions about which assets to select and expected one-year returns into the optimizer configuration.  The optimizer then provides an efficient frontier (EF) based on historic total-return data and my personal expected-return estimates.

I use other software (User Tuner) to approach the EF, while limiting the number and size of trades (minimizing churn and trading costs).  Getting exactly to the EF would require trading (buying or selling) every asset in my portfolio — which would cost approximately $159 in trading costs for 18 trades. Factoring in bid/ask spreads the cost would be even higher.  However, by being frugal about trades, I was able to limit the number of trades to 6 while getting much closer to the EF.

Past performance is no guarantee of future performance, nor is past volatility necessarily indicative of future volatility.  Nonetheless, I am making the personal decision to use past volatility information to possibly increase the empirical diversification of my retirement portfolio with the goal of increasing risk-adjusted return.  Time will tell whether this approach was successful or not.

In my last post I blogged about reallocating my entire retirement portfolio closer to the MVO efficient frontier computed by the HALO Portfolio Optimizer.  The zoomed in plot tells the story to date:

5_7_2014_realloc

The “objective space” plot is zoomed in and only shows a small portion of the efficient frontier. As you can see the black X is closer to the efficient frontier than the blue diamond, but naturally the dimensions are not the same. Using a risk-free rate of 0.5% the predicted Sharpe ratio has improved from 0.68 to 0.75 – a marked increase of about 10.3%.  [If you crunch the numbers yourself, don’t forget to annualize σ.]

While a 10.3% Sharpe ratio expected improvement is very significant, there is obviously room for compelling additional improvement. An expected Sharpe ratio of just north of 0.8 is attainable.

The primary reason the portfolio has not  yet moved even closer to the efficient frontier is due to 18.6% of the retirement portfolio being tied up in red tape as a result of my recent voluntary severance or “buy-out” from Intel Corporation. [ Kudos to Intel for offering voluntary severance to all of my local coworkers and me.  It is a much more compassionate method of workforce reduction than layoffs!  I consider the package offered to me reasonably generous, and I gladly took the opportunity to depart and begin working full time building my start up.]

Time to Get Technical

I won’t finish without mentioning a few important technical details. The points in the objective space (of monthly σ on the horizontal and expected annual return on the vertical) can be viewed as dependent variables of the (largely) independent variables of asset weights. Such points include the blue diamond, the black X, and all the red triangles on the efficient frontier. I often call the (largely) independent domain of asset allocation weights the “search space”, and the weightings in the search space that result in points on the efficient frontier the “solution space.”

One way to measure the progress from the blue diamond to the X is via improvement in the Sharpe ratio, which implicitly factors in the CAL, or the CML for the tangent CAL.  As “X” approaches the red line visually it also approaches the efficient frontier quantitatively and empirically.  However, X can make significant progress towards the efficient frontier, say point EF#9 specifically, with little or no “progress” in the portfolio weights from the blue diamond to the black X.

“Progress” in the objective space is reasonably straight forward — just use Sharpe ratios, for instance. However measuring “progress” in the asset allocation (weight) space is perhaps less clear. Generally, I prefer the use of the L1-norms of differences of the asset-weight vectors Wo (corresponding to original portfolio weight; e.i. the blue diamond), Wx, and Wef_n. The distance of from the blue diamond  in search space to the red triangle #9 is denoted as |Wef_9 – Wo|1 while the distance from X in the search space is |Wef_9Wx|1.  Interestingly, the respective values are 0.572 and 0.664.  Wis, by this measure, actually further from Wef_9 in search space, but closer in objective space!

I sometimes refer to these as the “Hamming distances” (even though “Hamming distance” is typically applied to differences in binary codes or character inequality counts of two strings of characters.) It is simply easier to say the “Hamming distance from Wx to Wef_9” than the “ell-one norm of the difference of Wx and Wef_9.”

I have been working on an utility temporarily called “user tuner” that makes navigating in both the search space and the objective space quicker, easier and more productive. More details to follow in a future post.

Why Not Semi-Variance Optimization?

Frequent readers will know that I believe that mean semi-variance optimization (MSVO or SVO) is superior to vanilla MVO. So why am I starting with MVO? Three reasons:

  • To many, MVO is less scary because it is somewhat familiar. So I’m starting with the familiar “basics.”
  • I wanted to talk about Sharpe ratios first, because again they are more familiar than, say, Sortino ratios.
  • I wanted to use “User Tuner”, and I originally coded it for MVO (though that is easily remedied).

However, asymptotically refining allocation of my entire portfolio to get extremely close to the MVO efficient frontier is only phase 1.  It is highly likely I will compute the SVO efficient frontier next and use a slightly modified “User Tuner” to approach the mean semi-variance efficient frontier… Likely in the next month or two, once my 18.6% of assets are freed up.

Portfolio-Optimization Plots

I am happy to announce that the latest version of the HALO Portfolio-Optimization Suite is now available.  Key features include:

  • Native asset constraint support
  • Native asset-category constraint support
  • Dramatic run-time improvements of 2X to over 100X

Still supported are user-specified risk models, including semi-variance and max-drawdown.  What has been temporarily removed (based on minimal client interest) is 3-D 2-risk modelling and optimization.  This capability may be re-introduced as a premium feature, pending client demand.

Here is a quick screenshot of a 20-asset, fixed-income portfolio optimization.  The “risk-free” rate used for the tangent capital allocation line (CAL) is 1.2% (y-intercept not shown), reflecting a mix of T-Bills and stable value funds.  Previously this optimization took 18 minutes on an $800 laptop computer.  Now, with the new HALO software release, it runs in only 11 seconds on the same laptop.

 

Fixed income with capital allocation line

Optimized Fixed-Income (only) Portfolio.

Choices, Opportunities, and Solutions

To date I’ve invested approximately 800 hours developing and testing the heuristics and algorithms behind HALO. Finding exact solutions (with respect to expected-return assumptions) to certain real-world portfolio-optimization problems can be solved. Finding approximate solutions to other real-world portfolio-optimization problems is relatively easy, but finding provably optimal solutions is currently “impossible”. The current advanced science and art of portfolio optimization involves developing methods to efficiently find nearly optimal solutions.

I believe that HALO represents a significant step forward in finding nearly-optimal solutions to generalized risk models for investment portfolios. The primary strengths of HALO are in flexibility and dimensionality of financial risk modeling. While HALO currently finds solutions that are almost identical to exact solutions for convex optimization problems; the true advantage of HALO is in the quality of solutions for non-convex portfolio-optimization problems

Do you know if your particular optimization metric can be articulated in canonical convex notation? I argue that HALO does not care.  If it can be, HALO will find a near-optimal solution virtually identical to the ideal convex optimization solution.  If it cannot be, and is indeed non-convex, HALO will find solutions competitive with other non-convex optimization methods.

It could be argued that “over-fitting” is a potential danger of optimal and near-optimal solutions. However, I argue that given a sufficiently diverse and under-constrained optimization task, over-fitting is less worrisome.   In other words, the quality of the inputs greatly influences the quality of the outputs.  One secret is to supply high-quality (e.g. asset expected return) estimates to the optimization problem.

The Best Financial Models for Insight and Prediction?

The best models are not the models that fit past data the best, they are the models that predict new data the best. This seems obvious, but a surprising number of business and financial decisions are based on best-fit of past data, with no idea of how well they are expected to correctly model future data.

Instant Profit, or Too Good to be True?

For instance, a stock analyst reports to you that they have a secret recipe to make 70% annualized returns by simply trading KO (The Coca-Cola Company).  The analyst’s model tells what FOK limit price, y, to buy KO stock at each market open.  The stock is then always sold with a market order at the end of each trading day.

The analyst tells you that her model is based on three years of trading data for KO, PEP, the S&P 500 index, aluminum and corn spot prices.  Specifically, the analyst’s model uses closing data for the two preceding days, thus the model has 10 inputs.  Back testing of the model shows that it would have produced 70% annualized returns over the past three years, or a whooping 391% total return over that time period.  Moreover, the analyst points out that over 756 trading days 217 trades would have been executed, resulting in profit a 73% of the time (that the stock is bought).

The analyst, Debra, says that the trading algorithm is already coded, and U.S. markets open in 20 minutes. Instant profit is only moments away with a simple “yes.” What do you do with this information?

Choices, Chances, Risks and Rewards

You know this analyst and she has made your firm’s clients and proprietary trading desks a lot of money. However you also know that, while she is thorough and meticulous; she is also bold and aggressive. You decide that caution is called for, and allocate a modest $500,000 to the KO trading experiment.  If after three months, the KO experiment nets at least 7% profit, you’ll raise the risk pool to $2,000,000.  If, after another three months, the KO-experiment generates at least 7% again; you’ll raise the risk pool to $10,000,000 as well as letting your firms best clients in on the action.

Three months pass, and the KO-experiment produces good results: 17 trades, 13 winners, and a 10.3% net profit. You OK raising the risk pool to $2,000,000.  After only 2 months the KO-experiment has executed 13 trades, with 10 winners, and a 11.4% net profit.  There is a buzz around the office about the “knock-out cola trade”, and brokers are itching to get in on it with client funds. You are considering giving the green light to the “Full Monty,” when Stan the Statistician walks into your office.

Stan’s title is “Risk Manager”, but people around the office call him Stan the Statistician, or Stan the Stats Man, or worse (e.g. “Who is the SS going to s*** on today?”)  He’s actually a nice guy, but most folks consider him an interloper.  And Stan seems to have clout with corporate, and he has been known to use it to shut down trades. You actually like Stan, but you already know why he is stopping by.

Stan begins probing about the KO-trade.  He asks what you know.  You respond that Debra told you that the model has an R-squared of 0.92 based on 756 days of back-tested data.  “And now?” asks Stan.  You answer, “a 76% success rate, and profits of around 21% in 5 months.”  And then Stan asks, “What is the probability that that profit is essentially due to pure chance?”

You know that the S&P 500 historically has over 53% “up” days, call it 54% to be conservative. So stocks should follow suit.  To get exactly 23 wins on KO out of 30 tries is C(30, 23)*0.54^23*(0.46)^7 = 0.62%. To get at least 23 (23 or more wins) brings the percentage up to about 0.91%.  So you say 1/0.091 or about one in 110.

Stan says, “Your math is right, but your conclusion is wrong.  For one thing, KO is up 28% over the period, and has had 69% up days over that time.”  You interject, “Okay, wait one second… so my math now says about 23%, or about a 1 in 4.3 chance.”

Stan smiles, “You are getting much closer to the heart of the matter. I’ve gone over Debra’s original analysis, and have made some adjustments. My revised analysis shows that  there is a reasonable chance that her model captures some predictive insight that provides positive alpha.”  Stan’s expression turns more neutral, “However, the confidence intervals against the simple null hypothesis are not as high as I’d like to see for a big risk allocation.”

Getting all Mathy? Feedback Requested!

Do you want to hear more from “Stan”? He is ready to talk about adjusted R-squared, block-wise cross-validation, and data over-fitting. And why Debra’s analysis, while correct, was also incomplete. Please let me know if you are interested in hearing more on this topic.

Please let me know if I have made any math errors yet (other than the overtly deliberate ones).  I love to be corrected, because I want to make Sigma1 content as useful and accurate as possible.