A Better Robo Advisory

Building a Better Robo Advisor

The more we learned about the current crop of robo advisory firms, the more we realized we could do better. This brief blog post hits the high points of that thinking.

Not Just the Same Robo Advisory Technology

It appears that all major robo advisory companies use 50+ year-old MPT (modern portfolio theory). At Sigma1 we use so-called post-modern portfolio theory (PMPT) that is much more current. At the heart of PMPT is optimizing return versus semivariance. The details are not important to most people, but the takeaway is the PMPT, in theory, allows greater downside risk mitigation and does not penalize portfolios that have sharp upward jumps.

Robo advisors, we infer, must use some sort of Monte Carlo analysis to estimate “poor market condition” returns. We believe we have superior technology in this area too.

Finally, while most robo advisory firms offer tax loss harvesting, we believe we can 1) set up portfolios that do it better, 2) go beyond just tax loss harvesting to achieve greater portfolio tax efficiency.

Advertisement

Semivariance Excel Example

The most in-demand topic on this blog is for an Excel semivariance example. I have posted mathematical semivariance formulas before, but now I am providing a description of exactly how to compute semivariance in “vanilla” Excel… no VBA required.

The starting point is row D. Cell D$2 contains average returns of over the past 36 months. The range D31:D66 contains those returns.  Thus the contents of D$2 are simply:

=AVERAGE(D31:D66)

This leads us to the semivariance formula:

{=SQRT(12)*(SQRT(SUM(IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0))/(COUNT(D31:D66-1))))}

We will now examine each building block of this formula starting with

IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0)

We only want to measure “dips” below the mean return. For all the observations that “dip” below the mean we take the square of the dip, otherwise we return zero. Obviously this is a vector operation, the IF function returns a vector of values.

Next we divide the resulting vector by the number of observations (months) minus 1. We can simply COUNT the number of observations with COUNT(D31:D66-1).  [NOTE 1: The minus 1 means we are taking the semivariance of a sample, not a population. NOTE 2: We could just as easily taken the division “outside” the SUM — the result is the same either way.]

Next is the SUM. The following formula is the monthly semivariance of our returns in row D:

{=SUM(IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0))/(COUNT(D31:D66-1))}

You’ll notice the added curly braces around this formula. This specifies that this formula should be treated as a vector (matrix) operation.  The curly braces allow this formula to stand alone.  The way the curly braces are applied to a vector (or matrix) formula is to hit <CTRL><SHIFT><ENTER> rather than just <ENTER>. Hitting <CTRL><SHIFT><ENTER> is required after every edit.

We now have monthly semivariance. If we wanted annual semivariance we could simply multiply by 12.

Often, however, we ultimately want annual semi-deviation (also called semi-standard deviation) for computing things like Sortino ratios, etc. Going up one more layer in the call stack brings us to the SQRT operation, specifically:

{=SQRT(SUM(IF((D31:D66-D$2)<0,(D31:D66-D$2)^2,0))/(COUNT(D31:D66-1)))}

This is monthly (downside) semi-deviation. We are just one step away from computing annual semi-deviation. That step is multiplying by SQRT(12), which brings us back to the big full formula.

There it is in a nutshell. You now have the formulas to compute semivariance and semi-deviation in Excel.

 

 

The Equation Everyone in Finance Should Know (MV Optimization: How To, Part 2)

As the previous post shows, it all starts with…

In order get close to bare-metal access to your compute hardware, use C.  In order to utilize powerful, tested, convex optimization methods use CVXGEN.  You can start with this CVXGEN code, but you’ll have to retool it…

  • Discard the (m,m) matrix for an (n,n) matrix. I prefer to still call it V, but Sigma is fine too.  Just note that there is a major difference between Sigma (the covariance-variance matrix) and sigma (individual asset-return variances matrix; the diagonal of Sigma).
  • Go meta for the efficient frontier (EF).  We’re going to iteratively generate/call CVXGEN with multiple scripts. The differences will be w.r.t the E(Rp).
  • Computing Max: E(Rp)  is easy, given α.  [I’d strongly recommend renaming this to something like expect_ret comprised of (r1, r2, … rn). Alpha has too much overloaded meaning in finance].
  • [Rmax] The first computation is simple.  Maximize E(Rp) s.t constraints.  This is trivial and can be done w/o CVXGEN.
  • [Rmin] The first CVXGEN call is the simplest.  Minimize σp2 s.t. constraints, but ignoring E(Rp)
  • Using Rmin and Rmax, iteratively call CVXGEN q times (i=1 to q) using the additional constraint s.t. Rp_i= Rmin + (i/(q+1)*(Rmax-Rmin). This will produce q+2 portfolios on the EF [including Rmin and Rmax].  [Think of each step (1/(q+1))*(Rmax-Rmin) as a quantization of intermediate returns.]
  • Present, as you see fit, the following data…
    • (w0, w1, …wq+1)
    • [ E(Rp_0), …E(Rp_(q+1)) ]
    • [ σ(Rp_0), …σ(Rp_(q+1)) ]

My point is that —  in two short blog posts — I’ve hopefully shown how easily-accessible advanced MVO portfolio optimization has become.  In essence, you can do it for “free”… and stop paying for simple MVO optimization… so long as you “roll your own” in house.

I do this for the following reasons:

  • To spread MVO to the “masses”
  • To highlight that if “anyone with a master’s in finance and computer can do MVO for free” to consider their quantitative portfolio-optimization differentiation (AKA portfolio risk management differentiation), if any
  • To emphasize that this and the previous blog will not greatly help with semi-variance portfolio optimization

I ask you to consider that you, as one of the few that read this blog, have a potential advantage.  You know who to contact for advanced, relatively-inexpensive SVO software. Will you use that advantage?

Binary Options and Test Taking

Most of the important financial industry tests (Series 6, 7, 24, 26, CFA I, CFA II, CFA III, etc) only have two possible binary outcomes: PASS or FAIL. Failure is a waste of time and money. Over-studying, however, can also waste time. (Studying for a PASS/FAIL test is investing in a binary “real option.”)

All of the material is worth knowing for someone, but some information is simply not relevant to everyone. For example, investment advisor reps don’t necessarily need to know all of the rules for broker dealer agents (and vise versa). Knowing the stuff that is relevant to you is more valuable than simply allowing you to pass a test.

That said, the goal is to PASS. And you’ve got a million other things to do. So what’s a quant to do? Get quantitative of course!

Quantitative Test Prep

Step 1: Find representative sample tests. All else hinges on this. Obtaining sample tests from multiple independent sources may help.

Step 2: Determine your average score on practice tests.

Step 3: Determine the standard deviation of your scores.

Step 4: Calculate the probability of achieving a passing score given your mean score and standard deviation.

Step 5: Decide the risk/reward and whether more study provides sufficient ROI.

Assuming normal distributions, I use the 68/95/99.7 rule. Regardless of the standard deviation, if your practice average is the same as the minimum score your chance of success is only 50%. Naturally, if your mean practice score is 1-sigma above the threshold for passing, your chance on the real test is 84% [1-(1-0.68)/2]. If your mean score is plus 2 sigma, your chance of passing is almost 98% [1-(1-0.95)/2].

This little exercise shows two possible ways to improve your expected pass rate. The obvious way is getting better with the material. The less obvious way is reducing your standard deviation. Can this second way be achieved? If so how?

Keeping in mind the four-answer multiple-choice format, the mean deviation is:

MD = 2*p*(1-p)

Where p is the probability of answering a particular question correctly. Per-question deviation (PQD) is highest for p=0.5 at 0.5. PQD is lowest when p=1 at 0. For random guessing, PQD is 0.375.

Increasing your pq to from random guess 0.25 to 0.5 for a given category q will increase your expected score, but will also increase sigma. Taking the first derivative of MD with respect to p gives: 2-4p. Because the range of p is [0,1] (arguably [0.25,1)) the best incremental decrease in MD is greatest near p=1.

Now, the test candidate must decide what the the d/dt(pqc(t)) is for each question category (where t is time spent studying that category).  Studying the categories (qc) with the highest d/dt(pqc(t)) will most efficiently improve the expected score. Further studying the categories with the maximum d/dt(pqc(t))*(4p-2) will reduce PQD and hence reduce test standard deviation.

Deeper Analysis of the Meta Problem

Naturally, this analysis only scratches the tactical surface of the “binary-test optimization meta problem.” [The test itself is the problem, the tactics are part of the meta-problem of optimizing generalized multiple-choice test prep]. Improving from p=0.8 to p=0.9 is clearly better than improving from p=0.4 to p=0.5 in terms of PQD reduction, and equal in terms of increase of expected score.

Also of relevance is PQD (modified) downside semi-deviation, which I will call PQDd. I’ll spare you the derivation; it turns out that:

PQDd = p*sqrt(2*(1-p))

This value peaks at p=2/3 with a value of 0.5443. PQDd slowly ascends as p goes from 0.25 up to 0.667, then falls pretty rapidly for values of p>0.8.

We care about the random variable S which represents the actual test score. S is a function of the mean expected score μ and standard deviation σ… in a normal distribution. What we really care about is Pr(S>=Threshold), the probability that our score meets or exceeds the minimum passing score.

PQD = PQDd only when p = 0, 0.5, or 1.  For p in (0,0.5) PQDd<PQD and for p in (0.5,1) PDQd>PQD. Even though it seems a bit strange for discrete binary distribution, p in (0,0.5) has positive skewness and p in (0.5,1) negative skewness.

In the “final” analysis the chance of passing, pr(S=>Threshold), depends on score mean, μ, and downside deviation, σd.  In turn σd depends on PQD and PQDd.

Summary and Conclusions

Theoretically, one’s best course of action is to 1) increase the average expected score and 2) reduce σd. If practical, the best and most efficient way to achieve both objectives simultaneously is to improve areas that are in the 60-75% range (p=0.6 to 0.75) to the mid to high 90% range (p>=0.95).  This may seem counter-intuitive, but the math is solid.

Caveats: This analysis is mostly an exercise in showing the value of statics, variance, and downside variance in an area outside of finance.  It shows that there is more than one way to approach to a goal; in this case passing a standardized test.

 

Clover Patterns Show How Portfolios Manage Risk

Covariance illustration

Illustration of Classic Covariance.

The red and green “clover” pattern illustrates how traditional risk can be modeled.  The red “leaves” are triggered when both the portfolio and the “other asset” move together in concert.  The green leaves are triggered when the portfolio and asset move in opposite directions.

Each event represents a moment in time, say the closing price for each asset (the portfolio or the new asset).  A common time period is 3-years of total-return data [37 months of price and dividend data reduced to 36 monthly returns.]

Plain English

When a portfolio manager considers adding a new asset to an existing portfolio, she may wish to see how that asset’s returns would have interacted with the rest of the portfolio.  Would this new asset have made the portfolio more or less volatile?  Risk can be measured by looking at the time-series return data.  Each time the asset and the portfolio are in the red, risk is added. Each time they are in the green, risk is subtracted.  When all the reds and greens are summed up there is a “mathy” term for this sum: covariance.  “Variance” as in change, and “co” as in together. Covariance means the degree to which two items move together.

If there are mostly red events, the two assets move together most of the time.  Another way of saying this is that the assets are highly correlated. Again, that is “co” as in together and “related” as in relationship between their movements. If, however, the portfolio and asset move in opposite directions most of the time, the green areas, then the covariance is lower, and can even be negative.

Covariance Details

It is not only the whether the two assets move together or apart; it is also the degree to which they move.  Larger movements in the red region result in larger covariance than smaller movements.  Similarly, larger movements in the green region reduce covariance.  In fact it is the product of movements that affects how much the sum of covariance is moved up and down.  Notice how the clover-leaf leaves move to the center, (0,0) if either the asset or the portfolio doesn’t move at all.  This is because the product of zero times anything must be zero.

Getting Technical: The clover-leaf pattern relates to the angle between each pair of asset movements.  It does not show the affect of the magnitude of their positions.

If the incremental covariance of the asset to the portfolio is less than the variance of the portfolio, a portfolio that adds the asset would have had lower overall variance (historically).  Since there is a tenancy (but no guarantee!) for asset’s correlations to remain somewhat similar over time, the portfolio manager might use the covariance analysis to decide whether or not to add the new asset to the portfolio.

Semi-Variance: Another Way to Measure Risk

 

Semi-variance visualization

Semi-variance Visualization

After staring at the covariance visualization, something may strike you as odd — The fact that when the portfolio and the asset move UP together this increases the variance. Since variance is used as a measure of risk, that’s like saying the risk of positive returns.

Most ordinary investors would not consider the two assets going up together to be a bad thing.  In general they would consider this to be a good thing.

So why do many (most?) risk measures use a risk model that resembles the red and green cloverleaf?  Two reasons: 1) It makes the math easier, 2) history and inertia. Many (most?) textbooks today still define risk in terms of variance, or its related cousin standard deviation.

There is an alternative risk measure: semi-variance. The multi-colored cloverleaf, which I will call the yellow-grey cloverleaf, is a visualization of how semi-variance is computed. The grey leaf indicates that events that occur in that quadrant are ignored (multiplied by zero).  So far this is where most academics agree on how to measure semi-variance.

Variants on the Semi-Variance Theme

However differences exist on how to weight the other three clover leaves.  It is well-known that for measuring covariance each leaf is weighted equally, with a weight of 1. When it comes to quantifying semi-covariance, methods and opinions differ. Some favor a (0, 0.5, 0.5, 1) weighting scheme where the order is weights for quadrants 1, 2, 3, and 4 respectively. [As a decoder ring Q1 = grey leaf, Q2 = green leaf, Q3 = red leaf, Q4 = yellow leaf].

Personally, I favor weights (0, 3, 2, -1) for the asset versus portfolio semi-covariance calculation.  For asset vs asset semi-covariance matrices, I favor a (0, 1, 2, 1) weighting.  Notice that in both cases my weighting scheme results in an average weight per quadrant of 1.0, just like for regular covariance calculations.

 

Financial Industry Moving toward Semi-Variance (Gradually)

Semi-variance more closely resembles how ordinary investors view risk. Moreover it also mirrors a concept economists call “utility.” In general, losing $10,000 is more painful than gaining $10,000 is pleasurable. Additionally, losing $10,000 is more likely to adversely affect a person’s lifestyle than gaining $10,000 is to help improve it.  This is the concept of utility in a nutshell: losses and gains have an asymmetrical impact on investors. Losses have a bigger impact than gains of the same size.

Semi-variance optimization software is generally much more expensive than variance-based (MVO mean-variance optimization) software.  This creates an environment where larger investment companies are better equipped to afford and use semi-variance optimization for their investment portfolios.  This too is gradually changing as more competition enters the semi-variance optimization space.  My guestimate is that currently about 20% of professionally-managed U.S. portfolios (as measured by total assets under management, AUM) are using some form of semi-variance in their risk management process.  I predict that that percentage will exceed 50% by 2018.

 

Data Science: Shrinking Big Data into Meaningful Data

In this post I explain how less is more when it comes to using “big data.”  The best data is concise, meaningful, and actionable. It is both an art and a science to turn large, complex data sets into meaningful, useful information. Just like the later paintings of Monet capture the impression of beauty more effectively than a mere photograph, “small data” can help make sense of “big data.”

Monet Painting of the sun through the fog in London

Claude Monet, London

There is beauty in simplicity, but capturing simplicity is not simple. A young child’s drawings are simple too, but they very unlikely to capture light and mood like Monet did.

Worry not. There will be finance and math, but I will save the math for last, in an attempt to retain the interest of non “mathy” readers.

The point of discussing impressionist painting is show that reduction — taking things away — can be a powerful tool.  In fact, filtering out “noise” is both useful and difficult. A great artist can filter out the noise without losing the fidelity of the signal.  In this case, the “signal” is emotion and color and light as as perceived by a master painter’s mind.

 

Applying Impressionism to Finance

Massive amounts of data are available to the financial professional. Two questions I have been asking at Sigma1 since the beginning are 1) How to use “Big Compute” to crunch that data into better portfolios? 2) How to represent that data to humans — both investment pros and lay folk whose money is being invested?  After considerable thought, brainstorming, listening, and learning, I think we are beginning to construct a preliminary picture of how to do that — literally.

Portfolio Asset Relationships

Relationships between Portfolio Assets

While not a beautiful as a Monet painting, the picture above is worth a thousand words (and likely many thousands of dollars over time) to me.  The assets above constitute all of the current non-CASH building blocks of my personal retirement portfolio.  While simple, the above image took considerable software development effort and literally millions of computations to generate [millions is very do-able with computers].

This simple-looking image conveys complex information in an easy-to-understand form. The four colors — red, green, blue, and purple — convey four asset types: fixed income, US stocks, international stocks, and convertible securities. The angle between any two asset lines conveys the relative correlation between the pair.  In portfolio construction larger angles are better.  Finally the length of the line represents the “effectiveness” with which each asset represents its “angular position” within the portfolio (in addition to other information).

With Powerful Data, First Comes Humility, Next Comes Insight

I have applied the same visualizations to other portfolios, and I see that, according to my software, many of the assets in professionally-managed portfolios exhibit superior “robustness” to my own.  As someone who prides myself in having a kick-ass portfolio, this information is humbling, and took some time to absorb from an ego standpoint.  But, having gotten over it, I now see potential.

I have seen portfolios that have a significantly wider angle than my current portfolio.  What does this mean to me?  It means I will begin looking for assets to augment my personal portfolio.  Before I do that let me share some other insights. The plot combines covariance matrix data for the 16 assets in the portfolio, as well as semi-variance data for each asset.  Without getting to “mathy” yet, the data visualization software reduces 136 pieces of data down to 32 (excluding color). The covariance matrix and semi-variance calculation itself are also a reducers in that they combines 5 years monthly total-return data — 976 data points down to 120 unique covariance numbers and 16 semi-deviation numbers. Taking 976 down to 32 results in a compression ratio of 30.5:1.

Finally, as it currently stands, the visualization software and resulting plot say nothing about expected return.  The plot focuses solely on risk mitigation at the moment.  Naturally, I intend to change that.

Time for the Math and Finance — Consider Yourself Warned

I mentioned a 30.5:2 (71:2) compression ratio. Just as music and other data, other information, including financial information can be compressed.  However, only so much compression can be achieved in lossless manner.  In audio compression researchers have learned which portions of music and other audio can be “lost” without the listener telling the difference.  There is a field of psychoacoustics around doing just that — modeling what the human ear (and brain) can hear, and what gets “masked” by various physiological factors.

Even more important that preserving fidelity is extracting meaning. One way of achieving that is by removing “noise.” The visualization software performs significant computation to maintain as much angular fidelity as possible. As it optimizes angles, it keeps track of total error vis-a-vis the covariance matrix. It also keeps track of individual assets error (the reciprocal of fitness — fit versus lack of fit).

The real alchemy comes from the line-length computation.  It combines semi-variance data with various fitness factors to determine each asset line length.

Just like Mercator projections for maps incur unavoidable error when converting from a 3-D globe to a 2-D map, the portfolio asset visualizations introduce error as well.  If one thinks of just the correlation matrix and semi-variance data, each asset has a dimensionality of 8.5 (in the case of 16 assets).  Reducing from 8.5-D to 2-D is a complex process, and there are an infinite number of ways to perform such an operation!  The art and [data] science is to enhance the “signal” while stripping away the “noise.”

The ultimate goals of portfolio data visualization technology are:

1) Transform raw data into actionable insight

2) Preserve sufficient fidelity of relevant data such that the “map” can be used to reliably get to the desired “destination”

I believe that the first goal has been achieved.  I know what actions to take… trying various other securities to find those that can build a “higher-angle”, and arguably more robust, more resilient investment portfolio.

However, the jury is still out on the degree [no pun intended] to which goal #2 has or has not been achieved.  Does this simple 2-D map help portfolio builders reliably and consistently navigate the 8+ dimensional portfolio space?

What about 3-D Modelling and Visualization?

I started working with 2-D for one key reason — I can easily share 2-D images with readers and clients alike.  I want feedback on what people like and dislike about the visuals. What is easy to understand, what is not?  What is useful to them, and what isn’t?  Ironing out those details in 2-D is step 1.

Of course I am excited by 3-D. Most of the building blocks are in my head, and I can heavily leverage the 2-D algorithms.  I am, however, holding off for now. I am waiting for feedback from readers and clients alike.  I spend a lot of time immersed in the language of math, statistics, and finance.  This can create a communication gap that is best mitigated through discussion with other people with other perspectives.  I wish to focus on 2-D for a while to learn more about market needs.

That being said, it is hard to resist creating a 3-D portfolio asset visualizer. The geek in me is extremely curious about how much the error terms will reduce when given a third degree of freedom to work with.

The bottom line is: Please give me any feedback: positive, negative, technical, aesthetic, etc. This is just the start. I am extremely enthusiastic about where this journey will take me and my company.

Disclosure and Disclaimer

Securities mentioned in this post are holdings in my personal retirement accounts (e.g. 401K, IRA, Roth IRA) as of the day of initial publication of this post. The purpose of this post is to illustrate features of Sigma1 Financial software. This is NOT investment advice, and NOT a recommendation to buy, sell, or hold any securities. Please refer to the “Disclaimer” Tab of the main page of this site for further information.

Choosing your Crystal Ball for Risk

Choose Your “Perfect” Risk Model

I start with a hypothetical.  You are considering between three portfolios A, B, and C.  If you could know with certainty one of the following annual risk measures, which would you choose:

  1. Variance
  2. Semi-variance
  3. Max Drawdown

For me the choice is obvious: max drawdown. Variance and semi-variance are deliberately decoupled from return.  In fact, we often say variance as short-hand for mean-return variance. Similarly, semi-variance is short-hand for mean-return semi-variance. For each variance flavor, mean-returns — average returns — are subtracted from the risk formula.  The mathematical bifurcation of risk and return is deliberate.

Max drawdown blends return and risk. This is mathematically untidy — max drawdown and return are non-orthogonal. However, the crystal ball of max drawdown allows choosing the “best” portfolio because it puts a floor on loss.  Tautologically the annual loss cannot exceed the annual max drawdown.

Cheating Risk

My revised answer stretches the rules.  If all three portfolios have future max drawdowns of less than 5 percent, then I’d like to know the semi-variances.

Of course there are no infallible crystal balls.  Such choices are only hypothetical.

Past variance tends to be reasonably predictive of future variance; past semi-variance tends to predict future semi-variance to a similar degree.  However, I have not seen data about the relationship between past and future drawdowns.

Research Opportunities Regarding Max Drawdown

It turns out that there are complications unique to max drawdown minimization that are not present with MVO or semi-variance optimization. However, at Sigma1, we have found some intriguing ways around those early obstacles.

That said, there are other interesting observations about max drawdown optimization:

1) Max drawdown only considers the worst drawdown period; all other risk data is ignored.

2) Unlike V or SV optimization, longer historical periods increase the max drawdown percentage.

3) There is a scarcity of evidence of the degree (or lack) of relationship between past max drawdowns and future.

(#1) can possibly be addressed by using hybrid risk measures such as combined semi-variance and max drawdown measures. (#2) can be addressed by standardizing max drawdowns… a simple standardization would be DDnorm = DD/num_years.  Another possibility is DDnorm = DD/sqrt(num_years). (#3) Requires research. Research across different time periods, different countries, different market caps, etc.

Also note that drawdown has many alternative flavors — cumulative drawdown, weighted cumulative drawdown (WCDD), weighted cumulative drawdown over threshold — just to name three.

Semi-Variance Risk Measure Reaching Critical Mass?

The bottom line is that early adopters have embraced semi-variance based optimization and the trend appears to be snowballing.  For instance, Morningstar now calculates riskwith an emphasis on downward variation.”  I believe that drawdown measures, either stand-alone or hybridized with semi-variance, are the future of post post modern portfolio theory.

Bye PMPT. Time for a Better Name! Contemporary Portfolio Theory?

I recommend starting with the the acronym first.  I propose CPT or CAPT.  Either could be pronounced as “Capped”. However, CAPT could also be pronounced “Cap T” as distinct from CAPM (“Cap M”). “C” could stand for either Contemporary or Current.  And the “A” — Advanced, Alternative — with the first being a bit pretentious, and the latter being more diplomatic. I put my two cents behind CAPT, pronounced “Cap T”; You can figure out what you want the letters to represent.  What is your 2 cents?  Please leave a comment!

Back to (Contemporary) Risk Measures

I see semi-variance beginning to transition from the early-adopter phase to the early-majority phase. However, my observations may be skewed by the types of interactions Sigma1 Financial invites. I believe that semi-variance optimization will be mainstream in 5 years or less. That is plenty of time for semi-variance optimization companies to flourish. However, we’re also looking for the next next big thing in finance.

 

Semi-variance: Choosing the Best Formula

Unlike variance, there a several different formulas for semivariance (SV).  If you are a college student looking to get the “right” answer on test or quiz, the formula you are looking for is most likely:

Classic Semi-Variance Formula

Classic Semi-Variance Formula

The question-mark-colon syntax simply means if the expression before the “?” is true then the term before the “:” is used, otherwise the term after the “:” is used.  So a?b:c simply means chose b if a is true, else chose c.  This syntax is widely used in computer science, but less often in the math department.  However, I find it more concise than other formulations.

Another common semivariance formula involves comparing returns to a required minimum threshold rt.  This is simply:

Min Return Threshold SV

Min Return Threshold SV

Classic mean-return semivariance should not be directly compared to mean-return variance.  However a slight modification makes direct comparison more meaningful.  In general approximately half of mean-adjusted returns are positive and half are negative (exactly zero is a relatively rare event and has no impact to either formula).  While mean-variance always has n terms, semi-variance only uses a subset which is typically of size n/2.  Thus including a factor of 2 in the formula makes intuitive sense:

Modified Semi-Variance

Modified Semi-Variance

Finally, another useful formulation is one I call “Modified Drawdown Only” (MDO) semivariance.  The name is self-explanatory… only drawdown events are counted.  SVmdo does not require ravg (r bar) nor rt.  It produces nearly identical values to SVmod for rapid sampling (say for anything more frequent than daily data).  For high-speed trading it also has the advantage of not requiring all of the return data a priori, meaning it can be computed as each return data point becomes available, rather than retrospectively.

Modified Drawdown-Only Semi-variance

Modified Drawdown-Only Semi-variance

Why might  SVmdo be useful in high-speed trading?  One use may be in put/call option pricing arbitrage strategies.  Black–Scholes, to my knowledge, makes no distinction between “up-side” and “down-side” variance, and simply uses plain variance. [Please shout a comment at me if I am mistaken!]    However if put and call options are “correctly” priced according to Black–Scholes, but the data shows a pattern of, say, greater downside variance than normal variance on the underlying security, put options may be undervalued.  This is just an off-the-cuff example, but it illustrates a potential situation for which SVmdo is best suited.

Pick Your Favorite Risk Measure

Personally, I slightly favor SVmdo over SVmod for computational reasons. They are often quite similar in practice, especially when used to rank risk profiles of a set of candidate portfolios. (The fact that both are anagrams of each other is deliberate.)

I realize that the inclusion of the factor 2 is really just a semantic choice.  Since V and (classic) SV, amortized over many data sets, are expected to differ by a factor of 2, standard deviation, σ,  and semideviation, σd, can be expected to differ by the square root of 2.  I consider this mathematically untidy.  Conversely, I consider SVmod to be the most elegant formulation.

Principles of Portfolio Optimization Software

Explaining technical investment concepts in a non-technical way is critical to having a meaningful dialog with individual investors.  Most individual investors (also called “retail investors”, or “small investors”) do not have the time nor the desire to learn the jargon and concepts behind building a solid investment portfolio.  This is generally true for most individual investors regardless of the size of their investment portfolios.  Individual investors expect investment professionals (also called “institutional investors”) to help manage their portfolios and explain the major investment decisions behind the management of their individual portfolios.

In the same way that a good doctor helps her patient make informed medical decisions, a good investment adviser helps her clients make informed investment decisions.

I get routinely asked how the HALO Portfolio Optimizer works.  Every time I answer that question, I face two risks: 1) that I don’t provide enough information to convince the investment profession or their clients that HALO optimization provides significant value and risk-mitigation capability and 2) I risk sharing key intellectual property (IP) unique to the Sigma1 Financial HALO optimizer.

This post is my best effort to provide both investment advisers and their clients with enough information to evaluate and understand HALO optimization, while avoiding sharing key Sigma1 trade secrets and intelectual property.  I would very much appreciate feedback, both positive and negative, as to whether I have achieved these goals.

First Principle of Portfolio Optimization Software

Once when J.P. Morgan was asked what the market would do, he answered “It will fluctuate.”  While some might find this answer rather flippant, I find it extremely insightful.  It turns out that so-called modern portfolio theory (MPT) is based understanding (or quantifying) market fluctuations. MPT labels these fluctuations as “risk” and identifies “return” as the reward that a rational investor is willing to accept for a given amount of risk.  MPT assumes that a rational investor, or his/her investment adviser will diversify away most or all “diversifiable risk” by creating a suitable investment portfolio tailored to the investor’s current “risk tolerance.”

In other words, the primary job of the investment adviser (in a “fiduciary” role), is to maximize investment portfolio return for a client’s acceptable risk.  Said yet another way, the job is to maximize the risk/reward ratio for the client, without incurring excess risk.

Now for the first principle: past asset “risk” tends to indicate future asset “risk”.  In general an asset that has been previously more volatile will tend to remain more volatile, and and asset that has been less volatile will tend to remain less volatile.  Commonly, both academia and professional investors have equated volatility with risk.

Second Principle of Portfolio Optimization Software

The Second Principle is closely related to the first.  The idea is that the past portfolio volatility tends to indicate future portfolio volatility. This thesis is so prevalent that it is almost inherently assumed.  This is evidenced by search results that reaches beyond volatility and looks at the hysteresis of return-versus-volatility ratios, papers such at this.

Past Performance is Not Necessarily Indicative of Future Results.

Third Principle of Portfolio Optimization Software

The benefits of diversification are manifest in risk mitigation.  If two assets are imperfectly correlated, then their combined volatility (risk) will be less than the weighted averages of their individual volatilities.  An in-depth mathematical description two-asset portfolio volatilities can be found on William Sharpe’s web page.  Two-asset mean-variance optimization is relatively simple, and can be performed with relatively few floating-point operations on a computer.  This process creates the two-asset efficient frontier*.  As more assets are added to the mix, the computational demand to find the optimal efficient frontier grows geometrically, if you don’t immediately see why look at page 8 of this paper.

A much simpler explanation of the the third principle is as follows.  If asset A has annual standard deviation of 10%, and asset B an annual standard deviation of 20%, and A and B are not perfectly correlated, then the portfolio of one half invested in A and the other half invested in B will have a annual standard deviation of less than 15%.  (Non-perfectly correlated means a correlation of less than 1.0).  Some example correlations of assets can be found here.

In so-called plain English, the Third Principle of Portfolio Optimization can be stated: “For a given level of expected return, portfolio optimization software can reduce portfolio risk by utilizing the fact that different assets move somewhat independently from each other.”

Forth Principle of Portfolio Optimization Software

The Forth Principle of Portfolio Optimization establishes a relationship between risk and return.  The classic assumption of modern portfolio theory (MPT) is that so-called systematic risk is rewarded (over a long-enough time horizon) with increased returns.  Portfolio-optimization software seeks to reduce or eliminate unsystematic risk when creating an optimized set of portfolios.  The portfolio manager can thus select one of these optimized portfolios from the “best-in-breed” list created by the optimization software that is best suited to his/her client’s needs.

Fifth Principle of Portfolio Optimization Software

The 5th Principle is that the portfolio manager and his team adds value to the portfolio composition process by 1) selecting a robust mix of assets, 2) applying constraints to the weights of said assets and asset-groups, and 3) assigning expected returns to each asset.  The 5th Principle focuses on the assignment of expected returns.  This  process can be grouped under the category of investment analysis or investment research.  Investment firms pay good money for either in-house or contracted investment analysis of selected securities.

Applying the Portfolio Optimization Principles Together

Sigma1 Financial HALO Software applies these five principles together to help portfolio managers improve or fine-tune their proprietary-trading and/or client investment portfolios.  HALO Portfolio Optimization software utilizes the assets, constraints, and expected returns from the 5th Principal as a starting point.  It then uses the 4th Principal by optimizing away systematic risk from a set of portfolios by taking maximum advantage of varying degrees of non-correlation of the portfolio assets.  The 3rd Principle alludes to the computational difficulty of solving the multi-asset optimization problem.  Principles 1 and 2 form the bedrock of the concepts behind the use of historical correlation data to predict and estimate future correlations.

The Fine Print

Past asset volatility of most assets and most portfolios is historically well correlated with future volatility. However, not only are assets increasingly correlated, there is some evidence that asset correlations tend to increase during times of financial crisis. Even if assets are more correlated, there remains significant value in exploiting partial-discorrelation.
(*) The two-asset model can be represented as two parametric functions of a single variable, “t”, ER(t), and var(t).  t simply represents the investment proportion invested in asset 0 (aka asset A).  For three variables, expected return becomes ER(t0,t1) as does var(t0,t1).  And so on for increasing numbers of assets.  The computational effort required to compute ER(t0…tn) scales linearly with number of assets, but var(t0…tn) scales geometrically.
Optimizing efficiently within this complex space benefits from creative algorithms and heuristics.

Inverted Risk/Return Curves

Over 50 years of academic financial thinking is based on a kind of financial gravity:  the notion that for a relatively diverse investment portfolio, higher risk translates into higher return given a sufficiently long time horizon.  Stated simply: “Risk equals reward.”  Stated less tersely, “Return for an optimized portfolio is proportional to portfolio risk.”

As I assimilated the CAPM doctrine in grad school, part of my brain rejected some CAPM concepts even as it embraced others.  I remember seeing a graph of asset diversification that showed that randomly selected portfolios exhibited better risk/reward profiles up to 30 assets, at which point further improvement was minuscule and only asymptotically approached an “optimal” risk/reward asymptote.  That resonated.

Conversely, strict CAPM thinking implied that a well-diversified portfolio of high-beta stocks will outperform a marketed-weighted portfolio of stocks over the long-term, albeit in a zero-alpha fashion.  That concept met with cognitive dissonance.

Now, dear reader, as a reward for staying with this post this far, I will reward you with some hard-won insights.  After much risk/reward curve fitting on compute-intensive analyses, I found that the best-fit expected-return metric for assets was proportional to the square root of beta.  In my analyses I defined an asset’s beta as 36-month, monthly returns relative to the benchmark index.  Mostly, for US assets, my benchmark “index” was VTI total-return data.

Little did I know, at the time, that a brilliant financial maverick had been doing the heavy academic lifting around similar financial ideas.  His name is Bob Haugen. I only learned of the work of this kindred spirit upon his passing.

My academic number crunching on data since 1980 suggested a positive, but decreasing incremental total return vs. increasing volatility (or for increasing beta).  Bob Haugen suggested a negative incremental total return for high-volatility assets above an inflection-point of volatility.

Mr. Haugen’s lifetime of  published research dwarfs my to-date analyses. There is some consolation in the fact that I followed the data to conclusions that had more in common with Mr. Haugen’s than with the Academic Consensus.

An objective analysis of the investment approach of three investing greats will show that they have more in common with Mr. Haugen than Mr. E.M. Hypothesis (aka Mr. Efficient Markets, [Hypothesis] , not to be confused with “Mr. Market”).  Those great investors are 1) Benjamin Graham, 2) Warren Buffet, 3) Peter Lynch.

CAPM suggests that, with either optimal “risk-free”or leveraged investments a capital asset line exists — tantamount to a linear risk-reward relationship. This line is set according to an unique tangent point to the efficient frontier curve of expected volatility to expected return.

My research at Sigma1 suggests a modified curve with a tangent point portfolio comprised, generally, of a greater proportion of low volatility assets than CAPM would indicate.  In other words, my back-testing at Sigma1 Financial suggests that a different mix, favoring lower-volatility assets is optimal.  The Sigma1 CAL (capital allocation line) is different and based on a different asset mix.  Nonetheless, the slope (first derivative) of the Sigma1 efficient frontier is always upward sloping.

Mr. Haugen’s research indicates that, in theory, the efficient frontier curve past a critical point begins sloping downward with as portfolio volatility increases. (Arguably the curve past the critical point ceases to be “efficient”, but from a parametric point it can be calculated for academic or theoretical purposes.)  An inverted risk/return curve can exist, just as an inverted Treasury yield curve can exist.

Academia routinely deletes the dominated bottom of the the parabola-like portion of the the complete “efficient frontier” curve (resembling a parabola of the form x = A + B*y^2) for allocation of two assets (commonly stocks (e.g. SPY) and bonds (e.g. AGG)).

Maybe a more thorough explanation is called for.   In the two-asset model the complete “parabola” is a parametric equation where x = Vol(t*A, (1-t)*B) and y = ER( t*A, (1-t)*B.  [Vol == Volatility or standard-deviation, ER = Expected Return)].   The bottom part of the “parabola” is excluded because it has no potential utility to any rational investor.  In the multi-weight model, x=minVol (W), y=maxER(W), and W is subject to the condition that the sum of weights in vector W = 1.  In the multi-weight, multi-asset model the underside is automatically excluded.  However there is no guarantee that there is no point where dy/dx is negative.  In fact, Bob Haugen’s research suggests that negative slopes (dy/dx) are possible, even likely, for many collections of assets.

Time prevents me from following this financial rabbit hole to its end.  However I will point out the increasing popularity and short-run success of low-volatility ETFs such as SPLV, USMV, and EEMV.  I am invested in them, and so far am pleased with their high returns AND lower volatilities.

==============================================

NOTE: The part about W is oversimplified for flow of reading.  The bulkier explanation is y is stepped from y = ER(W) for minVol(W) to max expected-return of all the assets (Wmax_ER_asset = 1, y = max_ER_asset_return), and each x = minVol(W) s.t. y = ER(W) and sum_of_weights(W) = 1.   Clear as mud, right?  That’s why I wrote it the other way first.