How to Write a Mean-Variance Optimizer (Part III)… In R

Parts 1 and 2 left a trail of breadcrumbs to follow.  Now I provide a full-color map, a GPS, and local guide.  In other words the complete solution in the R statistical language.

Recall that the fast way to compute portfolio variance is:



The companion equation is rp = wTrtn, where rtn is a column vector of expected returns (or historic returns) for each asset.  The first goal is to find find w0 and wn. w0 minimizes variance regardless of return, while wn maximizes return regardless of variance.  The goal is to then create the set of vectors {w0,w1,…wn} that minimizes variance for a given level of expected return.

I just discovered that someone already wrote an excellent post that shows exactly how to write an MVO optimizer completely in R. Very convenient!  Enjoy…

http://economistatlarge.com/portfolio-theory/r-optimized-portfolio

The Equation Everyone in Finance Should Know (MV Optimization: How To, Part 2)

As the previous post shows, it all starts with…



In order get close to bare-metal access to your compute hardware, use C.  In order to utilize powerful, tested, convex optimization methods use CVXGEN.  You can start with this CVXGEN code, but you’ll have to retool it…

• Discard the (m,m) matrix for an (n,n) matrix. I prefer to still call it V, but Sigma is fine too.  Just note that there is a major difference between Sigma (the covariance-variance matrix) and sigma (individual asset-return variances matrix; the diagonal of Sigma).
• Go meta for the efficient frontier (EF).  We’re going to iteratively generate/call CVXGEN with multiple scripts. The differences will be w.r.t the E(Rp).
• Computing Max: E(Rp)  is easy, given α.  [I’d strongly recommend renaming this to something like expect_ret comprised of (r1, r2, … rn). Alpha has too much overloaded meaning in finance].
• [Rmax] The first computation is simple.  Maximize E(Rp) s.t constraints.  This is trivial and can be done w/o CVXGEN.
• [Rmin] The first CVXGEN call is the simplest.  Minimize σp2 s.t. constraints, but ignoring E(Rp)
• Using Rmin and Rmax, iteratively call CVXGEN q times (i=1 to q) using the additional constraint s.t. Rp_i= Rmin + (i/(q+1)*(Rmax-Rmin). This will produce q+2 portfolios on the EF [including Rmin and Rmax].  [Think of each step (1/(q+1))*(Rmax-Rmin) as a quantization of intermediate returns.]
• Present, as you see fit, the following data…
• (w0, w1, …wq+1)
• [ E(Rp_0), …E(Rp_(q+1)) ]
• [ σ(Rp_0), …σ(Rp_(q+1)) ]

My point is that —  in two short blog posts — I’ve hopefully shown how easily-accessible advanced MVO portfolio optimization has become.  In essence, you can do it for “free”… and stop paying for simple MVO optimization… so long as you “roll your own” in house.

I do this for the following reasons:

• To spread MVO to the “masses”
• To highlight that if “anyone with a master’s in finance and computer can do MVO for free” to consider their quantitative portfolio-optimization differentiation (AKA portfolio risk management differentiation), if any
• To emphasize that this and the previous blog will not greatly help with semi-variance portfolio optimization

I ask you to consider that you, as one of the few that read this blog, have a potential advantage.  You know who to contact for advanced, relatively-inexpensive SVO software. Will you use that advantage?

How to Write a Mean-Variance Optimizer: Part 1

The Equation Everyone in Finance Show Know, but Many Probably Don’t!

Here it is:


… With thanks to codecogs.com which makes it really easy to write equations for the web.

This simple matrix equation is extremely powerful.  This is really two equations.  The first is all you really need.  The second is just merely there for illustrative purposes.

This formula says how the variance of a portfolio can be computed from the position weights wT = [w1 w2 … wn] and the covariance matrix V.

• σii ≡ σi2 = Var(Ri)
• σij ≡ Cov(Ri, Rj) for i ≠ j

The second equation is actually rather limiting.  It represents the smallest possible example to clarify the first equation — a two-asset portfolio.  Once you understand it for 2 assets, it is relatively easy to extrapolate to 3-asset portfolios, 4-asset portfolios, and before you know it, n-asset portfolios.

Now I show the truly powerful “naked” general form equation:

This is really all you need to know!  It works for 50-asset portfolios. For 100 assets. For 1000.  You get the point. It works in general. And it is exact. It is the E = mc2 of Modern Portfolio Theory (MPT).  It at least about 55 years old (2014 – 1959), while E = mc2 is about 99 years old (2014 – 1915).  Harry Markowitz, the Father of (M)PT simply called it “Portfolio Theory” because:

Yes, I’m calling Markowitz the Einstein of Portfolio Theory AND of finance!  (Now there are several other “post”-Einstein geniuses… Bohr, Heisenberg, Feynman… just as there are Sharpe, Scholes, Black, Merton, Fama, French, Shiller, [Graham?, Buffet?]…)   I’m saying that a physicist who doesn’t know E = mc2 is not much of a physicist. You can read between the lines for what I’m saying about those that dabble in portfolio theory… with other people’s money… without really knowing (or using) the financial analog.

Why Markowitz is Still “The Einstein” of Finance (Even if He was “Wrong”)

Markowitz said that “downside semi-variance” would be better.  Sharpe said “In light of the formidable
computational problems…[he] bases his analysis on the variance and standard deviation.”

Today we have no such excuse.  We have more than sufficient computational power on our laptops to optimize for downside semi-variance, σd. There is no such tidy, efficient equation for downside semi-variance.  (At least not that anyone can agree on… and none that that is exact in any sense of any reasonable mathematical definition of the word ‘exact’.)

Fama and French improve upon Markowitz (M)PT [I say that if M is used in MPT, it should mean “Markowitz,” not “modern”, but I digress.] Shiller, however, decimates it.  As does Buffet, in his own applied way.  I use the word decimate in its strict sense… killing one in ten.  (M)PT is not dead; it is still useful.  Diversification still works; rational investors are still risk-averse; and certain low-beta investments (bonds, gold, commodities…) are still poor very-long-term (20+ year) investments in isolation and relative to stocks, though they still can serve a role as Markowitz Portfolio Theory suggests.

Wanna Build your Own Optimizer (for Mean-Return Variance)?

This blog post tells you most of the important bits.  I don’t really need to write part 2, do I?   Not if you can answer these relatively easy questions…

• What is the matrix expression for computing E(Rp) based on w?
• What simple constraint is w subject to?
• How does the general σp2 equation relate to the efficient frontier?
• How might you adapt the general equation to efficiently compute the effects of a Δw event where wi increases and wj decreases?  (Hint “cache” the wx terms that don’t change,)
• What other constraints may be imposed on w or subsets (asset categories within w)?  How will you efficiently deal with these constraints?
• Is short-selling allowed?  What if it is?
• OK… this one’s a bit tricky:  How can convex optimization methods be applied?

If you can answer these questions, a Part 2 really isn’t necessary is it?

Clover Patterns Show How Portfolios Manage Risk

Illustration of Classic Covariance.

The red and green “clover” pattern illustrates how traditional risk can be modeled.  The red “leaves” are triggered when both the portfolio and the “other asset” move together in concert.  The green leaves are triggered when the portfolio and asset move in opposite directions.

Each event represents a moment in time, say the closing price for each asset (the portfolio or the new asset).  A common time period is 3-years of total-return data [37 months of price and dividend data reduced to 36 monthly returns.]

Plain English

When a portfolio manager considers adding a new asset to an existing portfolio, she may wish to see how that asset’s returns would have interacted with the rest of the portfolio.  Would this new asset have made the portfolio more or less volatile?  Risk can be measured by looking at the time-series return data.  Each time the asset and the portfolio are in the red, risk is added. Each time they are in the green, risk is subtracted.  When all the reds and greens are summed up there is a “mathy” term for this sum: covariance.  “Variance” as in change, and “co” as in together. Covariance means the degree to which two items move together.

If there are mostly red events, the two assets move together most of the time.  Another way of saying this is that the assets are highly correlated. Again, that is “co” as in together and “related” as in relationship between their movements. If, however, the portfolio and asset move in opposite directions most of the time, the green areas, then the covariance is lower, and can even be negative.

Covariance Details

It is not only the whether the two assets move together or apart; it is also the degree to which they move.  Larger movements in the red region result in larger covariance than smaller movements.  Similarly, larger movements in the green region reduce covariance.  In fact it is the product of movements that affects how much the sum of covariance is moved up and down.  Notice how the clover-leaf leaves move to the center, (0,0) if either the asset or the portfolio doesn’t move at all.  This is because the product of zero times anything must be zero.

Getting Technical: The clover-leaf pattern relates to the angle between each pair of asset movements.  It does not show the affect of the magnitude of their positions.

If the incremental covariance of the asset to the portfolio is less than the variance of the portfolio, a portfolio that adds the asset would have had lower overall variance (historically).  Since there is a tenancy (but no guarantee!) for asset’s correlations to remain somewhat similar over time, the portfolio manager might use the covariance analysis to decide whether or not to add the new asset to the portfolio.

Semi-Variance: Another Way to Measure Risk

Semi-variance Visualization

After staring at the covariance visualization, something may strike you as odd — The fact that when the portfolio and the asset move UP together this increases the variance. Since variance is used as a measure of risk, that’s like saying the risk of positive returns.

Most ordinary investors would not consider the two assets going up together to be a bad thing.  In general they would consider this to be a good thing.

So why do many (most?) risk measures use a risk model that resembles the red and green cloverleaf?  Two reasons: 1) It makes the math easier, 2) history and inertia. Many (most?) textbooks today still define risk in terms of variance, or its related cousin standard deviation.

There is an alternative risk measure: semi-variance. The multi-colored cloverleaf, which I will call the yellow-grey cloverleaf, is a visualization of how semi-variance is computed. The grey leaf indicates that events that occur in that quadrant are ignored (multiplied by zero).  So far this is where most academics agree on how to measure semi-variance.

Variants on the Semi-Variance Theme

However differences exist on how to weight the other three clover leaves.  It is well-known that for measuring covariance each leaf is weighted equally, with a weight of 1. When it comes to quantifying semi-covariance, methods and opinions differ. Some favor a (0, 0.5, 0.5, 1) weighting scheme where the order is weights for quadrants 1, 2, 3, and 4 respectively. [As a decoder ring Q1 = grey leaf, Q2 = green leaf, Q3 = red leaf, Q4 = yellow leaf].

Personally, I favor weights (0, 3, 2, -1) for the asset versus portfolio semi-covariance calculation.  For asset vs asset semi-covariance matrices, I favor a (0, 1, 2, 1) weighting.  Notice that in both cases my weighting scheme results in an average weight per quadrant of 1.0, just like for regular covariance calculations.

Financial Industry Moving toward Semi-Variance (Gradually)

Semi-variance more closely resembles how ordinary investors view risk. Moreover it also mirrors a concept economists call “utility.” In general, losing \$10,000 is more painful than gaining \$10,000 is pleasurable. Additionally, losing \$10,000 is more likely to adversely affect a person’s lifestyle than gaining \$10,000 is to help improve it.  This is the concept of utility in a nutshell: losses and gains have an asymmetrical impact on investors. Losses have a bigger impact than gains of the same size.

Semi-variance optimization software is generally much more expensive than variance-based (MVO mean-variance optimization) software.  This creates an environment where larger investment companies are better equipped to afford and use semi-variance optimization for their investment portfolios.  This too is gradually changing as more competition enters the semi-variance optimization space.  My guestimate is that currently about 20% of professionally-managed U.S. portfolios (as measured by total assets under management, AUM) are using some form of semi-variance in their risk management process.  I predict that that percentage will exceed 50% by 2018.

Surpassing the Frontier?

Suppose you have the tools to compute the mean-return efficient frontier to arbitrary (and sufficient) precision — given a set of total-return time-series data of asset/securities.  What would you do with such potential?

I propose that the optimal solution is to “breach the frontier.”  Current portfolios provide a historic reference. Provided reference/starting point portfolios have all (so far) provided sufficient room for meaningful and sufficient further optimization, as gauged by, say, improved Sortino ratios.

Often, when the client proposes portfolio additions, some of these additions allow the optimizer to push beyond the original efficient frontier (EF), and provide improved Sortino ratios. Successful companies contact  ∑1 in order to see how each of their portfolios:

1) Land on a risk-versus-reward (expected-return) plot
2) Compare to one or more benchmarks, e.g. the S&P500 over the same time period
3) Compare to an EF comprised of assets in the baseline portfolio

Our company is not satisfied to provide marginal or incremental improvement. Our current goal is provide our client  with more resilient portfolio solutions. Clients provide the raw materials: a list of vetted assets and expected returns.  ∑1 software then provides near-optimal mix of asset allocations that serve a variety of goals:

1) Improved projected risk-adjusted returns (based on semi-variance optimization)
2) Identification of under-performing assets (in the context of the “optimal” portfolio)
3) Identification of potential portfolio-enhancing assets and their asset weightings

We are obsessed with meaningful optimization. We wish to find the semi-variance (semi-deviation) efficient frontier and then breach it by including client-selected auxiliary assets. Our “mission” is  as simple as that — Better, more resilient portfolios

Portfolio-Optimization Plots

I am happy to announce that the latest version of the HALO Portfolio-Optimization Suite is now available.  Key features include:

• Native asset constraint support
• Native asset-category constraint support
• Dramatic run-time improvements of 2X to over 100X

Still supported are user-specified risk models, including semi-variance and max-drawdown.  What has been temporarily removed (based on minimal client interest) is 3-D 2-risk modelling and optimization.  This capability may be re-introduced as a premium feature, pending client demand.

Here is a quick screenshot of a 20-asset, fixed-income portfolio optimization.  The “risk-free” rate used for the tangent capital allocation line (CAL) is 1.2% (y-intercept not shown), reflecting a mix of T-Bills and stable value funds.  Previously this optimization took 18 minutes on an \$800 laptop computer.  Now, with the new HALO software release, it runs in only 11 seconds on the same laptop.

Optimized Fixed-Income (only) Portfolio.

Choices, Opportunities, and Solutions

To date I’ve invested approximately 800 hours developing and testing the heuristics and algorithms behind HALO. Finding exact solutions (with respect to expected-return assumptions) to certain real-world portfolio-optimization problems can be solved. Finding approximate solutions to other real-world portfolio-optimization problems is relatively easy, but finding provably optimal solutions is currently “impossible”. The current advanced science and art of portfolio optimization involves developing methods to efficiently find nearly optimal solutions.

I believe that HALO represents a significant step forward in finding nearly-optimal solutions to generalized risk models for investment portfolios. The primary strengths of HALO are in flexibility and dimensionality of financial risk modeling. While HALO currently finds solutions that are almost identical to exact solutions for convex optimization problems; the true advantage of HALO is in the quality of solutions for non-convex portfolio-optimization problems

Do you know if your particular optimization metric can be articulated in canonical convex notation? I argue that HALO does not care.  If it can be, HALO will find a near-optimal solution virtually identical to the ideal convex optimization solution.  If it cannot be, and is indeed non-convex, HALO will find solutions competitive with other non-convex optimization methods.

It could be argued that “over-fitting” is a potential danger of optimal and near-optimal solutions. However, I argue that given a sufficiently diverse and under-constrained optimization task, over-fitting is less worrisome.   In other words, the quality of the inputs greatly influences the quality of the outputs.  One secret is to supply high-quality (e.g. asset expected return) estimates to the optimization problem.

The Future of Investing is Automation

Developing an Automation Mindset for Investing

In 2010, I bought the domain name Sigma1.com with the idea of creating an hedge fund that I would manage.  In order to measure and manage my investment strategies objectively, I began thinking about benchmarks and financial analysis software.  And as I ran scenarios through Excel and some light-weight analysis software I created, I began to realize that analysis, by itself was very limited.  I could only back-test one portfolio at a time, and I had to construct each portfolio’s asset weights manually.  It soon became obvious that I needed portfolio optimization software.

I learned that portfolio optimization software with the capabilities I wanted was extremely expensive. Further, I realized that even if, say, I negotiated a deal with MSCI where they provided Sigma1 Financial with their Barra Portfolio Manager for free, it would not differentiate a Sigma1 hedge fund from other hedge funds using the same software.

I was beginning to interact with several technology entrepreneurs and angel investors.  I quickly learned that legal costs and barriers to entry for a new hedge were intractable.  If Sigma1 attracted \$10M in assets from accredited investors in 12 months, and charged 2 and 20, it would be a money loosing enterprise.  Cursory research revealed that critical mass for a profitable (for the hedge fund managers) hedge fund could be as high as \$500M.  Luckily, I had learned about the concept of the “entrepreneurial pivot“.

The specific pivots Sigma1 used were a market segment pivot followed by a technology pivot. I realized that while the high cost of good portfolio optimization software is bad for a hedge fund startup, it was great for a financial software startup.  Suddenly, the Sigma1 Financial target market switched from accredited investors to financial professionals (investment managers, fund managers, proprietary traders, etc).  This was a key market segment pivot.

Just creating a cheaper portfolio optimizer seemed unlikely to provide sufficient incentive to displace entrenched portfolio optimizers. Sigma1 needed a technology pivot — finding a solution using a completely different technology.  Most prior portfolio optimizers use some variant of linear programming (LP) [or QP or NLP] to help find optimal portfolios. Moreover they also create an asset covariance matrix as a starting point for the optimization.

One stormy day, I realized that some algorithms I created to solve statistical electrical engineering problems in grad school could be adapted to optimize investment portfolios. The method I devised not only avoided LP, QP, or NLP methods; it also dispensed with the need for a covariance matrix.  Over then next several days I realized that by eliminating dependence on a covariance matrix, the algorithm I later named HALO, could use both traditional and alternate risk measures ranging from variance-based (eg. standard-deviation of return) to covariance-based ones (e.g. beta) to semivariance to max draw down.  By developing a vastly different technology, HALO could optimize for risks such as semivariance and Sortino ratios, or max drawdown, or even custom risk measures devised by the client.

Algorithms Everywhere

Long before Sigma1 began developing HALO, the financial industry has been increasingly reliant on digital systems and various financial algorithms. As digital communication networks and electronic stock exchanges gained trading volume, various forms of program trading began to flourish.  This includes the often maligned high-frequency trading variant of automated trading.

Concurrently, more and more trading volume has gone online.  A significant portion of today’s individual investors have never placed a trade using a human stock broker.

There are now numerous automated investment analysis tools, many of which come free with a brokerage account, while others are free or low-cost stand-alone online tools.  Examples of the former include the Fidelity’s nascent GPS (Guided Portfolio Summary) to more seasoned offerings such as Financial Engines.  Online portfolio analysis offering range from Morningstar’s Instant X-Ray, to sites like ETFreplay.

However these software offerings are just the beginning. A company call FutureAdvisor has partnered with Fidelity and TD Ameritrade to allow its automate portfolio software to make trades on its users behalf. Companies like Future Advisor have the potential to help small investors benefit from custom-tailored investment advice utilizing proven academic research (e.g. Fama French) at a very low cost — costs so low that they would not be profitable for human investment advisers to provide.

If successful (and I believe some automated investment companies will be), why should they stop at small-time investors, with less than \$500,000 in investable assets?  Why not \$1,000,000 or more?  Nothing should stop them!

I could easily imagine Mark Zuckerberg, Sergey Brin, or Larry Page utilizing an automated investment company’s software to manage a large part of their portfolios.  If we, as a society, are considering allowing automated systems to drive our cars for us, surely they can also manage our investment portfolios.

The Future Roll of the Human Financial Adviser

There will always be some percentage of investors who want a personal relationship with a financial adviser. Human investment advisers can excel at explaining investment concepts and putting investors at ease during market corrections.  In some ways human investment advisers even function as personal financial counselors, listening to their clients emotional financial stories.  And, of course, there are some people who want to be able to pick up the phone and yell at a real person for letting them suffer market losses.  Finally, there are people with Luddite tenancies who want as little to do with technology as possible.  For all these reasons human investment advisers will have a place in the future world of finance.

Investment Automation will Accelerate

There are some clear trends in the investing world.  Index investing will continue to grow, as will total ETF assets under management (AUM). Alternative investments from rental property to master limited partnerships (MLPs) to private equity are also likely to become part of the portfolios of more sophisticated and affluent investors.

With the exception of high-frequency trading, which has probably saturated arbitrage and front-running opportunities, I expect algorithmic (algo) management to increase as an overall percentage of US and global AUM. Some algorithmic trading and investing will be of the “hardwired” variety where the algo directly connects to the exchanges and makes trades, while the rest of the algo umbrella will comprise trading and investing decisions made by financial software and entered manually by humans with minimal revision.  There will also be hybrid methods where investment decisions are a synthesis of “automated” and “manual” processes.  I expect the scope of these “flavors” of automated investing to not only increase, but to accelerate in the near term.

It is important to note, however, that for the foreseeable future, the ultimate arbiters of algorithmic investing and portfolio optimization will be human.  The software architects and developers will exercise significant influence on the methodology behind the fund and portfolio optimization software.  Furthermore, the users of the software will have supreme control over what parameters go into the optimization process such as including or excluding or bounding certain assets and asset classes (amongst many other factors under their direct control).

That being said, the future of investing will be increasingly the domain of financial engineers, software developers and testers, and people with skills in financial mathematics, statistics, algorithms, data structures, GUIs, web interfaces and usability. Additionally, the financial software automation revolution will have profound impacts on legal professionals and marketers in the financial domain, as well as more modest impacts on accountants and IT professionals.

Some financial professionals will take the initiative and find a place on the leading edge of the financial automation revolution. It is likely to be a wild but lucrative ride. Others will seek the short-term comfort of tradition. They may be able to retain many of their current clients through sheer charisma and inertia, but may find it increasingly difficult the appeal to younger affluent clients steeped in a culture of technology.

Pursuing Alpha with Antivariance

A simple and marginally-effective strategy to reduce portfolio variance is by constructing an asset correlation matrix, selecting assets with low (preferably negative) correlations, and building a portfolio of low-correlation assets.  This basic strategy involves creating a set of assets whose cross-correlations (covariances) are minimized.

One reason this basic  strategy is only somewhat effective is that a correlation matrix (or covariance matrix) only provides a partial picture of the chosen investment landscape.  Some fundamental limitations include non-normal distributions, skewness, and kurtosis to name a few.  To most readers these are fancy words with varying degrees of meaning.

Personally, I often find the mathematics of the work I do seductive like a Siren’s song.  I endeavor to strike a balance between exploring tangential mathematical constructs, and keeping most of my math applied. One mental antidote to the Siren’s song of pure mathematics is to think more conceptually than mathematically by asking questions like:

What are the goals of portfolio optimization?  What elements of the investing landscape allow these goals to be achieved?

I then attempt to answer these questions with explanations that a person with a college degree but without a mathematically background beyond algebra could understand.  This approach lets me define the concept first, and develop the math later.  In essence I can temporarily free my mind of the slow, system 2 thinking generally required for math.

Recently, I came up with the concept of antivariance.  I’m sure others have had similar ideas and a cursory web search reveals that as profession poker player’s nickname.  I will layout my concept of antivariance as it relates to porfolio theory in particular and the broader concept in general.

By convention, one of the key objective of modern portfolio theory is the reduction of portfolio return variance.  The mathematical concept is the idea that by combining assets with correlations of less than 1.0, the return variance is less than the weighted sums of each asset’s individual variance.

Antivariance assumes that there are underlying patterns explain why two or more assets should be somewhat less correlated (independent), but at times negatively correlated.  Consider the affects of major hurricanes like Andrew or Katrina.  Their effects were negative for insurance companies with large exposures, but were arguably positive for companies that manufactured and supplied building materials used in the subsequent rebuilds.  I mention Andrew because there was much more and more rapid rebuilding following Andrew than Katrina.  The disparate groups of stocks of (regional) insurance versus construction companies can be considered to exhibit paired antivariance to devastating weather events.

Nicholas Nassim Taleb coined the the term antifragile, because terms such as robust simply don’t convey the exact mental connections.  I am beginning to use the term antivariance because it conveys concepts not well captured by terms like “negatively correlated”, “less correlated”, “semi-independent”, etc.   In many respects antifragile systems should exhibit antivariance characteristics, and vice versa.

The concept of antivariance can be extended to related concepts such as anticovariance and anticorrelation.

Software Development Choices for Portfolio Optimization

The first phase of developing the HALO (Heuristic Algorithm Optimizer) Portfolio Optimizer was testing mathematical and heuristic concepts.  The second phase was teaming up with beta partners in the financial industry to exchange optimization work for feedback on the optimizer features and results.

For the first phase, my primary tool for software development was the Ruby language.  Because Ruby is a “high-level” extensible language I was able to quickly prototype and test many diverse and complex concepts.  This software development process is sometimes referred to as software prototyping.

For the second, beta phase of software development I kept most of the software in Ruby, but began re-implementing selected portions of the code in C/C++. The goal was to keep the high-change-rate code in Ruby, while coding the more stable portions in C/C++ for run-time improvement.  While a good idea in theory, it turned out that my ability to foresee beta-partner changes was mixed at best.  While many changes hit the the Ruby code, and were easily implemented, a significant fraction hit deep into the C/C++ code, requiring significant development and debugging effort.  In some cases, the C/C++ effort was so high, I switched back portions of the code to Ruby for rapid development and ease of debugging.

Now that the limited-beta period is nearly complete, software development has entered a third phase: run-time-performance optimization.  This process involves converting the vast majority of Ruby code to C.  Notice, I specifically say C, not C/C++.   In phase 2, I was surprised at the vast increase in executable code size with C++ (and STL and Boost).  As an experiment I pruned test sections of code down to pure C and saw the binary (and in-memory) machine code size decrease by 10X and more.

By carefully coding in pure C, smaller binaries were produced, allowing more of the key code to reside in the L1 and L2 caches.  Moreover, because C allows very precise control over memory allocation, reallocation, and de-allocation, I was able to more-or-less ensure than key data resided primarily in the L1 and/or L2 caches as well.  When both data and instructions live close to the CPU in cache memory, performance skyrockets.

HALO code is very modular, meaning that it is carefully partitioned into independent functional pieces.  It is very difficult, and not worth the effort, to convert part of a module from Ruby to C — it is more of an all-or-nothing process.  So when I finished converting another entire module to C today, I was eager to see the result.  I was blown away.  The speed-up was 188X.  That’s right, almost 200 times faster.

A purely C implementation has its advantages.  C is extremely close to the hardware without being tied directly to any particular hardware implementation.   This enables C code (with the help of a good compiler) to benefit from specific hardware advantages on any particular platform.  Pure C code, if written carefully, is also very portable — meaning it can be ported to a variety of different OS and hardware platforms with relative ease.

A pure C implementation has disadvantages.  Some include susceptibility to pointer errors, buffer-overflow errors, and memory leaks as a few examples.  Many of these drawbacks can be mitigated by software regression testing, particularly to a “golden” reference spec coded in a different software language.  In the case of HALO Portfolio-Optimization Software, the golden reference spec is the Ruby implementation.  Furthermore unit testing can be combined with regression testing to provide even better software test coverage and “bug” isolation.  The latest 188X speedup was tested against a Ruby unit test regression suite and proven to be identical (within five or more significant digits of precision) to the Ruby implementation.  Since the Ruby and C implementations were coded months apart, in different software languages, it is very unlikely that the same software “bug” was independently implemented in each.  Thus the C helps validate the “golden” Ruby spec, and vice versa.

I have written before about how faster software is greener software.  At the time HALO was primarily a Ruby implementation, and I expected about a 10X speed up for converting from Ruby to C/C++.  Now I am increasingly confident that an overall 100X speedup for an all C implementation is quite achievable.  For the SaaS (software as a service) implementation, I plan to continue to use Ruby (and possibly some PHP and/or Python) for the web-interface code.  However, I am hopeful I can create a pure C implementation of the entire number-crunch software stack.  The current plan is to use the right tool for the right job:  C for pure speed, Ruby for prototyping and as a golden regression reference, and Ruby/PHP/Python/etc for their web-integration capabilities.