by Maz Jadallah
Articles Featured Featured-Homepage
In October 2017, the Wall Street Journal published an article largely critical of Morningstar’s Star-Rating system. The now infamous piece, “The Morningstar Mirage”, was well researched and written by seasoned journalists Kirsten Grind, Tom McGinty and Sarah Krouse. Its main assertions were that star ratings were not predictive of future performance and that Morningstar had not done enough to discourage advisors from using the system as a primary method of manager selection.
As you can imagine, the piece garnered a big response from both Morningstar and the advisor community. You can read Morningstar CEO’s response to the article here and for a deeper dive you can read Morningstar’s own research on the performance of its star rating system. Morningstar’s research concludes that star-ratings “had moderate predictive ability for risk-adjusted returns in the short term.”
My take on the WSJ piece is that, while thorough, it did not give enough credit to Morningstar for creating the star rating system in 1985, at a time when there was no easy-to-understand way for individual investors and analysts to compare one investment product vs the other. At the time, the company was well known for innovation and was on the cutting edge of functionality such as data visualization and presentation.
I also think the authors focused too much on the tool and not enough on the craftsman (the advisor). Morningstar has been clear from the beginning that the rating system is intended as a starting point for advisors. Unfortunately, star ratings also make it easier for advisors to sell funds to clients, which opens it up to misuse.
A Better Mousetrap
Now that we’ve got that out of the way, this article isn’t about the pros and cons of Morningstar’s star ratings. Instead, in the spirit of continuing to innovate in the area of manager skill assessment and selection, this article seeks to propose a holdings-based rating system as a long overdue evolution. Our core assertion is that holdings-based analysis (like our Clone Scores) is a better way to predict a manager’s future performance than rating a manager simply based on their reported returns. Why?
A manager’s reported performance can arise from many sources; stock selection, leverage, market timing, trading skill and even their fee structure. So, if you’re using reported performance to assess a manager’s stock selection skill, that’s a lot of noise you have to filter through to get to what really matters – stock selection skill. On the other hand, a manager’s holdings as reported in mandated regulatory disclosures (Form-13F), know nothing about a manager’s use of leverage, market timing, trading skill or fee structure, they simply reveal the manager’s stock selections at the end of each quarter. So, assessing a manager using just their reported holdings is a great way to isolate the manager’s stock selection skill. As it turns out, it’s also a lot more predictive of the manager’s future performance. Read on.
To support our claim, we mimic Morningstar’s own research methodology as described in the paper above. Morningstar conducted two types of studies over the period January 2003[i] through December 2015; a regression analysis using French-Fama factors and an event study which “provides a picture for what the typical investor could expect to experience over a variety of holding periods.” We perform the same analysis using Clone Score and compare the results.
Let’s begin with the event study. The event study “is constructed by sorting funds into groups according to their star ratings each month, equally weighting them, and then tracking each group’s subsequent performance during several time periods: one month, three months, six months, 12 months, 36 months, and 60 months.”
Morningstar found that for equity funds, “the average three-year forward cumulative return was 27.74% for 5- star funds and 25.78% for 1-star funds – an annualized outperformance of 0.64 percentage points.”
Following the same process over the same time period, we constructed two distinct portfolios made up of the largest twenty holdings (equal weighted) from the top/bottom 10% of managers in our universe with the highest/lowest Clone Scores. Rather than perform monthly ranking/selection like the Morningstar study, we hewed closer to the investment process used for our AlphaClone Hedge Fund Manager Index, selecting managers semi-annually and rebalancing their holdings quarterly. Figure 2 summarizes our results compared to those from Morningstar.
We found that the average three-year forward cumulative return for managers in the top 10% was 43.05% and for managers in the bottom 10% it was 31.80% – an annualized outperformance of 3.75 percentage points. It is also worth noting the consistency in outperformance of the top 10% over the various time periods.
Our study tried to insulate itself as much as possible from survivorship bias. Our securities dataset includes dead securities so the effects of their selection are incorporated in our results. Our manager dataset went live in December 2008 which means that prior to then, our results include only managers that were active at the time. Thus, some amount of bias could be present in our results during that time period. After December 2008, our results do incorporate holdings from now obsolete managers.
Perhaps the most useful conclusion to draw in comparing the results of both event studies is to simply look at our results as an order of magnitude of Morningstar’s. As constructed, the results of our event study outperformed Morningstar’s by a factor of 5x clearly supporting the efficacy of our approach. Again – the objective here is to highlight the power of holdings-based analysis in predicting a manager’s future returns and not disparage Morningstar’s rating system.
Morningstar also ran a “more rigorous” regression analysis that “aims to discover whether higher-rated funds have superior one-month forward returns in the cross section.” The analysis controlled for expenses and, for equity funds, the following Fama-MacBeth risk factors; market, size, value and momentum.
Morningstar found that, “In the fixed-income and allocation asset classes, the t-test results strongly support the hypothesis that the star ratings had significant forecasting capability after controlling for other factors, such as market beta, size, style, momentum, and credit. Among equity funds (see Figure 3), we observe the correct directionality among the coefficients (for example, higher-rated funds have higher returns), but these results were not as significant or conclusive.”
While it was important to conduct both event studies over the same time period in order to make the results comparable, the same is not true for the regression study. As long as the regression factors are the same, one should be able to meaningfully compare the results of regression studies performed over different time periods. Since we are not constrained, like Morningstar, to begin our analysis on January 2003, we elected to run our analysis from August 2000 through December 2019, a period of just over eighteen years. We believe using a longer time period (more data) should give readers more confidence in our regression’s results.
Our analysis regresses the monthly returns of two separate hypothetical strategies that each invest in the five largest holdings (equal weighted) from each of 10 managers (~3% of universe) with the highest and lowest Clone Scores respectively. Like in the event study, we select managers semi-annually and rebalance holdings quarterly. Figure 4 compares the regression results for the Top 10 model vs Bottom 10.
The results from the Top 10 regression generated a monthly alpha of 0.35% or 4.24% annualized. It is interesting to note that the level of annualized alpha in the regression study is similar to that found in the event study. With a t-statistic above 2, the results were statistically significant at the 95% confidence interval. Conversely, the results from the Bottom 10 study indicate no monthly alpha and a very strong indication that the returns in the model are fully explained by the factors in the regression.
Taken together, the results strongly support the efficacy of a holdings-based approach to equity manager assessment. While there is no way a holdings-based assessment can or should replace star ratings given their applicability across asset classes, star ratings are weakest where a holdings-based assessment has been shown to be strongest, in the equity asset class. Is it possible then to incorporate a holdings-based assessment as part of Morningstar’s star rating system? The answer is almost certainly, “yes” and we would argue that it should be. After all, in the end, our objective is the same as Morningstar’s; to help investors make better investment decisions.
[i] According to Morningstar, in 2002, significant changes were made to the Morningstar Rating methodology, so it is not relevant to assess the data prior to this date.
© AlphaClone 2020. All Rights Reserved. The information, data, analyses and opinions presented herein do not constitute investment advice; are provided solely for informational purposes and therefore are not an offer to buy or sell a security; and are not warranted to be correct, complete or accurate. Performance results are hypothetical and, in some cases, based on back-tested performance. The opinions expressed are as of the date written and are subject to change without notice.