It is hard to know all the reasons behind Buch’s comment above, but I suspect a large part of it is to do with the pitfalls of backtests.
For any systematic/algo-based/quant strategy, backtesting is necessary, but it can be easy and is too often misused. Additionally, when done poorly, because concrete numbers are involved, it can give a false sense of reliability and accuracy.
“One of the first things taught in introductory statistics textbooks is that correlation is not causation. It is also one of the first things forgotten.” – Thomas Sowell
A spurious correlation is a relationship between two variables that appears causal but is not. It is often just the result of simply coincidence or there is an undiscovered third factor at play. Spurious correlations are the kryptonite of algo strategies.
Look at the graph below – no matter how low your opinion of Nicholas Cage is, you would agree that there is no way more people are killing themselves by drowning in a pool just because he appeared in too many movies that year.
Spurious correlations may be funny (you can check out
https://www.tylervigen.com/spurious-correlations) but backtests are no laughing matter. They are crucial to figuring out how the strategy performs in different conditions.
So, the next time you get excited by backtesting results, look for the following to make sure they are replicable in real life.
Garbage in, garbage out
First and foremost, any model is only as good as the data you feed it. If your data is not accurate, your model will certainly not be – utmost care must be taken that the data being considered is clean and accurate.
This is best understood with an example. While backtesting a strategy, one can just assume one is able to exit at a close price, or if it is a day trading strategy then whatever the price was during the day. However, in reality, there are a lot of charges beyond the price. Not only actual charges, such as Brokerage, STT, etc. but also hidden slippages such as impact cost.
If one participates in the market, one influences it – especially if one is dealing in small-cap stocks that are illiquid. To deal with this, the direct charges should definitely be incorporated into the backtests, but also the impact costs need to be modeled.
Even if you do all of this beware that this will always be an estimation and in practice, your realized prices can be very different. So, any expected strategy should have an in-built margin of safety.
An obvious mistake that many people make is when considering companies to invest in during the backtest only those considered which are trading today. But then this ignores many companies that have gone bust, since the backtest period.
For example, Kingfisher Airlines is no longer listed today because it went bust. However, if you are backtesting your algo in the 2007 to 2018 (before it got delisted) period, then Kingfisher Airlines should be included in the universe of stocks for the algo to pick from.
By excluding bankrupt companies that are no longer around, one could be giving the algo an unfair advantage in the backtests. To deal with this, one must try and include all the companies that were trading during the backtest period.
Overfitting could result from your model doing well over a certain period because of certain specific conditions that might not be generally true. Let’s say you run a momentum-based strategy and only test it over a period that was very bullish.
Well then, your strategy will obviously do well, but over the long term, there will be downtrends as well. Since your period was not chosen correctly you will mistakenly think the system is better than it actually is; additionally, one can keep adding parameters so that over any period, the model performs well but then this is liable to a breakdown in real performance.
Hence, one should try and use as few parameters as possible and should backtest over as long a period as possible.
Additionally, one should train and test over several different periods, and check the robustness of the system in different scenarios.
When creating the model and then testing its performance in the past, you have to make sure you are not using the information from the test period itself.
For example, let’s say today (in Oct 2022) you want to test the performance of your model from 1-Jan-2020 to 31-Dec-2020.
Then when generating the portfolio as of 1-Jan-2020, you cannot use any information from after 1-Jan-20 even though it is technically available to you since you are sitting in Oct 2022.
Sometimes, it may not even be intentional, but while in the throes of a complicated model one might overlook this and “peek” at forwarding information.
This will obviously result in a model with very high backtest results that simply do not work in the real world unless you are a soothsayer or God!
This is in no way an exhaustive list (underfitting, out-of-sample tests, etc. are also powerful tools), but it is a good place to start when evaluating backtests. It will certainly help you eliminate a large percentage of bad apples.
To conclude, I want to leave you with a warning that even if a backtest passes all these tests and more, the fact is that markets evolve. Even if your model was robust in the past, it might not be in the future. As with any skill, you have to keep updating it to make sure it passes the test of time.
“Most people use statistics like a drunk man uses a lamppost; more for support than illumination”
(Disclaimer: Recommendations, suggestions, views and opinions given by the experts are their own. These do not represent the views of Economic Times)