Strategy Performance Evaluation 9/12

Petr Tmej
Jul 15, 2022
9 min read

STRATEGY Performance Evaluation

One of the widespread algorithmic trading mistakes is only paying attention to the total return of trading strategies. When developing, analyzing, and comparing your strategies, you should have a method to evaluate their performances. The most obvious and easiest way to assess the performance of a strategy is its total return. You want to make the most money possible, right?

Well, I’d argue that the way you make money is just as important. What I am talking about here is stability.

Imagine two trading strategies: The first strategy made 100% total return from initial capital in 4 years. However, all the 100% return it did during a single year. The second strategy made 80% total return from initial capital in 4 years. Every single year it made 20% return p.a.. Even though the first one outperformed the second one regarding the total return, the second strategy provides more stability.

So, we would trade the second strategy. Most traders would not be able to wait a few years to generate profits. This example shows that a trading strategy’s total return is not the best measure to evaluate and compare different strategies. Therefore, it’s essential to look at other metrics besides return.

strategy performance evaluation key takeaways

E-mini Russell 2000 Strategy (“EXCELSIOR-RUS2000”)

For illustration purposes, we decided to present the EXCELSIOR-RUS2000 strategy. It is a strategy for E-mini Russell 2000 (RTY). Basically, from 2015 this strategy has a valid out-of-sample period. A strategy uses unique price patterns for entry and exit signals from the genetic algorithms building process. This strategy’s trading sessions from 8:15 a.m. to 3:15 p.m, and the bar interval is 60 minutes.

First, have a look at an equity curve from 2nd January 2008 to 24th September 2020 (Figure 1):

Figure 1: Equity curve with number of trades for EXCELSIOR-RUS2000 on E-mini Russell 2000 market

You see that this strategy made a profit literally in any market condition during the last twelve years on the e-mini Russell 2000 market.

Let’s have a look at the performance summary:

Figure 2: Performance summary of EXCELSIOR-RUS2000 strategy for E-mini Russell 2000

In Figure 2 and Figure 3 you can see the historical equity curve and performance summary of one of the very successful strategies that can be applied for E-mini Russell 2000 (as well as Micro).

The figure’s headline shows that the curve covers trades carried out between January 14, 2008, and 24 September 2020. You can see that the system executed nearly 700 trades (namely 653). The vertical axis gives us information that this trading strategy earned approximately $ 170 000 by the trading of one futures contract during the period.

You can see that this strategy’s equity curve tends to create new peaks continuously. It means that after periods of losses (drawdowns) the trading system has the power to create new price peaks again, i.e., to enter into a series of profitable trades.

Approximately after the trade no. 600, the equity curve rose very steeply. It is an exciting phenomenon because the strategy made profits during a pandemic – from February 2020.

The economic crisis causing steep Long-side or Short-side movements is very favorable for this strategy.

You can also see that the last 60 trades were very profitable and that the curve created new and new peaks.

We must not forget that we have a very reasonable number of trades – more than 600, and that is a good statistical sample.

Another positive fact is that we backtested 12 years of quality and accurate data where data from 2015 are a real live trading simulation. The data include all market situations – a crisis period, stagnation period, and a period of strong economic growth. Never underestimate the quality of data and the sufficiently long historical sample! This equity curve and performance summary was generated by TradeStation software that provides very high-quality historical data.

The equity curve provides much useful information. On the first insight, you can see the profitability and stability of a given trading strategy. For us, it is one of the most essential and valuable graphical tools for evaluating the performance of a system.

There is a problem when you have a low trading capital. Not all traders can afford such a high drawdown in tens of thousands of dollars, yet, there is a potential solution. Let me present this strategy for Micro E-mini Russell 2000 (M2Y) (Figure 3):

Figure 3: Performance summary of EXCELSIOR-RUS2000 strategy for Micro E-mini Russell 2000

Let’s explain which performance indicators we should pay attention to.

Net Profit (Total Net Profit)

Net profit is the sum of all winning (Gross Profit) and losing (Gross Loss) trades. Its value can be both positive and negative.

To illustrate, imagine backtests of two trading systems, both with 5-trade samples with the following results:

$320, -$260, $50, -$400, $720. The total profit of $430.
$200, -$750, -$3 000, $30, $5 200. The total profit of $1,680.

Question for novice traders: “Which of these two trading strategies would you trade with a real account?” Many of you would, seemingly logically, choose the system with a higher net profit of $1,680. By this simple example, we’re trying to explain that it is necessary to start thinking in a broader context. Of course, our primary goal will always be to earn the most money possible. Yet trading, just like any other business, is also inevitably about the risk. So now let’s look at the two examples from the perspective of the accumulated loss:

In the first case, you would be exposed to the accumulated loss of -260 + 50-400 = $-610
in the second case the cumulative loss is -750 -3,000 = $-3,750

Now you surely understand that evaluating a trading strategy solely based on a single indicator would be very short-sighted and could end by an early bankruptcy. It is always necessary to assess trading strategies through the use of more information and indicators. Therefore, never and under no circumstances consider the net profit indicator as a decisive and the only relevant indicator of your trading strategy’s quality!

Remember that the net profit should consider the costs associated with brokerage fees (commissions) and slippage in order execution

Drawdown

Drawdown is the difference between the historical peak of our equity curve and the subsequent cumulative price decline. It does not necessarily mean a loss. It may be only a price collapse. It can be expressed as the amount of money or percentage of the largest cumulative decline in the capital in our historical trades or backtests.

Its value or a multiple of its value is often used to determine the size of the live trading account. For example, if our maximum historical backtest drawdown was $5,000, we can say that our trading strategy requires an account with at least a triple drawdown value, i.e., $15,000.

In the EXCELSIOR-RUS2000 strategy, the Maximum Intraday Drawdown was $13640 for E-mini Russell 2000 (see Figure 12).

Profit Factor

Profit Factor is the ratio of all winning (Gross Profit), and losing (Gross Loss) trades in absolute value. Its lowest value is 0, the highest value is not limited. Yet from our experience, we know that strategies with a sufficient number of trades (more than 500) and the Profit Factor higher than 1.5 are rather exceptional.

Some traders use the profit factor as an essential indicator. However, it does not reflect the stability of a trading strategy. So instead, take this metric as an additional one to have a better overview. In Figure 2, you can see that the Profit Factor of the EXCELSIOR-RUS2000 strategy is 1.74, and that is a desirable value.

Total Number of Trades

Always remember one important rule – the larger the sample of trades, the better. Those familiar with the probability theory and the basics of statistics do not need a further explanation. You undoubtedly understand that the larger the statistical sample of data we have (backtest data in our case), the more relevant conclusion we can make.

A classic example is surveys of election results. We all know the situation very well – an opinion research agency conducts a survey on election results using a statistical sample. Such a statistical sample typically contains a few thousand inhabitants, i.e., an insignificant part of the total population.

The most important thing is that the sample included all age groups, social classes, ethnicities, both genders, etc. in an even distribution. Based on this statistical sample the overall results are estimated.

Please note that in these surveys, they always speak about a few-percent statistical deviation. There is one inflexible rule here – the closer is the statistical sample (subset) to the total number (entire set), the smaller the statistical deviation can be.

In the Czech Republic, there are approximately 10 million inhabitants. Now imagine that two relevant statistical samples (subsets) are chosen for an election survey, the first with 10,000 inhabitants and the second with 100,000 inhabitants. It is clear the sample with 100,000 inhabitants will be more statistically relevant if both statistical samples have the same relevancy.

Similarly, now let’s have a look at a trading strategy. You understand that a strategy with 30 trades has an entirely different informative value than a system with 6 000 trades. Of course, it depends whether you want to trade an intraday strategy that keeps positions for a few hours or a swing strategy that keeps positions for days.

Sometimes it is necessary to be benevolent and accept that a strategy has a naturally lower number of trades. If a strategy has at least 100 trades on out-of-sample data and covers different types of market regimes, we might consider it a relevant statistical sample.

To sum it up is to always necessary to backtest:

the longest period possible for the given market with the largest amount of trades possible,
the largest sample of market regimes possible ( e.g., chops, strong trends, sudden price reversals, limit movements, high and low-volatility periods, extreme situations, etc.).

The EXCELSIOR-RUS2000 has 653 trades over 12 years period, and that is a good statistical sample (Figure 2).

Percentage of profitable trades (Percent Profitable) and Ratio Avg. Win / Avg. Loss

If we take the number of winning trades and divide it by the total number of trades and multiply the result by 100, we get the percentage of successful trades. The question is to what extent this is a strong indicator for us. There are strategies with only 30% of profitable trades that generate much higher profits in the long run than strategies with 60% success rates. There may be even strategies with 60% of profitable trades that are losing in total. Therefore, the answer lies hand in hand with Percent Profitable in the average profit and the average loss for all trades and their ratio.

The EXCELSIOR-RUS2000 has Percent Profitable 48.39% and Ratio Avg. Win / Avg. Loss 1.86 (Figure 19).

Let’s explain this as an example. We have two strategies, each of which we tested on 1,000 trades:

a) The first strategy has an average profit of $1,000 and an average loss of $400.

The Ratio Avg. Win / Avg. Loss is 1 000/400 = 2.5.

The Percent Profitable value is 35%.

The total profit/loss is:

Total profit/loss = the number of trades * (average profit * 0.35 – average loss * 0.65) = 1000 * (350-260) = $90,000

Thus, despite the low percentage of successful trades, the strategy is profitable in the long run thanks to a high average profit ratio to the average loss.

b) The second strategy has an average profit of $1,000 and an average loss of $1800.

The Ratio Avg. Win / Avg. Loss is, therefore, 0.55.

The Percent Profitable value is 60%.

The total profit/loss is:

Total profit/loss = 1000 * (600-720) = – $120,000

Although the strategy has 60% profitable trades, it led to a substantial loss after 1,000 trades due to the very low Ratio Avg. Win / Avg. Loss.

The example clearly shows that like in other performance indicators, Percent Profitable cannot be seen in isolation, but always in the context of the ratio of the average profit to the average loss. Therefore, it is a significant indicator because the entire trading is about searching for a “Real edge” (alpha).

The strategies with low Percent Profitable value (between 20-50%) and high Ratio Avg. Win / Avg. Loss (from 1.8 to more) are usually trend following strategies.

Therefore EXCELSIOR-RUS2000 can be qualified as a short term trend following strategy.

Contrary, the strategies with high Percent Profitable value (more than 60%) and low Ratio Avg. Win / Avg. Loss (usually less than 1) are usually mean reversion strategies. We will talk about it in detail in the last article.

Average profit/loss per one trade (Average Trade Net Profit)

This metric can give us a lot of useful information in trading. Its calculation is straightforward – it is the arithmetic average of all trades’ net profits, i.e., winning and losing ones. It can be both positive and negative. If the indicator’s value is positive, the overall backtesting result was a profit and vice versa. In the case of EXCESLSIOR-RUS2000 the Average Trade Net Profit value is $250.23 (Figure 2). Please never forget to deduct average slippage and cost from Gross Profit.

There is one truly fundamental rule – the higher the average profit per trade, the better. Yet again, you must also take other metrics into account, and it is the standard deviation of the system. In no case, you can simply choose a strategy because it has the highest average profit per trade!

Advice: Strategy performance evaluation is not only about returns. You always need to consider the volatility of the returns as well. That is why you should always use Sharpe or Sortino ratio. Remember, trading is about stable returns. This is why you want to do it.

If you don’t want to read all I want to share with you article by article, grab our Ultimate Guide To Successful Algorithmic Trading here and read it anytime you want! 12 chapters, 112 pages: all in one place and completely FREE of charge!