Discussion of article "R-squared as an estimation of quality of the strategy balance curve"

 

New article R-squared as an estimation of quality of the strategy balance curve has been published:

This article describes the construction of the custom optimization criterion R-squared. This criterion can be used to estimate the quality of a strategy's balance curve and to select the most smoothly growing and stable strategies. The work discusses the principles of its construction and statistical methods used in estimation of properties and quality of this metric.

Linear regression is a linear dependence of one variable y from another independent variable x, expressed by the formula y = ax+b. In this formula, а is the multiplier, b is the bias coefficient. In reality, there may be several independent variables, and such model is called a multiple linear regression model. However, we will consider only the simplest case. 

Linear dependence can be visualized in the form of a simple graph. Take the daily EURUSD chart from 2017.06.21 to 2017.09.21. This segment is not selected by chance: during this period, a moderate ascending trend was observed on this currency pair. This is how it looks in MetaTrader:

Fig. 11. Dynamics of the EURUSD price from 21.06.2017 to 21.08.2017, daily timeframe

Author: Vasiliy Sokolov

 

Именно поэтому количество сделок должно быть достаточно большим. Но что подразумевать под достаточностью? Принято считать, что любая выборка должна содержать как минимум 37 измерений. Это магическое число в статистике, именно оно является нижней границей репрезентативности параметра. Конечно, для оценки торговой системы этого количества сделок недостаточно. Для надежного результата желательно совершить не менее 100 — 150 сделок. Более того, для многих профессиональных трейдеров и этого недостаточно. Они проектируют системы, совершающие не менее 500-1000 сделок, и уже потом, на основании этих результатов, рассматривают возможность запуска системы на реальных торгах...

Oh! I always thought it was 100. Thank you, interesting article.

Thus, it is safe to say that the R-squared coefficient of determination is an important addition to the existing set of MetaTrader 5 testing metrics. It allows you to evaluate the smoothness of a strategy's balance line curve, which is a non-trivial metric in itself. R-square is easy to use: its range of values is fixed and ranges from -1.0 to +1.0, signalling a negative trend of the strategy balance (values close to -1.0), no trend (values close to 0.0) and a positive trend (values tending to +1.0). Due to all these properties, reliability and simplicity, R-square can be recommended for use in building a profitable trading system.

Wow! I always thought that Ryx - Coefficient of Determination - is used to evaluate the quality of linear regression. The coefficient ofdeterminationfor amodelwith aconstanttakesvaluesfrom0to1.

It is also common to perform significance tests on the regression coefficient. Even Alglib has them :-)

PearsonCorrelationSignificance(), SpearmanRankCorrelationSignificance().

 
we will use the ready-made Expert Advisor CImpulse 2.0, the work of which is described in the article"Universal Trading Expert Advisor: Working with Pending Orders". It was chosen for its simplicity and for the fact that, unlike Expert Advisors from the standard MetaTrader 5 delivery, it can be optimised, which is extremely important for the purposes of our article

What was meant?

There is a special statistical metric in the MetaTrader terminal report. It is called LR Correlation and it shows the correlation between the balance line and the linear regression found for this line.

The way of calculating the determination coefficient R^2 is similar to the way of calculating LR Correlation. But the final number is additionally squared.

One number we are most interested in here is R-squared or R-squared. This metric shows a value of 0.5903. Hence, linear regression explains 59.03% of all values and the remaining 41% remains unexplained.

From the above citations, it follows that R^2 = LR^2. Thus the criterion for finding a linear function called "Linear Regression" is the MNC of variances or which is the same thing - maximising the absolute value of Pearson's RQ, which is MathAbs(LR). And maximising MathAbs(LR) is the same as maximising R^2, since MathAbs(LR) = MathSqrt(R^2).


Total we have that the linear regression is found by the maximisation criterion MathAbs(R)^n, where n is any positive number.

Well then what is the point of talking about 59.03% of all values explained by LR, when you can, for example, at n = 1 get 76.8%, and at n = 4 - 34.8%?


Wrong statement

R^2 is nothing but the correlation between a graph and its linear model

   //-- Find R^2 and its sign
   double r2 = MathPow(corr, 2.0);
 
Способ расчета коэффициента детерминации R^2 аналогичен способу расчета LR Correlation. Но итоговое число дополнительно возводится в квадрат.

Plots of the LR Correlation and R^2 distributions for the 10,000 independent examples that are presented in the article show that R^2 != LR^2.

The amazing thing is that by a simple mathematical action (second degree) we have completely removed the undesirable marginal effects of the distribution.

I don't understand why the second degree of the original "concave" distribution makes it "flat"?
 
Now let's find the best run by the R-squared parameter. To do this, save the optimisation runs to an XML file. If Microsoft Excel is installed on your computer, the file will open in it automatically. We will work with sorting and filters, so let's highlight the table header and click the button of the same name (Home -> Sort and Filter -> Filter), after which you will be able to flexibly display the columns. Отсортируем прогоны по пользовательскому критерию оптимизации

Why use Excel for this, when everything is sorted in the Tester itself?

 
НедостаткиРешение
Применим исключительно для оценки линейных процессов, или систем, торгующих фиксированным лотом.Не применять для торговых систем, использующих систему капитализации (мани-менеджемент).

Equity for R^2 calculation should be counted not as AccountEquity ( == AccountBalance + Sum(Profit[i])), but as Sum(Profit[i] / Lots[i]) (for one-character TS).

 
Of all the MQL sources in the article, only one is useful
//+------------------------------------------------------------------+
//| Returns the R^2 score calculated based on the equity of the strategy |
//| Equity values are passed as an equity array |
//+------------------------------------------------------------------+
double CustomR2Equity(double& equity[], ENUM_CORR_TYPE corr_type = CORR_PEARSON)
{
   int total = ArraySize(equity);
   if(total == 0)
      return 0.0;
   //-- Fill the matrix Y - equity value, X - ordinal number of the value
   CMatrixDouble xy(total, 2);
   for(int i = 0; i < total; i++)
   {
      xy[i].Set(0, i);
      xy[i].Set(1, equity[i]);
   }
   //-- Find the coefficients a and b of the linear model y = a*x + b;
   int retcode = 0;
   double a, b;
   CLinReg::LRLine(xy, total, retcode, a, b);
   //-- Generate linear regression values for each X;
   double estimate[];
   ArrayResize(estimate, total);
   for(int x = 0; x < total; x++)
      estimate[x] = x*a+b;
   //-- Find the correlation coefficient of values with their linear regression
   double corr = 0.0;
   if(corr_type == CORR_PEARSON)
      corr = CAlglib::PearsonCorr2(equity, estimate);
   else
      corr = CAlglib::SpearmanCorr2(equity, estimate);
   //-- Find R^2 and its sign
   double r2 = MathPow(corr, 2.0);
   int sign = 1;
   if(equity[0] > equity[total-1])
      sign = -1;
   r2 *= sign;
   //-- Return the normalised estimate of R^2, to hundredths precision
   return NormalizeDouble(r2,2);
}

It is universal - it is suitable for any double arrays (not only Equity).

When you look at all the other MQL-codes, you don't understand why they are given, because they are not readable at all without knowledge of CStrategy.


Thanks to the author for the article, it made me think.


ZY The lines highlighted in yellow in the source code are controversial.

 

Very interesting, thank you. I never thought about the fact that tester metrics are distorted by captitalisation, so optimisation will be less efficient than with fixed volume. And R^2 is of course very useful, I wonder if it will speed up the optimisation process compared to profit factor+ max balance, for example.

 
fxsaber:
Of all the MQL sources in the article, only one is useful

I agree with that, all the rest will have to be ripped out of classes to add to your system... it would be better to have everything in separate fs or a separate includnik.

 
fxsaber:

Equity for R^2 calculation should be calculated not as AccountEquity ( == AccountBalance + Sum(Profit[i])), but as Sum(Profit[i] / Lots[i]) (for single-character TS).

Code for calculating "equity" suitable for R^2. It is written in MT4 style, it is not difficult to translate it to MT5...

// Equity calculation without MM (variant for single-character TS)
class EQUITY
{
protected:
  int PrevHistoryTotal;
  double Balance;
  double PrevEquity;
  
  // Add an element to the end of an arbitrary array
  template <typename T>
  static void AddArrayElement( T &Array[], const T Value, const int Reserve = 0 )
  {
    const int Size = ::ArraySize(Array);
  
    ::ArrayResize(Array, Size + 1, Reserve);
  
    Array[Size] = Value;
  }

  static double GetOrderProfit( void )
  {
    return((OrderProfit()/* + OrderCommission() + OrderSwap()*/) / OrderLots()); // commission and swap are sometimes useful to ignore
  }
  
  static double GetProfit( void )
  {
    double Res = 0;
    
    for (int i = OrdersTotal() - 1; i >= 0; i--)
      if (OrderSelect(i, SELECT_BY_POS) && (OrderType() <= OP_SELL))
        Res += EQUITY::GetOrderProfit();
    
    return(Res);
  }
  
  double GetBalance( void )
  {
    const int HistoryTotal = OrdersHistoryTotal();
    
    if (HistoryTotal != this.PrevHistoryTotal)
    {
      for (int i = HistoryTotal - 1; i >= PrevHistoryTotal; i--)
        if (OrderSelect(i, SELECT_BY_POS, MODE_HISTORY) && (OrderType() <= OP_SELL) && OrderLots()) // OrderLots - CloseBy
          this.Balance += EQUITY::GetOrderProfit();
      
      this.PrevHistoryTotal = HistoryTotal;
    }
    
    return(this.Balance);
  }
  
public:
  double Data[];

  EQUITY( void ) : PrevHistoryTotal(0), Balance(0), PrevEquity(0)
  {
  }
  
  virtual void OnTimer( void )
  {
    const double NewEquity = this.GetBalance() + EQUITY::GetProfit();
    
    if (NewEquity != this.PrevEquity)    
    {
      EQUITY::AddArrayElement(this.Data, NewEquity, 1 e4);
      
      this.PrevEquity = NewEquity;
    }
  }
};


Usage

EQUITY Equity;

void OnTimer()
{
  Equity.OnTimer();
}

double OnTester()
{
  return(CustomR2Equity(Equity.Data));
}
 
fxsaber:

Code for calculating "equity" suitable for R^2. Written in MT4 style, it is not difficult to translate it to MT5....


Usage


Cool, you can just call it at each new bar, so that the system is not loaded with a timer. For systems with new bar control.