Discussion of article "R-squared as an estimation of quality of the strategy balance curve"
Именно поэтому количество сделок должно быть достаточно большим. Но что подразумевать под достаточностью? Принято считать, что любая выборка должна содержать как минимум 37 измерений. Это магическое число в статистике, именно оно является нижней границей репрезентативности параметра. Конечно, для оценки торговой системы этого количества сделок недостаточно. Для надежного результата желательно совершить не менее 100 — 150 сделок. Более того, для многих профессиональных трейдеров и этого недостаточно. Они проектируют системы, совершающие не менее 500-1000 сделок, и уже потом, на основании этих результатов, рассматривают возможность запуска системы на реальных торгах...
Oh! I always thought it was 100. Thank you, interesting article.
Thus, it is safe to say that the R-squared coefficient of determination is an important addition to the existing set of MetaTrader 5 testing metrics. It allows you to evaluate the smoothness of a strategy's balance line curve, which is a non-trivial metric in itself. R-square is easy to use: its range of values is fixed and ranges from -1.0 to +1.0, signalling a negative trend of the strategy balance (values close to -1.0), no trend (values close to 0.0) and a positive trend (values tending to +1.0). Due to all these properties, reliability and simplicity, R-square can be recommended for use in building a profitable trading system.
Wow! I always thought that Ryx - Coefficient of Determination - is used to evaluate the quality of linear regression. The coefficient ofdeterminationfor amodelwith aconstanttakesvaluesfrom0to1.
It is also common to perform significance tests on the regression coefficient. Even Alglib has them :-)
PearsonCorrelationSignificance(), SpearmanRankCorrelationSignificance().
What was meant?
There is a special statistical metric in the MetaTrader terminal report. It is called LR Correlation and it shows the correlation between the balance line and the linear regression found for this line.
The way of calculating the determination coefficient R^2 is similar to the way of calculating LR Correlation. But the final number is additionally squared.
One number we are most interested in here is R-squared or R-squared. This metric shows a value of 0.5903. Hence, linear regression explains 59.03% of all values and the remaining 41% remains unexplained.
From the above citations, it follows that R^2 = LR^2. Thus the criterion for finding a linear function called "Linear Regression" is the MNC of variances or which is the same thing - maximising the absolute value of Pearson's RQ, which is MathAbs(LR). And maximising MathAbs(LR) is the same as maximising R^2, since MathAbs(LR) = MathSqrt(R^2).
Total we have that the linear regression is found by the maximisation criterion MathAbs(R)^n, where n is any positive number.
Well then what is the point of talking about 59.03% of all values explained by LR, when you can, for example, at n = 1 get 76.8%, and at n = 4 - 34.8%?
Wrong statement
R^2 is nothing but the correlation between a graph and its linear model
//-- Find R^2 and its sign double r2 = MathPow(corr, 2.0);
Plots of the LR Correlation and R^2 distributions for the 10,000 independent examples that are presented in the article show that R^2 != LR^2.
The amazing thing is that by a simple mathematical action (second degree) we have completely removed the undesirable marginal effects of the distribution.
Why use Excel for this, when everything is sorted in the Tester itself?
| Недостатки | Решение |
|---|---|
| Применим исключительно для оценки линейных процессов, или систем, торгующих фиксированным лотом. | Не применять для торговых систем, использующих систему капитализации (мани-менеджемент). |
Equity for R^2 calculation should be counted not as AccountEquity ( == AccountBalance + Sum(Profit[i])), but as Sum(Profit[i] / Lots[i]) (for one-character TS).
//+------------------------------------------------------------------+ //| Returns the R^2 score calculated based on the equity of the strategy | //| Equity values are passed as an equity array | //+------------------------------------------------------------------+ double CustomR2Equity(double& equity[], ENUM_CORR_TYPE corr_type = CORR_PEARSON) { int total = ArraySize(equity); if(total == 0) return 0.0; //-- Fill the matrix Y - equity value, X - ordinal number of the value CMatrixDouble xy(total, 2); for(int i = 0; i < total; i++) { xy[i].Set(0, i); xy[i].Set(1, equity[i]); } //-- Find the coefficients a and b of the linear model y = a*x + b; int retcode = 0; double a, b; CLinReg::LRLine(xy, total, retcode, a, b); //-- Generate linear regression values for each X; double estimate[]; ArrayResize(estimate, total); for(int x = 0; x < total; x++) estimate[x] = x*a+b; //-- Find the correlation coefficient of values with their linear regression double corr = 0.0; if(corr_type == CORR_PEARSON) corr = CAlglib::PearsonCorr2(equity, estimate); else corr = CAlglib::SpearmanCorr2(equity, estimate); //-- Find R^2 and its sign double r2 = MathPow(corr, 2.0); int sign = 1; if(equity[0] > equity[total-1]) sign = -1; r2 *= sign; //-- Return the normalised estimate of R^2, to hundredths precision return NormalizeDouble(r2,2); }
It is universal - it is suitable for any double arrays (not only Equity).
When you look at all the other MQL-codes, you don't understand why they are given, because they are not readable at all without knowledge of CStrategy.
Thanks to the author for the article, it made me think.
ZY The lines highlighted in yellow in the source code are controversial.
Very interesting, thank you. I never thought about the fact that tester metrics are distorted by captitalisation, so optimisation will be less efficient than with fixed volume. And R^2 is of course very useful, I wonder if it will speed up the optimisation process compared to profit factor+ max balance, for example.
Of all the MQL sources in the article, only one is useful
I agree with that, all the rest will have to be ripped out of classes to add to your system... it would be better to have everything in separate fs or a separate includnik.
Equity for R^2 calculation should be calculated not as AccountEquity ( == AccountBalance + Sum(Profit[i])), but as Sum(Profit[i] / Lots[i]) (for single-character TS).
Code for calculating "equity" suitable for R^2. It is written in MT4 style, it is not difficult to translate it to MT5...
// Equity calculation without MM (variant for single-character TS) class EQUITY { protected: int PrevHistoryTotal; double Balance; double PrevEquity; // Add an element to the end of an arbitrary array template <typename T> static void AddArrayElement( T &Array[], const T Value, const int Reserve = 0 ) { const int Size = ::ArraySize(Array); ::ArrayResize(Array, Size + 1, Reserve); Array[Size] = Value; } static double GetOrderProfit( void ) { return((OrderProfit()/* + OrderCommission() + OrderSwap()*/) / OrderLots()); // commission and swap are sometimes useful to ignore } static double GetProfit( void ) { double Res = 0; for (int i = OrdersTotal() - 1; i >= 0; i--) if (OrderSelect(i, SELECT_BY_POS) && (OrderType() <= OP_SELL)) Res += EQUITY::GetOrderProfit(); return(Res); } double GetBalance( void ) { const int HistoryTotal = OrdersHistoryTotal(); if (HistoryTotal != this.PrevHistoryTotal) { for (int i = HistoryTotal - 1; i >= PrevHistoryTotal; i--) if (OrderSelect(i, SELECT_BY_POS, MODE_HISTORY) && (OrderType() <= OP_SELL) && OrderLots()) // OrderLots - CloseBy this.Balance += EQUITY::GetOrderProfit(); this.PrevHistoryTotal = HistoryTotal; } return(this.Balance); } public: double Data[]; EQUITY( void ) : PrevHistoryTotal(0), Balance(0), PrevEquity(0) { } virtual void OnTimer( void ) { const double NewEquity = this.GetBalance() + EQUITY::GetProfit(); if (NewEquity != this.PrevEquity) { EQUITY::AddArrayElement(this.Data, NewEquity, 1 e4); this.PrevEquity = NewEquity; } } };
Usage
EQUITY Equity; void OnTimer() { Equity.OnTimer(); } double OnTester() { return(CustomR2Equity(Equity.Data)); }
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
New article R-squared as an estimation of quality of the strategy balance curve has been published:
This article describes the construction of the custom optimization criterion R-squared. This criterion can be used to estimate the quality of a strategy's balance curve and to select the most smoothly growing and stable strategies. The work discusses the principles of its construction and statistical methods used in estimation of properties and quality of this metric.
Linear regression is a linear dependence of one variable y from another independent variable x, expressed by the formula y = ax+b. In this formula, а is the multiplier, b is the bias coefficient. In reality, there may be several independent variables, and such model is called a multiple linear regression model. However, we will consider only the simplest case.
Linear dependence can be visualized in the form of a simple graph. Take the daily EURUSD chart from 2017.06.21 to 2017.09.21. This segment is not selected by chance: during this period, a moderate ascending trend was observed on this currency pair. This is how it looks in MetaTrader:
Fig. 11. Dynamics of the EURUSD price from 21.06.2017 to 21.08.2017, daily timeframe
Author: Vasiliy Sokolov