Trader's self-deception: distrust of forwards. - page 11

 
Yury Reshetov:

It's called cockroaches in the head. So speak for yourself, don't speak for everyone else. If you don't even know about something, it doesn't mean that others don't take advantage of it. See, for example, the RNN Expert Advisor.

If an EA is not profitable on 1/2 forwards (half backtest, half forward), then you should throw it away. Any fool will get a profit on backtests using tester GA.

I do not understand, what I do not know? About an advisor of some sort? What is the point of your post? If you mean forward analysis, the image on the link is a single one and such an Expert Advisor, if not worth dropping immediately, but worth checking on many bands.
 

Youri Tarshecki:


I almost never see any analysis of the effectiveness of strategies and systems based on forwards.

What is it? Lack of tradition or avoidance of unpleasant emotions?

...

I do not understand what I am not guessing? What is the point of your post?

See Robert Pardo in PDF


Youri Tarshecki:
If you're talking about forward-looking analysis, the picture on the link is just one.

CodeBase is not the Hermitage or even the Louvre to arrange Pinocchio's exhibition there.

Роберт Пардо в PDF - MQL4 форум
  • www.mql5.com
Роберт Пардо в PDF - MQL4 форум
 
Youri Tarshecki:
I'm testing all the options. Now I run 12 segments with 12 forwards and look the overall result. If the majority of forwards are not satisfactory, this EA should not be used, it needs to be reworked. By rem aking your EA and getting information about forwards, you can understand if you are going in the right direction.

Random splitting the history into back and forward will not do any good.

Random values and even clusters can turn out to be the best on some pieces of backtest. I.e. good EA, but on the first backtest, a random set of options turned out to be the best and you chose it for forward. Correspondingly, it died on forward)). In the same way with others - the more you split, the less statistically correct the results. So there must be enough trades on each backtest and forward for statistical validity. In fact for most even intraday systems this will mean that they need at least 20 years of testing with your splits)). And they don't usually live that long. And so, you have been correctly written that the best solution would be to combine the back zones into one and you will get an average but more reliable set of wholes. And more precisely the optimum zones for each optimised parameter.

You don't have the luxury of dividing the history into 12 sectors with enough deals to obtain reliable statistical results in each of them. And to do something with stat. unreliable results is a fluke as a result

 
Слава:

Randomly splitting history into back and forward won't do you any good.

Random values and even clusters can turn out to be the best on some pieces of backtest. I.e. good EA, but on the first backtest, a random set of options turned out to be the best and you chose it for forward. Correspondingly, it died on forward)). In the same way with others - the more you split, the less statistically correct the results. So there must be enough trades on each backtest and forward for statistical validity. In fact for most even intraday systems this will mean that they need at least 20 years of testing with your splits)). And they don't usually live that long. And so, you have been correctly written that the best solution would be to combine the back zones into one and you get an average, but more reliable set of wholes. And more precisely the optimum zones for each optimised parameter.

You don't have the luxury of dividing the history into 12 segments, where the deals will be enough for the results to be statistically correct in each of them. And to do something with statistically unreliable results is randomness as a result

And let's understand what stat validity is. If you believe that the greater the averaging, the greater the validity, then why don't you test on ALL available history - the averaging will be absolute. But would it be reliable? Unlikely. Precisely because the market changes statistically reliably and you can never predict whether this particular cluster is an anomaly or the beginning of a new trend. These two properties of the market are diametrically opposed and require a dialectical solution, as one requires an increase in the segment and the other a decrease.

I solve this problem experimentally. In my particular case it turned out that the sum of 12 monthly forwards is greater than the sum of 4 three-month forwards, and those, in turn, are greater than one year's, with the number of continuous losses also being better.

Have you tested your back-forward proportions experimentally? On what basis do you think your cutoff is the most optimal?

 
Слава:

Randomly splitting the history into back and forward won't do any good.

Random values and even clusters can turn out to be the best on some pieces of backtest. I.e. good EA, but on the first backtest, a random set of options turned out to be the best and you chose it for forward. Correspondingly, it died on forward)). In the same way with others - the more you split, the less statistically correct the results. So there must be enough trades on each backtest and forward for statistical validity. In fact for most even intraday systems this will mean that they need at least 20 years of testing with your splits)). And they don't usually live that long. And so, you have been correctly written that the best solution would be to combine the back zones into one and you get an average, but more reliable set of wholes. Or more precisely optimal zones for each optimised parameter.

There is no luxury in dividing the history into 12 zones with enough trades for the results to be statistically correct for each of them. And to do something with stat. unreliable results is a fluke as a result

I agree with every word. I will add that I use R^2 indicator calculated on the obtained equity strategy to identify the most stable areas of TC parameters. From my point of view the best run is a positive result, good R^2 (greater than 0.8, 0.9) and statistically significant amount of trades. In this case the absolute profit is not as important as whether there were periods of losing or not. All good strategies fail during certain periods. It is just that these losses should be within the general positive trend. It is also important to have at hand a dozen, albeit average, but stable (in terms of R^2) strategies, whose unfavorable moments do not overlap with each other with absolute precision (full correlation is difficult to achieve).
 
Youri Tarshecki:

And let's understand what statistical validity is. If you think that the greater the averaging, the greater the validity, then why don't you test on ALL available history - the averaging will be absolute. But would it be reliable? Unlikely. Precisely because the market changes statistically reliably and you can never predict whether this particular cluster is an anomaly or the beginning of a new trend. These two properties of the market are diametrically opposed and require a dialectical solution, as one requires an increase in the segment and the other a decrease.

Expert Advisor works for a certain period of time - its favorable phase, to put it crudely. Therefore it's senseless to take too much history or at all at random. And your method requires many times more statistics and testing period. So the task is to find a working theme as quickly as possible and at the same time to eliminate the fitting, and splitting up the history into separate fragments, on which decisions are made, leads to the fact that the history on which the system has already worked would be very long (arguments about statistical validity in the previous post).

Youri Tarshecki:


I solve this problem experimentally. In my particular case it turned out that the sum of 12 monthly forwards is greater than the sum of 4 three-month forwards, and those, in turn, are greater than one year's, while the number of continuous losses is also better.

Have you tested your back-forward proportions experimentally? On what basis do you think your stretch is the most optimal?


I approach it differently) I don't take any proportions and I don't use a forward test. I analyze the system quality by equity quality.

The ideal system is Equity upwards.) I.e. Mo=const and Dispersion(dispersion)=0. In reality mo floats and the variance is not zero either. Roughly the oscillation is around a perfect straight line. A good system is the one that has small variance and positive slope when tested reliably (number of trades as one criterion). For example, PF considers it. I.e. a system with a good PF (and some other numerical characteristics of equity) will be further examined for stability when reliably tested. This is already enough for it to pass and your indicators - break it down and they too will be quality))

And in general, you need to understand what the system earns and look at each part of the system for sustainability separately. Each option should become a separate stability study.

Well, you need to have a sufficiently non-lagging criterion for system shutdown, which also builds on understanding the components of the system and which is crucial in its performance

 
Vasiliy Sokolov:
I agree with every word. I would like to add that to identify the most stable areas of TS parameters, I use the R^2 indicator calculated on the obtained equity strategy. From my point of view the best run is a positive result, good R^2 (greater than 0.8, 0.9) and statistically significant number of trades. In this case the absolute profit is not as important as whether there were periods of losing or not. All good strategies fail during certain periods. It is just that these losses should be within the general positive trend. It is also important to have at hand a dozen, albeit average, but stable (in terms of R^2) strategies, whose unfavorable moments do not overlap with each other with absolute precision (full correlation is difficult to achieve).
I agree) Except that having a dozen uncorrelated profitable strategies on hand is realistic)
 
Слава:

... Roughly fluctuating around a perfect straight line. A good system is one that, when tested reliably (number of trades as one criterion), has a small variance and a positive slope. For example, PF considers it.

I tried to use PF at the time but the problem is that it depends inversely (and very clearly) on the number of trades, the more of them, the less of PF. R^2 based on net change of equity (periods of idle time of the trading system are not considered) has no such feature.

Glory:

I.e. a system with a good PF (and some other numerical characteristics of equity) will be further tested for stability when tested reliably. This is already enough for it to pass and your indicators - break it down and they too will be quality))

Exactly. You can formally prove this statement: if TS has a nearly perfect equity straight line pointing upwards, then an arbitrary segment (forward) of that equity will also pass validation, since it too will have a positive result. On the other hand, forward testing will find the set of parameters that will be profitable on all parts of the history, therefore it will find the set of parameters, at which TS running on the entire history will give the most stable and stable positive result. But since the same set of parameters will be obtained during optimization over the whole sample, there is no need to divide the sample into N arbitrary parts.

Glory:

In general, we need to understand what the system earns and consider each part of the system for stability separately. Each option should become a separate stability study.

This is difficult. This is probably the holy grail of trading. We trade the consequence of the cause, which almost always remains behind the scenes. For example, not every trend-following TS works in a trend market. Some TS may show great results on some markets and on others - except that it does not lose money. Although, there are no obvious differences between these markets, even in terms of their trendiness or any other statistics.

Glory:

And it is necessary to have a sufficiently non-lagging criterion of system disconnection, which is also based on understanding of the system components and which is decisive in its working capacity.

Yes. It's easier here, because you can precisely formulate what we want to see: a straight line with a positive slope. If TS has stopped earning, its equity will sooner or later exceed our expectation model and it will have to be disabled.

But, the main factor here is a psychological factor - to accept the inevitable stomping and even a certain loss as the standard behavior of the TS within the selected model.

 
Vasiliy Sokolov:

That's the tricky part. This is probably the holy grail of trading. We trade as a consequence of a cause, which almost always remains behind the scenes. For example, not every trend-following TS works in a trend market. Some TS may show great results on some markets and on others - except that it does not lose money. Although there is no obvious difference between these markets, even in terms of their trendiness or any other statistics.


there are two ways as always)) deduction and induction. The test is induction - we find a pattern from statistical studies. There is also deduction - from understanding what we earn (or rather what some people lose or gain less) and search for how it should result in a strategy that uses it. These two approaches can be combined - induction gives insights, deduction clarifies. Or vice versa))
 

A little more about R^2.

To me this is a very powerful indicator, but not enough. In practice I have encountered that some TSs, can give very good and smooth equity upwards. R^2 is very high and their set of parameters can crack even the most sophisticated forward. Here is an example of one such TS:


Its equity makes one make a stand on the market, but it is not that simple. Adapted TS have one remarkable feature: their set of parameters is almost always unstable and any slight shift in values of these parameters can drastically change the result. For example, a slight change of closing rules of this TS leads to the following results:

You can see that a small change has led to disastrous results. It is noteworthy that there are only 2 optimization parameters in this TS. This is to the point that in fact one can easily obtain the fit by approximating only two points, and the small number of prameters of the TS does not indicate its inability to fit. Therefore once the optimal set of parameters is determined, it is necessary to shift the parameters by some value in the multidimensional optimization space, and see the results of runs in the vicinity of the optimal point:

If we are in a stable parameter spot, their displacement will not dramatically change the behaviour of the TS. It is important to understand that in real trading, this very shift will happen. On history, we are moving TS parameters around a static market. On the real market, the market will move its characteristics around the previously found and fixed by us parameters.

Reason: