Machine learning in trading: theory, models, practice and algo-trading - page 322

 
Maxim Dmitrievsky:

It is necessary to choose a forward with the highest profit factor, and the backtest should be about the same, so the grid is able to make normal predictions with these parameters.

No, it is not. The forward is just showing the potential profit in real trading. Take the best result of a backtest and look at its fronttest - that's what you would earn.
It is even possible to get at least one profitable forward variant during optimization, but such an EA will still lose whatever way you spin it. Genetics makes more than 10000 selections of parameters and some of them are always profitable both in backtest and fronttest but it is just an accident.

Forward can be used as a control for creating/modifying an Expert Advisor - replace those three rsi parameters with something else, genetically find new best parameter values and see what happens on the forward. If the best backtest results correspond to good forward results and this happens during optimization on different time intervals - then EA is OK. Forward is not to be taken too long as almost all EAs lose on the long interval without additional optimization. For example, 2 months of backtest, a week of fronttest is enough.


Maxim Dmitrievsky:

I still do not quite understand if it is better for the normalization function to take 5000 bars array at once instead of 50 so it would find more correct max and min from the very beginning and would not update them over time, because at the beginning of testing we will receive not quite correctly normalized values for entry, but later more and more accurately

Yes, it will be more accurate with 5000. Also in real trading after Expert Advisor launching, terminal restarting, etc. the values for min and max will be reset. All optimization will be lost. The deposit will be destroyed.
I'm also trying to change something in the code for profit - for example, I just took a pure linear regression result without any additions and multiplied by 1000, then added 0.5. Result will be almost always in [0;1] (if it's outside of limits - I'll print error in log and reduce multiplier later), center will be always in 0.5, and it will never get messed up.


Maxim Dmitrievsky:

I would still detrend charts and feed more digestible values for the grid. I'm not sure yet how to improve, say, the regression slope and autocorrelation of a stationary series, since I'm not very good at econometrics, I just watch video clips now

The slope of the regression on the stationary series will be zero, you don't need to look for that. Generally speaking, if you want to find the slope of the current trend on the last N bars, then linear regression is Ok.
The autocorrelation would be somewhat confusing as it is not a single value but a long vector (correlation with lag 1, correlation with lag 2, correlation with lag 3, etc.). All these values will not fit in the RNN.

 
Dr. Trader:

The slope of the regression on the stationary series will be zero, there is no need to look for it. In general, if you want to find the slope of the current trend on the last N bars, then linear regression is quite ok, everything is already good in the code.
The autocorrelation would be somewhat confusing as it is not a single value but a long vector (correlation with lag 1, correlation with lag 2, correlation with lag 3, etc.). All these values will not fit in the RNN.


No, no, we calculate the slope of regs using normal charts and look for autocorrelation using detrend, may be we transfer the cycle periodicity from 0 to 1, like in the phase of the cycle we are in

I.e. at the input we have the direction in the form of the regression slope and cyclic frequency within this direction

 
Maxim Dmitrievsky:

And what do you think of the idea of using the RNN as a sort of autoencoder for MLP?

There's something quite wrong with that phrase :)


Autoencoder is such a neuron which can:
1) take some vector (e.g. a vertex series), and output another vector, smaller in length. A kind of data compression with little loss.
2) Take the previously obtained short vector of data, and reconstruct the original data from it (or nearly original, depending on the losses of the first step). That is decompression.

Real life example: we have a picture in BMP format, which takes much disk space. The autoencoder takes its pixels and returns a new vector of JPG pixels to us. It is the same image, but it takes less disk space and is a bit cloudy.
And then you can if you want from JPG back to BMP, but the original brightness and vividness is not returned.
I don't think we can put the JPG algorithm into neuronics; it's just for clarity.


RNN takes not time series but RSI in this case and it returns only one value according to which the original prices can't be restored.



Maxim Dmitrievsky:

No, no, we calculate the slope of regs according to the normal charts and look for autocorrelation according to detrends, may be we transfer the cycle periodicity from 0 to 1, like in the phase of the cycle we are in

I.e. on the output we will have the direction in the form of the regression slope and cyclic frequency within this direction

Ah, I see.
 
Dr. Trader:

There's something completely wrong with that phrase :)

Well, the RNN takes not the time series but RSI in this case and it returns us only one value, by which the original prices cannot be restored.

But we can reconstruct the 3 rsi readings back :) he just compressed them and gave out the probability, no? )

Autoencoder also has a loss of information... I don't understand the difference yet. Mb the difference is purely in the architecture, we have kind of a simplified version

 
Yuriy Asaulenko:
Also took a look. Imho, this is not our subject area.


Well, why? I've seen publications for EURUSD for M1.

Look at rugarch.

There are a lot of these GARCNs. They have three groups of parameters: the model itself, the type of average, and the type of the residual distribution. For each of the parameter types, the latest peeps. Detrending is discussed above. So in GARCH we detrend using ARFIMA, i.e. with fractional differentiation (Hurst).

 
SanSanych Fomenko:


Well, why? I've seen publications for EURUSD for M1.

Look at rugarch.

There are a lot of these GARCNs. They have three groups of parameters: the model itself, the type of average, and the type of the residual distribution. For each of the parameter types, the latest peeps. Detrending is discussed above. In GARCH, detrending with ARFIMA, i.e. with fractional differentiation (Hurst).


Well, how should one put some indicators to the input into the grid? The net should create the model itself.
 
Maxim Dmitrievsky:

Well, how do you shove garbage into the grid? After all, the grid itself should create the model inside itself.

Spit on the grid and don't shove decent things into decent ones.
 
SanSanych Fomenko:

Spit on the grid and do not shove the decent things into the indecent

No, it is necessary to use them for the sake of interest, it is possible to think up the use of nets.
 
Maxim Dmitrievsky:

No, it is necessary to use it for fun, it is obviously possible to think up applications for meshes.
All machine learning and NS models are extremely dependent on predictors, which need to be matched to the target function. All of this has been discussed many times above. The main labor costs are on dataminig, and the grid itself does not matter much.
 
SanSanych Fomenko:
All machine learning and NS models are extremely dependent on predictors that have to be matched to the target function. All of this has been discussed many times above. The main labor costs are on dataminig, and the grid itself doesn't matter much.

Well, I just tried to discuss above options of predictors :) I'll try, che
Reason: