Machine learning in trading: theory, models, practice and algo-trading - page 126

 
Andrey Dik:
You seem to have misunderstood me. I don't tell the net where to enter, not with a zz or any other indicator. A trained net chooses where to enter by itself.

Oh, man... So I don't get it...

 
mytarmailS:

What should correlate with what? How do you do it? I don't get it either...

I think no one did it here but you)

Let me explain once again and encourage you to read about nested crossvalidation.

This is an example from my work. I was building a regression model that makes a prediction of some engineering parameter in a very complex system.

I go through the training parameters of the model, select the best model on the test folds of crossvalidation, and then validate it. In total, I selected 100 models in the test (dots on the graph). These are the best models in the test sample. What makes them different is that they use different predictors.

You can see that the under-trained model on the test turns out to be under-trained on validation as well. The fully trained model on the test and on validation is trained. The over-trained state when high on the test and low on validation does not exist at all.

We have a correlation between the performance of the selected models on the test and the performance on validation.

As we vary the number of predictors, the model grows from underfit to fully fit. And this growth is characteristic of both the data where the best model is selected and the data where the selected best model is validated. There is consistency!

That is, I didn't just pick one model that was best on validation (out-of-sample), but I did multiple model training, selecting them by test, and comparing the quality metric on validation. This is nested crossvalidation. This model is not over-trained. I can take the best model on crossvalidation and get one of the best out-of-sample metrics.

And if I have variation in model performance on forex does not explain variation in model performance on validation, then having the samples on which we select the best model (in this case, the average quality value on the crossvalidation test fouls) means we cannot predict quality performance outside the sample.

So by doing model selection, but not testing the out-of-sample selection procedure itself, we are fitting a model.

A picture like mine comes out on stationary, consistent data - they contain stable dependencies. As an example, sensor overheating degrades the simulated value in all cases and this is explained physically.

When modeling financial time series, I've shown before, I have, with 2,000 selected models, their quality metrics on the test samples do not correlate with the validation ones.

Nested crossvalidation involves repeatedly training different models - or models with different inputs or parameters - on unique training samples followed by testing. For each unique sample, the best model is selected. It is then tested again on a unique validation sample. This process is repeated many times. An external layer of testing is needed to show that the model itself and its selection procedure give consistency of in-sample and out-of-sample results.

I pointed this out to SanSanych, Dr., and others. Dr. understood me. SanSanych didn't get it.

So, if we achieve this picture for forex or any other financial market, we can run the best model in terms of test cutoff in production.

 
Alexey Burnakov:

Let me explain again and encourage you to read about nested crossvalidation.

This is an example from my work. I was building a regression model that predicts some engineering parameter in a very complex system.

I go through the training parameters of the model, select the best model on the test folds of crossvalidation, and then validate it. In total, I selected 100 models in the test (dots on the graph). These are the best models in the test sample. What makes them different is that they use different predictors.

You can see that the under-trained model on the test turns out to be under-trained on validation as well. The fully trained model on the test and on validation is trained. The over-trained state when high on the test and low on validation does not exist at all.

We have a correlation between the performance of the selected models on the test and the performance on validation.

As we vary the number of predictors, the model grows from underfit to fully fit. And this growth is characteristic of both the data where the best model is selected and the data where the selected best model is validated. There is consistency!

That is, I didn't just pick one model that was best on validation (out-of-sample), but I did multiple model training, selecting them by test, and comparing the quality metric on validation. This is nested crossvalidation. This model is not over-trained. I can take the best model on crossvalidation and get one of the best out-of-sample metrics.

And if I have variation in model performance on forex does not explain variation in model performance on validation, then having the samples on which we select the best model (in this case, the average quality value on the crossvalidation test fouls) means we cannot predict the quality of out-of-sample performance.

So by doing model selection but not testing the out-of-sample selection procedure itself, we are fitting a model.

A picture like mine comes out on stationary, consistent data - they contain stable dependencies. As an example, sensor overheating degrades the simulated value in all cases and this is explained physically.

When modeling financial time series, I've shown before, I have, with 2,000 selected models, their quality metrics on the test samples do not correlate with the validation ones.

Nested crossvalidation involves repeatedly training different models - or models with different inputs or parameters - on unique training samples followed by testing. For each unique sample, the best model is selected. It is then tested again on a unique validation sample. This process is repeated many times. An external layer of testing is needed to show that the model itself and its selection procedure give consistency of in-sample and out-of-sample results.

I pointed this out to SanSanych, Dr., and others. Dr. understood me. SanSanych didn't get it.

So, if we achieve this picture for forex or any other financial market, we can run the best model in terms of test cutoff in production.

I still don't get it, sorry.

Validation fouls: are they in the same file as the test fouls or is the validation in general on a new file?

PS.

By crossvalidation I mean the following algorithm: the file is divided, for example, into 10 fouls. The first 9 are taught, and 10 are validated. Then they teach on 2-10, and validate on 1 fold. And that's how they move the validation foul. Right?

 
SanSanych Fomenko:

I still don't understand, sorry.

Validation fouls: are they in the same file as the test fouls, or is the validation on a new file altogether?

PS.

By crossvalidation I mean the following algorithm: the file is divided, for example, into 10 fouls. The first 9 are taught, and 10 are validated. Then they teach on 2-10, and validation on 1 fold. And that's how they move the validation foul. Right?

Yes.

One cycle of M1 crossvalidation training on 10 fouls, you understand correctly. For each combination of learning parameters: on 9 fouls learn, on a delayed check. So 10 times. We obtain the average value of the quality metric at 10 folds. Let us call it m1.

Let's repeat the procedure N times (all the time adding new data to training and testing).

Nested crossvalidation:

We repeat M - N times. Each cycle M is a unique training sample. We obtain m1, m2, . mn metrics of quality, obtained during training and selection of the best models, all on different data.

Outer layer. Each selected model M is tested on a unique validation sample. We obtain k1, k2, ... kn tests outside the sample.

Construct a point diagram of M vs. K. We obtain an estimate of how a change in model quality on cross-validation predetermines out-of-sample quality.

About Predictor Selection. If you don't have the ability to get such a huge amount of data, just give each cycle N of the model a unique set of predictors. You will test whether there is consistency in model performance depending on the selected predictors on test and validation. Roughly speaking, an under-trained model on the test should give worse results on validation as well. An over-trained model on the test will give much worse results on validation.

 
Alexey Burnakov:


I killed half of 15 years for this illusion. Model validation should only be done on data that has NOTHING to do with the training, testing, and validation procedure. I'm too lazy to look up the results of the relevant calculations. But because of rattle, which does as you write, I wasted half a year.

 
SanSanych Fomenko:

Model validation should only be done on data that has NOTHING to do with the training, testing, and validation procedure.

Brrrrr.

That's what it's supposed to do! Validation is done on deferred sampling (or rather, samples, if we're talking about the nested approach).

What illusion? This approach is by all means more objective than single-sample model fitting.

 
Alexey Burnakov:

Brrrrr.

That's what it's supposed to do! Verification is done on deferred sampling (or rather, samples, if we're talking about the nested approach).

What illusion? This approach is by all means more objective than single-sample model fitting.

You know best.

Everything works for me. If I remove the noise predictors, a model trained on June data will work on July data, and when I train a model on July data, the error of that training on July data will be the same as the prediction I used in July, on the model trained in June. This is what I call the lack of retraining.

 
SanSanych Fomenko:

You tell me.

Everything works for me. If I remove the noise predictors, the model trained on the June data will work on the July data, and when I train the model on the July data, the error of that training on the July data will be the same as the prediction I used in July on the model trained in June. This is what I call the lack of retraining.

Let's say you have this working all the time, not just on a 2 month example, which may be a case in point.

Are you teaching what, belonging to the zigzag knee? I'm not ruling out that this particular target learns consistently well, but belonging to a knee doesn't give you accurate inputs. That's the problem. I can pretty accurately predict volatility a day ahead, but it won't give me anything in trading.

 
Alexey Burnakov:

Suppose you have this working all the time, not just on a 2-month example, which could be a case.

Are you teaching what, belonging to the zigzag knee? I'm not ruling out that this particular target learns consistently well, but belonging to a knee doesn't give you accurate inputs. That's the problem. I can accurately forecast volatility one day in advance but it will not help me in trading.

The shortcomings of the target has nothing to do with the methodology of determining overtraining models. I have executed several orders using targets and predictors unknown to me. The result is the same everywhere when removing the noise predictors.
 
SanSanych Fomenko:
The flawed target has nothing to do with the methodology for determining model overfitting.

I think you are mistaken. Noisy labels (celestials) give a dissonance between what you see on your test and what you will see in the future. That's what all sorts of contrivances like nested validation are introduced for such cases. There are even approaches that prove that out of several alternative models on the test you should choose the one that is worse.

The result is the same everywhere if you remove the noise predictors.

How did you determine this? Did you track the performance of your predictors on the then unknown future?

Reason: