Machine learning in trading: theory, models, practice and algo-trading - page 1304

 
Maxim Dmitrievsky:

10% error per test and trace for ~10k examples, increases smoothly with increase

with such an error, the models started to work on the new data

on validation differently, it is necessary to go through the variants.

Algorithms don't reveal any more, just communicating

Something suspiciously small. In his articles even on zigzags Perervenko has not reached such a result.

It's also suspicious that the test and tray have 10% each, but the validation is "different". I.e. more or what? The worst should be the test, not the validation.

 
elibrarius:

Something suspiciously small. Perervenko in his articles, even on zigzags did not achieve this.

And it is also suspicious that the test and the tray have 10% each, and the validation is "different". I.e. more or what? The worst should be the test, not the validation.

"Even on zigzags" )))

the worst should only be a validation that was not involved in learning in any way, even indirectly
 
Maxim Dmitrievsky:

"Even on zigzags" )))

the worst can only be validation, which was not involved in learning in any way, even indirectly
But why not the test section? It also "didn't take part in training in any way, even indirectly".
 
elibrarius:
Why not the test site? Because it, too, "was not involved in learning in any way, not even indirectly".

the test is always indirectly involved in learning, take the same catbust... come on

 

А... Or we call the plots differently.

I call

1 the trace (train)
2 valid (valid) - this is what is used in many packages for control in the learning process and for early stopping. It's called Valid.
3 test - for evaluating the system with new data

You must have called the 2nd section a test section.

 
Maxim Dmitrievsky:

the test is always indirectly involved in learning, take the same catbust... come on

I don't know Catbust. Here is a quote from XGBoost

early_stopping_rounds
If NULL, the early stopping function is not triggered. If set to an integer k, training
with a validation set will stop if the performance doesn't improve for k
rounds.

 
elibrarius:

А... Or we call the plots differently.

I call

1 training dataset
2 valid -this is what is used in many packages for control in training and for early stopping. And it's called Valid.
3 test - for evaluating the system with new data

You must have called the 2nd part a test part.

I think it's the other way around, validation is new, where as they write

Yeah, well, you get the idea.

https://tech.yandex.com/catboost/doc/dg/concepts/cli-reference_train-model-docpage/

-t

--test-set

A comma-separated list of input files that contain the validation dataset description (the format must be the same as used in the training dataset).

Omitted. If this parameter is omitted, the validation dataset isn't used.


)))) you can write it however you want, it's called

 
Maxim Dmitrievsky:

In my opinion, on the contrary, validation is the new, where as they write

Yeah, well, you get the idea.

At first I didn't understand it.
Because we use different terms.

It would be better to stick to one terminology.

 
elibrarius:

At first I did not understand it.
Because we have different terms.

We should stick to the same terminology.

Show me documentation of any package, where the second section (which is used for learning control and/or early stopping) is called test, not validation.

I showed you above, and here's more

https://tech.yandex.com/catboost/doc/dg/concepts/output-data_training-log-docpage/

CatBoost — Metrics and time information — Yandex Technologies
  • tech.yandex.com
The table below lists the names of parameters that define the metric values to output. The values of all functions defined by these parameters are output. Information about the number of seconds of training: The resulting JSON file consists of the following arrays: meta Contains basic information about the training. Format of the array with...
 
saw)
In general, the confusion with the terminology
Reason: