Confusion with the terminology of training and validation - General

Forester 2019.02.09 13:38 #13031

Maxim Dmitrievsky:

10% error per test and trace for ~10k examples, increases smoothly with increase

with such an error, the models started to work on the new data

on validation differently, it is necessary to go through the variants.

Algorithms don't reveal any more, just communicating

Something suspiciously small. In his articles even on zigzags Perervenko has not reached such a result.

It's also suspicious that the test and tray have 10% each, but the validation is "different". I.e. more or what? The worst should be the test, not the validation.

Two Brokers - 1 [Archive!] Any rookie question, Programmers

Maxim Dmitrievsky 2019.02.09 13:41 #13032

elibrarius:

Something suspiciously small. Perervenko in his articles, even on zigzags did not achieve this.

And it is also suspicious that the test and the tray have 10% each, and the validation is "different". I.e. more or what? The worst should be the test, not the validation.

"Even on zigzags" )))

the worst should only be a validation that was not involved in learning in any way, even indirectly

Forester 2019.02.09 13:44 #13033

Maxim Dmitrievsky:

"Even on zigzags" )))

the worst can only be validation, which was not involved in learning in any way, even indirectly

But why not the test section? It also "didn't take part in training in any way, even indirectly".

Maxim Dmitrievsky 2019.02.09 13:45 #13034

elibrarius:
Why not the test site? Because it, too, "was not involved in learning in any way, not even indirectly".

the test is always indirectly involved in learning, take the same catbust... come on

Forester 2019.02.09 13:51 #13035

А... Or we call the plots differently.

I call

1 the trace (train)
2 valid (valid) - this is what is used in many packages for control in the learning process and for early stopping. It's called Valid.
3 test - for evaluating the system with new data

You must have called the 2nd section a test section.

Advisors on neural networks, VPS and EAs An Error insert a

Forester 2019.02.09 13:52 #13036

Maxim Dmitrievsky:

the test is always indirectly involved in learning, take the same catbust... come on

I don't know Catbust. Here is a quote from XGBoost

early_stopping_rounds
If NULL, the early stopping function is not triggered. If set to an integer k, training
with a validation set will stop if the performance doesn't improve for k
rounds.

What causes stack overflow EURUSD - Trends, Forecasts [Archive!] Any rookie question,

Maxim Dmitrievsky 2019.02.09 13:53 #13037

elibrarius:

А... Or we call the plots differently.

I call

1 training dataset
2 valid -this is what is used in many packages for control in training and for early stopping. And it's called Valid.
3 test - for evaluating the system with new data

You must have called the 2nd part a test part.

I think it's the other way around, validation is new, where as they write

Yeah, well, you get the idea.

https://tech.yandex.com/catboost/doc/dg/concepts/cli-reference_train-model-docpage/

-t

--test-set

A comma-separated list of input files that contain the validation dataset description (the format must be the same as used in the training dataset).

Omitted. If this parameter is omitted, the validation dataset isn't used.

)))) you can write it however you want, it's called

Simple experts Discussing the article: "Reimagining using arrays names in

Forester 2019.02.09 13:57 #13038

Maxim Dmitrievsky:

In my opinion, on the contrary, validation is the new, where as they write

Yeah, well, you get the idea.

At first I didn't understand it.
Because we use different terms.

It would be better to stick to one terminology.

Maxim Dmitrievsky 2019.02.09 13:59 #13039

elibrarius:

At first I did not understand it.
Because we have different terms.

We should stick to the same terminology.

Show me documentation of any package, where the second section (which is used for learning control and/or early stopping) is called test, not validation.

I showed you above, and here's more

https://tech.yandex.com/catboost/doc/dg/concepts/output-data_training-log-docpage/

CatBoost — Metrics and time information — Yandex Technologies

tech.yandex.com

The table below lists the names of parameters that define the metric values to output. The values of all functions defined by these parameters are output. Information about the number of seconds of training: The resulting JSON file consists of the following arrays: meta Contains basic information about the training. Format of the array with...

Forester 2019.02.09 14:00 #13040

Maxim Dmitrievsky:

above, here's more

https://tech.yandex.com/catboost/doc/dg/concepts/output-data_training-log-docpage/

saw)
In general, the confusion with the terminology

Machine learning in trading: theory, models, practice and algo-trading - page 1304