How to reduce the cost of learning by adding CB and MO on logic? - General

Aleksey Vyazmikin 2024.05.17 17:22 #35221

Maxim Dmitrievsky #:

10 models (there are two in each model, basic and meta)

And immediately ready TC.

I run in batches of 20-100 retraining with different parameters. The markup has the biggest impact.

So I want to find a way to find the most correct markup.

Here we are not looking for the "correct" markup, but a convenient for training model with specific settings.

But, even so, it's quite realistic to go over 1kk as well.

Maxim Dmitrievsky 2024.05.17 17:27 #35222

Aleksey Vyazmikin #:

What is being sought here is not the "right" markup, but an easy to train model with specific settings.

But, even so, it is quite realistic to go over 1kk too.

Well, the features are the same, but the markups are different. And you get different models.

Aleksey Vyazmikin 2024.05.17 18:01 #35223

Maxim Dmitrievsky #:
Well, the features are the same, but the markups are different. So you get different models.

Well, of course that makes sense. My method just allows you to discard the factor of model settings.

If you claim that your CB settings do not significantly affect the learning process, then drop such a sample for reproduction, I will be interested to familiarise myself with it.

In any case, this is all randomness talk. As long as there is no possibility to quickly detect the "breakdown" of the model, the drain will be rapid on new data.

Questions from a "dummy" LET'S SAY THAT ... Gathering a team to

Maxim Dmitrievsky 2024.05.17 18:20 #35224

Aleksey Vyazmikin #:

Well, of course it makes sense. My method simply allows you to put aside the model tuning factor.

If you claim that your CB settings do not significantly affect the learning process, then throw such a sample for reproduction, I will be interested to familiarise myself with it.

In any case, this is all randomness talk. As long as there is no possibility to quickly detect the "breakdown" of the model, the drain will be rapid on new data.

There are 2 models, not the usual way. 2 different datasets

Aleksey Vyazmikin 2024.05.17 20:11 #35225

Maxim Dmitrievsky #:
There are 2 models, not the usual. 2 different datasets

Well, it's clear, I'm interested in the sample with markup after the first model.

Maxim Dmitrievsky 2024.05.17 20:53 #35226

Aleksey Vyazmikin #:

Well, that's understandable, I'm interested in sampling with markup after the first model.

There are 2 models at the same time. The second one is trained just to trade/not to trade. I can upload it tomorrow, but I don't know why :)

Aleksey Vyazmikin 2024.05.18 18:30 #35227

Maxim Dmitrievsky #:
There are 2 models out there at the same time. The second one is to learn simply, to trade/not to trade. I can throw tomorrow, I don't know why :)

Still relevant. I'm really interested, as I have quite different conclusions - probably such a different sample.

Maxim Dmitrievsky 2024.05.20 16:58 #35228

Aleksey Vyazmikin #:

Still relevant. I'm really interested as I have very different conclusions - perhaps such a different sample.

Separate file for each cluster

Files:

output_csvs.zip 5242 kb

Maxim Dmitrievsky 2024.05.21 09:18 #35229

It turned out to reduce entropy (logloss) through markup with the addition of some "rules". That is, combining MO and TC on logic.

For example, with random partitioning, although Accuracy was normal, but logloss left much to be desired

{'learn': {'Accuracy': 0.8438783894823336, 'Logloss': 0.4787490774779375}, 'validation': {'Accuracy': 0.7420178799489144, 'Logloss': 0.5603823600397243}}

And with the new markings, it works like this.

{'learn': {'Accuracy': 0.9840909090909091, 'Logloss': 0.12419709401710959}, 'validation': {'Accuracy': 0.9470899470899471, 'Logloss': 0.2028722652115128}}

I'm really happy, it's a real improvement. I didn't bring up entropy for nothing.

{'learn': {'Accuracy': 0.9907674552798615, 'Logloss': 0.09702284179278793}, 'validation': {'Accuracy': 0.955585464333782, 'Logloss': 0.15982284254600834}}

:)

Looking for patterns Is there a need Regularity or Randomness

Maxim Dmitrievsky 2024.05.21 10:03 #35230

Maxim Dmitrievsky #:

It turned out to reduce entropy (logloss) through markup with the addition of some "rules". That is, combining MO and TC on logic.

For example, with random partitioning, although Accuracy is normal, but logloss left much to be desired

And with the new markup it turns out like this

I'm really happy, a real improvement. I touched upon entropy for a reason.

:)

Of course, it is possible to get such a result with random sampling. But, according to rough calculations, it is necessary to make at least 10000 restarts of the markup, taking into account the length of the sample and the range of parameters. This is the minimum, at the level of probability of the same markup falling out, and so in the neighbourhood of a million.

That's why I wanted to find a fast way to check, but directly through entropy did not work. It takes a long time through the model.

Need help! Can't solve Discussion of article "Metamodels BLODIUM strategy - Candle

Machine learning in trading: theory, models, practice and algo-trading - page 3523