Machine learning in trading: theory, models, practice and algo-trading - page 2611

 
Renat Akhtyamov #:

99% probability or 0.99 without percentage

You're a scary man!

To have that kind of probability and communicate with mere mortals...? - that's not real...

 
Serqey Nikitin #:

You're a scary man!

To have such a possibility and communicate with mere mortals...? - that's not real...

come on ;)

the point is that no matter how hard traders try, in the end most of them end up trading counter-trends

look at the distribution of volumes on the CME, they are published dynamically on line and the behaviour of the price

that again says only one thing - the price is against most

bought - price down and vice versa.

and so it was and so it will always be

because:

https://www.mql5.com/ru/forum/86386/page2605#comment_28636383

Машинное обучение в трейдинге: теория, практика, торговля и не только
Машинное обучение в трейдинге: теория, практика, торговля и не только
  • 2022.03.29
  • www.mql5.com
Добрый день всем, Знаю, что есть на форуме энтузиасты machine learning и статистики...
 

It's not a good idea to make a strategy based on information from the CME

because.

as soon as they notice, they know how to throw out the wrong information.

Been there, done that ;)

 

It turns out a kind of boosting, as Alexei pointed out

Improvement at each iteration, given the exam sampling

Iteration: 0, R^2: 0.187883200953193
Iteration: 1, R^2: 0.23135332833695177
Iteration: 2, R^2: 0.5069635195005324
Iteration: 3, R^2: 0.6549692113098968
Iteration: 4, R^2: 0.49450581772674385
Iteration: 5, R^2: 0.727741771152099
Iteration: 6, R^2: 0.7155342473909062
Iteration: 7, R^2: 0.7577880020333465
Iteration: 8, R^2: 0.7519731839574526
Iteration: 9, R^2: 0.6484696911159258
Iteration: 10, R^2: 0.7919754252032625
Iteration: 11, R^2: 0.7434806103697286
Iteration: 12, R^2: 0.7829611167594436
Iteration: 13, R^2: 0.8423847977639594
Iteration: 14, R^2: 0.8755566220080022
Iteration: 15, R^2: 0.8073736447495541
Iteration: 16, R^2: 0.7756062175823373
Iteration: 17, R^2: 0.8767667338484959
Iteration: 18, R^2: 0.8658089653482818
Iteration: 19, R^2: 0.7976304450279426
Iteration: 20, R^2: 0.8335757510984808
Iteration: 21, R^2: 0.8236019726095158
Iteration: 22, R^2: 0.8590437311223307
Iteration: 23, R^2: 0.8425455355207566
Iteration: 24, R^2: 0.7897953478024325

But the back is not good (on the left), but sometimes it is better.

There are many settings, I won't explain in detail. I have described the idea as best I could.


 

if you wait for 100 iterations


 
Maxim Dmitrievsky #:
Regularity implies repeatability. You're not looking for a pattern, you're making a validation fit.
Your algorithm does not take into account the repeatability of found dependencies, hence it does not check if there is a pattern...

Here's an example on your fingers.
You have a sample of 100 observations.
You can build 100 rules that will use one time per forecast or find one rule that will use 100 times...

Which approach should you bet on?

 
Maxim Dmitrievsky #:

It turns out a kind of boosting, as Alexei pointed out

Improvement at each iteration, given the exam sampling

But the back is not good (on the left), but sometimes it is better.

There are many settings, I won't explain in detail. I have described the idea as best I could.


Basically, I just need to look at 2 charts (equity), all on pure OOS: 1 - first model, trained, without any extra features, 2 - after all these described procedures. You can also use PF, RF and winrate metrics. And so it is not clear, what is the effect, a beautiful learning curve is, as I understand, on IS?

 
Replikant_mih #:

In fact, you just need to look at two charts (equity), all on pure OOS: 1 - the first model, trained, without any frills, 2 - after all these described procedures. You can also use PF, RF and winrate metrics. And so it is not clear, what is the effect, a beautiful learning curve is, as I understand it, on IS?

first third of the graph - new data, not involved in learning

the pictures with 25 and 100 iterations show an improvement at 100, although the maximum was around 70
 
Maxim Dmitrievsky #:

There is a question like this:

Two models are used. One predicts to buy or sell, the other to trade or not to trade.

First the first model is trained, then we look where it predicts poorly, mark these examples as "don't trade", the other good ones as "trade", teach this to the second model.

The first model is tested not only in the training area but also in the additional area and the second model is trained in both areas.

We repeat this several times, retraining both models on the same dataset. The results gradually improve on the samples. But not always on the control sample.

In parallel with this we keep a log of bad trades cumulative for all passes, all "bad" deals for "not to trade" are collected in it for training the second model and filtered according to a certain principle like the more copies of bad deals for all passes, the more chance to mark them as "not to trade"

For example, for each date some amount of bad trades is accumulated for all iterations of training, where this amount exceeds a threshold (mean, average), those trades are marked as "do not trade". The rest trades are skipped, otherwise it would be possible to exclude all trades if there are a lot of training iterations.

coefficient allows you to adjust the number of trades at the output, the lower it is, the more trades are filtered out

... by this point i am already tired of writing ...

How can such a combination of models be improved so that it improves its results on a new independent plot?
Is there any philosophy as to why this might work? Other than the fact that the models naturally improve each other (error drops) on each round of retraining, but how to get rid of the fit?

Illustration. The graph is split into 3 parts. The last one trains the first model, the penultimate and last one the second, the first third is an exam sample. Naturally the last section will be the best and the first third the worst.

Here there were 15 iterations of retraining both models, using the bad trades log.

sounds like a trivial Multi-Label Classificaton - we should not vary the combination of models, but the combination of predictors -- first of all the division of predictors into features of smart & retail actions... because of course there will be signals to the contrary, but the OTF entry points (for breakdown of levels) - it's already Edge for the choice of model (dtf or otf action in the market)... imho

==========

or without marking, but just with LSTM with forget gate possibilities, so you don't have to filter separately from 2 models... but it's all a matter of taste...

ibm

I got a regression on IBM (test data from late 2021 - there the right tail on the price chart is represented on the train and test chart)... ... simply by Close...

pred

- ... I've got a trivial MA - and it will always work in a trend (however it works), not in flat - smart & retail behavior should be additionally filtered (and the model should be re-designed to classify incoming and outgoing ...)

Files:
 
JeeyCi #:

looks like a trivial Multi-Label Classificaton - it's not the combination of models that should be varied, but the combination of predictors - first of all the division of predictors into features of smart & retail actions... because of course there will be signals to the contrary, but OTF entry points (for breakdown of levels) - it's already Edge for model selection (dtf or otf action in the market)... imho

==========

or without markup, but just with LSTM and layers, so you don't have to filter separately from 2 models... but it's all a matter of taste...

I got a regression on IBM (test data from late 2021 - there the right tail on the price chart is represented on the train and test chart)... ... simply by Close... - We have a trivial MA - and it will always work in a trend (no matter how), not in flat - smart & retail behavior should be additionally filtered (and the model should be re-designed to classify incoming and outgoing ...)

It's not a multilabel, different meaning. Exclude bad signals iteratively, leave those that are well predicted by the main model in the common pile, and the second model learns to separate the bad from the good, forbid or allow trading of the first

lstm always produces MA, tested long time ago

Reason: