Machine learning in trading: theory, models, practice and algo-trading - page 2215

 

Returned to the question of visualizing the CatBoost model, in order to analyze its prospects.

This is what the model looks like on the training sample:

On the x-axis is the probability value of the logistic function, and on the y-axis is the percentage over the interval of 0.05 values:

- Razdel(blue) is all values in the sample.

- Target=1(magnet) - target 1 values

- Target=0(aqua) - target values 0

- Balans+(light blue) - the financial result that led to the profit relative to all profits and losses, this indicator is scaled to fit into the chart

- Balans-(brick) - the financial result that led to the loss concerning all profits and losses, this index is scaled to fit into the chart

- Circles are scaled balance value - we focus on the zero value from the zero value of x-coordinate - made for clarity

Vertical aqua line - the maximum value of Target=0

Vertical line magnet - the maximum value of Target=1

Vertical line in red - conditional division of 0.5 to classify to 1 and 0 by default in CatBoost - for clarity.

I assume that the further apart the aqua and magnet lines are from the red vertical line, the more confident the model has divided the classes. It is also worth observing balance lines, during training they are also spaced on both sides - it is especially relevant to models where profit and loss may have different values, so, for example, a model may well filter small losses but lose on large losses, although according to the classification the accuracy value will be more than 0.5.

Further we look at the test sample

We can see that the vertical lines - red and magnetite - have become closer together, but their relative position has not changed, which is already good (sometimes magnetite is in the zone of <0.5). The balance lines have moved closer together, which is somewhat disappointing. There is a loss area after the probability 0.5, which indicates the insufficient quality of the model.

Further we can look at the results on the test sample.

On the right side (probability higher than 0.5) the situation looks better than on the test sample, this may indicate that the test sample is a rare occurrence and there were few examples similar to it in training, or the model is not fully trained. In favor of the latter assumption is the fact that in the area of probability less than 0.5 there are regions where the balance lineBalans+, indicating the positive financial results, crosses the lineBalans-, which can also be seen when looking at circles that in fact indicate the delta between the profit and loss in a certain probability area.

Well, let's take a look at the balance on the examination sample.

You can clearly see that the character of the market has changed, which can be seen in 2/3 of the graph - we should continue to study the model.

And here is an example of a clearly bad model

Already on the test sample we can see a strong shift of the whole body to the left part, i.e. the model knows very little about the sample - the completeness is low, and the peak of target 1 accumulation is behind the left part of the probability. It is worth noting that there is still a profit on the training.

Let's look at the test and exam samples

Already on the test sample we can see that all lines outside the probability 0.5 are very strongly interlaced, and on the test sample we can observe how the balance lines have swapped places.

 
mytarmailS:


Basically I have an empty network (I teach it only to initialize it because it is not self-written, but from a package)


I can think of any abstraction, any target and write a fitness function.

Then let genetics start to change the weights of the network so that at the train and at the test I (the network) would get something maximum similar to my goal.


And this is a thousand times" more profound" than creating tags and fitting regression or classification yourself

you went back 2 years when we were condemning the training of neurons through the MT5 optimizer

And I wrote such bots. It's a common optimization with a lot of parameters.

Read on here

https://www.mql5.com/ru/articles/497

Нейронные сети - от теории к практике
Нейронные сети - от теории к практике
  • www.mql5.com
В наше время, наверное, каждый трейдер слышал о нейронных сетях и знает, как это круто. В представлении большинства те, которые в них разбираются, это какие-то чуть ли не сверхчеловеки. В этой статье я постараюсь рассказать, как устроена нейросеть, что с ней можно делать и покажу практические примеры её использования. Понятие о нейронных сетях...
 
Maxim Dmitrievsky:

You went back 2 years, when there was condemnation of neuron training through MT5 optimizer

And I wrote such bots. This is the usual optimization with a lot of parameters.

Read on here

https://www.mql5.com/ru/articles/497

I tried it with max profit, it's possible to train it with something else


Listen, if you're not too lazy, try to train the catbust on max profit, I'm not sure that it works there

you need to feed the X - data and the Y - target

maybe all this "customization" is just a cosmetic modification of existing functions

 
Aleksey Vyazmikin:

Returned to the question of visualizing the CatBoost model, in order to analyze its prospects.

I think it's better to put such big researches into a blog and leave a copy here. In half a year you won't find it here...
 
elibrarius:
I think such large studies are better to be blogged and copied here. In six months you won't find it here...

Maybe - I just do not use a blog, so somehow such an idea did not visit.

I'm thinking that you can put all these points from the graph (20 on the curve) in the sample and try to learn - maybe this way you can identify with greater probability models with potential stability.

 
Aleksey Vyazmikin:

Maybe - just do not use the blog, so that's why this thought did not visit.

I here that I think you can shove all these points from the graph (20 on the curve) in the sample and try to learn - perhaps that way you can identify with greater probability models with potential stability.

What will be the target? How to mark each example? Or by self-training?
 
elibrarius:
What's the target? How to mark up each of the examples? Or by self-learning?

The target will be the financial result of the model on the exam sample.

 
mytarmailS:

Well, yes, but I just tried it on max. profits, you can train on something else


Listen, if you're not too lazy, try to train katbust on max profit, I'm not sure that it works there

you need to feed the X - data and the Y - target

maybe all this "customization" is just a cosmetic change of existing functions

too lazy to write new metrics... and it will certainly not be a max profit, but something more meaningful

e.g. Lyapunov's stability ))

 
Maxim Dmitrievsky:

too lazy to write new metrics yet... and it will definitely not be max profit then, but something more meaningful

For example, Lyapunov's stability))

You don't need it even, or density))). In our business it is rare.

 
Aleksey Vyazmikin:

Back to the question of visualizing the CatBoost model, in order to analyze its perspective.

Yes, the distributions usually show everything. You can just make such for signs\target ones without the boost and see at once

Reason: