Machine learning in trading: theory, models, practice and algo-trading - page 2412

 
mytarmailS:

Alexei, you should have learned python or r-cu, tried to code there... Believe me, a thousand questions would be gone...

What a good wish, and it would also be useful to learn a couple of foreign languages, and master all other competences, which would not depend on other people. However, I am not talented in everything, and I realize that I will not achieve high results in coding, and too much effort will be spent.

mytarmailS:

What sense to check efficiency of methods of selection of signs if they already checked and work? otherwise they would not exist

Here it was more about the effectiveness of the method, i.e. how much it can eventually improve the result compared to the submission of the sample without excluding predictors. That is, the actual experiment.


mytarmailS:

The problem is not in the rejection of signs, but in the signs themselves, if you feed 10 indicators, then select until blue in the face and you will get the same result from ANY selection algorithm...

I'm in the neighborhood of 5 predictors right now, which is why this approach is interesting.

mytarmailS:

Did you hear in the video? They're selecting among tens of thousands of features, and they even mention MSUA where they're talking about creating and trying billions of features

That's what we should talk about - the systems that generate millions of ideas and check them automatically, that's the essence of it, these are individual solutions, and the selection of attributes is a small final part of the process and there's nothing interesting in it, you just take any algorithm and go ahead, there's nothing to talk about, it's just not interesting.

I'm just working with a large number of features, and developing methods of generating them from a pattern. The process of binarization of traits, perhaps with the preservation of different indicators within the new predictor, which would make 50,000 traits out of 5000, and they need to be studied for mutual relations to create new, saturated traits, from which the model will already be built.

All in all, I don't know why such a primitive view of my activity...

 
Aleksey Vyazm

In general, I do not know why such a primitive view of my activity...

Alexey, don't you understand that all your 5k binary signs can be replaced by 2-3 main components, i.e. 2-3 signs and all)) but you have to do it in order to know...

Also you don't understand that all your cool model with 5k features can be just one feature out of hundreds of others for a higher ranked model, and that in turn will be a feature for an even higher ranked model...

This is my current thinking.


Read Ivakhnenko MSUA, how many elaborated and deep concepts, when I read it I feel myself as a first-grader in MO...

 
mytarmailS:
Alexey, how can you not understand that all your 5k binary signs can be replaced by 2-3 -major components, i.e. 2-3 signs and all)) but it must be done to know...

Where do you get such conclusions about my understanding or not? The topic of MSUA I have not touched, because there is no real experience of application. Are you willing to shrink my signs down to 2-3? I'd be interested to see it and compare it to my approach. Since you already have it all honed in on that, I don't think it would be a problem, would it?

mytarmailS:
Also, don't you realize that your whole cool model with 5k features can be just one feature out of hundreds of others for a higher ranked model, and that in turn will be a feature for an even higher ranked model...

This is how I think now.

I've been putting this into practice for a long time, pulling leaves out of models - they are the saturated component for more global models.

A lot of what I've come up with has other names and is implemented for general use, but when you do everything from scratch, there's an underlying understanding of how and why it works, not just theory.

 
mytarmailS:
Read the same Ivakhnenko MSUA, how much elaborated and deep concepts, when I read it I feel myself as a first-grader in MO...

I have to do things - there are already enough thoughts to verify, I need to code and verify.

 
mytarmailS:

Also, you don't realize that your cool model with 5k features can be just one feature out of hundreds of other features for a higher ranked model, which in turn will be a feature for an even higher ranked model...

A fan of the movie The Matrix?

 

I thought about how to improve the method of selecting predictors/characters/features by analyzing the resulting model.

I've thrown myself some ideas for the implementation of the algorithm, but decided to share them with the esteemed community, maybe before we start working on the implementation of this algorithm, there will be constructive criticism or additions/clarifications to the algorithm. It is interesting to think that nothing will work with justification.


Selecting predictors by their frequency of use (Feature importance) when creating a CatBoost model

The idea is that each algorithm has its own peculiarities of building trees, and we will select those predictors that are used more often by the algorithm, in this case CatBoost.

However, to evaluate uniformity on the time scale, we will use multiple samples and combine their data into a single table. This approach will allow us to sift out random events that have a strong influence on the choice of the predictor in one of the models. The regularities on which the model is built should occur throughout the sample, which can facilitate proper classification on the new data. This feature is applicable to data from the market, i.e., data without completeness, including data with hidden cyclicality, i.e., not temporal, but event-driven. At the same time, it is desirable to penalize the predictors that are not in the top 30%-50% in one of the plots, which will allow to select the predictors that are most frequently in demand when creating models on different time plots.

Also, to reduce the randomness factor, you should use models with different Seed values, I think there should be from 25 to 100 such models. Whether it is necessary to add a coefficient depending on the quality of the obtained model or just average all results by predictors - I do not know yet, but I think we should start with a simple one, i.e. just average.

The issue of using a quantization table is important, it may have a decisive role in the selection of predictors. If the table is not fixed, then each model will create its own table for the subsample, which will not allow comparison of the results obtained, so the table should be common for all samples.

It is possible to get a quantization table:

  1. By setting hyperparameters for CatBoost on the type and number of partitions into quanta of the entire training sample, and saving the results in csv.
  2. Set hyperparameters for CatBoost by type and number of partitions into quanta by selecting one of the sample areas, let's say the best, and save the results in csv.
  3. Obtain a table using a separate script that selects the best choices from a set of tables.
Previously obtained tables are used for each sample through forced table loading during training.
 
You can connect shap values to the boot and see the interaction of features in the output on any data, it's for those who like to dig in the underwear, like Alexei :) There are also similar libraries, like Lime, which don't depend on a particular model. Of course, if you analyze hundreds of meaningless signs, then any such venture is doomed. This is a simple chore and no one will do it for you for free, because it is an incredible time-killer with a known outcome.
 
Maxim Dmitrievsky:
You can connect shap values to bootstrap and see the interaction of features in the output on any data, this is for those who like to dig in the underwear, like Alexey :)

The question of metrics is open, there are different options - you need to try, which indicator will be better - the impact on the model, the number of partitions, the number of correct examples after the partition - metrics are different. The question is the correctness of their use for the task at hand. By the way, as far as I remembershap values could not be used in the early builds for the command line, but you can make a script for visualization.

 
Maxim Dmitrievsky:
Of course, if you analyze hundreds of meaningless signs, any such endeavor is doomed. This is a simple chore, and it is unlikely someone will do it for you for free, because it is an incredible time-killer with a known outcome.

Why the pessimism - the point is just to generate a set of features, in theory suitable for any target/basic strategies and select the best of them for a particular target.

Do you doubt the increase in classification quality after manipulation?
 
Aleksey Vyazmikin:

Why the pessimism - the point is just to generate a set of attributes, in theory suitable for any target/basic strategy and select the best of them for a particular target.

Do you doubt the quality gain of classification after manipulation?
I don't see the full picture why it can work.
Reason: