Machine learning in trading: theory, models, practice and algo-trading - page 3179

 

I asked boost developers what to do with multicollinearity and feature selection (preprocessing).

I got an unambiguous answer: dude, just forget it :)

 
Maxim Dmitrievsky #:

and why would I ask if the conversion from flots to ints is needed mainly for acceleration on very large data

The bonus can be a small calibration of the model for better or worse, as luck would have it.

they will just give you the same answer, so you're probably afraid to ask because it will devalue all your years of hard work :)

How can my labours be devalued by answering a question if I judge them by the result?

Yes, the result is not super in terms of metrics growth, but it is there, including in other manifestations.

For example, there are samples where without preprocessing using my method I could not get a profitable model on new data at all.

 
Maxim Dmitrievsky #:

I asked the boost developers what to do about multicollinearity and feature selection (preprocessing).

To which I got an unambiguous answer: dude, just forget it :)

And if there are a billion traits, should we just forget about it? Or should we still select the ones that are not correlated?
 
mytarmailS #:
And if there are a billion traits, do we just forget about it? Or do we still have to select the ones that aren't correlated?
At will.
 
Maxim Dmitrievsky #:
At will
Rather by necessity, must be filtered, no option
 
Maxim Dmitrievsky #:

I asked the boost developers what to do about multicollinearity and feature selection (preprocessing).

I got an unambiguous answer: dude, just forget it :)

There is a functionality there that few people discuss at all, judging by the information about real examples of its application.

For example - the ability to group predictors and give them weights. I also see the potential for improving the model, but I am not able to experiment here - it requires a lot of searching.

And it is not a fact that the person who originally kept the whole project in his head is still working on it, it is quite possible that there are colleagues and others who improve the algorithm itself in terms of speed of execution and fix bugs. Well and small chips appear sometimes.

 
mytarmailS #:
Rather by necessity, you need to filter, no option
It's probably easier not to make a billion features in the first place
 
Aleksey Vyazmikin #:

There is functionality there that few people discuss at all, judging by the information about real examples of its application.

For example, the ability to group predictors and give them weights. I also see the potential for improving the model, but I am not able to experiment here - it requires a lot of searching.

And it is not a fact that the person who originally kept the whole project in his head is still working on it, it is quite possible that there are colleagues and others who improve the algorithm itself in terms of speed of execution and fix bugs. Well and small chips appear sometimes.

This is all optional
 
Maxim Dmitrievsky #:
It's probably easier not to make a billion signs in the first place.
To realise that a feature is bad you have to check it, to see it you have to have a feature, to have a feature you have to have it....
 
mytarmailS #:
To realise that a sign is bad you have to check it, to see it you have to have a sign, to have a sign you have to have it....
you can't have all the signs of the world.
Reason: