Machine learning in trading: theory, models, practice and algo-trading - page 2804
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
I'm doingit now, including for a forum thread to seeif it makessense for that sample.
It doesn't
There's no point
You think that sample is hopeless?
CatBoost chooses randomly the number of predictors at each iteration of splitting or tree building - it depends on settings, and it means that strongly correlated predictors have more chance to get into random, i.e. not at them, but at the information they carry.
Yeah, and the creators of boosts don't know that...
They also don't know that it's possible to filter out signs by correlation))) how would they know, the method is only 50 years old))))
do you really believe that you know more than they do?
Do you think that sample is hopeless?
Sure... Boost takes it all into account.
And don't give me a hard time, I'm probably younger than you.)
You think that sample is hopeless?
https://datascience.stackexchange.com/questions/12554/does-xgboost-handle-multicollinearity-by-itself
Decision trees are inherently immune to multicollinearity. For example, if you have 2 functions ,
that are 99% correlated, the tree will only choose one of them when making a partition decision. Other models,
such as logistic regression, will use both functions.
Since bousting trees use separate decision trees, they are also not affected by multicollinearity.
========
you can use this approach, evaluate the importance of each function and keep only the best functions for your final model.
Which is actually what I'm telling you earlier
Yeah, and the creators of boosts like that don't know that....
They also don't know that it is possible to filter out signs by correlation)) how could they know, the method is only 50 years old)))
Do you really believe you know more than they do?
I do. Boost takes it all into account.
And don't give me that shit, I'm probably younger than you.)
I analyse the results of the models and I see that they grab highly correlated predictors, for example predictors based on time - even if they have a small time lag.
I think they know everything perfectly well, but also they shouldn't tell you about platitudes that are decades old....
About "You" or "You" - I think it's better for everyone to call the interlocutor as it suits him, if it doesn't carry an offensive message and doesn't hinder constructive dialogue.
https://datascience.stackexchange.com/questions/12554/does-xgboost-handle-multicollinearity-by-itself
Decision trees are inherently immune to multicollinearity. For example, if you have 2 functions,
that are 99% correlated, the tree will choose only one of them when deciding whether to split. Other models,
such as logistic regression, will use both functions.
Because bousting trees use separate decision trees, they are also not affected by multicollinearity.
========
you can use this approach, evaluate the importance of each feature and keep only the best features for your final model.
Which is actually what I'm telling you earlier
That's the thing, it will choose - yes one, but how many times will this choice go through....
Besides CatBoost has some differences from xgboost, and there are different results on different samples, on average CatBoost is faster and even better, but not always.
Plus I have my own method of grouping similar predictors and selecting the best option from them, and I need a control group in the form of correlation...
CatBoost chooses randomly the number of predictors at each iteration of splitting or tree building - it depends on settings, and it means that strongly correlated predictors have more chance to get into random, i.e. not at them, but at the information they carry.
Are you sure it's picking predictors at random? I wasn't catbusting, I was looking at the code of basic bousting examples. All the predictors are used there. I.e., the best one is taken. The correlated one will be next to it, but slightly worse. But at some other split levels or in correction trees, another of the correlated predictors may be better.