Machine learning in trading: theory, models, practice and algo-trading - page 2627

 
elibrarius #:
I was comparing several ways of assessing the importance of the attributes. I took the most resource-intensive as a benchmark: learning the model by removing the features one by one.
The fast methods do not coincide with the benchmark. And they do not match each other. fselector is even faster, I think it won't match anything either.
Cool...
Now calculate your importance for market data, with 500 rows and 1000 attributes...
In 20 years, tell me what you got.

And what does this have to do with the problem of changing attributes over time?
 
mytarmailS #:

The importance of signs in the moving window (indicators and prices)

At one moment the indicator may be 10% important and at another moment it may be 0,05% important, such is the truth of life)

If you think that crossvaluation solves everything, you should blush, it's time ...

It is not clear what cross validation has to do with it?
The data in the sliding window is used for each model.
Cross validation is used to match the training results of multiple models trained on different pieces of data.
Models on non-sliding window data can also be trained on different chunks of that data and get cross validation as well.
 
elibrarius #:
Not clear, what does cross validation have to do with it?
The data in the sliding window is used for each model.
Cross validation is used to dock the training results of multiple models trained on different pieces of data.
Models on non-sliding window data can also be trained on different chunks of that data and get cross validation as well.

The idea here is that a sliding window with the same width does not solve the problem. The good idea is to increase the runs per dimension, changing the width of the window at each step. There's the curse again)))

 
elibrarius #:
What does cross validation have to do with it?
The data in the sliding window is used for each model.
Cross validation is used to match the training results of multiple models trained on different pieces of data.
Models on non-sliding window data can also be trained on different chunks of that data and get cross validation too.
Not awake yet?))
If you understand that the importance of the attributes is very variable, then there is no point in cross validation, so it is written, what is not clear?
 
mytarmailS #:
Cool...
Now calculate your importance for market data, with 500 rows and 1000 attributes...

In 20 years, tell me what you got.
A test on small data shows that fast methods don't work well.
What is the purpose of the importance score? So that by removing the unimportant ones, it is possible to train the model faster in the future, without losing quality. It's just tuning the data and model that are already working. And neither you nor I (as I assume) have anything to tune yet.

So just teaching the model. The model will use the important ones and not the unimportant ones.

 
mytarmailS #:
Not awake yet?))
If you understand that the importance of the signs is very variable, then there is no point in crosvalidation, it says so, what's not to understand
Awake)
I disagree.
Cross validation is the ability to throw out a model that happens to be successful on one piece of history. Testing it on a few chunks of history, might show that it won't work there.
Just cross validation shows that the signs and model are floating.
This "float" is shown to you by another method, cross validation to me.
 
I don't use pure crossvalidation myself, but valving forward. That is, not in a circle, but shifts forward only.
 
Valeriy Yastremskiy #:

The idea here is that a sliding window with the same width does not solve the problem. The good idea is to increase the runs per dimension, changing the width of the window at each step. Damn it again)))

Damn it all, the sun is outside, it's time to put on swimming trunks and go to the garden

 
elibrarius #:
The test on small data shows that fast methods don't work well.
What's the purpose of the importance test? So that by removing the unimportant ones you can train the model faster in the future, without losing quality. It's just tuning the data and model that are already working. And neither you nor I (as I assume) have anything else to tune.

So I simply teach the model. The model itself will use the important ones and not use the unimportant ones.

What if I want to create a neuron that generates quality features in the output?
I'm sure this hasn't even occurred to you, but you've already drawn all the conclusions for me.
 
mytarmailS #:
What if I want to create a neuron that generates a qualitative output?
I'm sure it never even occurred to you, but you've already drawn all the conclusions for me
It hasn't occurred to me. I draw conclusions only after conducting my own experiments. Good luck with your experiments.
As for cross validation (valving forward), you still haven't explained why it's bad. My experiments show that it's a working method for weeding out bad models/ideas.
Reason: