Machine learning in trading: theory, models, practice and algo-trading - page 596

 
Aleksey Terentev:
Try cross-validation (K-fold).
How will it help to increase the impact of fresh data?
 
elibrarius:
How does it help strengthen the impact of fresh data?
Think about it, you train the model by feeding separate blocks of data, which will give the model some independence from the time series sequence, which leads to the fact that the new data will be evaluated without "bias".
 
Aleksey Terentev:
Think about it, you teach the model by feeding separate blocks of data, which will give the model some independence from time series sequence and lead to the fact that the new data will be evaluated without "bias".

"independence from the sequence of the timeseries" is provided by mixing. Without it, the model comes to nothing at all.

And the question is about how to increase the importance of the freshest data when mixing, so that the model picks up new market trends faster.

 
elibrarius:

"independence from the sequence of the timeseries" is provided by mixing. Without it, the model comes to nothing at all.

And the question is how with mixing to increase the importance of the freshest data, so that the model would grasp new market trends faster.

Pre-training is done on old data. The final stages of training are conducted on new data.
 

I.e. training in 2 steps?
Training on a large amount of data + retraining of the obtained model on fresh data.
It is possible to try.


I had an idea, just add fresh data 2 -3 times to the total training data. Even with shuffling, the significance will increase.

 
elibrarius:

So I thought, if everything is shuffled, how can we make the fresh data have a stronger effect on the training?

There is a trick to duplicate the most recent training examples several times.
And for example in gbm package you can set some coefficient of importance of each training example, but it's not a neuron, I just gave an example.


elibrarius:

"Sequence-independent timeseries" is provided by shuffling. Without it, the model comes to nothing at all.

In most models, there is no such thing as sequence dependence at all. In neurons, for example, an error is calculated for each training example, and then the sum of all errors affects changes in weights. The sum does not change due to the change of places of the summands.

But models often have the batch.size parameter or similar, it affects what percentage of training data to take for training. If you take a very small percentage of training data, and turn off shuffling, then the model will take the same small set every time, and everything will end badly. I don't know about darch specifically, but turning off mixing shouldn't cause a complete failure, you have something wrong with the other parameters.


Aleksey Terentev:
Try cross validation (K-fold).

I fully support that. Whatever loud claims the author of the model makes about its protection from overfit, only k-fold will show if it's true.

 
It's time for you to round up and make a point.
And show practice.
 
Alexander Ivanov:
It's time for you to round up and make a conclusion.
And show the practice.

Coming soon... "almost done."

I have never done such a tin before in my life

 
Maxim Dmitrievsky:

coming soon... "almost done."

I've never done such a tin before in my life.

Ugh, rubbing pens to try the demo 😀👍👍👍👍 like a fresh delicious grandma's pie😂😀
 
Imho, of course, but here every page of the branch, you need to start with the slogan from SanSanych - "garbage in - garbage out". And all your cognitive and creative talents should first of all be aimed at reducing the garbage at the input, and only then try to extremely load the computer hardware.
Reason: