How to train for a computer model? It's time to round up and make a point - General

Forester 2018.01.21 20:14 #5951

Aleksey Terentev:
Try cross-validation (K-fold).

How will it help to increase the impact of fresh data?

Aleksey Terentev 2018.01.21 21:11 #5952

elibrarius:
How does it help strengthen the impact of fresh data?

Think about it, you train the model by feeding separate blocks of data, which will give the model some independence from the time series sequence, which leads to the fact that the new data will be evaluated without "bias".

Financial news for traders Trading Signals and Copy Fibonacci Time Zones -

Forester 2018.01.21 21:26 #5953

Aleksey Terentev:
Think about it, you teach the model by feeding separate blocks of data, which will give the model some independence from time series sequence and lead to the fact that the new data will be evaluated without "bias".

"independence from the sequence of the timeseries" is provided by mixing. Without it, the model comes to nothing at all.

And the question is about how to increase the importance of the freshest data when mixing, so that the model picks up new market trends faster.

Parabolic SAR - Trend Parabolic SAR - Trend Parabolic SAR - Trend

Yuriy Asaulenko 2018.01.21 21:30 #5954

elibrarius:

"independence from the sequence of the timeseries" is provided by mixing. Without it, the model comes to nothing at all.

And the question is how with mixing to increase the importance of the freshest data, so that the model would grasp new market trends faster.

Pre-training is done on old data. The final stages of training are conducted on new data.

Forester 2018.01.21 21:41 #5955

I.e. training in 2 steps?
Training on a large amount of data + retraining of the obtained model on fresh data.
It is possible to try.

I had an idea, just add fresh data 2 -3 times to the total training data. Even with shuffling, the significance will increase.

GDP Deflator - USA Debug - Main menu Real and Generated Ticks

Dr. Trader 2018.01.21 21:51 #5956

elibrarius:

So I thought, if everything is shuffled, how can we make the fresh data have a stronger effect on the training?

There is a trick to duplicate the most recent training examples several times.
And for example in gbm package you can set some coefficient of importance of each training example, but it's not a neuron, I just gave an example.

elibrarius:

"Sequence-independent timeseries" is provided by shuffling. Without it, the model comes to nothing at all.

In most models, there is no such thing as sequence dependence at all. In neurons, for example, an error is calculated for each training example, and then the sum of all errors affects changes in weights. The sum does not change due to the change of places of the summands.

But models often have the batch.size parameter or similar, it affects what percentage of training data to take for training. If you take a very small percentage of training data, and turn off shuffling, then the model will take the same small set every time, and everything will end badly. I don't know about darch specifically, but turning off mixing shouldn't cause a complete failure, you have something wrong with the other parameters.

Aleksey Terentev:
Try cross validation (K-fold).

I fully support that. Whatever loud claims the author of the model makes about its protection from overfit, only k-fold will show if it's true.

Moving Average - Trend Moving Average - Trend Moving Average - Trend

Alexander Ivanov 2018.01.22 04:53 #5957

It's time for you to round up and make a point.

And show practice.

Maxim Dmitrievsky 2018.01.22 04:54 #5958

Alexander Ivanov:
It's time for you to round up and make a conclusion.

And show the practice.

Coming soon... "almost done."

I have never done such a tin before in my life

Alexander Ivanov 2018.01.22 05:15 #5959

Maxim Dmitrievsky:

coming soon... "almost done."

I've never done such a tin before in my life.

Ugh, rubbing pens to try the demo 😀👍👍👍👍 like a fresh delicious grandma's pie😂😀

sibirqk 2018.01.22 05:49 #5960

Imho, of course, but here every page of the branch, you need to start with the slogan from SanSanych - "garbage in - garbage out". And all your cognitive and creative talents should first of all be aimed at reducing the garbage at the input, and only then try to extremely load the computer hardware.

Enabling the Storage - Account Connection - Accounts Account Connection - Accounts

Machine learning in trading: theory, models, practice and algo-trading - page 596