How to improve the quality of time series-predictors and their error by error - General

Alexey Burnakov 2016.08.03 16:41 #861

SanSanych Fomenko:

All packages (models) can be divided into two categories:

basically good
do not fit in principle

Performance of those packages which are "good in principle" is approximately the same, differences are not significant.

All problems are not in the model, but in the set of predictors and their preliminary preparation. If you take some set of predictors, then the possibility to build a NOT over-trained model, and the magnitude of the error depends little on changes in the model. That is why you should take the simplest and fastest model from those that are "basically good".

PS.

From my own experience. At me more than 75% of labor intensity in the construction of TS - is the selection of predictors, if at all manages to pick up such a set for a particular target variable.

San Sanych, hello.

But if by your method for 3 non-intersecting segments of data we have different values of predictors, then they are non-stationary (noise, etc.) should we follow?

mytarmailS 2016.08.03 17:00 #862

SanSanych Fomenko:

All packages (models) can be divided into two categories:

basically good
do not fit in principle

Performance of those packages which are "good in principle" is approximately the same, differences are not significant.

All problems are not in the model, but in the set of predictors and their preliminary preparation. If you take some set of predictors, then the possibility to build a NOT over-trained model, and the magnitude of the error depends little on changes in the model. That is why you should take the simplest and fastest model from those that are "basically good".

PS.

From my own experience. At me over 75% of labor intensity in the construction of TS - is the selection of predictors, if at all manages to pick up such a set for a particular target variable.

What models, what are you talking about ... It's like a man asking "what time is it?" and the answer is "what would you like me to dance?":)

Never, please, never do that again, it's easier to write 10 lines of text than to read two lines of questions

MQL5 Wizard: Development of Mailbox - MetaTrader 5 Gann Tools - Objects

mytarmailS 2016.08.03 19:07 #863

mytarmailS:

Maybe someone will be interested, I found a package that can simulate trading and build trading systems called quantstrat

http://www.rinfinance.com/agenda/2013/workshop/Humme+Peterson.pdf

repost

СанСаныч Фоменко 2016.08.03 19:36 #864

Alexey Burnakov:

San Sanych, hi.

But if, according to your methodology, we get different significance of predictors on three non-intersecting data segments during training, then they are non-stationary (noise, etc.), right?

The significance of predictors is obtained only once - when the model is trained. Then this model is APPLICABLE, not trainable.

Alexey Burnakov 2016.08.03 19:36 #865

SanSanych Fomenko:
Predictor relevance is obtained only once - when the model is trained. Then that model is APPLICABLE, not taught.

You have to teach it several times there, as I recall?

СанСаныч Фоменко 2016.08.03 19:48 #866

Alexey Burnakov:
You have to teach it several times, as I remember?

No way!

Once again.

Let's take a big chunk of time series-predictors, for example 10 000 observations (lines).

2. We divide it into two parts, strictly mechanically: 7000 first part and 3000 second part.

3. We divide the first part into three parts at random: for training, testing and validation

4. We teach (fit - fit) the model on the training sample.

5. Apply the trained model to testing and validation samples.

6. If on all three samples - training, testing and validation - the error is approximately equal, then clause 7.

7. Apply the model on the second part, which is the unbroken time series in its time sequence.

8. If the error on this part, too, is roughly equal to the previous three, then:

this set of predictors does not lead to overfitting of the model
the error that was obtained on all FOUR sets (three random and one sequential) and is an error that is very difficult to reduce by model fitting.

I have the performance of models by error as follows: ada, randomforest, SVM and their many varieties. nnet is much worse.

Real and Generated Ticks Spreads - For Advanced Moving Average - Trend

Alexey Burnakov 2016.08.03 20:19 #867

SanSanych Fomenko:

Absolutely not!

Once again.

1. we take a big chunk of time series predictors, for example 10 000 observations (lines).

2. We divide it into two parts, strictly mechanically: 7000 first part and 3000 second part.

3. We divide the first part into three parts at random: for training, testing and validation

4. We teach (fit - fit) the model on the training sample.

5. Apply the trained model to testing and validation samples.

6. If on all three samples - training, testing and validation - the error is approximately equal, then clause 7.

7. Apply the model on the second part, which is the unbroken time series in its time sequence.

8. If the error on this part, too, is roughly equal to the previous three, then:

this set of predictors does not lead to overfitting of the model
the error that was obtained on all FOUR sets (three random and one sequential) and is an error that is very difficult to reduce by model fitting.

I have the performance of models by error as follows: ada, randomforest, SVM and their many varieties. nnet is much worse.

Here you go. Thank you.

I have much better results on training than on the other samples. And on crossvalidation the result is much closer to the final out of sample.

I think your thesis of equal errors on all samples speaks to the underfit model. That is, it's equally bad everywhere.

Close By - Trade Close By - Trade Optimization Types - Algorithmic

СанСаныч Фоменко 2016.08.03 20:25 #868

Alexey Burnakov:

....That is, everywhere is equally so-so.

So-so is just not enough brains and time.

You have to start with the target variable, and then select predictors for it, and then double-check with mathematics, so on. Anyway, the process is slow and I can't formalize it.

Account Connection - Accounts Global Variables - Algorithmic Uninstalling the Platform -

Alexey Burnakov 2016.08.03 21:36 #869

SanSanych Fomenko:

So-so - just not enough brains and time.

You have to start with the target variable, and then select predictors for it, and then double-check with mathematics, so on. Anyway, the process is painful and I can't formalize it.

Particularly on the meaning is torturous. That's not what I mean.

If you're equally good everywhere, that's an achievement. But more often than not it will be equally bad, which a weak model allows you to achieve.

Moving Average - Trend Moving Average - Trend Moving Average - Trend

mytarmailS 2016.08.03 23:32 #870

it looks like the branch is dead....

Machine learning in trading: theory, models, practice and algo-trading - page 87