Machine learning in trading: theory, models, practice and algo-trading - page 87

 
SanSanych Fomenko:

All packages (models) can be divided into two categories:

  • basically good
  • do not fit in principle

Performance of those packages which are "good in principle" is approximately the same, differences are not significant.

All problems are not in the model, but in the set of predictors and their preliminary preparation. If you take some set of predictors, then the possibility to build a NOT over-trained model, and the magnitude of the error depends little on changes in the model. That is why you should take the simplest and fastest model from those that are "basically good".

PS.

From my own experience. At me more than 75% of labor intensity in the construction of TS - is the selection of predictors, if at all manages to pick up such a set for a particular target variable.

San Sanych, hello.

But if by your method for 3 non-intersecting segments of data we have different values of predictors, then they are non-stationary (noise, etc.) should we follow?

 
SanSanych Fomenko:

All packages (models) can be divided into two categories:

  • basically good
  • do not fit in principle

Performance of those packages which are "good in principle" is approximately the same, differences are not significant.

All problems are not in the model, but in the set of predictors and their preliminary preparation. If you take some set of predictors, then the possibility to build a NOT over-trained model, and the magnitude of the error depends little on changes in the model. That is why you should take the simplest and fastest model from those that are "basically good".

PS.

From my own experience. At me over 75% of labor intensity in the construction of TS - is the selection of predictors, if at all manages to pick up such a set for a particular target variable.

What models, what are you talking about ... It's like a man asking "what time is it?" and the answer is "what would you like me to dance?":)

Never, please, never do that again, it's easier to write 10 lines of text than to read two lines of questions

 
mytarmailS:

Maybe someone will be interested, I found a package that can simulate trading and build trading systems called quantstrat

http://www.rinfinance.com/agenda/2013/workshop/Humme+Peterson.pdf

repost
 
Alexey Burnakov:

San Sanych, hi.

But if, according to your methodology, we get different significance of predictors on three non-intersecting data segments during training, then they are non-stationary (noise, etc.), right?

The significance of predictors is obtained only once - when the model is trained. Then this model is APPLICABLE, not trainable.
 
SanSanych Fomenko:
Predictor relevance is obtained only once - when the model is trained. Then that model is APPLICABLE, not taught.
You have to teach it several times there, as I recall?
 
Alexey Burnakov:
You have to teach it several times, as I remember?

No way!

Once again.

Let's take a big chunk of time series-predictors, for example 10 000 observations (lines).

2. We divide it into two parts, strictly mechanically: 7000 first part and 3000 second part.

3. We divide the first part into three parts at random: for training, testing and validation

4. We teach (fit - fit) the model on the training sample.

5. Apply the trained model to testing and validation samples.

6. If on all three samples - training, testing and validation - the error is approximately equal, then clause 7.

7. Apply the model on the second part, which is the unbroken time series in its time sequence.

8. If the error on this part, too, is roughly equal to the previous three, then:

  • this set of predictors does not lead to overfitting of the model
  • the error that was obtained on all FOUR sets (three random and one sequential) and is an error that is very difficult to reduce by model fitting.
I have the performance of models by error as follows: ada, randomforest, SVM and their many varieties. nnet is much worse.

 
SanSanych Fomenko:

Absolutely not!

Once again.

1. we take a big chunk of time series predictors, for example 10 000 observations (lines).

2. We divide it into two parts, strictly mechanically: 7000 first part and 3000 second part.

3. We divide the first part into three parts at random: for training, testing and validation

4. We teach (fit - fit) the model on the training sample.

5. Apply the trained model to testing and validation samples.

6. If on all three samples - training, testing and validation - the error is approximately equal, then clause 7.

7. Apply the model on the second part, which is the unbroken time series in its time sequence.

8. If the error on this part, too, is roughly equal to the previous three, then:

  • this set of predictors does not lead to overfitting of the model
  • the error that was obtained on all FOUR sets (three random and one sequential) and is an error that is very difficult to reduce by model fitting.
I have the performance of models by error as follows: ada, randomforest, SVM and their many varieties. nnet is much worse.

Here you go. Thank you.

I have much better results on training than on the other samples. And on crossvalidation the result is much closer to the final out of sample.

I think your thesis of equal errors on all samples speaks to the underfit model. That is, it's equally bad everywhere.
 
Alexey Burnakov:
....That is, everywhere is equally so-so.

So-so is just not enough brains and time.

You have to start with the target variable, and then select predictors for it, and then double-check with mathematics, so on. Anyway, the process is slow and I can't formalize it.

 
SanSanych Fomenko:

So-so - just not enough brains and time.

You have to start with the target variable, and then select predictors for it, and then double-check with mathematics, so on. Anyway, the process is painful and I can't formalize it.

Particularly on the meaning is torturous. That's not what I mean.

If you're equally good everywhere, that's an achievement. But more often than not it will be equally bad, which a weak model allows you to achieve.
 
it looks like the branch is dead....
Reason: