Machine learning in trading: theory, models, practice and algo-trading - page 2382

 
Evgeni Gavrilovi:

randomly? i.e. as stated here? test on a random sample of 50%

yes, you can read it in the sklearn documentation

i checked the same (as in the video) on the seasonal version... it doesn't seem to improve anything much

 
elibrarius:

You're talking about some kind of standard / ancient cross validation.
First, do not shuffle the lines and take blocks as is 0-90 training 90-100 test, then 10-100 training, 0-10 test, then 20-100-10 training 10-20 test, etc..
Secondly, according to Prado's advice you need to leave a space (pruning) between train and test, so neighboring examples from train and test don't get into work. Example from train adjacent to 10-100 examples from the quiz will be a hint / peek. Read more here https://dou.ua/lenta/articles/ml-vs-financial-math/
Or here's a picture:

You can 20% or as much as you want.

And finally, instead of crossvalidation, you can apply a rollicking forward. Which doesn't take the test plot all the way around, but only in front.

Everything you say here is used by me in my experiments anyway.

What is the purpose of these tricks, with splitting the sample into chunks, is to find the chunk where the pattern inherent in the entire sample is less noisy. The less noise/more pronounced the rules that contribute to classification, the better the model will be. Yes, this method has a right to life, but it is good to use when you know that there are more predictors associated with the target, the more random ones, and when the sample size is large enough to accommodate as many combinations of predictors between them as possible, and the more predictors the larger the sample should be. My sample size rarely exceeds 20k lines (100%), and predictors are more than 2k, and obviously all combinations won't get into the sample itself and won't be taken into account by the model, hence there will always be Recall of no more than +-50%.

That is why my binarization method is based on a different approach - each quantum of predictor grid is evaluated for stability in time and predisposition to the target, then selected predictor quanta are combined into one binary predictor, thus clearing from noisy predictor quanta/splits, most of the predictors simply do not pass the selection. Based on the results of selection such binarization sampling is already built, as a result we have similar behavior of predictors at all training sites, which should contribute to stability of the model when similar events occur, which were in the history as well.

 
Aleksey Vyazmikin:

Everything you say here I already use in my experiments.

What is the purpose of these tricks, with splitting the sample into chunks, is to find the chunk where the pattern inherent in the entire sample is less noisy.

No - finding the model averages (error, etc.) across all the test chunks. Or the sum of the balances.

Cross validation is fine for you if it is acceptable to use early rows as a test.
Walking forward is probably not anymore. 20000 lines is hard to divide into many chunks to test ahead.

You have an atypical scheme, so you can't really advise anything)
 
elibrarius:

No - finding model averages (error, etc.) for all test pieces. Or the sum of the balances.

So, for this to happen and you need to identify the plot where the relationships prevail, which will be stable in the future, significant predictors and the target.

elibrarius:

Cross validation will work for you if it is acceptable to use early lines as a test.

Walking forward, perhaps not anymore. 20,000 strings is hard to divide into many chunks to test ahead.

You have an atypical scheme, so not much advice)

Using early strings is unacceptable for the reason that it was used to evaluate the quanta-at 60% of the sample. Here the whole evaluation procedure to do the individual chunks - but what is the point of it - globally it is not.

Method Lasso showed better results, CatBoost - I certainly and on other samples later will compare, but apparently it allows you to generalize highly discharged binary predictors, where the units of 10-20%. That's how to make it work for income extraction - the question.

 
Aleksey Vyazmikin:

No improvement was made by reducing L2 regularization. So Lasso is better.

Well, it's better... both are bad, and there's a couple percent difference

 
Maxim Dmitrievsky:

Well, it's better... what's there is bad, and there's a difference of a couple of percent

4% accuracy is a lot in monetary terms - it will increase profitability and expectation!

 
Who has eur 5min for 10 years, please send me the txt or csv.
 
I drew neural network forecasts in my browser. I got indicators + tried to indicate entry points.
The link is in my profile.
 
mytarmailS:
Who has the eur 5min for 10 years please send me the txt or csv.

Can't you download the terminal?

 
Maxim Dmitrievsky:

You can't download the terminal?

I've been testing on M5 quotations for 10 years... I should on the contrary hide the terminal from them before they make trouble for the family budget.

Reason: