Machine learning in trading: theory, models, practice and algo-trading - page 2380

 
Aleksey Vyazmikin:

I mistakenly thought it was about columns.

And yet, can't you do all the training on the sample file, and checking on another file?

Alexei , everything is possible!!!

But I'm not interested.

Learn R! It's a fascinating language, especially for traynig...
 
mytarmailS:

Alexei, anything is possible!

But I'm not interested.

Learn R! It's a fascinating language, especially for traynig...

Thanks for the help.

The accuracy (Precision) and completeness (Recall) are much better than CatBoost.

I merged all samples into one file.

So, can we think more in this direction?

 
Aleksey Vyazmikin:

Thank you for your help.

The accuracy (Precision) and completeness (Recall) are much better than CatBoost.

I merged all the samples into one file.

So, can we still think in this direction?

Is it better on new data or on training data?

What are the numbers there and there?

 
elibrarius:

Is it better on new data or on training data?

What are the numbers there and there?

Alas, I was wrong, the accuracy is worse, not better.



However, this is a difficult sample - I can not properly trained on it - tomorrow I'll try another, where good models from CatBoost'a get. Well, in the parameters of the model I do not understand, so maybe the comparison is not very fair.

At the expense of a large Recall from this model can make a separate predictor in general. But I don't know how to upload it to a file :)

 
Maxim Dmitrievsky:

Here is such a paradoxical situation that even if you accidentally get it right, no one will appreciate it.

because there are no evaluation criteria)

No, I don't need appreciation and acknowledgement at all - because then they certainly won't get off).

Rather, I take it as some kind of exercise or puzzle when trying to find common sense in any forum idea)

In this particular case - lasso seems to be quite applicable, if logistic regression is used for classification.

 

I tried it on a different sample - here it is


X <- read.csv2("F:\\FX\\Открытие Брокер_Demo\\MQL5\\Files\\ZZ_Po_Vektoru_TP4_SL4_Si_QMini_02_Bi\\Si_cQq\\Setup\\xxx.csv")
Y <- X$Target_100
X <- as.matrix(within(X, rm("Time","Target_P","Target_100",
                            "Target_100_Buy","Target_100_Sell")))
library(glmnet)
tr <- 1:14112 #  train idx
best_lam <- cv.glmnet(x = X[tr,], 
                      y = Y[tr],alpha = 1, 
                      lambda = 10^seq(2, -2, by = -.1), 
                      nfolds = 5)$lambda.min

lasso_best <- glmnet(x = X[tr,], y = Y[tr], alpha = 1, lambda = best_lam)
pred <- predict(lasso_best, s = best_lam, newx = X[-tr,])

sma <- TTR::SMA(pred,20)
pred2 <- c(pred-sma) ; pred2[pred2>0] <- 1 ; pred2[pred2<=0] <- 0

yy <- tail(Y[-tr] ,3528)
pp <- tail(pred2 ,3528)
caret::confusionMatrix(as.factor(yy),as.factor(pp))
    Reference
Prediction    0    1
         0 1063  860
         1  567 1019
The question is how to get the model and how, to begin with, to save the classification to a file.
Files:
xxx.zip  482 kb
 
Aleksey Vyazmikin:

I tried it on another sample - here


The question is how to get the model and how, to begin with, save the classification to a file.

catbust has a pretty strong regularization, moreover, if the signs are categorical, they should be declared as such in the boost

 
Maxim Dmitrievsky:

catbust has a pretty strong regularization, especially if the signs are categorical, you should declare them so in the boost

It does not matter for binary ones whether they are categorical or not.

We can try to reduce regularization - good idea - thanks.

So far Lasso has shown better results on exam part of sample.

 
Maxim Dmitrievsky:

I.e. mark the deals by some waveform with a period of 5 or the price difference and see what happens

the signs will also be smoothed in the process of training


try the same way. I got it good in the custom tester, I have a problem when exporting the model, I'll look for the error later.

 
Aleksey Vyazmikin:

For binary ones it doesn't matter if they are categorical or not.

You can try to reduce regularization - good idea - thanks.

So far Lasso has shown better results on the exam part of the sample.

Maybe it's just a lucky chunk of the exam sample. And you make a fit for it, picking the model with the best parameters for it.

I now always crossvalidation (or valving-forward) test, no fitting for a small portion, but all at once for all data, I think this is the best training option.
Doc also before disappearing from the forum advised to check it.

Reason: