Machine learning in trading: theory, models, practice and algo-trading - page 3402

 
Maxim Dmitrievsky #:
It's not the coffee, it's the MO that keeps you going, it's the disease.

Hmm... maybe a disease, there were three in the ward.....

 
Aleksey Vyazmikin #:

Hmm... maybe a disease, there were three in the room....

It's a good disease, benign.
 
Three nutcases))))
 
Alexey, send me your data so that I can understand what the problem is and write a normal code. There are too many variants of misunderstandings
 
mytarmailS #:
Alexey, send me your data so that I can understand what the problem is and write a normal code. There are too many variants of misunderstandings.

The sample will be downloaded within an hour.

Download link:https: //transfiles.ru/5fgge

In general, it uses public predictors that are specified in my articles, and the target one is from the latest articles. The sample is experimental - I test ideas on it.

 
Aleksey Vyazmikin #:

The sample will be uploaded within an hour

Download link:https: //transfiles.ru/5fgge

In general, it uses public predictors, which are specified in my articles, and the target one is from the latest articles. The sample is experimental - I test ideas on it.

Ok, I'll try it when I'm kind of awake
 
mytarmailS #:
OK, I'll try it when I sort of wake up.

I rewrote the code, the code had errors, don't trust GPT, it's rubbish!

Tried with and without normalisation, tried with and without class balancing.

The most signs are found with normalisation and class balancing, but the results may differ due to random balancing. It finds on average 15-20 signs

data <- data.table::fread("D:\\train.csv", sep = ";") |> as.data.frame()
original_colum_names <- colnames(data)
target <- data$Target_100
data <- data[, !(names(data) %in% c("Time", "Target_P", "Target_100", "Target_100_Buy", "Target_100_Sell"))]
data <- scale(data)

#  table(target) #  баланс целевой


#################################################
###  без балансировки классов в большую сторону ##
#################################################

library(abess)
model <- abess(y = target, x = data, tune.path = "gsection", early.stop = TRUE) 
ex <- extract(model)
ex$support.vars

#   сохранение имен колонок 
#  write.csv(ex$support.vars, "E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\Pred.csv", row.names = FALSE)
best_colums_idx <- which(original_colum_names %in% ex$support.vars)
#  Сохранение индексов в CSV файл
#write.csv(best_colums_idx, "E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\Оставшиеся_предикторы.csv", row.names = FALSE)





################################################
###  с балансировкой классов в большую сторону ##
################################################

 x <- caret::upSample(x = data, y = as.factor(target), list = TRUE)
x$y <- as.numeric(as.character(x$y))

#  table(x$y) #  баланс целевой

model <- abess(y = x$y, x = x$x, tune.path = "gsection", early.stop = TRUE) 
ex <- extract(model)
ex$support.vars

#   сохранение имен колонок 
#  write.csv(ex$support.vars, "E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\Pred.csv", row.names = FALSE)
best_colums_idx <- which(original_colum_names %in% ex$support.vars)
#  Сохранение индексов в CSV файл
#write.csv(best_colums_idx, "E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\Оставшиеся_предикторы.csv", row.names = FALSE)




For example

ex$support.vars
 [1] "iATR_B0_S1_M1"         "iStdDev_B0_S1_M1"      "iVolumes_B0_S1_M1"     "iStdDev_B0_S1_M2"     
 [5] "iATR_B0_S1_M3"         "iATR_B0_S1_M4"         "iATR_B0_S1_M5"         "iATR_B0_S1_M6"        
 [9] "iATR_B0_S2_M1"         "iStdDev_B0_S2_M1"      "iATR_B0_S2_M2"         "iATR_B0_S2_M3"        
[13] "iATR_B0_S2_M5"         "iATR_B0_S2_M6"         "iATR_B0_S15_M1"        "iATR_B0_S15_M2"       
[17] "iADX_B2_S15_M12"       "iADXWilder_B2_S15_M20" "iVolumes_B0_S15_H4"   
 

Also, there are a lot of linearly dependent variables in the dataset. Try to train your model on this cleaned data, there are only 500 features left from the original 2400. In theory, the result should be the same as with 2400.

data <- data.table::fread("D:\\train.csv", sep = ";") |> as.data.frame()
target <- data$Target_100
data <- data[, !(names(data) %in% c("Time", "Target_P", "Target_100", "Target_100_Buy", "Target_100_Sell"))]

bad_colums <- caret::findLinearCombos(data[1:500,])$remove

ncol(data)         # количество  всех колонок 2408
length(bad_colums) #  кличество линейно зависимых 1908


clear_data <- cbind.data.frame(
                              target, 
                              data[,-bad_colums]
                              )

write.csv(clear_data, "E:\\FX......Setup\\не_избыточный_датасет.csv", row.names = FALSE)
 
mytarmailS #:
Rewrote the code, the code had errors, don't trust the GPT it's rubbish!

What are the errors? The result was identical in the case of "without balancing".

I will try to train with the version after balancing.

 
mytarmailS #:
Try to train your model on this cleaned data.

I'll try, but I need indexes to exclude columns....

Reason: