How to train CatBoost models on the same level as human? - General

Maxim Dmitrievsky 2024.02.27 23:31 #34001

mytarmailS #:
h ttps:// youtu.be/Ipw_2A2T_wg?si=U03oigHFfaFxwjbs

These are MO heroes.

That reminds me.

Aleksey Vyazmikin 2024.02.28 00:07 #34002

mytarmailS #:

Try this one.

binary classification

Thank you. Now it works fast!

GPT won't replace human of course, but it helps quite well.

mytarmailS #:

50,000 traits / columns

found a subset of the best features in less than 3 seconds.

all features that are relevant to the target found and none of the 50,000 noise features found

So, it found six predictors out of the whole list. Hmm, now I will train 100 CatBoost models on them and see the average result.

mytarmailS #:
Logistic regression is a classification algorithm, for example, texts are classified using it.

Yes, of course it's a classification algorithm. I haven't seen any contradictions in your arguments and my earlier words. In general - some misunderstanding.

This code saves the indexes of predictors for exclusion and the list of predictors selected by this method.

#  Установка и загрузка пакета abess
#install.packages("abess")
library(abess)

#  Загрузка данных из CSV
data <- read.csv("E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\train.csv", sep = ";")

#  Указание столбца целевой переменной
target_column <- "Target_100"
target <- data[, target_column]

#  Исключение столбцов по их именам
столбцы_исключения <- c("Time","Target_P","Target_100","Target_100_Buy","Target_100_Sell")
data_without_excluded <- data[, !names(data) %in% столбцы_исключения]

#  Выбор только первых 50 столбцов
#data_without_excluded <- data[, 1:500]

#  Применение метода abess
#  Здесь вам нужно указать вашу модель и настройки метода abess
#  Например:
#model <- abess(y = target, x = data_without_excluded, method = "lasso")
model <- abess(y = target, x = data_without_excluded, tune.path = "gsection", early.stop = TRUE)

#  Копирование результатов модели - разные показатели
ex <- extract(model)

#  Получение имен отобранных предикторов (столбцов)
Pred <- ex$support.vars

#  Сохранение информации в CSV файл
write.csv(Pred, "E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\Pred.csv", row.names = FALSE)

#  Получение индексов всех предикторов в наборе данных
все_предикторы <- colnames(data_without_excluded)
индексы_всех_предикторов <- seq_along(все_предикторы)

#  Получение индексов предикторов, которые не входят в список для исключения
индексы_оставшихся_предикторов <- setdiff(индексы_всех_предикторов, match(Pred, все_предикторы))

#  Уменьшение индексов на 1
индексы_оставшихся_предикторов <- индексы_оставшихся_предикторов - 1

#  Сохранение индексов в CSV файл
write.csv(индексы_оставшихся_предикторов, "E:\\FX\\MT5_CB\\MQL5\\Files\\00_Standart_50\\Setup\\Оставшиеся_предикторы.csv", row.names = FALSE)

How to make it I need a line confusion about the behaviour

Aleksey Vyazmikin 2024.02.28 00:46 #34003

So let's compare - 100 CB models are trained - the first picture will be with the abess method, and the second one without, the result on the sample train

Test sample - we stop training on it.

The exam sample is a delayed sample that does not participate in the training process.

It seems that the abess method of selection does not work very effectively....

I should note that I have very unbalanced classes, I have about 16% of units - maybe it affects the selection.

Is there a pattern Bayesian regression - Has Advisor for an article.

mytarmailS 2024.02.28 01:44 #34004

Aleksey Vyazmikin #:

The abess selection method doesn't seem to work very effectively....

I have very unbalanced classes, I have about 16% units - maybe it affects the selection.

I have 5 options

1. Something is wrong with data saving. There are BIG questions about the code, it is the most strange and I do not understand, is it again gpt?

2. Perhaps you need to normalise the data

3. Maybe there is something wrong in the data itself.

4. Maybe there's an imbalance.

5. Maybe it's actually working worse

Questions from Beginners MQL5 [Archive] Learn how to Experiment

Aleksey Vyazmikin 2024.02.28 02:01 #34005

mytarmailS #:
I've got five options

1. Something is messed up with data saving. There are BIG questions about the code, it is the most strange and I do not understand, is it again gpt?

2. Perhaps we need to normalise the data

3. Maybe there is something wrong with the data itself

4. Maybe there's an imbalance

5. Probably really worse performance

1. It saves correctly - it corresponds to the output to the log. The code is clear to me on the contrary - it is a compilation of the original code and the one you suggested earlier here.

2. Probably needed for this method. I will try to do it.

3. Well, there may be something wrong in the data - that's what the process with selection is for. The data is obtained correctly, if that is what we are talking about.

4. That's what I'm writing about. Maybe it is necessary to set auto-balancing in the parameters?

5. So far it turns out like this. Maybe with stationary data it would be OK.

Creating a positive IO Hearst index quick channel

mytarmailS 2024.02.28 02:15 #34006

Aleksey Vyazmikin #:

1. It saves correctly - it corresponds to the output to the log. The code is clear to me on the contrary - it is a compilation of the original code and the one you suggested earlier here.

2. Probably needed for this method. I will try to do it.

3. Well, there may be something wrong in the data - that's what the process with selection is for. The data is obtained correctly, if that is what we are talking about.

4. That's what I'm writing about. Maybe it is necessary to set auto-balancing in the parameters?

5. So far it turns out like this. Maybe with stationary data it would be OK.

Получение индексов предикторов, которые не входят в список для исключения
индексы_оставшихся_предикторов <- setdiff(индексы_всех_предикторов, match(Pred, все_предикторы))

#  Уменьшение индексов на 1
индексы_оставшихся_предикторов <- индексы_оставшихся_предикторов - 1

I don't understand this one at all.

Normalisation is easy, just apply scale(data) to the data before all procedures. This normalises the matrix by columns

Aleksey Vyazmikin 2024.02.28 02:26 #34007

mytarmailS #:

That's what I don' t understand at all .

For CatBoost it is necessary to submit a list of predictors that are subject to exclusion, i.e. those that are not selected. Decreasing the value by -1 is necessary because the index in the same CatBoost is counted from null, as well as in many other languages.

mytarmailS #:
Normalisation is easy, just apply scale(data) to the data before all procedures. This normalises the matrix by columns.

#  Нормализация выбранных предикторов
normalized_data <- scale(data_without_excluded)
data_without_excluded  <- normalized_data

Is this acceptable?

In general, the result is identical to that without normalisation.

Is there a pattern [ARCHIVE!] Any rookie question, Errors, bugs, questions

mytarmailS 2024.02.28 02:38 #34008

Aleksey Vyazmikin #:

For CatBoost it is necessary to submit a list of predictors that are subject to exclusion, i.e. those that are not selected. Decreasing the value by -1 is necessary because the index in the same CatBoost is counted from null, as in many other languages.

Is this acceptable?

In general, the result is identical to that without normalisation.

Yes, it is acceptable.

So the problem is different, I do not understand the code that I highlighted, but without a computer I encrypt many times, tomorrow I will look....

I had coffee at 6.30pm, it's almost 3.40am and I'm lying here staring at the ceiling.

Do EA's work in Colored Moving Average modifying a code

Maxim Dmitrievsky 2024.02.28 03:03 #34009

mytarmailS #:

Shit I had coffee at 6.30pm, it's nearly 3.40am now and I'm lying there staring at the ceiling.

It's not the coffee, it's not the coffee, it's the MO that's keeping you up, it's the disease.

Aleksey Vyazmikin 2024.02.28 03:04 #34010

mytarmailS #:

Yes, it's permissible.

So the problem is different, I don't understand the code I highlighted, but I encrypt it many times without a computer, tomorrow I'll look at it....

I had coffee at 6.30pm, it's almost 3.40am and I'm lying here staring at the ceiling.

So the coffee wasn't fake.

Machine learning in trading: theory, models, practice and algo-trading - page 3401