How to predict the direction of a currency pair? - General

Elvin Nasirov 2023.01.15 21:20 #29021

Hey, everybody!

Maybe someone can give me some advice. I am trying to predict the direction of a currency pair for the day (up or down) using the "DecisionTreeClassifier" model.

I take only 5 predictors for prediction, the result of prediction is upward (1) or downward (-1) trend. Dataset size: 999 rows and 6 columns (dataset attached).

But I encountered a problem when increasing "max_depth" increases all the time the accuracy on the training and test samples simultaneously. The accuracy on the test sample stops growing and becomes a constant at max_depth=22, equal to 0.780000. Results at different values of max_depth:

1) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=3)

Accuaracy on training set: 0.539424 Accuaracy on test set: 0.565000

2) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=5)

Accuaracy on training set: 0.579474 Accuaracy on test set: 0.585000

3) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=7)

Accuaracy on training set: 0.637046 Accuaracy on test set: 0.640000

4) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=9)

Accuaracyon training set: 0.667084 Accuaracy on test set: 0.700000

5) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=11)

Accuaracy on training set: 0.700876 Accuaracy on test set: 0.710000

6) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=13)

Accuaracy on training set: 0.720901 Accuaracy on test set: 0.720000

7) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=15)

Accuaracy on training set: 0.734668 Accuaracy on test set: 0.740000

8) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=17)

Accuaracy on training set: 0.747184 Accuaracy on test set: 0.760000

9) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=19)

Accuaracy on training set: 0.755945 Accuaracy on test set: 0.765000

10) clf_20=DecisionTreeClassifier(criterion='entropy', max_depth=22)

Accuaracy on training set: 0.760951 Accuaracy on test set: 0.780000

I am extremely confused by this situation, because I have heard that you should not use max_depth more than 3-4, because retraining is possible. But does the model behave like this when retraining, it looks more like an undertrained model.

.

I don't understand in such a situation, what depth of the decision tree to choose or what model even and in general whether it is worth to work further in this direction, maybe something is missing (but, like, the dataset is not 100 rows), whether it is possible to add more predictors and how many more can be added at such a size of the dataset (I would add 2-5 more pieces).

The code is simple, I also attach it together with the dataset:

Files:

df_mql5.csv 32 kb

1y0ati_o0zr2d_2023-01-16_q_00.09.49.png 228 kb

Bayesian regression - Has Testing real-time forecasting systems Is there a pattern

Aleksey Vyazmikin 2023.01.15 22:06 #29022

Elvin Nasirov #:

I am very confused by this situation, because I heard that you should not use max_depth more than 3-4, because retraining is possible. But is this how the model behaves when retrained, it looks more like an undertrained model.

.

I don't understand in such a situation, what depth of the decision tree to choose or what model even and in general whether it is worth to work further in this direction, maybe something is missing (but, like, the dataset is not 100 rows), whether it is possible to add more predictors and how many more can be added at such a size of the dataset (I would add 2-5 more pieces).

The code is simple, I also attach it together with the dataset:

Hello.

More number of splits - more memory = risk of learning the sample.

I'm not proficient in python, but:

1. Try splitting the sample without mixing.

2. It still seems to me that you are learning on the whole sample, not on a reduced sample.

EA limit code wont open orders with Vertical line search .......

Elvin Nasirov 2023.01.15 22:27 #29023

Aleksey Vyazmikin #:

Hello.

More number of splits - more memory = risk of learning a sample.

I'm not proficient in python, but:

1. Try splitting the sample without mixing.

2. It seems to me that you are training on the whole sample, not on a reduced sample.

Thank you! It seems that you are right.

I replaced "clf_20.fit(X, y)" with "clf_20.fit(X_train, y_train)" in the above code and the picture changed almost 50/50.

Aleksey Vyazmikin 2023.01.15 22:32 #29024

Elvin Nasirov #:

Thank you! I think you're right.

I replaced "clf_20.fit(X, y)" with "clf_20.fit(X_train, y_train)" in the above code and the picture changed almost 50/50.

It's normal to have such a result - too good a result is always a reason to start looking for a bug in the code.

Elvin Nasirov 2023.01.15 22:46 #29025

Aleksey Vyazmikin #:

It's normal to have this result - too good a result is always a reason to start looking for a bug in the code.

I have another question, if I may.

It turns out that the best result is achieved at max_depth=1 and looks like this:

Accuaracy on training set: 0.515021 Accuaracy on test set: 0.503333

It seems to be extremely bad and equal to the probability of flipping a coin. Or can we consider this a good result and conclude that we have found a formalisation that allows us to level the probability of forex movement and the probability of the outcome with a flip of a coin?

That is, the situation is such that for each combination of predictors there are two equivalent variants of market movement: up or down, and therefore it is necessary to supplement the dataset with something that could specify at the current combination still up or down.

Random wandering Stable MTS Not the Grail, just

Aleksey Vyazmikin 2023.01.15 23:03 #29026

Elvin Nasirov #:

Another question came up, if I may.

It turned out that the best result is achieved at max_depth=1 and looks like this:

Accuaracy on training set: 0.515021 Accuaracy on test set: 0.503333

It seems to be extremely bad and equal to the probability of flipping a coin. Or can we consider this a good result and conclude that we have found a formalisation that allows us to level the probability of a forex movement and the probability of a coin flip?

That is, the situation is such that for each combination of predictors there are two equivalent variants of market movement: up or down, and therefore it is necessary to supplement the dataset with something that could specify at the current combination still up or down.

First read about other metrics for evaluating the results of training - Recall (completeness) and Precision (accuracy), they are especially relevant for unbalanced sampling. The strategy may be to produce a positive financial outcome for classification with the same chance of correct and incorrect results.

Consider a more complex but logical target markup. Determining how a day will close on its opening is more difficult than determining the probability of a rise and fall by some percentage of the day's opening - there is a probability of identifying an intraday pattern.

For me, the sample is too small.

Think about creating predictors that can describe the market. From the indicators of predictors, in my opinion, it should be possible to restore the situation on the chart without looking at it.

I recommend to try CatBoost for training - it builds models quickly and the issue of transferring models into code to work in MT5 without crutches is solved.

Is there a pattern Custom Indicator. Tester vs Looking for patterns

Forester 2023.01.16 07:24 #29027

Elvin Nasirov #:

It turns out that the best result is achieved when max_depth=1 and it looks like this:

Accuaracy on training set: 0.515021 Accuaracy on test set: 0.503333

I also often see that the best result is at depth=1, which means that only 1 split on one of the features was made. Further splitting of the tree leads to overtraining on traine and worse results on test.

How to count string Is there a pattern New trends in technical

Elvin Nasirov 2023.01.16 07:35 #29028

elibrarius #:

I also often see that the best result is at depth=1, which means that only 1 split on one of the features was made. Further splitting of the tree leads to retraining on the traine and worsening of results on the test.

Checked the results yesterday, it turned out that the model for all cases gave a prediction of "1", on average and therefore 50/50. You can do without the model - all the time saying "up" will go.

EURUSD - Trends, Forecasts Trend is your friend Can the SB chart

mytarmailS 2023.01.18 08:23 #29029

Trading as a professional pro trader

h ttps://youtu.be/RS9jRVmW1j4

This is what support and resistance levels are in my understanding.....

Not everyone will understand it, but if they do, kudos to them....

EARNING SEASON KICKS OFF - Trading Futures Live

2023.01.13
www.youtube.com

Join our FREE Discord community https://discord.gg/zhvUwUUhFirst 5 days of January bullish were followed by Full-year gains 83% of the time since 1950.Earnin...

MetaQuotes team - Thank FOREX - Trends, forecasts Errors, bugs, questions

Aleksey Vyazmikin 2023.01.18 09:38 #29030

mytarmailS #:
Trading as a professional pro trader

h ttps://youtu.be/RS9jRVmW1j4

This is what support and resistance levels are in my understanding.....

Not everyone will understand it, but if they do, I commend them...

If you do, you can trade like this.

Have you already put these levels into code? There are so many levels there that it is not realistic to trade them by hand....

Machine learning in trading: theory, models, practice and algo-trading - page 2903