Machine learning in trading: theory, models, practice and algo-trading - page 949

 
SanSanych Fomenko:

What other trend in the classification? Errors in class prediction will tear the trend - nothing will be left of the trend.

Well, why not, I define only inputs, outputs will be worked out by trawl, not by results of MO.

 
SanSanych Fomenko:

Of course, for fuck's sake!

What others?

Let me count them.

I'm waiting with interest!

 
Aleksey Vyazmikin:

I look forward to it with interest!

Here.

Number of observations used to build the model: 20276
Missing value imputation is active.

Call:
 randomForest(formula = as.factor(arr_Buy) ~ .,
              data = crs$dataset[crs$sample, c(crs$input, crs$target)],
              ntree = 500, mtry = 7, importance = TRUE, replace = FALSE, na.action = randomForest::na.roughfix)

               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 7

        OOB estimate of  error rate: 17.76%
Confusion matrix:
     -1    0    1 class.error
-1 4429 1157   84   0.2188713
0   498 8288  501   0.1075697
1   102 1259 3958   0.2558752

Variable Importance
===================

                                -1     0     1 MeanDecreaseAccuracy MeanDecreaseGini
arr_LastBarPeresekD_Down_M15 56.11 64.27 64.49                69.56           211.39
arr_DonProc_M15              60.79 63.68 57.48                67.77           298.46
Levl_High_H4                 59.69 61.76 57.36                69.74           195.16
Levl_Close_W1                54.44 58.90 57.08                64.35           234.41
arr_Regresor                 56.51 55.71 56.09                61.40           212.89
Levl_Low_H4                  50.14 52.47 55.56                57.38           203.09
Levl_Low_D1                  51.00 50.52 55.24                57.91           192.80
arr_Den_Nedeli               47.86 50.18 55.22                53.55           214.23
arr_DonProcVisota            53.91 58.53 55.15                58.61           305.84
Levl_Close_MN1               51.68 51.71 54.70                58.12           228.30
Levl_Close_D1                53.13 51.06 51.83                57.80           267.86
arr_LastBarPeresekD_Up_M15   52.46 52.45 49.94                56.53           218.22
arr_DonProc                  47.96 69.45 49.35                60.33           322.91
Levl_Support_D1              52.28 52.42 49.21                56.50           253.82
Levl_Support_W1              47.90 50.98 47.38                53.37           219.35
Levl_High_W1                 48.68 47.64 47.35                52.45           144.54
Levl_Low_H1                  46.62 53.94 46.72                54.10           208.75
Levl_Support_MN1             41.67 44.57 46.52                46.83           198.77
arr_TimeH                    44.65 46.73 45.21                47.78           183.06
Levl_High_D1                 43.69 42.56 45.17                46.79           169.77
Levl_first_H4                41.65 44.09 43.92                46.57           121.20
Levl_High_MN1                37.88 40.27 42.96                42.52           142.87
X_USE_Filter_MA_02           38.67 43.46 42.57                49.19            84.23
Levl_first_H1                38.36 40.30 40.51                44.00           135.97
Levl_Low_MN1                 36.20 39.33 39.59                40.68           149.38
Levl_High_H1                 36.34 39.67 39.02                40.28           196.14
arr_LastBarPeresekD_Down     40.51 39.81 37.87                43.21           232.92
Levl_first_D1                33.94 36.19 36.47                38.99            78.20
Levl_first_MN1               30.33 33.31 35.62                34.03            99.66
arr_LastBarPeresekD_Up       32.66 40.83 35.21                38.65           238.36
Levl_Low_W1                  33.29 34.25 35.02                35.13           175.21
X_Use_Donchianf              31.06 34.26 33.54                36.21            97.49
Levl_Support_H4              33.55 38.03 33.15                36.91           248.48
Use_Filter_MA_Prirost        31.89 31.93 31.42                38.92            63.63
Levl_Close_H1                32.25 34.31 31.06                34.08           242.26
X_Use_Filter_Fibo_in_Day     29.56 30.80 30.99                36.89            71.70
Levl_Close_H4                34.27 33.26 30.79                34.17           272.58
X_USE_Filter_MA              25.90 31.13 29.25                33.87            66.11
X_Use_BarPeresek_iMA_TF      26.07 23.12 28.88                32.17            31.87
Levl_first_W1                26.50 28.50 27.21                28.70            83.33
arr_Vektor_Week              25.93 25.76 26.68                29.62            44.61
arr_Vektor_Don_M15           29.11 28.28 26.27                31.15            53.36
arr_RSI_Open_H1              30.82 29.38 25.75                36.56            45.64
Levl_Support_H1              26.88 27.87 25.56                27.72           215.06
arr_Vektor_Day               22.67 24.33 24.70                26.31            43.21
arr_Vektor_Don               23.32 25.04 21.89                25.40            65.35
arr_BB_Up                    10.94 11.71 16.86                15.05            21.55
arr_BB_Center                16.63 17.40 16.01                17.13            58.55
X_Use_ChanelEvaProc          13.36 17.13 12.74                23.63           106.51
arr_RSI_Open_M1               8.95 11.16 12.15                13.34            33.44
arr_BB_Down                  13.49 13.31  6.84                13.36            24.11
USE_Filter_MA_Donchian        3.60 -1.85  5.00                 3.82             2.32


The required number of trees has increased, certainly not 100


The error is not correct: it should be counted by the column, worse than before, but still very decent

 
Error matrix for the Random Forest model on Pred_027_2016_H2_T.csv [validate] (counts):

Predicted
Actual -1 0 1 Error
-1 20157 5167 292 21.3
0 2222 37861 2060 10.2
1 373 5502 17608 25.0

Error matrix for the Random Forest model on Pred_027_2016_H2_T.csv [validate] (proportions):

Predicted
Actual -1 0 1 Error
-1 22.1 5.7 0.3 21.3
0 2.4 41.5 2.3 10.2
1 0.4 6.0 19.3 25.0

Overall error: 17.1%, Averaged class error: 18.83333%.


Error matrix for the Random Forest model on Pred_027_2016_H2_T.csv [test] (counts):


Predicted

Actual -1 0 1 Error

-1 19963 5131 328 21.5

0 2259 37753 2104 10.4

1 404 5703 17597 25.8


Error matrix for the Random Forest model on Pred_027_2016_H2_T.csv [test] (proportions):


Predicted

Actual -1 0 1 Error

-1 21.9 5.6 0.4 21.5

0 2.5 41.4 2.3 10.4

1 0.4 6.3 19.3 25.8


Overall error: 17.4%, Averaged class error: 19.23333%


The amazing stability of the error is very encouraging.

 
SanSanych Fomenko:

Here.


The required number of trees has increased, certainly not 100


The error does not count correctly: it should count by the column, worse than before, but still very decent

Thank you! So, the set of predictors is not so bad, and there is a sense to expand it!


SanSanych Fomenko:

The amazing stability of the error is very reassuring.

Or maybe the sample is just very typical? I'm thinking that somehow it should be trained on the file with 2015, and tested on 2016 - there are global trends of the opposite direction, I think the system will not be able to work so effectively there.

Eh, I wish I knew how else to make it work... I wonder if the forests in Maxim's and here are the same by logic of formation or not?

 
Aleksey Vyazmikin:

Thank you! So the set of predictors is not so bad, and it makes sense to expand it!


Or maybe the sample is just very typical? I'm thinking that somehow I should be trained on the file with 2015, and check on 2016 - there are global trends of the opposite direction, I think the system will not be able to work so effectively.

Eh, I wish I knew how else to make it work... I wonder if Maxim's scaffolding and here are the same by formation logic or not?

I wrote above, and I'll repeat it:

  • Check the predictive power
  • split the file and see the error on the second half.


PS.

The predictors are too much.

 
SanSanych Fomenko:

I wrote above, and I'll repeat it:

  • check the predictive ability
  • divide the file and see the error on the second half.

Why divide the file, if everything is already divided into two files? I just do not know how to do it in R, no one could explain to me - apparently stupid.

SanSanych Fomenko:

PS.

Predictors are too much as it is.

Yeah not really, that's not all I use in real trading, including using ATS.

I very much hope that the network can outperform an optimized EA on history :)

 

Where did you pick up so many farthers? Did you manually select the strategy? crazy :)

the logic of the scaffolding should be +- the same

 

But here is a different model:

Error matrix for the SVM model on Pred_027_2016_H2_T.csv [validate] (counts):

      Predicted
Actual   -1     0    1 Error
    -1 6176 18666  774  75.9
    0  2242 38585 1316   8.4
    1  1333 17683 4467  81.0

Error matrix for the SVM model on Pred_027_2016_H2_T.csv [validate] (proportions):

      Predicted
Actual  -1    0   1 Error
    -1 6.8 20.5 0.8  75.9
    0  2.5 42.3 1.4   8.4
    1  1.5 19.4 4.9  81.0

Overall error: 46%, Averaged class error: 55.1

The result is QUANTITELY different, although the model is qualitatively different, should work poorly on your data.


We need to improve the randomForest

 
Aleksey Vyazmikin:


I really hope that the network can outperform the optimized Expert Advisor on history :)

Why divide a file if everything is already divided into two files? I just do not know how to do it in R, no one has ever been able to explain it to me - obviously I'm stupid.

Dividing is a piece of cake, the problem is the prejudice against R.


I very much hope that the network will be able to outperform an optimized Expert Advisor on history :)

What is the network for?

Reason: