Machine learning in trading: theory, models, practice and algo-trading - page 3009

 

First you have to realise that the model is full of rubbish inside...

If you decompose a trained wooden model into the rules inside and the statistics on those rules.

like :

     len  freq   err                                                                                 condition pred
315    3 0.002 0.417    X[,1]>7.49999999999362e-05 & X[,2]<=-0.00026499999999996 & X[,4]<=0.000495000000000023    1
483    3 0.000 0.000     X[,1]<=0.000329999999999941 & X[,8]>0.000724999999999976 & X[,9]>0.000685000000000047    1
484    3 0.002 0.273      X[,1]>0.000329999999999941 & X[,8]>0.000724999999999976 & X[,9]>0.000685000000000047   -1
555    3 0.001 0.333   X[,5]<=0.000329999999999941 & X[,7]>0.000309999999999921 & X[,8]<=-0.000144999999999951   -1
687    3 0.001 0.250 X[,2]<=-0.00348499999999996 & X[,7]<=-0.000854999999999939 & X[,9]<=-4.99999999999945e-05    1
734    3 0.003 0.000    X[,7]>-0.000854999999999939 & X[,8]>0.000724999999999865 & X[,9]<=0.000214999999999965    1
1045   3 0.003 0.231   X[,1]<=-0.000310000000000032 & X[,4]>0.000105000000000022 & X[,4]<=0.000164999999999971   -1
1708   3 0.000 0.000    X[,3]>0.00102499999999994 & X[,6]<=0.000105000000000022 & X[,7]<=-0.000650000000000039    1
1709   3 0.002 0.250     X[,3]>0.00102499999999994 & X[,6]<=0.000105000000000022 & X[,7]>-0.000650000000000039   -1
1984   3 0.001 0.000     X[,1]<=0.000329999999999941 & X[,8]>0.000724999999999976 & X[,9]>0.000674999999999981    1
2654   3 0.003 0.000        X[,4]<=0.00205000000000011 & X[,5]>0.0014550000000001 & X[,9]<=0.00132999999999994    1
2655   3 0.000 0.000         X[,4]<=0.00205000000000011 & X[,5]>0.0014550000000001 & X[,9]>0.00132999999999994   -1
2656   3 0.001 0.200         X[,3]<=0.00245499999999998 & X[,4]>0.00205000000000011 & X[,5]>0.0014550000000001   -1
2657   3 0.000 0.000          X[,3]>0.00245499999999998 & X[,4]>0.00205000000000011 & X[,5]>0.0014550000000001    1
2852   3 0.000 0.000                X[,2]<=-0.001135 & X[,8]>-0.000130000000000075 & X[,8]>0.00128499999999998   -1
2979   3 0.001 0.200     X[,1]>0.000930000000000097 & X[,1]>0.00129000000000012 & X[,8]<=-0.000275000000000025   -1


and analyse the dependence of the error of rule err on the frequency of its occurrence in the sample.


we get


Then we are interested in this area


Where the rules work very well, but they are so rare that it makes sense to doubt the authenticity of the statistics on them, because 10-30 observations is not statistics

 
mytarmailS #:

First you have to realise that the model is full of rubbish inside...

If you decompose a trained wooden model into the rules inside and the statistics on those rules.

like:

and analyse the dependence of the error of the rule err on the frequency freq of its occurrence in the sample

we get

Just a ray of sunshine in the darkness of recent posts
If you properly parse the errors of the model, you can find something interesting. We will accept very quickly and without any gpu, sms and registrations.
 
Maxim Dmitrievsky #:
Just a ray of sunshine in the darkness of recent posts
If you parse the model errors properly, you can find something interesting. We will accept very quickly and without any gpu, sms and registrations.

there will be an article about it, if there is one.

 
mytarmailS #:

there'll be an article about it, if there is one.

Norm, my last article was about the same thing. But if your way is faster, that's a plus.
 
Maxim Dmitrievsky #:
Norm, my last article was about the same thing. But if your way is faster, that's a plus.

What do you mean, faster?

 
mytarmailS #:

What do you mean, faster?

In terms of speed.
 
Maxim Dmitrievsky #:
In terms of speed.

about 5-15 seconds on a 5k sample

 
mytarmailS #:

about 5-15 seconds on a 5k sample.

I mean the whole process from the beginning to getting the TC.

I have 2 models being retrained several times, so it's not very fast, but it's acceptable.

and at the end I don't know what exactly they've screened out.

 
Maxim Dmitrievsky #:

I mean, the whole process from the beginning to getting the TC.

I have 2 models being retrained several times, so not very fast, but acceptable

and at the end, I don't know what exactly they screened out.

Train 5k.

Valid 60k.


model training - 1-3 seconds

rule extraction - 5-10 seconds

checking each rule (20-30k rules) for validity 60k 1-2 minutes


of course everything is approximate and depends on the number of features and data

 
Forester #:

Unfortunately no one found it, otherwise I'd be on tropical islands instead of here))))

Yes. Even 1 tree or regression can find a pattern if it's there and doesn't change.

Easy. I can ungenerate dozens of datasets. I'm just now exploring TP=50 and SL=500. There's an average of 10% error in the teacher's markup. If there is 20%, it will be a plum model.
So it is not the classification error that is the point, but the result of adding up all the profits and losses.

As you can see, the top model has an error of 9.1%, and you can earn something with an error of 8.3%.
The charts show only OOS, obtained by Walking Forward with retraining once a week, a total of 264 retraining over 5 years.
It is interesting that the model worked at 0 with a classification error of 9.1%, and 50/500 = 0.1, i.e. 10% should be. It turns out that 1% ate the spread (minimum per bar, the real one will be bigger).

That test was with real volumes from CME for EURUSD: cumulative volume, delta, divergence and convergence for 100 bars. Total 400 columns + 5 more of some kind.
Without changing any model settings, just deleted 405 columns with CME data (price deltas and zigzags remained) for a total of 115 columns and got slightly better results. I.e. it turns out that the volumes are sometimes selected in splits, but they turn out to be noise on OOS. And training slows down by 3.5 times.

For comparison, I left the charts with volumes at the top and without volumes at the bottom.

I hoped that the volumes with CME would bring additional information/ regularities that would improve learning. But as you can see, the models without volumes are a bit better, even though the charts are very similar.
This was my 2nd approach to CME (I tried it 3 years ago) and again unsuccessful.
It turns out that everything is taken into account in the price.

Has anyone else tried adding volumes to the training? Are the results the same? Or do you have them give improvements?

Reason: