Machine learning in trading: theory, models, practice and algo-trading - page 1280

 

I stand by my opinion: here reside two indisputable relatives of the venerable KsanKsanych (Fa). 1) Alyoshenka the son, who is overtaken by angry investors, and 2) grandson Kesha, who promises billions to everyone who reads his grandfather's creations.

Please do not confuse the two!

 

Interesting opinion by a StarCraft 2 game professional on what's going on. Especially about cheating in the last match. We should not forget that the organization of such spectacles from large companies is primarily a marketing move. The right thing to do would be to buy their shares for this event inside the day.


 

If you are interested, you can compare the importance tables by permutation and by actual removal of a predictor

Importance of predictors by brute force (removing 1 at a time)
, feature, absolute value, related value * 100
1) 17 0.01097643069603077 99
2) 30 0.006790004907923086 61
3) 61 0.004684715336508855 42
4) 2 -0.0002692516957934765 -2
5) 59 -0.0006465367565449825 -5
6) 34 -0.0006503517167333328 -5
7) 5 -0.001340840857516234 -12
8) 41 -0.001504570905518282 -13
9) 15 -0.001971414359495396 -17
10) 49 -0.002008411960897655 -18
11) 6 -0.002027305543154334 -18
12) 55 -0.002292162160081906 -20
13) 47 -0.002398304141661728 -21
14) 29 -0.003010337993465118 -27
15) 51 -0.004160368206123241 -37
16) 45 -0.004454751375256194 -40
17) 31 -0.004888451443569572 -44
18) 0 -0.00493201061731692 -44
19) 48 -0.005610904510929521 -51
20) 3 -0.005764515487066274 -52
21) 57 -0.005965409431599886 -54
22) 10 -0.006056332510674986 -55
23) 35 -0.006367565963429744 -58
24) 58 -0.006638024809636447 -60
25) 43 -0.007371220115761079 -67
26) 9 -0.007420288551508419 -67
27) 21 -0.007838972444520739 -71
28) 4 -0.007840269966254226 -71
29) 44 -0.008004942292835771 -72
30) 16 -0.008290498838290847 -75
31) 36 -0.008995332552560964 -81
32) 50 -0.009024243316015798 -82
33) 27 -0.009105675807931257 -82
34) 24 -0.01027361001595535 -93
35) 7 -0.01052719088846928 -95
36) 26 -0.01082406611271462 -98
37) 18 -0.01155880619525071 -105
38) 60 -0.01156309946744785 -105
39) 56 -0.01203862169736691 -109
40) 1 -0.01203862169736691 -109
41) 25 -0.0122272134638268 -111
42) 38 -0.01241174339783128 -113
43) 62 -0.01249635462233889 -113
44) 28 -0.01266702047388507 -115
45) 11 -0.01359028620740281 -123
46) 39 -0.01404126970316556 -127
47) 20 -0.01439737068264699 -131
48) 52 -0.01439756725211659 -131
49) 42 -0.01444571512808378 -131
50) 22 -0.01551886866180208 -141
51) 33 -0.01615798882405024 -147
52) 12 -0.01905830020505599 -173
53) 14 -0.01926462731981513 -175
54) 37 -0.01995084300903066 -181
55) 40 -0.020510512124551 -186
56) 19 -0.021415509666178 -195
57) 63 -0.02151966963894812 -196
58) 54 -0.02355949029687353 -214
59) 64 -0.02507021252693609 -228
60) 32 -0.02702794503628224 -246
61) 8 -0.02803580711831312 -255
62) 13 -0.03090123190409769 -281
63) 46 -0.03344678821960098 -304
64) 53 -0.03558721250407129 -324
65) 23 -0.04407219798162174 -401

Importance of predictors by the permutation method
, feature, absolute value, related value * 100
1) 55 0.04340158682225395 99
2) 61 0.02562763893643727 59
3) 58 0.02546470705535522 58
4) 56 0.02529445125891924 58
5) 59 0.02513377163594621 57
6) 57 0.02208166602125552 50
7) 64 0.02019285632774162 46
8) 60 0.0160907362360114 37
9) 43 0.0125324616278514 28
10) 35 0.01239249171969528 28
11) 13 0.01233138008911674 28
12) 24 0.01170363669371338 26
13) 62 0.01162424331038356 26
14) 63 0.01149019906346291 26
15) 45 0.01127777161657609 25
16) 34 0.01085020622422195 24
17) 46 0.01061844113396632 24
18) 20 0.01007598993178244 23
19) 2 0.009874770749918993 22
20) 19 0.00973881761283335 22
21) 1 0.009100774421598679 20
22) 32 0.009027289557555301 20
23) 9 0.008970631365350451 20
24) 54 0.00802484531062575 18
25) 8 0.007874015748031482 18
26) 53 0.007388216046985141 17
27) 41 0.006952887365763216 16
28) 12 0.0065631543248105 15
29) 21 0.006511968996697037 15
30) 31 0.006445981174562854 14
31) 30 0.005790682414698156 13
32) 42 0.005742446472030011 13
33) 22 0.003590654957257189 8
34) 4 0.003590358440616087 8
35) 38 0.00350243104857792 8
36) 10 0.00350243104857792 8
37) 29 0.003392223030944636 7
38) 5 0.003253553701826867 7
39) 52 0.003019071994331074 6
40) 11 0.002622140078149371 6
41) 15 0.001506974549529611 3
42) 49 0.001178236999850979 2
43) 27 0.000646877104963639 1
44) 23 0.0001088642328799794 0
45) 0 -0.0007427642973199949 -1
46) 36 -0.0008086747680855211 -1
47) 18 -0.001719116017552688 -3
48) 16 -0.003868408494392753 -8
49) 7 -0.004264601904658535 -9
50) 25 -0.004436590312574581 -10
51) 44 -0.004549722466056144 -10
52) 17 -0.005094229165450173 -11
53) 33 -0.007112771718937178 -16
54) 50 -0.008009653155771651 -18
55) 6 -0.008725562553674474 -20
56) 26 -0.01000190433609049 -23
57) 47 -0.01158648521535965 -26
58) 3 -0.01809942562041326 -41
59) 51 -0.01843159353630121 -42
60) 39 -0.02375369534904158 -54
61) 40 -0.02659139305699997 -61
62) 37 -0.02970174182772609 -68
63) 48 -0.031083105562031 -71
64) 14 -0.03323633066169551 -76
65) 28 -0.03952723165321592 -91

By permutation, the first 10 lines show that if we remove the predictor, the error will worsen by 2-6%, the first 10 of the search - only by 0.1-0.2%, because in practice the tree will always find another predictor for which there will be almost as good separation (primarily due to correlated with the predictor removed, but even if they are previously removed, something will still be found).

What is interesting, almost half of the predictors show negative importance when actually removed, i.e. if you remove them the tree error will decrease, i.e. they are clearly noisy. But the noisiest one only worsens the result by 0.5%.
And the fact that the order of importance is not at all similar leads to the idea that it is still better to sift out the noisy predictors by enumeration.

 

Maybe because you have to compare with some benchmark or known example, not hot with light.

+speed is very important. Since alglib doesn't have imports built in, I think permutation is optimal right now (tried a bunch of permutation methods)

 
elibrarius:

By permutation, the first 10 lines show that if you remove the predictor, the error will worsen by 2-6%, the first 10 of the search - only by 0.1-0.2%, because in practice the tree will always find another predictor for which will be almost as good separation (primarily due to correlated with the removed predictor, but even if you remove them previously, there is still something to find).

Why do you need a common error, do you have an equilibrium binary sample? I'm leaning more toward finding ways to improve class 1 accuracy.

 
Aleksey Vyazmikin:

Why do you need total error, do you have equilibrium binary sampling?

Total error is not an individual leaf, but a tree/forest.

Aleksey Vyazmikin:

I'm leaning more toward finding ways to improve Class 1 accuracy.

Me too)

 
Maxim Dmitrievsky:

Maybe because you have to compare with some benchmark or known example, not hot with light.

+speed is very important. Since alglib doesn't have imports built in, I think shuffling is optimal now (I've tried a lot of brute force methods).

The brute force (removal/addition by 1) is the benchmark against which all other methods should be compared. But it's long, I agree. But if it adds at least 5%, I'm willing to wait.
 
Another little experiment with permutation.
With different runs on the same tree, due to the randomness of permutation, the order of importance also changes
 
elibrarius:
Another little experiment with permutation.
With different runs on the same tree, due to the randomness of permutation, the order of importance also changes

I wanted to clarify, on which sample do you test the result of the permutation method, the one that was trained, or the test sample?

I understand that noise is something that stops working at all on the sample outside of training. But, I think it's not about a single predictor, but rather about relationships/leaves. I.e. there are two possibilities - the predictor is garbage, or it's just not being used correctly, i.e. the leaves are garbage.

 
Aleksey Vyazmikin:

I wanted to clarify, on which sample do you test the result of the permutation method, the one that was trained, or the test sample?

I understand that noise is something that stops working at all on the sample outside of training. But, I think it's not about a single predictor, but rather about relationships/leaves. I.e. there are two possibilities - the predictor is garbage or it's just not being used correctly, i.e. the leaves are garbage.

On the training one, since the trees are untrained. In over-trained trees it should be on the test tree, because the tree would remember the noise.
I think it doesn't matter for the untrained ones.
But sample size is important. The larger it is, the more representative it is. And my training plot is three times larger.

---------

From the https://www.mql5.com/ru/blogs/post/723619 tutorial, a large representative sample makes balancing across classes unnecessary, reducing temporal randomness. Transferred this to the untrained trees.
But I may be wrong, and I need to check the significance of predictors on the test plot.

Нужна ли деревьям и лесам балансировка по классам?
Нужна ли деревьям и лесам балансировка по классам?
  • www.mql5.com
Я тут читаю: Флах П. - Машинное обучение. Наука и искусство построения алгоритмов, которые извлекают знания из данных - 2015там есть несколько страниц посвященных этой теме. Вот итоговая:Отмеченный...
Reason: