CatBoost, GBM and LGBM are a beauty in the process of making it easier for you to get better - General

Yuriy Asaulenko 2018.11.30 20:16 #11801

Maxim Dmitrievsky:

Google doesn't work either? :)

https://chrome.google.com/webstore/detail/google-translate/aapbdbdomjkkjkaonfhkkikfgjllcleb?hl=ru

The translator works. Either translate the whole page, or copy-paste into the translator.

But a word or a paragraph - nothing.

Ivan Negreshniy 2018.11.30 20:19 #11802

Maxim Dmitrievsky:

There are too many settings there, you need a lot of bottles to figure it out... :) maybe the sample is small because the tree-like ones are mostly designed for large, you need to tweak something

of course, for sure it's possible to tweak, I even guess, that sampling percentage goes to each tree reduced by default, but two times two is an indicator...)

Aleksey Vyazmikin 2018.11.30 20:56 #11803

Maxim Dmitrievsky:

translate one word at a time, through the Google translator plugin for chrome. Without engl. no way. Even if you read through 1-2 words, the meaning will be clear in general. I myself use when I forget the words. Just click on the word. You can highlight turns / sentences.

Of course it is stupid to translate all the text at once, so you will never remember the words and you will never understand the meaning of the text.

Thanks, I should try to translate using your method, maybe it will be even more productive than making up my own hypotheses, but I have a weakness with languages...

Aleksey Vyazmikin 2018.11.30 21:00 #11804

Ivan Negreshniy:

I do not understand why you may need to manually edit splits and leaves of the decision trees, yes I have all branches automatically converted to logical operators, but frankly I do not remember that I myself have ever corrected them.

Because what's the point of using leaves with less than 50-60% prediction probability? It's random - it's better the model won't react to the situation at all, rather than reacting to a guess.

Ivan Negreshniy:

And is it even worth digging into the CatBoost code, how can you be sure.

For example I've put above test on python my neural network with learning by multiplication table by two, and now took it for testing trees and forests (DecisionTree, RandomForest, CatBoost)

and here's what the result came out - you can clearly see that it's not in favor of CatBoost, like two times two is zero five...:)

true, if you take thousands of trees, the results improve.

I'm not sure that the trees are better than neural networks, but trees require fewer resources to build them. For example right now I have about 400 predictors, and a network with 400 input neurons and (how many layers there are) would take too long to count.

I can reset my sample - maybe use it to see which method is better?

The settings, yes, make sense, and I'm digging into them right now and trying to figure them out.

Taking Neural Networks to Is there a pattern Using Neural Networks in

Maxim Dmitrievsky 2018.11.30 22:40 #11805

Ivan Negreshniy:

I do not understand why it may be necessary to manually edit splits and leaves deciding trees, yes I have all branches automatically converted to logical operators, but honestly do not remember that I myself have ever corrected them.

And in general it's worth digging the code CatBoost, how to be sure.

For example I put above test on python my neural network with learning by multiplication table by two, and now took it for testing trees and forests (DecisionTree, RandomForest, CatBoost)

and here's the result - you can see, that it's not in favor of CatBoost, like two times two - zero five...:)

It's true, if you take thousands of trees, the results improve.

import catboost
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from catboost import CatBoostRegressor
from sklearn.ensemble import GradientBoostingRegressor

x = [[1,2],[2,2],[3,2],[4,2],[5,2],[6,2],[7,2],[8,2],[9,2]]
y = [2,4,6,8,10,12,14,16,18]

print('-------- 1 DecisionTree')
tree = DecisionTreeRegressor().fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],tree.predict([ix])[0]))

print('-------- RandomForest 10 Tree')
regr = RandomForestRegressor(bootstrap=True).fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],regr.predict([ix])[0]))

print('-------- CatBoost 10 Tree')
cat = CatBoostRegressor(iterations=100, learning_rate=0.1, depth=2, verbose=False).fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],cat.predict([ix])[0]))

print('-------- Gboost 100 Trees')
gboost  = GradientBoostingRegressor(n_estimators=100, verbose = False).fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],gboost.predict([ix])[0]))

-------- 1 DecisionTree
 1.00*2.00=2.00 
 2.00*2.00=4.00 
 3.00*2.00=6.00 
 4.00*2.00=8.00 
 5.00*2.00=10.00 
 6.00*2.00=12.00 
 7.00*2.00=14.00 
 8.00*2.00=16.00 
 9.00*2.00=18.00 
-------- RandomForest 10 Tree
 1.00*2.00=3.60 
 2.00*2.00=4.40 
 3.00*2.00=6.00 
 4.00*2.00=8.00 
 5.00*2.00=9.20 
 6.00*2.00=11.80 
 7.00*2.00=13.20 
 8.00*2.00=15.60 
 9.00*2.00=17.40 
-------- CatBoost 10 Tree
 1.00*2.00=2.97 
 2.00*2.00=2.97 
 3.00*2.00=5.78 
 4.00*2.00=8.74 
 5.00*2.00=10.16 
 6.00*2.00=12.88 
 7.00*2.00=14.67 
 8.00*2.00=15.77 
 9.00*2.00=15.77 
-------- Gboost 100 Trees
 1.00*2.00=2.00 
 2.00*2.00=4.00 
 3.00*2.00=6.00 
 4.00*2.00=8.00 
 5.00*2.00=10.00 
 6.00*2.00=12.00 
 7.00*2.00=14.00 
 8.00*2.00=16.00 
 9.00*2.00=18.00

I tweaked it a little and added gradient boosting, it works best out of the box

the rest is monda something of course...

Yuriy Asaulenko 2018.11.30 22:56 #11806

Maxim Dmitrievsky:

About a year ago I saw a copy of a simple NS showing very decent results in the multiplication table. At that time it surprised me.

What's it for now?

Maxim Dmitrievsky 2018.11.30 23:12 #11807

import catboost
import lightgbm as gbm
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from catboost import CatBoostRegressor
from sklearn.ensemble import GradientBoostingRegressor

x = [[1,2],[2,2],[3,2],[4,2],[5,2],[6,2],[7,2],[8,2],[9,2]]
y = [2,4,6,8,10,12,14,16,18]

print('-------- 1 DecisionTree')
tree = DecisionTreeRegressor().fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],tree.predict([ix])[0]))

print('-------- RandomForest 10 Tree')
regr = RandomForestRegressor(bootstrap=True, n_estimators=100).fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],regr.predict([ix])[0]))

print('-------- CatBoost 10 Tree')
cat = CatBoostRegressor(iterations=100, learning_rate=0.1, depth=2, verbose=False).fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],cat.predict([ix])[0]))

print('-------- Gboost 100 Trees')
gboost  = GradientBoostingRegressor(n_estimators=100, verbose = False).fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],gboost.predict([ix])[0]))

print('-------- LGBM 100 Trees')
gbbm = gbm.LGBMRegressor(n_estimators=100,boosting_type='dart').fit(x,y)
for ix in x: print(' {:2.2f}*{:2.2f}={:2.2f} '.format(ix[0],ix[1],gbbm.predict([ix])[0]))

-------- 1 DecisionTree
 1.00*2.00=2.00 
 2.00*2.00=4.00 
 3.00*2.00=6.00 
 4.00*2.00=8.00 
 5.00*2.00=10.00 
 6.00*2.00=12.00 
 7.00*2.00=14.00 
 8.00*2.00=16.00 
 9.00*2.00=18.00 
-------- RandomForest 10 Tree
 1.00*2.00=2.84 
 2.00*2.00=3.74 
 3.00*2.00=5.46 
 4.00*2.00=7.70 
 5.00*2.00=9.66 
 6.00*2.00=11.44 
 7.00*2.00=13.78 
 8.00*2.00=15.46 
 9.00*2.00=16.98 
-------- CatBoost 10 Tree
 1.00*2.00=2.97 
 2.00*2.00=2.97 
 3.00*2.00=5.78 
 4.00*2.00=8.74 
 5.00*2.00=10.16 
 6.00*2.00=12.88 
 7.00*2.00=14.67 
 8.00*2.00=15.77 
 9.00*2.00=15.77 
-------- Gboost 100 Trees
 1.00*2.00=2.00 
 2.00*2.00=4.00 
 3.00*2.00=6.00 
 4.00*2.00=8.00 
 5.00*2.00=10.00 
 6.00*2.00=12.00 
 7.00*2.00=14.00 
 8.00*2.00=16.00 
 9.00*2.00=18.00 
-------- LGBM 100 Trees
 1.00*2.00=10.00 
 2.00*2.00=10.00 
 3.00*2.00=10.00 
 4.00*2.00=10.00 
 5.00*2.00=10.00 
 6.00*2.00=10.00 
 7.00*2.00=10.00 
 8.00*2.00=10.00 
 9.00*2.00=10.00

Igor Makanu 2018.12.01 04:33 #11808

Yuriy Asaulenko:

But a word or paragraph - nothing at all.

https://www.mql5.com/ru/forum/86386/page1180#comment_9543249

Машинное обучение в трейдинге: теория и практика (торговля и не только)

2018.11.29
www.mql5.com

Добрый день всем, Знаю, что есть на форуме энтузиасты machine learning и статистики...

Ivan Negreshniy 2018.12.01 09:33 #11809

Maxim Dmitrievsky:

there CatBoost at iterations=100 trees, not 10, and GBM is a beauty:)

Ivan Negreshniy 2018.12.01 09:36 #11810

Aleksey Vyazmikin:

Because what is the point of using sheets with less than 50-60% prediction probability? It's random-it's better for the model not to respond at all than to respond at a guess.

I'm not sure that trees are better than neural networks, but trees require fewer resources to build. For example, right now I have about 400 predictors, and a network with 400 input neurons and (how many layers there are) would take too long to count.

I can reset my sample - maybe use it to see which method is better?

And the settings yes - make sense - and I'm digging into them now and trying to understand their essence.

Sure dig in and choose as carefully as you can while it's still in the beginning stages.

Besides not understanding with two-two, try to disconnect obtrusive, at each startup, creation by CatBoost its temporary directories as from this in the protected environment it flies out.

And in general, these glitches he looks somehow not very professional, so if you can not beat them, then personally in my opinion, cheaper than free - from this product to give up immediately:)

F [WARNING CLOSED!] Any newbie Renko bars

Machine learning in trading: theory, models, practice and algo-trading - page 1181