Machine learning in trading: theory, models, practice and algo-trading - page 514

 
Яндекс выложил в открытый доступ новую библиотеку машинного обучения
Яндекс выложил в открытый доступ новую библиотеку машинного обучения
  • 2017.07.18
  • Оксана Мамчуева
  • www.searchengines.ru
Яндекс разработал новый метод машинного обучения CatBoost. Он позволяет эффективно обучать модели на разнородных данных — таких как местонахождение пользователя, история операций и тип устройства. Библиотека машинного обучения CatBoost выложена в открытый доступ, ее могут использовать все желающие. Для работы с CatBoost достаточно установить...
 

The R package is there, great.


2)
install.packages('devtools')
devtools::install_github('catboost/catboost', subdir = 'catboost/R-package')

 

Why R, I don't like it... command line or dll :)

 

I made a neural network regression predictor, displaying as a histogram the current price-prediction model for n bars ahead (15 in this case), trains for 5000 bars, retrains every 500 bars by itself. It looks good at first glance, but of course it works not as fast as I would like it to because I actually want to train several of them :)


And so if you look at the minutes - pretty small dispersion, of course it is high on the extreme emissions, but on average in the 100 points (5 signs) range.

The most tasty things are circled by arrows

 
Maxim Dmitrievsky:

Of course it does not work as fast as I would like it to,

On ALGLIB?

 
elibrarius:

On ALGLIB?


Yep

of course you can twist with external NS or woods, for example CatBoost on gpu, but so far too lazy and there is no time

the more accurate you are, the harder it is to run it in the tester

 

ALGLIB is a terrible brake on learning.

Served on the ALGLIB network 240-50-1, - 2 days waited, did not wait and turned off.

Network 70-5-1 trained in half an hour. And nnet of R trained less than a minute with the same data. So now I'm sitting with R to deal with.

 
elibrarius:

ALGLIB is a terrible brake on learning.

Served on the ALGLIB network 240-50-1, - 2 days waited, did not wait and turned off.

Network 70-5-1 trained in half an hour. And nnet of R trained less than a minute with the same data. So now I'm sitting with R to figure it out.


RF more or less, 50 inputs of 5000, 100 trees, 25 sec on average (on laptop). But it's also very long for optimization. Yes, NS is really slow, but it's a normal MLP, you shouldn't expect anything else from it.

I had to learn everything in a second at the most, where to get it? )

 

Once again I was convinced that the scaffolding cannot extrapolate, no matter how many exclamations there are that it is not:

above the red line 150 training prices (entries and exits). After that, the market began to fall and new prices appeared that were not in the training sample (were not fed to the output). The scaffold began displaying the lowest price they knew at the time of training, i.e. 1.17320, which corresponds to the horizontal line. Because of this, the residuals histogram was skewed.

Forests do NOT know how to EXTRAPLORE . All the clever ones are left for the second year to re-learn the math.


  • Like decision trees, the algorithm is totally incapable of extrapolation
http://alglib.sources.ru/dataanalysis/decisionforest.php
 

Prices without any conversion are not fed into the model.

Forests for extrapolation take the nearest known value. Neuronka or ruler in the extrapolation will calculate something according to internal formulas. But in fact all these models in this situation will merge, so there is no difference.
Reason: