Discussion of article "Gradient boosting in transductive and active machine learning"

 

New article Gradient boosting in transductive and active machine learning has been published:

In this article, we will consider active machine learning methods utilizing real data, as well discuss their pros and cons. Perhaps you will find these methods useful and will include them in your arsenal of machine learning models. Transduction was introduced by Vladimir Vapnik, who is the co-inventor of the Support-Vector Machine (SVM).

Let us go straight to active learning and test its effectiveness on our data.

There are several libraries for active learning in the Python language, the most popular of them being:

  • modAL is quite a simple and easy-to-learn package, which is a kind of a wrapper for the popular machine learning library scikit-learn (they are fully compatible). The package provides the most popular active learning methods.
  • Libact uses the multi-armed bandit strategy over existing query strategies for a dynamic selection of the best query. 
  • Alipy is a kind of a laboratory from package providers, which contains a large number of query strategies.

I have selected the modAL library as being more intuitive and suitable for getting acquainted with the active learning philosophy. It offers greater freedom in designing models and in creating your own models by using standard blocks or by creating your own ones.

Let us consider the above described process using the below scheme, which does not require further explanations:

Author: Maxim Dmitrievsky

 

Hi Maxim,

Thank you for the English version. I have 3 questions regarding specific parts of the code and I will appreciate if you can answer the questions specifically which will be helpful since I am a basic level programmer and still finding it difficult to understand everything from the explanation.

1.May I know from where and how did you get the below numbers and are these applicable for only "EURUSD" pairs or all currency pairs?

double catboost_model(const double &features[]) { 

    unsigned int TreeDepth[161] = {6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, };

    unsigned int TreeSplits[966] = {393, 730, 93, 54, 352, 313, 540, 591, 217, 12, 576, 757, 208, 574, 756, 446, 505, 10, 487, 791, 210, 673, 125, 647, 286, 593, 523, 706, 566, 510, 575, 754, 325, 450, 470, 321, 438, 589, 48, 257, 283, 745, 707, 520, 564, 296, 702, 27, 524, 223, 404, 755, 60, 218, 387,  };

    unsigned int BorderCounts[20] = {36, 44, 40, 41, 42, 40, 30, 30, 36, 35, 43, 45, 27, 37, 52, 55, 45, 40, 43, 38};

    float Borders[799] = {-0.0103283636f, -0.00538144633f, -0.00438116584f, -0.00384822348f, -0.00290416228f, -0.00226776977f, -0.00186691666f, -0.00173427281f, -0.00136242132f, , -0.00866030902f, -0.0083276052f, -0.00821269862f, -0.00758890808f, -0.0072928248f, -0.00716711534f, -0.00640411209f, -0.00561416801f, -0.0053433096f,  };

2.May I know from where and how did you get the below numbers and are these applicable for only "EURUSD" pairs or all currency pairs?

 

/* Aggregated array of leaf values for trees. Each tree is represented by a separate line: */

    double LeafValues[10304] = {

        -0.02908022041210655, 0, -0.005608946748068618, 0.005129329514937164, 0.03600027378169195, 0, 0.02578289590577986, 0.09444611655822675, 0.03646431117733154, 0.09977346533319338, -0.05595880296318598, -0.069314407568676, 0.08718389822649918, -0.1200338438496052, 0.0693147185156002, 0.01000834600443637, 0, 0.06059264820464737, ,

 

3.Can you please precisely tell me which parts of the code I need to edit to make it work  for other currency pairs or what exactly I need to do to test it for other pairs?


I have tried with other pairs , but I am not sure if I am doing something wrong or results are simply bad for other pairs where as it is working fine for EURUSD pair. I will appreciate if you can just post another example of some other currency pair to get a better idea how and what to implement to make it work for other pairs.

 
Awesome, thank you! of course i DID take the chance to use your method of model export to mql... great results on new data!
Reason: