Taking Neural Networks to the next level - page 23

 
Brian Rumbles:
I think Enrique's idea of using a NN to classify the condition if it recognizes it or not sounds like it can be very useful/ profitable

Definitely interesting. I'm curious which condition this is, in practice.

___

NELODI: I realize that your function is nothing less but an extremely nested function (only limited by the number of "y") with (pseudo)random result of each of these subfunctions. So basically it's an attempt to get from pseudo-random one step closer to actual randomness.

This is a hard nut to crack, as producing irreproduceable results is propably the whole purpose of this function.

But no excuses, I wrote something, but it is to be expected that the training time takes very long ... I think I'll let it run overnight...

ANN_challenge

The basic code looks like this (#include files not shown here!):

// +------------------------------------------------------------------+
// |                     ANN Challenge                                |
// |                                                 username Chris70 |
// |                                             https://www.mql5.com |
// +------------------------------------------------------------------+
#property   copyright     "username Chris70"
#property   version       "1.0"

#include    "Main.mqh"
#include    "MLP6.mqh"

// +------------------------------------------------------------------+
// | declare adjustable EA properties                                 |
// +------------------------------------------------------------------+
sinput string  HeaderMLP                                 = "";          // =============================== MLP ===============================
sinput ushort  mlp_layers                                = 3;           // MLP layers (on top of input layer)
sinput ushort  mlp_neurons                               = 100;         // MLP neurons per hidden layer
sinput ENUM_ACT_FUNCT mlp_hidden_actfunct                = f_sigmoid;   // MLP hidden layers activation function
sinput ENUM_ACT_FUNCT mlp_output_actfunct                = f_sigmoid;   // MLP output layer activation function
sinput ENUM_LOSS_FUNCTION mlp_loss                       = MSE;         // MLP loss function
sinput ENUM_OPTIMIZATION_METHOD mlp_optimizer            = Nesterov;    // MLP optimizer method
sinput ENUM_WEIGHT_INIT mlp_weight_init                  = Chris_uniform;// MLP weight initialization method
sinput double  mlp_lr                                    = 0.001;       // MLP learning rate
sinput double  mlp_lr_decay                              = 0.0001;      // MLP learning rate time decay (Vanilla / Nesterov / ADAM / RMSprop)
sinput double  mlp_momentum                              = 0.3;         // MLP learning rate momentum (Vanilla)
sinput double  mlp_beta1                                 = 0.99;        // MLP beta1 coefficient (ADAM)
sinput double  mlp_beta2                                 = 0.9;         // MLP beta2 coefficient (Nesterov / RMSprop / ADAM / ADADELTA)
sinput double  mlp_dropout                               = 0;           // MLP dropout level (hidden layers; only affects training mode)
sinput ENUM_SCALING mlp_input_scaling                    = minmax_method;// MLP input scaling method
sinput double  mlp_input_stdev                           = 1;           // MLP input scaling standard deviations (=if stdev method selected)
sinput double  mlp_input_radius                          = 10;          // MLP input scaling radius (minmax:stretch,stdev/tanh:clip)
sinput ENUM_SCALING mlp_label_scaling                    = minmax_method;// MLP label scaling method
sinput double  mlp_label_stdev                           = 10;          // MLP label standard deviations (=if stdev method selected)
sinput double  mlp_label_radius                          = 1;           // MLP label scaling radius (minmax:stretch,stdev/tanh:clip)
sinput string  mlp_filename                              = "ANN_challenge";// name of neural network data file
sinput string  Header_myfn                               = "";          // ========================== myfn FUNCTION ===========================
sinput int     min_x                                     = 0;           // min. value for x
sinput int     max_x                                     = 100;         // max. value for x
sinput int     min_y                                     = 0;           // min. value for y
sinput int     max_y                                     = 100;         // min. value for y
sinput int     min_z                                     = 0;           // min. value for z
sinput int     max_z                                     = 100;         // max. value for z
sinput string  HeaderTraining                            = "";          // ============================= TRAINING =============================
input bool     make_report                               = true;        // make test report during training
input int      train_counter                             = 1;           // train counter
input bool     plot_loss_curve                           = true;        // plot loss curve during training
input bool     plot_feature_distribution                 = true;        // plot feature distributions during training

// global variabels
int x_value=0,y_value=0,z_value=0,result_value=0;
double result_sum=0;
int predictions_counter=0;

// initialize class objects
CMain          main;
CMLP           mlp;

// ############################# end of global scope #######################################

//+------------------------------------------------------------------+
//| function for the challenge                                       |
//+------------------------------------------------------------------+

int myfn(int x, int y, int z)
  {
  srand(x);
  result_value=0;
  for(int i=0; i<y; i++)
    { 
    if (rand()>z) result_value++; 
    }
  return result_value;
  }

// +------------------------------------------------------------------+
// | OnInit                                                           |
// +------------------------------------------------------------------+

void OnInit()
  {
   ResetLastError();
   
// build network model (MLP)
   mlp.network_name="MT5 forum challence";
   mlp.network_shortname="MLP";
// -   1. add layers
   mlp.neurons[0]=3;         // input layer: variables x,y and z
   mlp.neurons[mlp_layers]=1;     // output layer: r
   mlp.actfunct[mlp_layers]=mlp_output_actfunct;
   for(int l=1; l<mlp_layers;l++)
     {
      mlp.neurons[l]=mlp_neurons;      // hidden layers
      mlp.actfunct[l]=mlp_hidden_actfunct;
     }

// -   2. feature scaling parameters
   mlp.input_scaling=mlp_input_scaling;
   mlp.input_scaling_stdev=mlp_input_stdev;
   mlp.input_scaling_radius=mlp_input_radius;

// -   3. label scaling parameters
   mlp.label_scaling=mlp_label_scaling;
   mlp.label_scaling_stdev=mlp_label_stdev;
   mlp.label_scaling_radius=mlp_label_radius;
   
   mlp.scaling_iterations=10000;

// -   4. set learning parameters
   mlp.weight_init_method=mlp_weight_init;
   ArrayInitialize(mlp.lr,mlp_lr);
   mlp.lr_decay=mlp_lr_decay;
   mlp.optimization_method=mlp_optimizer;
   mlp.lr_momentum=mlp_momentum;
   mlp.opt_beta1=mlp_beta1;
   mlp.opt_beta2=mlp_beta2;
   mlp.dropout_level=mlp_dropout;
   
   if(MQLInfoInteger(MQL_VISUAL_MODE))
     {
      if(plot_loss_curve && !plot_feature_distribution)
        {
         mlp.plot_loss=true;
         mlp.loss_curve_xpos=0;
         mlp.loss_curve_ypos=0;
         mlp.loss_curve_xsize=600;
         mlp.loss_curve_ysize=940;
        }
      if(!plot_loss_curve && plot_feature_distribution)
        {
         mlp.plot_feature_distribution=true;
         mlp.distrb_curve_xpos=0;
         mlp.distrb_curve_xsize=1200;
         mlp.distrb_curve_ypos=600;
         mlp.distrb_curve_ysize=340;
        }
      if(plot_loss_curve && plot_feature_distribution)
        {
         mlp.plot_loss=true;
         mlp.loss_curve_xpos=0;
         mlp.loss_curve_ypos=0;
         mlp.loss_curve_xsize=1000;
         mlp.loss_curve_ysize=500;
         mlp.plot_feature_distribution=true;
         mlp.distrb_curve_xpos=0;
         mlp.distrb_curve_xsize=1000;
         mlp.distrb_curve_ypos=500;
         mlp.distrb_curve_ysize=440;
        }
     }
// -   5. assign MLP label name
   mlp.labelstring[1]="my fn function result 'r'";

// -   6. load older weight matrix (will be initialized if there is none yet)
   mlp.load(mlp_filename);
   
  } // end of OnInit

// +------------------------------------------------------------------+
// | Expert tick function                                             |
// +------------------------------------------------------------------+
void OnTick()
  {   
   // update label
   mlp.label[1]=myfn(x_value,y_value,z_value);

   // model assessment for testreport purposes
   if(make_report){mlp.assess();}

   // backpropagation
   mlp.backpropagate();
   result_sum+=mlp.label[1];
   predictions_counter++;
   
   // show results
   if(MQLInfoInteger(MQL_VISUAL_MODE))
     {
      main.chart_print("loss: "+DoubleToStr(mlp.loss[mlp.bw_iterations-1]),0,mlp.loss_curve_xpos+100,30,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("iterations: "+IntegerToString(mlp.bw_iterations),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("random x: "+IntegerToString(x_value),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("random y: "+IntegerToString(y_value),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("random z: "+IntegerToString(z_value),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("predicted result r: "+DoubleToStr(mlp.out[1]),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("correct result r: "+DoubleToStr(mlp.label[1]),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
      main.chart_print("average error for r: "+DoubleToStr(result_sum/predictions_counter,2),-1,-1,-1,18,5,main.infopanel.headers_clr,main.infopanel.headers_font);
     }

   // get new random input variables for myfn function
   MathSrand(int(1000*MathMod(double(GetTickCount()),M_PI)));
   x_value=rand_int(min_x,max_x);
   y_value=rand_int(min_y,max_y);
   z_value=rand_int(min_z,max_z);
   
   // fill inputs into network
   mlp.x[0][1]=x_value;
   mlp.x[0][2]=y_value;
   mlp.x[0][3]=z_value;
        
   // make prediction (forward pass)
   mlp.forwardpass();        

  } // OnTick close
  
//+------------------------------------------------------------------+
//| Expert deinitialization function                                 |
//+------------------------------------------------------------------+
void OnDeinit(const int reason)
  {
   mlp.save(mlp_filename);
   if(make_report){mlp.testreport(mlp_filename);}
   ObjectsDeleteAll(0,0,-1);
   Print("uninit reason: ",_UninitReason);
  }

//+------------------------------------------------------------------+
//| custom tester result                                             |
//+------------------------------------------------------------------+
double OnTester()
  {
   return 1000000*mlp.loss[mlp.bw_iterations-1];
  }

I didn't find anything about .ex5 files in the forum rules and also wonder why we have the option to add such files if this is not wanted. If this should indeed be a problem, then I excuse myself and any moderators please feel free to delete it.

Files:
 

Chris70:

NELODI: I realize that your function is nothing less but an extremely nested function (only limited by the number of "y") with (pseudo)random result of each of these subfunctions.

Actually, the function is only one level deep, not extremely nested. And the number of iterations in the loop is fixed (it equals the "y" parameter). Anyway ... I've tested how long it takes the function to return a result (see my code below) and at least on my PC, the test code prints =>650, which is in microseconds and means that each call takes up to 0.7 microseconds on the average for parameters ranging from 0 to 1000 (since the loop makes 1000 calls).

  ulong tc=GetMicrosecondCount();
  for (int i=0;i<1000;i++) myfn(i,i,i);
  Print("=>",GetMicrosecondCount()-tc);
Anyway ... I've downloaded your ex5 file and dropped it on to a new EURUSD chart, but I don't see anything. Is there something specific I should do to let it train on my PC and to see what you see?
 
NELODI:

Actually, the function is only one level deep, not extremely nested. And the number of iterations in the loop is fixed (it equals the "y" parameter). Anyway ... I've tested how long it takes the function to return a result (see my code below) and at least on my PC, the test code prints =>650, which is in microseconds and means that each call takes up to 0.7 microseconds on the average for parameters ranging from 0 to 1000 (since the loop makes 1000 calls).

Yes, I understand why you don't consider it as a classic nested function (not a nested loop!), but the result of the rand() function of each iteration of the y loop (fixed or not) is taken into account for the result of the next iteration. So still, we somehow have y times a function inside a function, whereas of course not the last preliminary result of "r" goes into the next iteration, but the last rand(), which doesn't make it any easier. Any repeated call of rand() is like a nested function (again, not a nested loop). But I'm not a professional programmer; please correct me if I'm wrong.

 
NELODI:

Actually, the function is only one level deep, not extremely nested. And the number of iterations in the loop is fixed (it equals the "y" parameter). Anyway ... I've tested how long it takes the function to return a result (see my code below) and at least on my PC, the test code prints =>650, which is in microseconds and means that each call takes up to 0.7 microseconds on the average for parameters ranging from 0 to 1000 (since the loop makes 1000 calls).

Anyway ... I've downloaded your ex5 file and dropped it on to a new EURUSD chart, but I don't see anything. Is there something specific I should do to let it train on my PC and to see what you see?

It's written as an EA, not script; it should work if you run it via strategy tester.

 
Chris70:

Yes, I understand why you don't consider it as a classic nested function (not a nested loop!), but the result of the rand() function of each iteration of the y loop (fixed or not) is taken into account for the result of the next iteration. So still, we somehow have y times a function inside a function, whereas of course not the last preliminary result of "r" goes into the next iteration, but the last rand(), which doesn't make it any easier. Any repeated call or rand() is like a nested function (again, not a nested loop). But I'm not a professional programmer; please correct me if I'm wrong.

The rand function (as a piece of code) is NOT nested, but it does have a state, which is updated after each call and affects the output of the next call. But the same is true for any function that uses a variable which is being updated at certain points of the calculation and contributes in any way to the final results.

 

Which is why this single function calls the final result of y repeated interdependent functions in a row. It's many functions inside one function. Instead of just rand(), which sets the seed for the next call, we're basically looking at something like rand(rand(rand(rand(...)))) --> edit: of course rand() is parameter-free; I'm just trying to indicate that we're handing over the state/seed to the next call.

I'm aware that the word nested is often used in a different way (for those who understand german I'd say "verschachtelt"; nested here is used for lack of a better english word coming to my mind), especially with loops, where it exponentially deteriorates performance. This is not the case here. Again, semantics... I guess we mean the same. My point wasn't about performance, but why this equation is so hard to solve that you might win this challenge if I don't have a year of training time ;-) especially with high values for y; let's say we set it to 1 million...

When I offered the challenge, I had some complicated combination of your sin/cos/log/exp... example in mind (as you suggested), not looped repetitions of functions - but I didn't say that, so I guess you already won ;-)

 
Chris70:

Which is why this single function calls the final result of y repeated interdependent functions in a row. It's many functions inside one function. Instead of just rand(), which sets the seed for the next call, we're basically looking at something like rand(rand(rand(rand(...)))). I'm aware that the word nested is often used in a different way, especially with loops, where it exponentially deteriorates performance. This is not the case here. Again, semantics... I guess we mean the same. My point wasn't about performance, but why this equation is so hard to solve that you might win this challenge if I don't have a year of training time ;-) especially with high values for y; let's say we set it to 1 million...

When I offered the challenge, I had some complicated combination of your sin/cos/log/exp... example in mind (as you suggested), not looped repetitions of functions - but I didn't say that, so I guess you already won ;-)

I deliberately chose a pseudo-random function, because I believe that the market behaves like a random walk.

Anyway ... I finally got your EA to run in my Strategy Tester and show me the results live. I am also going to leave it running over the night.

 
NELODI:

I deliberately chose a pseudo-random function, because I believe that the market is a random walk.

Anyway ... I finally got your EA to run in my Strategy Tester and show me the results live. I am also going to leave it running over the night.

You can also run it in the optimizer --> should be faster in non-visual mode; just set the "training counter" to e.g. 1-100 and optimize "slow and complete" with all but one threads disabled; the weight matrix will be saved to the common file folder and re-used upon the next training cycle. But be aware that if " every tick" is chosen, you get as many iterations per cycle as there are ticks, so then it's probably better to not chose "years" of training history. Shorter time per cycle means you get more preliminary reports (also in the common file folder).

By the way, it's gonna be interesting if different network architectures make a difference (if anything useful comes out of it at all within reasonable training time). I thought for such a complicated task to be reflected in a network, we probably need many weights, so I set the input parameters to 10 layers with 500 neurons each. Activations: all sigmoid. Learning rate 0.001 with time decay set to 0.0001 and "vanilla" stochastic gradient descent and dropout set to zero (cause overfitting isn't a risk here).
 

the biggest interest of neural networks applied to trading is precisely to avoid optimization and reoptimization and rereoptimization

 
Icham Aidibe:

the biggest interest of neural networks applied to trading is precisely to avoid optimization and reoptimization and rereoptimization

This is  was misunderstanding: I suggested the "optimizer" here only in order to abuse it for an automated program restart. It doesn't "optimize" anything. The training counter in this example does nothing but counting and is irrelevant for the network. Apart from this: this is here now a fun exercise for the sake of the challenge, it has nothing to do with trading and if the equation is solvable, then it does have an "optimal" solution. Despite not being realistically achievable with this hard nut, optimisation/good fitting is part of the quest.

Reason: