Taking Neural Networks to the next level - page 13

NELODI
355
NELODI  

Chris70:

If you compare to manual trading: do you have 10000 candle on your chart at any time? Probably not. But you have an understanding of general market behaviour / trading rules / current market phase. Regarding neural networks, this all this is stored in the network weights.

No, I do not look at 10000 candles on a chart, because I use indicators. And the indicators I use are calculated based on the 10000 most recent candles. The alternative (which most traders do) is to periodically switch to several higher time-frames, in order to keep their perspective. But ... when you are Scalping manually on multiple Symbols, switching time-frames becomes an obstacle, so using indicators that combine all that data into the time-frame you are actively trading is an absolute requirement.

As for Neural Networks, I have to admit that my knowledge is outdated, since my most "recent" experience with ANNs was about 30 years ago ;)

Anyway ... did you get any useful results from your ANN so far? I mean ... something you'd actually use to trade with (your) real money? That was the whole point of this Exercise. Or am I wrong?

Bayne
971
Bayne  
NELODI:

No, I do not look at 10000 candles on a chart, because I use indicators. And the indicators I use are calculated based on the 10000 most recent candles. The alternative (which most traders do) is to periodically switch to several higher time-frames, in order to keep their perspective. But ... when you are Scalping manually on multiple Symbols, switching time-frames becomes an obstacle, so using indicators that combine all that data into the time-frame you are actively trading is an absolute requirement.

As for Neural Networks, I have to admit that my knowledge is outdated, since my most "recent" experience with ANNs was about 30 years ago ;)

Anyway ... did you get any useful results from your ANN so far? I mean ... something you'd actually use to trade with (your) real money? That was the whole point of this Exercise. Or am I wrong?

While Chris rejects this idea i still see a usefullness in Feeding a timeseries of higher period Indicators (eg a  200 period blablabla) into the Net, because training on more than a 100 bars per sequence would be relatiely costful in the computing performance aspect. (surely feeding a bigger timeframe would be possible too)

Btw the result a few pages ago looked quite convincing if it was done out of sample.

However i still think that adding specific indicators could save training time and even find new (less dominant) patterns (faster), though a simple LSTM could find most of them too.

@Chris70

Also suddenly the MetaLabeling Part reminds me of GAN networks where a distributor competes against the generator. implementing a CNN as a distributor could even improve or create more realistic predictions :/

Btw especially becauseyour broker is an ECN, you should be able to use any kind of ("ECN" like) FOREX data you could get. Maybe even ForexFutures Data...

NELODI
355
NELODI  
Bayne:
Btw the result a few pages ago looked quite convincing if it was done out of sample.

You mean the Screenshot with Bars and the projected high/low/close prices? Unless my eyes are deceiving me, these results don't look any better than a 3-period moving average. Why would you need an ANN for that?

Bayne
971
Bayne  
Pretty Sure it was #97.
NELODI
355
NELODI  
Bayne:
Pretty Sure it was #97.

Neural Network weight distribution charts? I do see how that could be useful to visualize the data currently stored by the ANN, but I fail to see the usefulness of that data when trading. Or am I missing the point?  As for the "results", if the range is between 0% and 64% accuracy, I find it rather disappointing. Throwing a dice would probably give you similar results. IMO, this was a failed experiment, unless the goal was to confirm the "random walk" theory.

Bayne
971
Bayne  
NELODI:

Neural Network weight distribution charts? I do see how that could be useful to visualize the data currently stored by the ANN, but I fail to see the usefulness of that data when trading. Or am I missing the point?  As for the "results", if the range is between 0% and 64% accuracy, I find it rather disappointing. Throwing a dice would probably give you similar results. IMO, this was a failed experiment, unless the goal was to confirm the "random walk" theory.

64% strike rate and always a positive CRV is not too bad. Show me something better only based on priceaction with a solid foundation(/working long term)
Enrique Dangeroux
610
Enrique Dangeroux  
Bayne:
64% strike rate and always a positive CRV is not too bad. Show me something better only based on priceaction with a solid foundation( /working long term)

The 64% is not fixed long term either, so you are comparing apples with oranges. 

Chris did not established (or posted) a base line, so there is no way of knowing if this "not too bad" 64% is actually an improvement over the base line.

The base line could be anything. From classic statistical prediction models to simply counting the percentage red/green candles from the training data for example.

Chris70
577
Chris70  

Okay, there's something going on here... First of all, it's not a dispute - we all want the same thing. This being said, I recall having been very open minded and conservative about too optimistic claims, especially with my introductory words in the beginning of this thread. If it doesn't work, I walk away without regrets and I also don't see the point in convincing anybody. You all trade with your own money, not mine. We all fight our one struggle, just as I'm okay with being wrong.

The main question is how much the markets can be anticipated AT ALL - based on history and current price action. Based on this query, I'm just trying to find the best possible information that there is. Somehow, we're all trying to "predict", with whatever method in place - indicators, funddamentals, machine learning... If on the other hand we had no opinion about the market at all, we should all stop at this point, walk away and never trade again.

With this in mind, the problem with classic "indicators" is, that we're trying to impose a "formula" onto the recent market data, that we BELIEVE to reveal something useful to us (and maybe it does), instead of trying to find the BEST possible formula under the given circumstances.

I'm less of an advocate for "belief" as long as there is a mathematical method for better knowledge. Machine learning is nothing less.

And if there just is no knowledge to be found? Okay, then machine learning shows us exactly that! But luckily (at least this is what my personal experience is showing me) there actually is some useful hidden knowledge.

Selections of indicators are usually based on personal judgement, trial and error and lack objectivity (at least if we neglect genetic indicator selection methods at this point).

This is something where I don't share Bayne's opinion: indicators are an unnecessary limitation upon the available information. Just take something as simple as a moving average or RSI: you can always calculate those values from a series of prices, but not the other way around. Neural networks on the other hand might even find relationships within the raw(!), most possible redundant information, that just didn't come to our thought.

Again, you can all do what you want with your time and money, I'm not trying to prove anything. This is an open discussion about methods.

Possible accuracy results (on unseen data!) beyond 60% are on average (64% ain't 0-64%). Of course, concrete results also depend on the timeframe and exact strategy. I'm trying different things here and they're not all the same. So far, I can say that looking at multicurrency data has helped a lot.

When I began this thread I also had not known that performance is so little of an issue with big neural networks under Mql5. When I chose the autoencoder+LSTM approach I wasn't really aware of this. I have learned that Mql5 is powerful enough that we don't really need the autoencoder method. Mql5 can very well handle big recurrent networks in reasonable training time. Anybody wants to implent machine learning in Mql5? Just do it, it works.

Do I currently use any of the mentioned preliminary results for actual real money trading? Clear honest answer: No (for that at this time I have other methods in place, mostly based on polynomial regression and momentum). But the results are promising enough to tell that I will soon. I just try a lot of things in order to find the best solution. Who's not? Lastly, it ain't a pissing contest, is it?

Happy trading everybody (after a hopefully nice weekend)!

Cheers,

Chris.

NELODI
355
NELODI  

Ok, then ... here's my "contribution" to this effort ;) I don't find it very useful, but ... here is a very simple Neural Network, written in pure MQL5, without any external dependencies (untested) ...

// Neural Network Parameters ...
#define DEFAULT_LEARNING_RATE 0.001
#define DEFAULT_ERROR_THRESHOLD 1.0
#define DEFAULT_WEIGHTS_INIT_FACTOR 0.5
#define INPUT_NEURONS  100
#define HIDDEN_NEURONS 80
#define OUTPUT_NEURONS 5

//- Neurons and their weights -
double FInputLayer[INPUT_NEURONS+1];    // the last element is used for BIAS = 1.0
double FHiddenLayer[HIDDEN_NEURONS+1];  // the last element is used for BIAS = 1.0
double FTargetLayer[OUTPUT_NEURONS];    // target output when training the network
double FOutputLayer[OUTPUT_NEURONS];    // output produced by the network

double FHiddenLayerWeights[HIDDEN_NEURONS+1][INPUT_NEURONS+1];
double FOutputLayerWeights[OUTPUT_NEURONS+1][HIDDEN_NEURONS+1];

double FErrorThreshold = DEFAULT_ERROR_THRESHOLD;
double FLearningRate = DEFAULT_LEARNING_RATE;
double FWeightsInitFactor = DEFAULT_WEIGHTS_INIT_FACTOR;

int FNNeuronError = 0;
double FTrainingError = 0;

//+------------------------------------------------------------------+
void bpInit()
  {
   FNNeuronError = 0;
   FTrainingError = 0;
   bpInitWeights();
  }
//+------------------------------------------------------------------+
void bpInitWeights()
  {
   int i, j, k;
   MathSrand(GetTickCount());
//--- Initializes the hidden layer weights
   for(j=0; j<=HIDDEN_NEURONS; j++)
      for(i=0; i<=INPUT_NEURONS; i++)
         FHiddenLayerWeights[j, i] = (MathRand()/32767.0 - 0.5) * (FWeightsInitFactor * 2.0);
//--- Initializes the output layer weights
   for(k=0; k<=OUTPUT_NEURONS; k++)
      for(j=0; j<=HIDDEN_NEURONS; j++)
         FOutputLayerWeights[k, j] = (MathRand()/32767.0 - 0.5) * (FWeightsInitFactor * 2.0);
  }
//+------------------------------------------------------------------+
void bpApply()
  {
   int i, j, k;
   FInputLayer[INPUT_NEURONS] = 1.0; // input layer's bias
   FHiddenLayer[HIDDEN_NEURONS] = 1.0;  // hidden layer's bias
//--- Feedforwards from INPUT to HIDDEN layer
   for(j=0; j<HIDDEN_NEURONS; j++)
     {
      FHiddenLayer[j] = 0;
      for(i=0; i<=INPUT_NEURONS; i++)
         FHiddenLayer[j] += FInputLayer[i] * FHiddenLayerWeights[j, i];
      FHiddenLayer[j] = BipolarSigmoid(FHiddenLayer[j]);
     }
//--- Feedforwards from HIDDEN to OUTPUT layer
   for(k=0; k<OUTPUT_NEURONS; k++)
     {
      FOutputLayer[k] = 0;
      for(j=0; j<=HIDDEN_NEURONS; j++)
         FOutputLayer[k] += FHiddenLayer[j] * FOutputLayerWeights[k, j];
      FOutputLayer[k] = BipolarSigmoid(FOutputLayer[k]);
     }
  }
//+------------------------------------------------------------------+
double HiddenErrors[HIDDEN_NEURONS+1];
double HiddenWeightsCorrection[HIDDEN_NEURONS+1][INPUT_NEURONS+1];
double OutputErrors[OUTPUT_NEURONS];
double OutputWeightsCorrection[OUTPUT_NEURONS][HIDDEN_NEURONS+1];
//+------------------------------------------------------------------+
void bpTrain()
  {
   int i, j, k;
   double Error;
//--- The feedforward phase
   bpApply();
//--- The backpropagation of error phase
   for(k=0; k<OUTPUT_NEURONS; k++)
     {
      Error = FTargetLayer[k] - FOutputLayer[k];
      if(MathAbs(Error) >= FErrorThreshold)
         FNNeuronError++;
      FTrainingError += Error * Error;
      OutputErrors[k] = Error * BipolarSigmoidDerivation(FOutputLayer[k]);

      for(j=0; j<=HIDDEN_NEURONS; j++)
         OutputWeightsCorrection[k, j] = FLearningRate * OutputErrors[k] * FHiddenLayer[j];

      //- Computes hidden layer weights error information
      for(j=0; j<HIDDEN_NEURONS; j++)
        {
         HiddenErrors[j]=0;
         for(k=0; k<OUTPUT_NEURONS; k++)
           {
            HiddenErrors[j] += OutputErrors[k] * FOutputLayerWeights[k, j];
            HiddenErrors[j] *= BipolarSigmoidDerivation(FHiddenLayer[j]);
            for(i=0; i<=INPUT_NEURONS; i++)
               HiddenWeightsCorrection[j, i] = FLearningRate * HiddenErrors[j] * FInputLayer[i];
           }
        }
      //- Updates output layer weights and bias
      for(k=0; k<OUTPUT_NEURONS; k++)
         for(j=0; j<=HIDDEN_NEURONS; j++)
            FOutputLayerWeights[k, j] += OutputWeightsCorrection[k, j];
      //- Updates hidden layer weights and bias
      for(j=0; j<HIDDEN_NEURONS; j++)
         for(i=0; i<=INPUT_NEURONS; i++)
            FHiddenLayerWeights[j, i] += HiddenWeightsCorrection[j, i];
     }
  }
//+------------------------------------------------------------------+
double BipolarSigmoid(double x)
  {
   double e=MathExp(-x);
   if(e!=-1.0)
      return(2.0/(1.0+e))-1.0;
   else
      return(0);
  }
//+------------------------------------------------------------------+
double BipolarSigmoidDerivation(double Fx)
  {
   return((1.0+Fx)*(1.0-Fx))/2.0;
  }
//+------------------------------------------------------------------+
Chris70
577
Chris70  

Hey, thanks a lot for your contribution.

Just a few thoughts:

1. Why only one fixed hidden layer? It's easy at this point to add a layers dimension "l" to your weights and hidden neurons and it just adds one more 'for' loop. It's harder if you change this once the code gets more complicated.

Another suggestion: don't chose different variable names for input layers, hidden layers and outputs. Just assign an index [0] to the input layer, [layers-1] to the output layer (assuming that "layers" is the total number of layers) and anything in between for the hidden layers. This seems unnecessary with only one hidden layer, but makes thinks easier with more complex networks. If you then do the backpropagation, you can still declare the output errors separately before cycling through the hidden layers, i.e. by specifically referring to [layers-1] for the outputs. Different names are really unnessary. By the way, I also first did it just like you did and changed it when I came upon some disadvantages later. Yes, of course there are differences how the layers are handled, like e.g. an input layer has no inputs on it's own, but this can all be addressed by array indices.

2. What is the purpose of the "error threshold"? I get that you calculate the error of each output neuron by the statement 

Error = FTargetLayer[k] - FOutputLayer[k];

because this is the derivative of the MSE error function (or more precisely: the correct derivative is Error = FOutputLayer[k]- FTargetLayer[k]; so the difference just the other way around, i.e. -(label-output) which is the same as +(output-label), so you get the wrong sign, but this detail doesn't matter depending on with which sign you handle the error later).

but then with MSE the total training error should be something like

double MSE=0;
for (int k=0;k<OUTPUT_NEURONS;k++)
  {MSE+=0.5*pow(FTargetLayer[k]-FOutputLayer[k],2);}

I can't find this in your code. Adding a threshold and just taking the square of the error instead of half the square seems wrong (because the derivative of x² is 2x, so the derivative of 0.5x² is just x).

3. Why the e!=-1 statement in the activation function? MathExp(-x) can never be negative (so never -1.0), because the range for e^x is between plus infinity for high x values and asymptotically approaching zero for very negative values ( https://en.wikipedia.org/wiki/Exponential_function).

By the way: a nice website for visualization of functions and computing derivatives is http://www.derivative-calculator.net. For example just input your (2.0/(1.0+exp(-x)))-1.0 there. It's fun to create your own custom activation functions there.

All in all: seems like a good starting point for a growing project - with neural networks you never run out of possibilities... (saving/loading the weights to/from a file, different weight initialization methods, more flexible network architecture, memory cells, other loss functions... you get the idea).

Here's something if you want some more activation functions to play with (a shared an earlier version of this code somewhere on the forum, but this version is a little better). The advantage: you can keep the choice of activation function flexible, for example as an input variable such as "input ENUM_ACT_FUNT actfunct", then later in your code you just write "Activate(x,actfunct) or e.g. Activate(x,f_sigmoid), Activate(x,f_ReLU)... and get the result for whatever activation function you selected (accordingly: DeActivate(x,actfunct) for the corresponding derivative).

//+------------------------------------------------------------------+
//|      list of available activation functions                      |
//+------------------------------------------------------------------+
enum ENUM_ACT_FUNCT
  {
   f_ident=0,        // identity function
   f_sigmoid=1,      // sigmoid (logistic)
   f_ELU=2,          // exponential linear unit (ELU)
   f_ReLU=3,         // rectified linear unit (ReLU)
   f_LReLU=4,        // leaky ReLU
   f_tanh=5,         // hyperbolic tangent (tanh)
   f_arctan=6,       // arcus tangent (arctan)
   f_arsinh=7,       // area sin. hyperbolicus (inv. hyperbol. sine)
   f_softsign=8,     // softsign (Elliot)
   f_ISRU=9,         // inverse square root unit (ISRU)
   f_ISRLU=10,       // inv.squ.root linear unit (ISRLU)
   f_softplus=11,    // softplus
   f_bentident=12,   // bent identity
   f_sinusoid=13,    // sinusoid
   f_sinc=14,        // cardinal sine (sinc)
   f_gaussian=15,    // gaussian
   f_differentiable_hardstep=16, // differentiable hardstep (custom)
   f_inv_diff_hardstep=17, // inverted differentiable hardstep (custom)
   f_softmax=18,     // normalized exponential (softmax)
   f_oblique_sigmoid=19 // oblique sigmoid (custom)
                     // note: the softmax function can't be part of this library because it has no single input but needs the other neurons of the layer, too, so it
  };                 //       needs to be defined from within the neural network code
   
string CustomError="no nan or inf numbers found";

// +------------------------------------------------------------------+
// |       function Activate / DeActivate                             |
// +------------------------------------------------------------------+
double Activate(double x,ENUM_ACT_FUNCT f)
  {
   switch(f)
     {
      case f_ident:     return x;
      case f_sigmoid:   return sigmoid(x);
      case f_ELU:       return ELU(x);
      case f_ReLU:      return ReLU(x);
      case f_tanh:      return modtanh(x);
      case f_arctan:    return arctan(x);
      case f_arsinh:    return arsinh(x);
      case f_softsign:  return softsign(x);
      case f_ISRU:      return ISRU(x);
      case f_ISRLU:     return ISRLU(x);
      case f_softplus:  return softplus(x);
      case f_bentident: return bentident(x);
      case f_sinusoid:  return sinusoid(x);
      case f_sinc:      return sinc(x);
      case f_gaussian:  return gaussian(x);
      case f_differentiable_hardstep: return diff_hardstep(x);
      case f_inv_diff_hardstep: return inv_diff_hardstep(x);
      case f_oblique_sigmoid: return oblique_sigmoid(x);
      default:          return x; //="identity function"
     }
  }

double DeActivate(double x,ENUM_ACT_FUNCT f)
  {
   switch(f)
     {
      case f_ident:     return 1;
      case f_sigmoid:   return sigmoid_drv(x);
      case f_ELU:       return ELU_drv(x);
      case f_ReLU:      return ReLU_drv(x);
      case f_tanh:      return modtanh_drv(x);
      case f_arctan:    return arctan_drv(x);
      case f_arsinh:    return arsinh_drv(x);
      case f_softsign:  return softsign_drv(x);
      case f_ISRU:      return ISRU_drv(x);
      case f_ISRLU:     return ISRLU_drv(x);
      case f_softplus:  return softplus_drv(x);
      case f_bentident: return bentident_drv(x);
      case f_sinusoid:  return sinusoid_drv(x);
      case f_sinc:      return sinc_drv(x);
      case f_gaussian:  return gaussian_drv(x);
      case f_differentiable_hardstep: return diff_hardstep_drv(x);
      case f_inv_diff_hardstep: return inv_diff_hardstep_drv(x);
      case f_oblique_sigmoid: return oblique_sigmoid_drv(x);
      default:          return x; //="identity function" derivative
     }
  }

//+------------------------------------------------------------------+
//|      return act.function as string variable equivalent           |
//+------------------------------------------------------------------+
string actfunct_string(ENUM_ACT_FUNCT f)
  {
   switch(f)
     {
      case f_ident:     return "identity";
      case f_sigmoid:   return "sigmoid (logistic)";
      case f_ELU:       return "exponential linear unit (ELU)";
      case f_ReLU:      return "rectified linear unit (ReLU)";
      case f_tanh:      return "hyperbolic tangent (tanh)";
      case f_arctan:    return "arcus tangent (arctan)";
      case f_arsinh:    return "area sinus hyperbolicus (inv. hyperbol.sine)";
      case f_softsign:  return "softsign";
      case f_ISRU:      return "inverse square root unit (ISRU)";
      case f_ISRLU:     return "inverse square root linear unit (ISRLU)";
      case f_softplus:  return "softplus";
      case f_bentident: return "bent identity";
      case f_sinusoid:  return "sinusoid";
      case f_sinc:      return "cardinal sine (sinc)";
      case f_gaussian:  return "gaussian";
      case f_softmax:   return "normalized exponential (softmax)";
      default:          return "";
     }
  }

// +------------------------------------------------------------------+
// |       function ELU / ELU_drv                                     |
// +------------------------------------------------------------------+
double ELU(double z)
  {
   if (z>0)
     {return z;}
   else
     {return ValidNumber(exp(z)-1,DBL_EPSILON-1);}
  }   

double ELU_drv(double z)
  {
   if (z>0)
     {return 1;} 
   else
     {return ValidNumber(exp(z),DBL_MIN);}
  }   

// +------------------------------------------------------------------+
// |       function sigmoid / sigmoid_drv                             |
// +------------------------------------------------------------------+   
 double sigmoid(double z)
  {
   if (z>0)
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MIN)),1);}
   else
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MAX)),0);}
  }

double sigmoid_drv(double z)
  {
   if (z>0)
     {return ValidNumber((1/(1+ValidNumber(exp(-z),DBL_MIN)))*(1-1/(1+ValidNumber(exp(-z),DBL_MIN))),0);}
   else
     {return ValidNumber((1/(1+ValidNumber(exp(-z),DBL_MAX)))*(1-1/(1+ValidNumber(exp(-z),DBL_MAX))),0);}
  }   

// +------------------------------------------------------------------+
// |       function oblique sigmoid / obl. sigmoid_drv (custom)       |
// +------------------------------------------------------------------+   
 double oblique_sigmoid(double z)
  {
   double alpha=0.01;
   if (z>0)
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MIN))+alpha*z,DBL_MAX);}
   else
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MAX))+alpha*z,-DBL_MAX);}
  }

double oblique_sigmoid_drv(double z)
  {
   double alpha=0.01;
   if (z>0)
     {return ValidNumber(ValidNumber(exp(-z),DBL_MIN)/pow(ValidNumber(exp(-z),DBL_MIN)+1,2)+alpha,alpha);}
   else
     {return ValidNumber(ValidNumber(exp(-z),DBL_MAX)/pow(ValidNumber(exp(-z),DBL_MAX)+1,2)+alpha,alpha);}
  } 

// +------------------------------------------------------------------+
// |       function ReLU / ReLU_drv                                   |
// +------------------------------------------------------------------+      
double ReLU(double z)
  {
   return MathMax(z,0);
  }
      
double ReLU_drv(double z)
  {
   return z>0;
  }

// +------------------------------------------------------------------+
// |       function LReLU / LReLU_drv                                 |
// +------------------------------------------------------------------+
double LReLU(double z)
  {
   if (z>0)
     {return z;}
   else
     {return 0.01*z;}
  }
      
double LReLU_drv(double z)
  {
   if (z>0)
     {return 1;}
   else
     {return 0.01;}
  }

// +------------------------------------------------------------------+
// |       function modtanh / modtanh_drv                             |
// +------------------------------------------------------------------+
double modtanh(double z)
  {
   if (z>0)
     {return ValidNumber(tanh(z),1);}
   else
     {return ValidNumber(tanh(z),-1);}
  }

double modtanh_drv(double z)
  {
   return ValidNumber(1-pow(tanh(z),2),0);
  }

//+------------------------------------------------------------------+
//|      function arctan / arctan_drv                                |
//+------------------------------------------------------------------+
double arctan(double z)
  {
   if (z>0)
     {return ValidNumber(atan(z),0.5*M_PI);}
   else
     {return ValidNumber(atan(z),-0.5*M_PI);}
  }
  
double arctan_drv(double z)
  {
   return ValidNumber(1/(pow(z,2)+1),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function arsinh / arsinh_drv                                |
//+------------------------------------------------------------------+
double arsinh(double z)
  {
   if (z>0)
     {return ValidNumber(asinh(z),DBL_MAX);}
   else
     {return ValidNumber(asinh(z),-DBL_MAX);}
  }
  
double arsinh_drv(double z)
  {
   return ValidNumber(1/sqrt(pow(z,2)+1),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function softsign / softsign_drv                            |
//+------------------------------------------------------------------+
double softsign(double z)
  {
   if (z>0)
     {return ValidNumber(z/(1+fabs(z)),1-DBL_EPSILON);}
   else
     {return ValidNumber(z/(1+fabs(z)),DBL_EPSILON-1);}
  }
  
double softsign_drv(double z)
  {
   return ValidNumber(1/pow(1+fabs(z),2),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function ISRU / ISRU_drv                                    |
//+------------------------------------------------------------------+
double ISRU(double z)
  {
   double alpha=1;
   if (z>0)
     {return ValidNumber(z/sqrt(1+alpha*pow(z,2)),1/sqrt(alpha)-DBL_EPSILON);}
   else
     {return ValidNumber(z/sqrt(1+alpha*pow(z,2)),DBL_EPSILON-1/sqrt(alpha));}
  }
  
double ISRU_drv(double z)
  {
   double alpha=1;
   return ValidNumber(pow(1/sqrt(1+alpha*pow(z,2)),3),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function ISRLU / ISRLU_drv                                  |
//+------------------------------------------------------------------+
// note: ISRLU="inverse square root linear unit"
double ISRLU(double z)
  {
   double alpha=1;
   if (z<0)
     {return ValidNumber(z/sqrt(1+alpha*pow(z,2)),DBL_EPSILON-1/sqrt(alpha));}
   else
     {
      return z;
     }
  }
  
double ISRLU_drv(double z)
  {
   double alpha=1;
   if (z<0)
     {return ValidNumber(pow(1/sqrt(1+alpha*pow(z,2)),3),DBL_MIN);}
   else
     {return 1;}
  }

//+------------------------------------------------------------------+
//|      function softplus / softplus_drv                            |
//+------------------------------------------------------------------+
double softplus(double z)
  {
   if (z>0)
     {return ValidNumber(log(1+exp(z)),DBL_MAX);}
   else
     {return ValidNumber(log(1+exp(z)),DBL_MIN);}
  }
  
double softplus_drv(double z)
  {
   if (z>0)
     {return 1/(1+exp(-z));} //all positive results are valid (gradient is close to 1)
   else
     {return ValidNumber(1/(1+exp(-z)),DBL_MIN);}
  }

//+------------------------------------------------------------------+
//|      function bentident / bentident_drv                          |
//+------------------------------------------------------------------+
double bentident(double z)
  {
   if (z>0)
     {return ValidNumber((sqrt(pow(z,2)+1)-1)/2+z,DBL_MAX);}
   else
     {return ValidNumber((sqrt(pow(z,2)+1)-1)/2+z,-DBL_MAX);}
  }
  
double bentident_drv(double z)
  {
   return z/(2*sqrt(pow(z,2)+1))+1;
  }

//+------------------------------------------------------------------+
//|      function sinusoid / sinusoid_drv                            |
//+------------------------------------------------------------------+
double sinusoid(double z)
  {
   return sin(z);
  }
  
double sinusoid_drv(double z)
  {
   return cos(z);
  }

//+------------------------------------------------------------------+
//|      function sinc / sinc_drv                                    |
//+------------------------------------------------------------------+
double sinc(double z)
  {
   if (z==0)
     {return 1;}
   else
     {return ValidNumber(sin(z)/z,DBL_MIN);}
  }
  
double sinc_drv(double z)
  {
   if (z==0)
     {return 0;}
   else
     {
      if (z>0)
        {return ValidNumber(cos(z)/z-sin(z)/pow(z,2),-DBL_MIN);}
      else
        {return ValidNumber(cos(z)/z-sin(z)/pow(z,2),DBL_MIN);} 
     }
  }     

//+------------------------------------------------------------------+
//|      function gaussian / gaussian_drv                            |
//+------------------------------------------------------------------+
double gaussian(double z)
  {
   return ValidNumber(exp(-pow(z,2)),DBL_MIN);
  }
  
double gaussian_drv(double z)
  {
   if (z>0)
     {return ValidNumber(-2*z*pow(M_E,-pow(z,2)),-DBL_MIN);}
   else
     {return ValidNumber(-2*z*pow(M_E,-pow(z,2)),DBL_MIN);}
  }

//+------------------------------------------------------------------+
//|      function differentiable hardstep (custom) / _drv            |
//+------------------------------------------------------------------+
double diff_hardstep(double z)
  {
   double alpha=0.1;
   if (z>=0)
     {return ValidNumber(1+alpha*pow(z,2),DBL_MAX);}
   else
     {return ValidNumber(-alpha*pow(z,2),DBL_MIN);}
  }
double diff_hardstep_drv(double z)
  {
   double alpha=0.1;
   return ValidNumber(fabs(alpha*2*z),DBL_MAX);
  }
  
//+------------------------------------------------------------------+
//|      function inverted differentiable hardstep (custom) / _drv   |
//+------------------------------------------------------------------+
double inv_diff_hardstep(double z)
  {
   double alpha=0.1;
   if (z>=0)
     {return ValidNumber(1-alpha*pow(z,2),DBL_MAX);}
   else
     {return ValidNumber(alpha*pow(z,2),DBL_MIN);}
  }
double inv_diff_hardstep_drv(double z)
  {
   double alpha=0.1;
   return ValidNumber(-fabs(alpha*2*z),DBL_MAX);
  }

//+------------------------------------------------------------------+
//|      ValidNumber                                                 |
//+------------------------------------------------------------------+
// purpose: returning the valid result (=no NaN of inf) that is closest to the max. expected mathematically correct result but still can be expressed as a double precision floating point number; return alternate if invalid;
// equation doesn't need to be calculated twice as in e.g. if (MathIsValidNumber(a+b){c=a+b;}else{c=d;}
// --> performance benefit with complicated equations
double ValidNumber(double equation,double alternate=0)
  {
   if (MathIsValidNumber(equation)){return equation;}else{return alternate;}
  }