How to use an ANNs in your real money? - Trading Systems

Taking Neural Networks to the next level

Chris70 · 2025-08-12T05:22:55.0000000Z

This thread won't be about a question or problem, but rather about the anouncement of the presentation and documentation of an exciting trading concept. I plan to do a series of postings here in order to keep you guys updated. Anybody who has an opinion on the topic, please don't hesitate to comment, even if you don't have profound machine learning knowledge (I'm still learning, too - which never ends). To those of you who are more familiar with machine learning, the particular topic of this series will be about Forex price FORECASTING with AUTO-ENCODERs combined with MULTIVARIATE FULLY-CONNECTED STACKED LSTM-networks . To those who are already intimidated by these fancy words: don't worry, it's not so complicated after all and I'm pretty sure that you will grasp the concept after a few introductory explanations. In order to make it easily understandable, I won't go into any calculus details. This is more about the idea. I still remember how it was when I first encountered neural network and how abstract and complicated it all seemed. Believe me: it's not - not after you are familiar with some basic terms. I know that there are many EA's in the market that work with neural networks. Most of them work with "multilayer perceptrons" in their simplest form - which is nothing bad per se, but none of them is the holy grail and they usually suffer from the "garbage in / garbage out" problem and any good results often consist in the same overfitting as many other "less intelligent" EA's. If you feed the network with data from lagging indicators, don't expect any real magic to happen. Some of you who remember me from earlier posts might remember that I have a strong opinion about the limitations of predicting the future, when it comes to trading. As of today, I think that there is more money to be made by reacting to the status quo, i.e. statistical anomalies as they happen, instead of forecasting tomorrow's anomalies. This particularly comprises personally much preferred various break-out and mean reversion techniques. When it comes to forecasting, the task can statistically speaking be broken down to a time series analysis problem, just like we know them in many other fields, like weather forecasting or forecasting of future sales, flights, etc.. Fore those time series that have some kind of repetetive pattern, methods like the so called ARIMA-model or Fast-Fourier-Transformation can very well do the job, or also some kinds of special neural networks (recurrent networks like GRU and LSTM). However, the problem with stock prices or currency pairs is the immense amount of noise and randomness, that makes valid predictions so difficult. In my earlier experiences with time series forecasting in trading (also with LSTM networks) my final conclusion was, that the method does in fact work, but there is not much money left after substracting spreads/commissions and that the method is not superior to other trading methods like e.g. polynomial regression chanel break-outs, that I have good practical experience with. This is why I left the idea of price forecasting for some time. However, it's never a bad idea to put one's opinion to a validity retest. In this project, I want to test if I can make better predictions by making some adjustments to the classic LSTM forecasting concept. The combination of autoencoders with stacked LSTMs is nothing new and therefore not my invention, but I don't know of any realisation in a dedicated trading environment like Metatrader. I don't know what the outcome will be and I might stop the project at any time if I should realize that it doesn't work, so please understand this project is more like a fun "scientific" investigation that stands apart form my real trading and not (yet?) a readily made expert advisor. I am very well aware that the programming language "Python" is the go-to language when it comes to machine learning, especially with it's powerful "Keras" library. I have some Python knowledge, which is why I could also do the same thing purely in Python, so it's more of a conscious personal choice to realize it all on Metatrader only. I will also do it this way because I already have my own libraries for MLP and LSTM networks complete and working from earlier projects, so it won't be that much additional work. Okay... having these words gotten out of the way, let's start with a few topics that I plan to write about in the next posts, so that anybody, even without any previous machine learning knowledge, will understand what it is about: 1. What is a "neuron" and what is it good for? 2. What is a "multilayer perceptron"? 3. What is "backpropagation" and how do neural networks learn? 4. What is an "autoencoder" and how can it be used in trading and time series analysis? 5. What is a recurrent neural network (LSTM,GRU...) and what are the benefits? 6. Putting it all together Next steps: - practical realisation, debugging and making the networks "learn" - hyperparameter optimization - implementation of the networks in a trading system - backtesting and forward-testing on unseen data Have fun following me on the journey with the upcoming postings ... and please excuse any mistakes with my mediocre "Netflix English" (german is my main language, but the german part of the forum is less active, which is why I decided to post it here). Chris.

NELODI 2019.10.12 11:46 #121

Chris70:

If you compare to manual trading: do you have 10000 candle on your chart at any time? Probably not. But you have an understanding of general market behaviour / trading rules / current market phase. Regarding neural networks, this all this is stored in the network weights.

No, I do not look at 10000 candles on a chart, because I use indicators. And the indicators I use are calculated based on the 10000 most recent candles. The alternative (which most traders do) is to periodically switch to several higher time-frames, in order to keep their perspective. But ... when you are Scalping manually on multiple Symbols, switching time-frames becomes an obstacle, so using indicators that combine all that data into the time-frame you are actively trading is an absolute requirement.

As for Neural Networks, I have to admit that my knowledge is outdated, since my most "recent" experience with ANNs was about 30 years ago ;)

Anyway ... did you get any useful results from your ANN so far? I mean ... something you'd actually use to trade with (your) real money? That was the whole point of this Exercise. Or am I wrong?

Big big problem of Coding help Combine indicators and multiple

Bayne 2019.10.12 14:25 #122

NELODI:

No, I do not look at 10000 candles on a chart, because I use indicators. And the indicators I use are calculated based on the 10000 most recent candles. The alternative (which most traders do) is to periodically switch to several higher time-frames, in order to keep their perspective. But ... when you are Scalping manually on multiple Symbols, switching time-frames becomes an obstacle, so using indicators that combine all that data into the time-frame you are actively trading is an absolute requirement.

As for Neural Networks, I have to admit that my knowledge is outdated, since my most "recent" experience with ANNs was about 30 years ago ;)

Anyway ... did you get any useful results from your ANN so far? I mean ... something you'd actually use to trade with (your) real money? That was the whole point of this Exercise. Or am I wrong?

While Chris rejects this idea i still see a usefullness in Feeding a timeseries of higher period Indicators (eg a 200 period blablabla) into the Net, because training on more than a 100 bars per sequence would be relatiely costful in the computing performance aspect. (surely feeding a bigger timeframe would be possible too)

Btw the result a few pages ago looked quite convincing if it was done out of sample.

However i still think that adding specific indicators could save training time and even find new (less dominant) patterns (faster), though a simple LSTM could find most of them too.

@Chris70

Also suddenly the MetaLabeling Part reminds me of GAN networks where a distributor competes against the generator. implementing a CNN as a distributor could even improve or create more realistic predictions :/

Btw especially becauseyour broker is an ECN, you should be able to use any kind of ("ECN" like) FOREX data you could get. Maybe even ForexFutures Data...

Strategy tester appears to Digital ACSTrend Lucky1.1

NELODI 2019.10.12 14:58 #123

Bayne:
Btw the result a few pages ago looked quite convincing if it was done out of sample.

You mean the Screenshot with Bars and the projected high/low/close prices? Unless my eyes are deceiving me, these results don't look any better than a 3-period moving average. Why would you need an ANN for that?

Any rookie question, so Neural Networks for trading Multi Timeframe Indicators

Bayne 2019.10.12 15:19 #124

Pretty Sure it was #97.

NELODI 2019.10.12 15:29 #125

Bayne:
Pretty Sure it was #97.

Neural Network weight distribution charts? I do see how that could be useful to visualize the data currently stored by the ANN, but I fail to see the usefulness of that data when trading. Or am I missing the point? As for the "results", if the range is between 0% and 64% accuracy, I find it rather disappointing. Throwing a dice would probably give you similar results. IMO, this was a failed experiment, unless the goal was to confirm the "random walk" theory.

Sleep() on Loss Levenberg-Marquardt algorithm Is that possible to

Bayne 2019.10.12 17:20 #126

NELODI:

Neural Network weight distribution charts? I do see how that could be useful to visualize the data currently stored by the ANN, but I fail to see the usefulness of that data when trading. Or am I missing the point? As for the "results", if the range is between 0% and 64% accuracy, I find it rather disappointing. Throwing a dice would probably give you similar results. IMO, this was a failed experiment, unless the goal was to confirm the "random walk" theory.

64% strike rate and always a positive CRV is not too bad. Show me something better only based on priceaction with a solid foundation(/working long term)

Enrique Dangeroux 2019.10.12 18:21 #127

Bayne:
64% strike rate and always a positive CRV is not too bad. Show me something better only based on priceaction with a solid foundation( /working long term)

The 64% is not fixed long term either, so you are comparing apples with oranges.

Chris did not established (or posted) a base line, so there is no way of knowing if this "not too bad" 64% is actually an improvement over the base line.

The base line could be anything. From classic statistical prediction models to simply counting the percentage red/green candles from the training data for example.

Simple green bar testing Tick based Tema no EURUSD - Trends, Forecasts

Chris70 2019.10.12 19:01 #128

Okay, there's something going on here... First of all, it's not a dispute - we all want the same thing. This being said, I recall having been very open minded and conservative about too optimistic claims, especially with my introductory words in the beginning of this thread. If it doesn't work, I walk away without regrets and I also don't see the point in convincing anybody. You all trade with your own money, not mine. We all fight our one struggle, just as I'm okay with being wrong.

The main question is how much the markets can be anticipated AT ALL - based on history and current price action. Based on this query, I'm just trying to find the best possible information that there is. Somehow, we're all trying to "predict", with whatever method in place - indicators, funddamentals, machine learning... If on the other hand we had no opinion about the market at all, we should all stop at this point, walk away and never trade again.

With this in mind, the problem with classic "indicators" is, that we're trying to impose a "formula" onto the recent market data, that we BELIEVE to reveal something useful to us (and maybe it does), instead of trying to find the BEST possible formula under the given circumstances.

I'm less of an advocate for "belief" as long as there is a mathematical method for better knowledge. Machine learning is nothing less.

And if there just is no knowledge to be found? Okay, then machine learning shows us exactly that! But luckily (at least this is what my personal experience is showing me) there actually is some useful hidden knowledge.

Selections of indicators are usually based on personal judgement, trial and error and lack objectivity (at least if we neglect genetic indicator selection methods at this point).

This is something where I don't share Bayne's opinion: indicators are an unnecessary limitation upon the available information. Just take something as simple as a moving average or RSI: you can always calculate those values from a series of prices, but not the other way around. Neural networks on the other hand might even find relationships within the raw(!), most possible redundant information, that just didn't come to our thought.

Again, you can all do what you want with your time and money, I'm not trying to prove anything. This is an open discussion about methods.

Possible accuracy results (on unseen data!) beyond 60% are on average (64% ain't 0-64%). Of course, concrete results also depend on the timeframe and exact strategy. I'm trying different things here and they're not all the same. So far, I can say that looking at multicurrency data has helped a lot.

When I began this thread I also had not known that performance is so little of an issue with big neural networks under Mql5. When I chose the autoencoder+LSTM approach I wasn't really aware of this. I have learned that Mql5 is powerful enough that we don't really need the autoencoder method. Mql5 can very well handle big recurrent networks in reasonable training time. Anybody wants to implent machine learning in Mql5? Just do it, it works.

Do I currently use any of the mentioned preliminary results for actual real money trading? Clear honest answer: No (for that at this time I have other methods in place, mostly based on polynomial regression and momentum). But the results are promising enough to tell that I will soon. I just try a lot of things in order to find the best solution. Who's not? Lastly, it ain't a pissing contest, is it?

Happy trading everybody (after a hopefully nice weekend)!

Cheers,

Chris.

Machine learning in trading: I'm getting a 80-85% From theory to practice

NELODI 2019.10.12 23:24 #129

Ok, then ... here's my "contribution" to this effort ;) I don't find it very useful, but ... here is a very simple Neural Network, written in pure MQL5, without any external dependencies (untested) ...

// Neural Network Parameters ...
#define DEFAULT_LEARNING_RATE 0.001
#define DEFAULT_ERROR_THRESHOLD 1.0
#define DEFAULT_WEIGHTS_INIT_FACTOR 0.5
#define INPUT_NEURONS  100
#define HIDDEN_NEURONS 80
#define OUTPUT_NEURONS 5

//- Neurons and their weights -
double FInputLayer[INPUT_NEURONS+1];    // the last element is used for BIAS = 1.0
double FHiddenLayer[HIDDEN_NEURONS+1];  // the last element is used for BIAS = 1.0
double FTargetLayer[OUTPUT_NEURONS];    // target output when training the network
double FOutputLayer[OUTPUT_NEURONS];    // output produced by the network

double FHiddenLayerWeights[HIDDEN_NEURONS+1][INPUT_NEURONS+1];
double FOutputLayerWeights[OUTPUT_NEURONS+1][HIDDEN_NEURONS+1];

double FErrorThreshold = DEFAULT_ERROR_THRESHOLD;
double FLearningRate = DEFAULT_LEARNING_RATE;
double FWeightsInitFactor = DEFAULT_WEIGHTS_INIT_FACTOR;

int FNNeuronError = 0;
double FTrainingError = 0;

//+------------------------------------------------------------------+
void bpInit()
  {
   FNNeuronError = 0;
   FTrainingError = 0;
   bpInitWeights();
  }
//+------------------------------------------------------------------+
void bpInitWeights()
  {
   int i, j, k;
   MathSrand(GetTickCount());
//--- Initializes the hidden layer weights
   for(j=0; j<=HIDDEN_NEURONS; j++)
      for(i=0; i<=INPUT_NEURONS; i++)
         FHiddenLayerWeights[j, i] = (MathRand()/32767.0 - 0.5) * (FWeightsInitFactor * 2.0);
//--- Initializes the output layer weights
   for(k=0; k<=OUTPUT_NEURONS; k++)
      for(j=0; j<=HIDDEN_NEURONS; j++)
         FOutputLayerWeights[k, j] = (MathRand()/32767.0 - 0.5) * (FWeightsInitFactor * 2.0);
  }
//+------------------------------------------------------------------+
void bpApply()
  {
   int i, j, k;
   FInputLayer[INPUT_NEURONS] = 1.0; // input layer's bias
   FHiddenLayer[HIDDEN_NEURONS] = 1.0;  // hidden layer's bias
//--- Feedforwards from INPUT to HIDDEN layer
   for(j=0; j<HIDDEN_NEURONS; j++)
     {
      FHiddenLayer[j] = 0;
      for(i=0; i<=INPUT_NEURONS; i++)
         FHiddenLayer[j] += FInputLayer[i] * FHiddenLayerWeights[j, i];
      FHiddenLayer[j] = BipolarSigmoid(FHiddenLayer[j]);
     }
//--- Feedforwards from HIDDEN to OUTPUT layer
   for(k=0; k<OUTPUT_NEURONS; k++)
     {
      FOutputLayer[k] = 0;
      for(j=0; j<=HIDDEN_NEURONS; j++)
         FOutputLayer[k] += FHiddenLayer[j] * FOutputLayerWeights[k, j];
      FOutputLayer[k] = BipolarSigmoid(FOutputLayer[k]);
     }
  }
//+------------------------------------------------------------------+
double HiddenErrors[HIDDEN_NEURONS+1];
double HiddenWeightsCorrection[HIDDEN_NEURONS+1][INPUT_NEURONS+1];
double OutputErrors[OUTPUT_NEURONS];
double OutputWeightsCorrection[OUTPUT_NEURONS][HIDDEN_NEURONS+1];
//+------------------------------------------------------------------+
void bpTrain()
  {
   int i, j, k;
   double Error;
//--- The feedforward phase
   bpApply();
//--- The backpropagation of error phase
   for(k=0; k<OUTPUT_NEURONS; k++)
     {
      Error = FTargetLayer[k] - FOutputLayer[k];
      if(MathAbs(Error) >= FErrorThreshold)
         FNNeuronError++;
      FTrainingError += Error * Error;
      OutputErrors[k] = Error * BipolarSigmoidDerivation(FOutputLayer[k]);

      for(j=0; j<=HIDDEN_NEURONS; j++)
         OutputWeightsCorrection[k, j] = FLearningRate * OutputErrors[k] * FHiddenLayer[j];

      //- Computes hidden layer weights error information
      for(j=0; j<HIDDEN_NEURONS; j++)
        {
         HiddenErrors[j]=0;
         for(k=0; k<OUTPUT_NEURONS; k++)
           {
            HiddenErrors[j] += OutputErrors[k] * FOutputLayerWeights[k, j];
            HiddenErrors[j] *= BipolarSigmoidDerivation(FHiddenLayer[j]);
            for(i=0; i<=INPUT_NEURONS; i++)
               HiddenWeightsCorrection[j, i] = FLearningRate * HiddenErrors[j] * FInputLayer[i];
           }
        }
      //- Updates output layer weights and bias
      for(k=0; k<OUTPUT_NEURONS; k++)
         for(j=0; j<=HIDDEN_NEURONS; j++)
            FOutputLayerWeights[k, j] += OutputWeightsCorrection[k, j];
      //- Updates hidden layer weights and bias
      for(j=0; j<HIDDEN_NEURONS; j++)
         for(i=0; i<=INPUT_NEURONS; i++)
            FHiddenLayerWeights[j, i] += HiddenWeightsCorrection[j, i];
     }
  }
//+------------------------------------------------------------------+
double BipolarSigmoid(double x)
  {
   double e=MathExp(-x);
   if(e!=-1.0)
      return(2.0/(1.0+e))-1.0;
   else
      return(0);
  }
//+------------------------------------------------------------------+
double BipolarSigmoidDerivation(double Fx)
  {
   return((1.0+Fx)*(1.0-Fx))/2.0;
  }
//+------------------------------------------------------------------+

Does the Grail exist? OrderProfit [Archive!] FOREX - Trends,

Chris70 2019.10.13 02:35 #130

Hey, thanks a lot for your contribution.

Just a few thoughts:

1. Why only one fixed hidden layer? It's easy at this point to add a layers dimension "l" to your weights and hidden neurons and it just adds one more 'for' loop. It's harder if you change this once the code gets more complicated.

Another suggestion: don't chose different variable names for input layers, hidden layers and outputs. Just assign an index [0] to the input layer, [layers-1] to the output layer (assuming that "layers" is the total number of layers) and anything in between for the hidden layers. This seems unnecessary with only one hidden layer, but makes thinks easier with more complex networks. If you then do the backpropagation, you can still declare the output errors separately before cycling through the hidden layers, i.e. by specifically referring to [layers-1] for the outputs. Different names are really unnessary. By the way, I also first did it just like you did and changed it when I came upon some disadvantages later. Yes, of course there are differences how the layers are handled, like e.g. an input layer has no inputs on it's own, but this can all be addressed by array indices.

2. What is the purpose of the "error threshold"? I get that you calculate the error of each output neuron by the statement

Error = FTargetLayer[k] - FOutputLayer[k];

because this is the derivative of the MSE error function (or more precisely: the correct derivative is Error = FOutputLayer[k]- FTargetLayer[k]; so the difference just the other way around, i.e. -(label-output) which is the same as +(output-label), so you get the wrong sign, but this detail doesn't matter depending on with which sign you handle the error later).

but then with MSE the total training error should be something like

double MSE=0;
for (int k=0;k<OUTPUT_NEURONS;k++)
  {MSE+=0.5*pow(FTargetLayer[k]-FOutputLayer[k],2);}

I can't find this in your code. Adding a threshold and just taking the square of the error instead of half the square seems wrong (because the derivative of x² is 2x, so the derivative of 0.5x² is just x).

3. Why the e!=-1 statement in the activation function? MathExp(-x) can never be negative (so never -1.0), because the range for e^x is between plus infinity for high x values and asymptotically approaching zero for very negative values ( https://en.wikipedia.org/wiki/Exponential_function).

By the way: a nice website for visualization of functions and computing derivatives is http://www.derivative-calculator.net. For example just input your (2.0/(1.0+exp(-x)))-1.0 there. It's fun to create your own custom activation functions there.

All in all: seems like a good starting point for a growing project - with neural networks you never run out of possibilities... (saving/loading the weights to/from a file, different weight initialization methods, more flexible network architecture, memory cells, other loss functions... you get the idea).

Here's something if you want some more activation functions to play with (a shared an earlier version of this code somewhere on the forum, but this version is a little better). The advantage: you can keep the choice of activation function flexible, for example as an input variable such as "input ENUM_ACT_FUNT actfunct", then later in your code you just write "Activate(x,actfunct) or e.g. Activate(x,f_sigmoid), Activate(x,f_ReLU)... and get the result for whatever activation function you selected (accordingly: DeActivate(x,actfunct) for the corresponding derivative).

//+------------------------------------------------------------------+
//|      list of available activation functions                      |
//+------------------------------------------------------------------+
enum ENUM_ACT_FUNCT
  {
   f_ident=0,        // identity function
   f_sigmoid=1,      // sigmoid (logistic)
   f_ELU=2,          // exponential linear unit (ELU)
   f_ReLU=3,         // rectified linear unit (ReLU)
   f_LReLU=4,        // leaky ReLU
   f_tanh=5,         // hyperbolic tangent (tanh)
   f_arctan=6,       // arcus tangent (arctan)
   f_arsinh=7,       // area sin. hyperbolicus (inv. hyperbol. sine)
   f_softsign=8,     // softsign (Elliot)
   f_ISRU=9,         // inverse square root unit (ISRU)
   f_ISRLU=10,       // inv.squ.root linear unit (ISRLU)
   f_softplus=11,    // softplus
   f_bentident=12,   // bent identity
   f_sinusoid=13,    // sinusoid
   f_sinc=14,        // cardinal sine (sinc)
   f_gaussian=15,    // gaussian
   f_differentiable_hardstep=16, // differentiable hardstep (custom)
   f_inv_diff_hardstep=17, // inverted differentiable hardstep (custom)
   f_softmax=18,     // normalized exponential (softmax)
   f_oblique_sigmoid=19 // oblique sigmoid (custom)
                     // note: the softmax function can't be part of this library because it has no single input but needs the other neurons of the layer, too, so it
  };                 //       needs to be defined from within the neural network code
   
string CustomError="no nan or inf numbers found";

// +------------------------------------------------------------------+
// |       function Activate / DeActivate                             |
// +------------------------------------------------------------------+
double Activate(double x,ENUM_ACT_FUNCT f)
  {
   switch(f)
     {
      case f_ident:     return x;
      case f_sigmoid:   return sigmoid(x);
      case f_ELU:       return ELU(x);
      case f_ReLU:      return ReLU(x);
      case f_tanh:      return modtanh(x);
      case f_arctan:    return arctan(x);
      case f_arsinh:    return arsinh(x);
      case f_softsign:  return softsign(x);
      case f_ISRU:      return ISRU(x);
      case f_ISRLU:     return ISRLU(x);
      case f_softplus:  return softplus(x);
      case f_bentident: return bentident(x);
      case f_sinusoid:  return sinusoid(x);
      case f_sinc:      return sinc(x);
      case f_gaussian:  return gaussian(x);
      case f_differentiable_hardstep: return diff_hardstep(x);
      case f_inv_diff_hardstep: return inv_diff_hardstep(x);
      case f_oblique_sigmoid: return oblique_sigmoid(x);
      default:          return x; //="identity function"
     }
  }

double DeActivate(double x,ENUM_ACT_FUNCT f)
  {
   switch(f)
     {
      case f_ident:     return 1;
      case f_sigmoid:   return sigmoid_drv(x);
      case f_ELU:       return ELU_drv(x);
      case f_ReLU:      return ReLU_drv(x);
      case f_tanh:      return modtanh_drv(x);
      case f_arctan:    return arctan_drv(x);
      case f_arsinh:    return arsinh_drv(x);
      case f_softsign:  return softsign_drv(x);
      case f_ISRU:      return ISRU_drv(x);
      case f_ISRLU:     return ISRLU_drv(x);
      case f_softplus:  return softplus_drv(x);
      case f_bentident: return bentident_drv(x);
      case f_sinusoid:  return sinusoid_drv(x);
      case f_sinc:      return sinc_drv(x);
      case f_gaussian:  return gaussian_drv(x);
      case f_differentiable_hardstep: return diff_hardstep_drv(x);
      case f_inv_diff_hardstep: return inv_diff_hardstep_drv(x);
      case f_oblique_sigmoid: return oblique_sigmoid_drv(x);
      default:          return x; //="identity function" derivative
     }
  }

//+------------------------------------------------------------------+
//|      return act.function as string variable equivalent           |
//+------------------------------------------------------------------+
string actfunct_string(ENUM_ACT_FUNCT f)
  {
   switch(f)
     {
      case f_ident:     return "identity";
      case f_sigmoid:   return "sigmoid (logistic)";
      case f_ELU:       return "exponential linear unit (ELU)";
      case f_ReLU:      return "rectified linear unit (ReLU)";
      case f_tanh:      return "hyperbolic tangent (tanh)";
      case f_arctan:    return "arcus tangent (arctan)";
      case f_arsinh:    return "area sinus hyperbolicus (inv. hyperbol.sine)";
      case f_softsign:  return "softsign";
      case f_ISRU:      return "inverse square root unit (ISRU)";
      case f_ISRLU:     return "inverse square root linear unit (ISRLU)";
      case f_softplus:  return "softplus";
      case f_bentident: return "bent identity";
      case f_sinusoid:  return "sinusoid";
      case f_sinc:      return "cardinal sine (sinc)";
      case f_gaussian:  return "gaussian";
      case f_softmax:   return "normalized exponential (softmax)";
      default:          return "";
     }
  }

// +------------------------------------------------------------------+
// |       function ELU / ELU_drv                                     |
// +------------------------------------------------------------------+
double ELU(double z)
  {
   if (z>0)
     {return z;}
   else
     {return ValidNumber(exp(z)-1,DBL_EPSILON-1);}
  }   

double ELU_drv(double z)
  {
   if (z>0)
     {return 1;} 
   else
     {return ValidNumber(exp(z),DBL_MIN);}
  }   

// +------------------------------------------------------------------+
// |       function sigmoid / sigmoid_drv                             |
// +------------------------------------------------------------------+   
 double sigmoid(double z)
  {
   if (z>0)
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MIN)),1);}
   else
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MAX)),0);}
  }

double sigmoid_drv(double z)
  {
   if (z>0)
     {return ValidNumber((1/(1+ValidNumber(exp(-z),DBL_MIN)))*(1-1/(1+ValidNumber(exp(-z),DBL_MIN))),0);}
   else
     {return ValidNumber((1/(1+ValidNumber(exp(-z),DBL_MAX)))*(1-1/(1+ValidNumber(exp(-z),DBL_MAX))),0);}
  }   

// +------------------------------------------------------------------+
// |       function oblique sigmoid / obl. sigmoid_drv (custom)       |
// +------------------------------------------------------------------+   
 double oblique_sigmoid(double z)
  {
   double alpha=0.01;
   if (z>0)
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MIN))+alpha*z,DBL_MAX);}
   else
     {return ValidNumber(1/(1+ValidNumber(exp(-z),DBL_MAX))+alpha*z,-DBL_MAX);}
  }

double oblique_sigmoid_drv(double z)
  {
   double alpha=0.01;
   if (z>0)
     {return ValidNumber(ValidNumber(exp(-z),DBL_MIN)/pow(ValidNumber(exp(-z),DBL_MIN)+1,2)+alpha,alpha);}
   else
     {return ValidNumber(ValidNumber(exp(-z),DBL_MAX)/pow(ValidNumber(exp(-z),DBL_MAX)+1,2)+alpha,alpha);}
  } 

// +------------------------------------------------------------------+
// |       function ReLU / ReLU_drv                                   |
// +------------------------------------------------------------------+      
double ReLU(double z)
  {
   return MathMax(z,0);
  }
      
double ReLU_drv(double z)
  {
   return z>0;
  }

// +------------------------------------------------------------------+
// |       function LReLU / LReLU_drv                                 |
// +------------------------------------------------------------------+
double LReLU(double z)
  {
   if (z>0)
     {return z;}
   else
     {return 0.01*z;}
  }
      
double LReLU_drv(double z)
  {
   if (z>0)
     {return 1;}
   else
     {return 0.01;}
  }

// +------------------------------------------------------------------+
// |       function modtanh / modtanh_drv                             |
// +------------------------------------------------------------------+
double modtanh(double z)
  {
   if (z>0)
     {return ValidNumber(tanh(z),1);}
   else
     {return ValidNumber(tanh(z),-1);}
  }

double modtanh_drv(double z)
  {
   return ValidNumber(1-pow(tanh(z),2),0);
  }

//+------------------------------------------------------------------+
//|      function arctan / arctan_drv                                |
//+------------------------------------------------------------------+
double arctan(double z)
  {
   if (z>0)
     {return ValidNumber(atan(z),0.5*M_PI);}
   else
     {return ValidNumber(atan(z),-0.5*M_PI);}
  }
  
double arctan_drv(double z)
  {
   return ValidNumber(1/(pow(z,2)+1),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function arsinh / arsinh_drv                                |
//+------------------------------------------------------------------+
double arsinh(double z)
  {
   if (z>0)
     {return ValidNumber(asinh(z),DBL_MAX);}
   else
     {return ValidNumber(asinh(z),-DBL_MAX);}
  }
  
double arsinh_drv(double z)
  {
   return ValidNumber(1/sqrt(pow(z,2)+1),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function softsign / softsign_drv                            |
//+------------------------------------------------------------------+
double softsign(double z)
  {
   if (z>0)
     {return ValidNumber(z/(1+fabs(z)),1-DBL_EPSILON);}
   else
     {return ValidNumber(z/(1+fabs(z)),DBL_EPSILON-1);}
  }
  
double softsign_drv(double z)
  {
   return ValidNumber(1/pow(1+fabs(z),2),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function ISRU / ISRU_drv                                    |
//+------------------------------------------------------------------+
double ISRU(double z)
  {
   double alpha=1;
   if (z>0)
     {return ValidNumber(z/sqrt(1+alpha*pow(z,2)),1/sqrt(alpha)-DBL_EPSILON);}
   else
     {return ValidNumber(z/sqrt(1+alpha*pow(z,2)),DBL_EPSILON-1/sqrt(alpha));}
  }
  
double ISRU_drv(double z)
  {
   double alpha=1;
   return ValidNumber(pow(1/sqrt(1+alpha*pow(z,2)),3),DBL_MIN);
  }

//+------------------------------------------------------------------+
//|      function ISRLU / ISRLU_drv                                  |
//+------------------------------------------------------------------+
// note: ISRLU="inverse square root linear unit"
double ISRLU(double z)
  {
   double alpha=1;
   if (z<0)
     {return ValidNumber(z/sqrt(1+alpha*pow(z,2)),DBL_EPSILON-1/sqrt(alpha));}
   else
     {
      return z;
     }
  }
  
double ISRLU_drv(double z)
  {
   double alpha=1;
   if (z<0)
     {return ValidNumber(pow(1/sqrt(1+alpha*pow(z,2)),3),DBL_MIN);}
   else
     {return 1;}
  }

//+------------------------------------------------------------------+
//|      function softplus / softplus_drv                            |
//+------------------------------------------------------------------+
double softplus(double z)
  {
   if (z>0)
     {return ValidNumber(log(1+exp(z)),DBL_MAX);}
   else
     {return ValidNumber(log(1+exp(z)),DBL_MIN);}
  }
  
double softplus_drv(double z)
  {
   if (z>0)
     {return 1/(1+exp(-z));} //all positive results are valid (gradient is close to 1)
   else
     {return ValidNumber(1/(1+exp(-z)),DBL_MIN);}
  }

//+------------------------------------------------------------------+
//|      function bentident / bentident_drv                          |
//+------------------------------------------------------------------+
double bentident(double z)
  {
   if (z>0)
     {return ValidNumber((sqrt(pow(z,2)+1)-1)/2+z,DBL_MAX);}
   else
     {return ValidNumber((sqrt(pow(z,2)+1)-1)/2+z,-DBL_MAX);}
  }
  
double bentident_drv(double z)
  {
   return z/(2*sqrt(pow(z,2)+1))+1;
  }

//+------------------------------------------------------------------+
//|      function sinusoid / sinusoid_drv                            |
//+------------------------------------------------------------------+
double sinusoid(double z)
  {
   return sin(z);
  }
  
double sinusoid_drv(double z)
  {
   return cos(z);
  }

//+------------------------------------------------------------------+
//|      function sinc / sinc_drv                                    |
//+------------------------------------------------------------------+
double sinc(double z)
  {
   if (z==0)
     {return 1;}
   else
     {return ValidNumber(sin(z)/z,DBL_MIN);}
  }
  
double sinc_drv(double z)
  {
   if (z==0)
     {return 0;}
   else
     {
      if (z>0)
        {return ValidNumber(cos(z)/z-sin(z)/pow(z,2),-DBL_MIN);}
      else
        {return ValidNumber(cos(z)/z-sin(z)/pow(z,2),DBL_MIN);} 
     }
  }     

//+------------------------------------------------------------------+
//|      function gaussian / gaussian_drv                            |
//+------------------------------------------------------------------+
double gaussian(double z)
  {
   return ValidNumber(exp(-pow(z,2)),DBL_MIN);
  }
  
double gaussian_drv(double z)
  {
   if (z>0)
     {return ValidNumber(-2*z*pow(M_E,-pow(z,2)),-DBL_MIN);}
   else
     {return ValidNumber(-2*z*pow(M_E,-pow(z,2)),DBL_MIN);}
  }

//+------------------------------------------------------------------+
//|      function differentiable hardstep (custom) / _drv            |
//+------------------------------------------------------------------+
double diff_hardstep(double z)
  {
   double alpha=0.1;
   if (z>=0)
     {return ValidNumber(1+alpha*pow(z,2),DBL_MAX);}
   else
     {return ValidNumber(-alpha*pow(z,2),DBL_MIN);}
  }
double diff_hardstep_drv(double z)
  {
   double alpha=0.1;
   return ValidNumber(fabs(alpha*2*z),DBL_MAX);
  }
  
//+------------------------------------------------------------------+
//|      function inverted differentiable hardstep (custom) / _drv   |
//+------------------------------------------------------------------+
double inv_diff_hardstep(double z)
  {
   double alpha=0.1;
   if (z>=0)
     {return ValidNumber(1-alpha*pow(z,2),DBL_MAX);}
   else
     {return ValidNumber(alpha*pow(z,2),DBL_MIN);}
  }
double inv_diff_hardstep_drv(double z)
  {
   double alpha=0.1;
   return ValidNumber(-fabs(alpha*2*z),DBL_MAX);
  }

//+------------------------------------------------------------------+
//|      ValidNumber                                                 |
//+------------------------------------------------------------------+
// purpose: returning the valid result (=no NaN of inf) that is closest to the max. expected mathematically correct result but still can be expressed as a double precision floating point number; return alternate if invalid;
// equation doesn't need to be calculated twice as in e.g. if (MathIsValidNumber(a+b){c=a+b;}else{c=d;}
// --> performance benefit with complicated equations
double ValidNumber(double equation,double alternate=0)
  {
   if (MathIsValidNumber(equation)){return equation;}else{return alternate;}
  }

Searching for an arbitrary Machine learning in trading: Neural network

Taking Neural Networks to the next level - page 13