Discussion of article "Neural Networks Made Easy" - page 3

 
Maxim Dmitrievsky:


what is "5 why's" and what do 4 layers have to do with it? Looked it up, a simple decision tree will answer this question. The universal approximator is the NS of 2 layers, which will answer any number of "why" :) The other layers are mainly used for data preprocessing in complex designs. For example, to compress an image from a large number of pixels and then recognise it.


"The 5 whys" is a technique for determining causality Wikipedia. The article gives for example how essentially a neural network is designed to find a causal relationship between past price movements and the future direction of price movement.

 
Andrey Azatskiy:
only the input signal must also be in this interval. By input signal I mean exactly the input signal to the neuron, not to the function under discussion.

Theoretically, the input signal to the neuron can be any. Its influence is corrected by a weighting coefficient. If the input signal is too small but has a significant impact on the overall solution, its weight coefficient will be increased in the learning process. If the signal is of great importance, but its influence on the result is negligible, its weight coefficient will be reduced down to "0" (communication breakdown between neurons).

 
Dmitriy Gizlyk:

"The 5 whys" is a technique for determining causality Wikipedia. The article is given for example as essentially a neural network designed to find a causal relationship between past price movements and the future direction of price movement.

I just didn't understand the correlation between the number of questions and the layers. The input is a number of attributes, each of which must be answered (roughly speaking). The output is a summarised result. One hidden layer may be enough, not necessarily 4. It is believed that a NS with 2 hidden layers can approximate any function. This is just for reference.

 

And to develop the topic, a question about the architecture of the NS. What does it depend on?

 
Maxim Dmitrievsky:

I just didn't understand the correlation between the number of questions and the layers. The input is a number of attributes, each of which must be answered (roughly speaking). The output is a summary result. One hidden layer may be enough, not necessarily 4. It is believed that an NS with 2 hidden layers can approximate any function. This is just for reference.

The 5 Why technique is based on sequential questions, where each question answers the reason for the previous question. For example, we look at a chart and a rising price chart and construct questions (the questions are answered abstractly to explain the technique):
1. Where to trade - Buy
2. Why buy? - Because it is a rising trend
3. Why a rising trend? - MA50 is rising
4. Why is the MA50 rising? - The average closing price of 50 candles with a shift of 1 is lower than the average closing price of the last 50 candles.

etc.
Since the questions are sequential and have a cause-and-effect relationship, we create layers to observe this relationship. If we use only 2 layers, the cause and effect relationship is lost, the neural network analyses a number of independent options and chooses the best one.

 
Denis Kirichenko:

And to develop the topic, a question about the architecture of the NS. What does it depend on?

On the architect's understanding of the process. The article gives the simplest version of a neural network and does not consider convolutional and other architectures.

 
Dmitriy Gizlyk:

The 5 Why technique is built on sequential questions where each question answers the reason for the previous question. For example, we look at a chart and a rising price chart and and and construct questions (the questions are answered in the abstract to explain the technique):
1. Where to trade - Buy
2. Why buy? - Because it is a rising trend
3. Why a rising trend? - MA50 is rising
4. Why is the MA50 rising? - The average closing price of 50 candles with a shift of 1 is lower than the average closing price of the last 50 candles.

etc.
Since the questions are sequential and have a cause-and-effect relationship, we create layers to observe this relationship. If we use only 2 layers, the cause and effect relationship is lost, the neural network analyses a number of independent options and chooses the best one.

It makes no difference in what order we ask these questions, the result will be the same. There is no need to separate by layers.

 
Dmitriy Gizlyk:

Good evening, Peter.
The neuron inside consists of 2 functions:
1. First we calculate the sum of all incoming signals taking into account their weighting coefficients. That is, we take the value at each input of the neuron and multiply it by the corresponding weighting factor. And add up the values of the obtained products.


Thus, we obtain a certain value that is fed to the input of the activation function.

2. The activation function converts the received sum into a normalised output signal. This can be either a simple logic function or various sigmoid functions. The latter are more common, as they have a smoother state change transition.

Communication between neurons is organised as a direct transfer of the output value of one neuron to the input of a subsequent neuron. In this case, referring to point 1, the value arriving at the input of a neuron is taken into account in accordance with its weight coefficient.

Thanks. The article and references helped me to understand the essence of the purpose of neural networks - finding and processing an invariant within the variation space of a data set, and the simplest method of technical implementation, which I have yet to fully understand. But, the explanations are very lucid.
 
Maxim Dmitrievsky:

There is no difference in the order in which these questions are asked, the result will be the same. There is no need for layering here.

The questions are given conditionally. The technique is used to find the root causes of an event. The connection between the first and last questions is not always immediately obvious.
However, this example in the article is given to demonstrate the connection between the root cause analysis approach and the network architecture.
 
Реter Konow:
Thanks. The article and references helped me to understand the essence of the purpose of neural networks - the definition and processing of the invariant embedded in the data set, and the simplest method of technical implementation, which I have yet to understand definitively. But, the explanations are very clear.

If you want to understand the MLP structure, IMHO you'd better look at this code:

#danila_zaytcev mlp 2018


import random

import math



class mlp:


    class activefunc:

        def __init__(self, func, derive):

            self.func = func

            self.derive = derive


    def __init__(self, structure,  af,  learnRate,  moment,  epohs):

        self.leanRate = learnRate

        self.af = af

        self.moment = moment

        self.epohs = epohs


        self.layerCount = len(structure) - 1

        self.weightCount = [None] * self.layerCount

        self.neuronsCount = [None] * self.layerCount

        self.Out = [None] * self.layerCount

        self.Err = [None] * self.layerCount

        self.Drv = [None] * self.layerCount

        self.Inputs = [None] * self.layerCount

        self.Weigthts = [None] * self.layerCount


        for l in range(self.layerCount):

            nLen = structure[l + 1]

            wLen = structure[l] + 1

            self.weightCount[l] = wLen

            self.neuronsCount[l] = nLen


            self.Out[l] = [0.0] * nLen

            self.Err[l] = [0.0] * nLen

            self.Drv[l] = [0.0] * nLen

            self.Weigthts[l] = [None] * nLen


            for n in range(nLen):

                self.Weigthts[l][n] = [None] * wLen

                for w in range(wLen):

                    self.Weigthts[l][n][w] = (random.random() * 2 - 1) / wLen



    def forward(self,  input):

        for l in range(self.layerCount):

            self.Inputs[l] = input

            for n in range(self.neuronsCount[l]):

                wcount = self.weightCount[l] - 1

                out = 0.0

                for w in range(wcount):

                    out += self.Weigthts[l][n][w] * input[w]


                out += self.Weigthts[l][n][wcount]

                out = self.af.func(out)

                self.Out[l][n] = out

                self.Drv[l][n] = self.af.derive(out)


            input = self.Out[l];



    def backward(self, output):

        last = self.layerCount - 1

        for n in range( self.neuronsCount[last]):

            self.Err[last][n] *= self.moment

            self.Err[last][n] += (output[n] - self.Out[last][n])*self.Drv[last][n] * (1.0 - self.moment)


        for l in range(last - 1, -1,-1):

            for n in range( self.neuronsCount[l]):

                backProp = 0

                for w in range(self.neuronsCount[l + 1]):

                    backProp += self.Weigthts[l + 1][w][n] * self.Err[l + 1][w]


                self.Err[l][n] = backProp * self.Drv[l][n]



    def update(self):

        for l in range(self.layerCount):

            for n in  range(self.neuronsCount[l]):

                G = self.Err[l][n] * self.leanRate

                for w in  range(self.weightCount[l] - 1):

                    self.Weigthts[l][n][w] += self.Inputs[l][w] * G

                self.Weigthts[l][n][self.weightCount[l] - 1] += G



    def learn(self, inputs, outputs):

        for e in  range(self.epohs):

            for i in range(len(inputs)):

                index = random.randint(0, len(inputs) - 1)

                self.forward(inputs[index])

                self.backward(outputs[index])

                self.update()



    def compute(self, vector):

        self.forward(vector)

        return self.Out[self.layerCount - 1]



def test_mlp():


    inputs = [[1.0,1.0],[-1.0,-1.0],[1.0,-1.0],[-1.0,1.0]]

    outputs = [[1.0],[1.0],[-1.0],[-1.0]]


    af = mlp.activefunc(math.tanh, lambda y: 1.0 - y ** 2)

    ml = mlp([2,2,1],af,0.01,0.1,10000)

    ml.learn(inputs,outputs)


    for i in inputs:

        print(str(i) + " = " + str(ml.compute(i)))

It is five times smaller, guaranteed to do what you need, and in Python, which is much easier to understand.

But the author of the article is good, of course, it's cool to write MLP yourself :)