Neural networks, how to master them, where to start? - page 12

 
Neutron >> :

No matter how you spin the NS, no matter what you put in its inputs, there are certainly no miracles!

So, what we get: On the one hand, the more layered the NS, the higher its predictive power, but it is senseless to build more than three layers - three-layer grid is already a universal approximator.

In general, no, I'm not going to argue - I'm bored.

It follows that the more layered the NS is, the longer the training sample will have to be used for its training. Not only the training complexity grows as P^3, but we may also run out of data!


That's the second time I've seen that degree. Let's do the math.


Let's take for an example a network m - n - k -- the letters stand for the number of neurons respectively in the input, hidden and output layer.

The complexity of signal propagation is O(m*n + n*k) for fully coupled synapses.

The complexity of back propagation is similar.

Now let us introduce an additional hidden layer of the same size.

The complexity is O(m*n + n*n + n*k).

Let's take the ratio -- we get (m + n + k)/(m + k).


In addition, the introduction of the 2nd hidden layer allows you to specifically reduce the size of the 1st layer.

For this purpose I have made three meshes - 1, 2 and 3 in Mathcad and compared results of predicting a sign of cotier increments one count ahead (I gathered statistics from 100 independent experiments). The results are as follows:

1 - p=10% correctly guessed signs (probability=1/2+p).

2 - 15-16%

3 - 12%

There are some free parameters here: dimension of input and number of neurons in the layer(s). First parameter was the same for all architectures, second was chosen personally. We see that 3-layer NS-grid is not a panacea, and perhaps for us as traders, the best option for the MTS analytical block is a two-layer grid - from the viewpoint of maximum forecast accuracy and minimum requirements for training complexity (power of RS, large history and its non-growth).

It is high time to think of fractal NS with fractional number of layers :)) . 2.5 would be just right.

 

hypothetical way to master NS technology
step 1.

build an NS in NS with one buy/sell output, feed it with Close[x], look at the chart, see - the grid is noisy!
step 2.

Now we feed something smoother than the initial quote but NS is noisy anyway.
Why? Because the teacher is uneven. I'm too lazy to make it by hand. (You need a spiralizer here)
step 3.

Read Reshetov's article, send NS, train it in the tester and notice - without any error function defined explicitly.
So Strategy Tester rumbles, developer purrs, says Reshetov is clever, he figured it all out, invented a real Teacher.
However, Reshetov is a clever one, but my computer is not working well with MT-4, so where is MT-5?
And for 4 inputs this "NS" is noisy again. Now the historical data turns out to be uneven - it contains different types of markets and we do not know which ones)
.... we repeat steps 1-3 in the loop.
step 4.

we realize that we are stuck - we cannot grow the network, MQL is slow, and neuropath training is kind of far from trading.
step 5.

Thinking at a crossroads - now we have started to work with NS, we know that NS is not so much mathematics. as technology,
can be saved by NS Trader, it has a better tester.
well and...
what's all this for?
step 6

If we invent a network and train it, in this process, it becomes clearer what we really need and the real thing is performed without NS,
without NS at all.
it turns out
The NS is only needed to understand what can be explained while you explain it to a dumb person)))

 
TheXpert писал(а) >>

Let's do the math.

On page 7 of the topic I posted the archive with the article, there is an estimation of complexity of training (p.65-66): C=P*w^2=d*P^2=(w^4)/d, which suggests that I screwed up slightly (a little pregnant) and the complexity is proportional to d*P^2 or through the number of synapses: (w^4)/d

Also, introducing a 2nd hidden layer allows you to specifically reduce the size of the 1st layer.

Where does that come from?
 
Neutron >> :
Where it follows from?

If not strictly, then indirectly from the number of adjustable parameters. Strictly nyahsil. I'm not good at proofs.

And if I add my own idea, I actually think for some time that the best convergence nonlinear speed structure of perseptron is herringbone, but I haven't checked it yet.

When I'll get around to it, I'll draw it.


Some New Year's ideas :)) .

 
Korey писал(а) >>
>>)) while he is explaining something to her, he should understand what he is explaining.)

There's an important point you didn't want to make - you'll have to explain the NS once (and you'll understand it too, as you correctly noted), and then, like a lathe, it will work on the changing world (kotier), not caring that you fully understand what it means!

TheXpert wrote >>

perseptron structure -- herringbone

What are you smoking?

Oh, I get it! Well, it sure looks like a Christmas tree. I have the same thing on an intuitive level.

 
Neutron >> :

...

Where does that come from?

read haykin -------> point 4.15. Network simplification methods

 

It doesn't make any difference which algorithm you take to learn . The result is the same.) You don't have to dig into the neural network, you have to look for inputs.

 
PraVedNiK писал(а) >>

read haykin -------> point 4.15. Network simplification methods

No problem. >> Let's read it!

 

Spruce! Intuition has nothing to do with it, NS is a pyramidal coder, + it is similar to FFT.
For example, based on pyramidal coder model it is elementary to calculate minimum number of neurons.

i.e. pyramidal coder is minimum coverage of designed NS.

 
Neutron >> :

What are you smoking?

Quit :)

Oh, I get it! Well, it sure looks like a Christmas tree. I got the same thing on a gut level.

Reason: