Market etiquette or good manners in a minefield - page 15

 
Neutron >> :

You can look up the formulas yourself in the literature, which is abundant on the Internet.

Let's not be too hasty. And don't try to complicate your life with all sorts of contrivances, such as "non-linear learning", etc. it's from the Evil One. Beauty and reliability in simplicity and harmony!

>> Save and be merciful...

 
:-)
 
Neutron >> :

Let's take it slow. And don't try to complicate your life with all sorts of contrivances like "non-linear learning" and the like. Beauty and reliability in simplicity and harmony!

Ы? It's a classic. Anyway, I'll leave you to it.

 
Neutron >> :

You have a vector of input signals (let it be one-dimensional) with length of n samples, and let n+1 samples be a test for the Network's training quality. You give it this vector (n samples), equating all weights to random values in the range +/-1 with uniform probability density distribution, and see what the grid outputs. Suppose you set +5.1, and check n+1 counts (the value the trained grid on the training vector should aspire to) +1.1. Then you take the difference between the obtained value and the desired +4 and add this value, keeping its sign to each weight of the output neuron (if it is without FA), or find the derivative of FA from this value and add it to the weights of the input neuron (if it has FA). And so on.

If you digest this piece, I'll tell you how to push further the error to the input weights of the first (input) layer.

1. As I understand it, the grid must have two modes of operation: 1 - learning, 2 - recognition and these modes are not compatible, i.e. the grid is in one of them at any given time.


2. A vector of input signals of length n, is, for example, an array V[n,3] (for a grid with three inputs) of RSI values on n bars - right? Then n+1 count is the same RSI on n+1 bar. In that case I train the grid to predict the future behaviour of RSI, based on its previous behaviour.

If so, the weights are all clear, up to the point where I need to take the derivative of the non-smooth FA function.( I mean, I just don't know how to do it yet... Maybe just take the angular coefficient between two adjacent FA(RSI(i)) points. ? Well, ok, it's a technical issue - it'll be solved).

 
paralocus писал(а) >>

1. As I understand it, the grid must have two modes of operation: 1 - learning, 2 - recognition and these modes are not compatible, i.e. the grid is in one of them at any given time.

2. A vector of input signals of length n, is, for example, an array V[n,3] (for a grid with three inputs) of RSI values on n bars - right? Then n+1 count is the same RSI on n+1 bar. In that case I train the grid to predict the future behaviour of RSI, based on its previous behaviour.

If so, the weights are all clear, up to the point where I need to take the derivative of the non-smooth FA function.( I mean, I just don't know how to do it yet... Maybe just take the angular coefficient between two adjacent FA(RSI(i)) points. ? Well, ok, it's a technical issue - it'll be solved).

It's very well explained in F.Wasserman.Neurocomputing - Theory and Practice

 
Vinin >> :

Very well explained in F. Wasserman, Neurocomputing - Theory and Practice

Yeah, thanks, I found one.

 
paralocus писал(а) >>

1. As I understand it, the grid must have two modes of operation: 1 - learning, 2 - recognition and these modes are not compatible, i.e. the grid is in one of them at any given time.

2. A vector of input signals of length n, is, for example, an array V[n,3] (for a grid with three inputs) of RSI values on n bars - right? Then n+1 count is the same RSI on n+1 bar. In that case I train the grid to predict the future behaviour of RSI, based on its previous behaviour.

If so, the weights are all clear, up to the point where I need to take the derivative of the non-smooth FA function.( I mean, I just don't know how to do it yet... Maybe just take the angular coefficient between two adjacent FA(RSI(i)) points. ? Well, ok this is a technical issue - it will be solved).

1. With the arrival of a new datum, the grid is trained on the new training vector and immediately after training produces a prediction 1 step ahead, and so on to infinity. I.e. it is a question of additional training of NS at each step.

2. Mesh with three inputs reads three last readings of vector length n+1+3, and is trained on it by sequential one-step shift of all n times.

There is no problem with the derivative. If we use the hyperbolic tangent FA=th(x) as FA, then we have no problem to find the derivative from it, dFA=1-th(x)^2 and correction weight at the input of this neuron will be: dw=delta*(1-th(s)^2), where delta is error between grid output and real value of the sample, s is grid output.

 
Neutron >> :

1. With the arrival of a new datum, the grid is trained on the new training vector and immediately after training produces a prediction 1 step ahead, and so on to infinity. I.e. we speak about additional training of NS at each step.

2. Mesh with three inputs reads three last readings of vector length n+1+3, and is trained on it by sequential one-step shift of all n times.

There is no problem with the derivative. If we use as FA hyperbolic tangent FA=th(x), then we have no problem to find derivative from it dFA=1-th(x)^2 and correction weight at the input of this neuron will be: dw=delta*(1-th(s)^2), where delta is error between grid output and real value of the sample, s is grid output.

That's it! So there's no need to retrain her. Oh, great!

2. A grid with three inputs, reads the last three samples of a vector of length n+1+3, and is trained on it by sequentially shifting by one step all n times.


I must have misspoken here. Not a grid with three inputs, but a neuron with three synapses, each of which receives an input signal:

1synaps - th(RSI(i))

2synaps - th(RSI(i+dt))

3synaps - th(RSI(i+td*2))

1. we initialize the weights of synapses with random values (-/+1), which makes sense - there are only three of them.

2. Then we need to feed each input with n samples of the input signal, i.e. a sequence of input signals on n previous bars.

3. Then output neuron on nth bar and compare it with n+1 value of the input signal, difference (error) with its sign should be added to each weight if the output of the neuron without FA.

If the output is from FA, we add the error multiplied by the prudent from FA(at the n-th bar) to the weights

4. Move one bar forward(n+1th becomes nth) and repeat steps 2 to 4.


Questions:

1. n = k*w*w/d ?

2. To calculate the error value, always subtract the test count from the grid output value?

 

Yes!

Then let's move on to epochs of learning. There will be from 10 to 1000, depending on goals and means, and we will touch in more details on how to form a vector of weights correction (it is in fact commutative within one epoch) and has a length equal to the number of synapses w on a sample of length n-counts.

In short, don't bother yet.

 
Neutron >> :

Yes!

Then let's move on to epochs of learning. There will be from 10 to 1000, depending on goals and means, and we will touch in more details on how to form a vector of weights correction (it is in fact commutative within one epoch) and has a length equal to the number of synapses w on a sample of length n-counts.

In short, don't get too excited just yet.

Neutron, I'll take a short time-out. I have to rethink everything once again and put it into codes at least for one neuron. Anyway, a day or two, and then we'll continue.

I thank you very much!

Reason: