Neural network - page 5

 
gpwr писал(а) >> we don't understand each other.

In fact, you have written everything correctly. We just disagreed on the definitions.

gpwr wrote(a) >> All textbooks say that training a network with a teacher is done by splitting the data into a training sample and a test sample. The network learns by minimizing the error on the training sample while observing the error on the test sample (out-of-sample test or verification). Learning stops when the error on the test sample stops decreasing (shown with a dashed line below). And the error on the training sample may continue to decrease as shown in this figure

With all this research many people forget about the most important thing - profit. That is why there is one "but" that is not written in textbooks. It is that achieving minimum error on OOS does not guarantee profit. Why? Because minimal error and profit are two different things. They may be in no way connected. Why? Because the network does not necessarily repeat the market on the OOS and real - it is enough at certain moments to give the correct buy or sell signals - i.e. exceed a certain trigger threshold. Between these moments (signals) the network can behave as it pleases - the main thing is not to exceed this threshold. That is why at a large error on the OOS the profit can be big.

muallch wrote >>

At the beginning there is an out-of-sample - to set up the net. And then it will be gone - the real future is ahead, it must be predicted. What is the criterion for stopping the training - a certain error or the number of training runs? Or something else?

And of course the question of the future is definitely open, because even the OOS is a known future where we can control the profit, while we are trading in a future which is unknown to us and the main thing there is not to get the minimum error, but to get the maximum profit. And what kind of error there will be - it doesn't matter.
 
muallch писал(а) >>

First there is out-of-sample - to set up the grid. And then there is no out-of-sample - the real future lies ahead and must be predicted. What is the criterion for stopping the training - a certain error or the number of training runs? Or something else?

For example 100 epochs without updating the minimum error on the control sample.

 
LeoV писал(а) >>

And of course the question of the future is definitely open, because even the OOS is a known future where we can control the profit, while we are trading in a future which is unknown to us and the main thing there is not to get the minimum error, but to get the maximum profit. What kind of error it will be - it doesn't matter.

You may be thinking in such a way, as you have never had really stable results. Error and Profit are interrelated, in principle for each task it is possible to determine what error should be achieved to get acceptable TC...
 
StatBars писал(а) >>
You are probably thinking this way because you have never had any really stable results. Error and Profit are interrelated, in principle for each task you can define what error you need to achieve to get acceptable TC performance...

What does "really stable results" mean to you?

 
StatBars писал(а) >> Error and Profit are interrelated

They may or may not be related - that is the big question. But it is certain that the minimum error on OOS does not mean and does not lead to the maximum profit on the real account. As well as the maximum profit on the real account does not mean the minimum error on it.

 
gpwr >> :

You have misunderstood the essence of my reasoning. I did not talk about the relationship between an "untrained" network and trading results. It is written everywhere that the network must be trained until the error in the sample under test ceases to decrease. I agree with that and don't want to argue about it. The essence of my reasoning was to show how a parallel network structure leads to difficulties in its optimization and how a non-linear model based on a power series is able to achieve the same goal as a neural network, but with a much simpler mathematical apparatus and a fast learning process leading to a unique result.

Eh, how many opinions have been expressed here ;-). I will add my 5 kopeck.

You also misunderstood me. I didn't mean that the network is "untrained". I meant that you shouldn't expect miracles from the net. It's not a panacea and if it's going to give a winning percentage, it's very small, and that's why we need committees. The network configuration for the committee and the input/output data structure can be searched long and hard. Imho, you were too quick to discount grids, without actually trying even 10% of what you should have (just judging by the time you started working directly on your project). By virtue of your mathematical background you have options for what to try and replace the grid ;-). You are welcome to try. But it seems to me that by criticising the grid you are focusing your attention on the wrong points. In particular, what difference does it make which synapse learns to which input factor in a particular mesh instance? Do you need to know it? It really doesn't. This intrinsic uncertainty of the signal distribution grid across neurons is supposed to be "by design". But if you train a dozen meshes and make them thin, you will see that the pattern of connections - the very non-linear series you mentioned - has formed itself, and is close or equal to one and the same. If you're making a manual analog, as a mathematician you know which methods to use and how labour-intensive they are, in order to reveal the dependencies revealed by the network in the data stream and only after you've revealed these dependencies you can create your power series.

What I would like to say about committees is that they are not chosen on the simple principle of N networks, but only, say, the top 10 networks out of the hundred received. If we continue with the example of meetings of people, only those who are more or less able to listen to each other will be heard. Also, you have apparently forgotten that we have more than 2 outcomes. They are in fact: success, failure, loss, failure. So, probability must be calculated (I intentionally simplify): no-loss(1)*no-loss(2)=0.4*0.4=0.02. I.e. the best configuration is not the one with maximal profit probability, but the one with minimal loss probability. By analogy, we look at the drawdown ratio. It makes no sense to take a super profitable configuration, if the drawdown for it is already 50%, because it almost guarantees a loss.

 

Again.

joo писал(а) >>

With a teacher, it is only possible to train a grid on a function that we already know, such as a sine wave. Here, we can, in good conscience, feed the net to the next point to be trained as a teacher.

of the trained point as a teacher. This will not work in the market.


Because we always know ahead of time which point will be next on the sine wave. We know the future of the sine wave!

So it is valid to teach on historical (sinusoidal) data, that is to say, to teach with a teacher.

But we do not know the future of the market, so teaching with a teacher becomes a pointless process.

 
LeoV писал(а) >>

What does "really stable results" mean to you?

For example an Expert Advisor is optimized on 2 months of history with only 3 parameters, 80% of positive results are profitable on all history.

It's the same with networks...

 
LeoV писал(а) >>

They may or may not be related - that is the big question. But it is certain that the minimum error on OOS does not mean and does not lead to the maximum profit on the real account. Also, the maximum profit on the real account does not mean a minimum error on it.

Generally you are talking about stability of results, not error, if the network stably recognizes or predicts something and this prediction is enough for profit, we will have profit on both forward and real.

If the error is satisfactorily small, then it will lead. What does it mean satisfactorily? For every problem this condition is set separately, I only know the empirical way.

 
joo писал(а) >>

I'll say it again.

Because, we always know ahead of time which point will be next in the sine wave. We know the future of the sine wave!

That is why it is legitimate to train on historical (sinusoidal) data, i.e. to teach with a teacher.

But we do not know the future of the market, which is why learning with a teacher is a pointless process.

If we know the sine wave and therefore can predict it with networks, then create a more complex formula, the analytic notation will be known to you, so we can predict it too. The market is the same formula, only even more complicated and it is not known to WE...

Reason: