NS + indicators. Experiment. - page 11

 
Figar0 >>:

Пора поднять веточку, все-таки она посвящена входам НС и новую тему заводить не стану...

Появилась пара совсем простых теоритических вопросов-раздумий.

*

Ать). Подавая в НС варианты входов, которые утрировано выглядят примерно так:

а) Open[0]-Open[1], Open[0]-Open[2], Open[0]-Open[3], Open[0]-Open[4],..., Open[0]-Open[N]

б) Open[0]-Open[1], Open[1]-Open[2], Open[2]-Open[3], Open[3]-Open[4],..., Open[N-1]-Open[N]

рассчитывал получить примерно одинаковый результат. Все-таки входы весьма тавтологичны. Ан нет, результаты далеко не всегда даже рядом лежат, как и на участке обучения так и вне его. Пробовал разные варианты сетей, разные нейросетевые пакеты, везде картина примерно одинакова... Почему? несовершенство механизма обучения? "Слабость" сети? Многого от нее хочу?

*

Два). Хочу услышать мнение опытных или считающих себя таковыми нейросетевиков, как далеко должны быть входы от цены в нащей прикладной задаче прогнозирования ценового ВР? Мои опыты показывают, что использование всяческих индикаторов, "производных N-го порядка" от цены, в лучшем случае, вообщем-то ничего не дает. Цена, максимум сглаживание, экстремумы, все остальное от опиум для НС.. Не считаете?

At) The information is "tautological" for you, for me, for many other μl programmers... Have you tried a mutual correlation and see how correlated the inputs are? Outside the learning area, from personal experience, the difference appears only due to non-stationarity of the input...

Two)Regarding "how far the inputs should be from the price" - relatively, on the contrary, for me using derivatives improves the criterion... Personally, I don't think so. What the inputs should be depends on the output, the problem depends on how you come up with (roughly speaking).

 
Figar0 >>:

Ать). Подавая в НС варианты входов, которые утрировано выглядят примерно так:

а) Open[0]-Open[1], Open[0]-Open[2], Open[0]-Open[3], Open[0]-Open[4],..., Open[0]-Open[N]

б) Open[0]-Open[1], Open[1]-Open[2], Open[2]-Open[3], Open[3]-Open[4],..., Open[N-1]-Open[N]

рассчитывал получить примерно одинаковый результат. Все-таки входы весьма тавтологичны. Ан нет, результаты далеко не всегда даже рядом лежат, как и на участке обучения так и вне его. Пробовал разные варианты сетей, разные нейросетевые пакеты, везде картина примерно одинакова... Почему? несовершенство механизма обучения? "Слабость" сети? Многого от нее хочу?

At first glance, the information fed to the network input in cases a) and b) is indeed tautological. The brain, through the eyes, takes in graphical information in the form of written formulas.

The symbols of the formulas themselves do not carry any useful information for the brain, the information is contained in the values of those symbols.

It practically does not see the difference at first. I didn't see it. :)

It does not see the difference until it begins to understand what numbers are hidden under the signs of the formulas. This is where it immediately becomes clear what the difference is, A big difference.


Here's a screenshot of graph a),

We see a nearly perfect line from infinity to the value of the difference between Open[0]-Open[1]:



Let's now look at graph b):



The difference, as they say, is obvious! In the second case, the value of our function "jumps" around 0, in well-defined bounds.

Any NS has limits of sensitivity of inputs determined by the function of neuron activation and limits of weights. The network cannot be required to learn something that it has not learned within the patterns of the learning function and within its sensitivity limits. This means that the NS can correctly compute only on the input data, even if it was not in the training sample, but at the same time, if the values submitted to the input belong to the area of determination of the training function and lie within the sensitivity of the network. These are two necessary conditions, under which any NS works correctly.

Figar0 >>:
Two). I want to know the opinion of experienced or consider themselves experienced neural-networkers, how far from price inputs should be in our applied problem of forecasting of price VR? My experience shows that using all sorts of indicators, "Nth order derivatives" from the price, generally gives nothing at best. Price, maximum smoothing, extremes, everything else from NS opium... Don't you think?

The indicators derived from the price, which have principal differences between them in the calculation principles, can and should be used. Separate neighboring bars by themselves do not carry any useful information. Useful information can be obtained only if we consider the interaction of "groups" of bars, or, in other words, "windows".

 
joo писал(а) >>

At first glance, the information fed to the network input in cases a) and b) is indeed tautological.

Here is a screenshot of graph a),

Something tells me that you have not graphically interpreted formula a) correctly.

Your graph is something like (variant of):

f[0] = Open[0]-Open[1] I see.

f[1] = Open[0]-Open[2] I.e. looking ahead?

Or substitute Open[0] for the current value, i.e.

f[1] = Open[1]-Open[2] i.e. formula b)

Or ahem.

f[1] = Open[1]-Open[3] But this does not follow from the context

As I understand it (at least this is what I had in mind with similar "experiments" with inputs), it's all an array of N-1 digits at "point 0", i.e. f[0][N-1]

at the point f[1][N-1] will be Open[1]-Open[2], Open[1]-Open[3], ...

It means that one should draw an n-1-dimensional plane.

The fact that the results are different is natural. But I haven't decided which inputs are "more correct" (for the same outputs).

SZY. I searched long ago and in "black boxes".

 
SergNF >>:

Что-то мне подсказывает, что Вы не правильно графически интерпретировали формулу a).

And you are consulting this "Something" for nothing.

Make an indicator and see.

SergNF >>
The

fact that the results are different is natural

.

But which inputs are "correct" (for the same outputs)

I

haven't decided yet.

What do you mean the results are different? Which inputs are the same as outputs! And which of them are "correct", described the post above.

 
joo писал(а) >>

And you're taking advice from that 'Something' for nothing.

You're probably right :)

Make an indicator and see.

Top "graph" of input vector!!!!! DBuffer[i]=Close[i]-Close[i+DPeriod];

Bottom "graph" of the input vector!!!!! DBuffer[i]=Close[i+DPeriod]-Close[i+DPeriod+1];


Next is a linguistic parsing of the original post and... Again to the question of correctness of wording in ToR. ;)

 
SergNF >>:

Наверное Вы правы :)

Верхний "график" входного вектора!!!!! DBuffer[i]=Close[i]-Close[i+DPeriod];

Нижний "график" входного вектора!!!!! DBuffer[i]=Close[i+DPeriod]-Close[i+DPeriod+1];


Дальше будет лингвистический разбор исходного поста и ... опять к вопросу о корректности формулировок в ТЗ. ;)

Yeah, so the questioner's ToR is not good, since two people have interpreted the question differently. :)

We should wait for Figar0 to see what he has to say.

 
StatBars писал(а) >>

What the inputs should be depends on the output (task), the task (output) on how you come up with (roughly speaking).

What outputs can there be? The signals which we hope to use to obtain the gradual growth of equity without slippage) Although the task may indeed be set differently by the NS....

joo wrote >>

Well, it means the questioner's ToR is not good, since two people have interpreted the question differently. :)

We must wait for Figar0, what he has to say.

Honestly, I did not think there would be such a difference of opinion... The cost of brewing NS in their cramped worlds. SergNF understood my "ToR" correctly) Two input vectors, where one can be easily obtained from the other and with different results in the output. And it seems I begin to understand why they are different, tomorrow I shall make a couple of experiments and try to put this understanding into letters...

 
Figar0 >>:

Какие могут быть выходы? Сигналы, работая по которым надеемся получить плавный рост эквити без просадок) Хотя задачу НС можно действительно поставить по разному....

The signals for which we hope to get a smooth rise is one type of problem, and there can be many principles, but many will still be very similar (at least in terms of correlation). And the output was meant in this very context.

What do you think about it?

I agree that we should not share experience, in any case, many things would be easier on the one hand, and important information about the project and developments would still remain hidden, i.e. for example, I have something to share that will make life easier for others, but at the same time will not reveal my secrets in any way :)

 
Figar0 писал(а) >>

And I think I'm beginning to understand why different, I'll do some experiments tomorrow and try to put that understanding into letters...

Well, there seems to be a definite understanding, a conclusion:

- Inputs a) and b) are the same, and different results due to imperfect learning mechanism, and calculation errors in data conversion. I.e. if I use for input increments of anything: indicators, price, I will still take the difference from the zero report or from adjacent points. I have learnt it only experimentally, having simplified the NS to the limit and made it possible to learn by exhaustive enumeration of weights. The results coincide to a cent...

After digesting this conclusion, I've fallen into the study of GA.

StatBars wrote >>

I agree that it's a waste not to share experience, in any case, many things would be easier on one hand, and the important information about the project and developments would still remain hidden, i.e. for example, I have something to share that will facilitate life of others, but at the same time will not reveal my secrets :)

Definitely a waste... It's not a bad idea to synchronize your watch at least once in a while)

I'll start with myself. At the moment I'm preparing my nets in MQL, and I'm training them only in the tester/optimizer of the terminal. However, I investigate some components of inputs in Statistic. This approach has both disadvantages and certain advantages. The biggest plus is learning the network directly in the application environment and you can teach it to the final goal - profit. As an architecture I've experimentally stopped at MLP with few layers: incoming, outgoing and one or two hidden ones. More than sufficient approximator IMHO. At the moment I'm thinking of adding backlinks to MLP. Number of inputs used ~ 50 to 200, more would complicate architecture and training, and I think it would be redundant.

Maybe some of my description is strange, maybe not, but I'm a self-taught neural networker, moreover, jealously guarding my brain from excessive dusty literature)

 

I use MQL for preprocessing data, uploading it to the internal format, and analysing network results. I only train networks in NSH2. Of course, there are also networks in MCL and C++, but NSH2 is better...

Nets are usually no more than the usual 3-layer (MLP), more is not required.

By the way, about inputs, maximum I fed 128, by some experiments I found out that more than 20-30 inputs is already a strong excess, and most likely removing unimportant ones will greatly improve the learning criterion. If less than 40 inputs the question may still be debatable, but more than 40 inputs is 100% overkill.

Learning criterion is standard in NSH2 - sko.

Be sure to use a test sample to avoid network fitting. Moreover, the division of the sample sequentially - the first part of the sample let's say from 2005.01.01 - 2006.01.01 - training sample, test sample - 2006.01.01 - 2006.04.01. Many may doubt that such a way to get rid of retraining the network, but I am 100% sure that it gets rid of retraining, if you do not get rid of, then the problem in another, ie, I do not recommend changing the sample split, rather teach exactly so (with 2 samples). Consider which is more important for you: The network should be stable to begin with, but you need to work on it to make it profitable enough as well :) Or the network is profitable, but you don't know when it will be unprofitable, i.e. first we see profit, and then we figure out the stability of the network. I did not mean smooth growth of equity here but the fact that the net is showing different results on different forward tests (depo drains and vice versa).

The first one is more important for me personally as I am developing in this direction. Now I have achieved certain results, for example: (I've already said it on the forum, but I repeat) there is "general sample", I train the network on about 20-30% of the general sample and after the training the network works equally everywhere, on the whole sample, and the date of "general sample" is from 2003 until now.

With the profitability of the grid, too, I have already come very close to a solution. The net recognizes 84% of samples and 87% is a minimal threshold for stable-smooth growth of

equity. When I first started solving this problem, the recognition rate was 70%.

Reason: