Machine learning in trading: theory, models, practice and algo-trading - page 437

 
elibrarius:

I'm not yet sure that it is correct to consider the graphs similar with such a large difference in the slope angles. Using the same example:

the variant found gives a pullback from the upper trend point or the end of the trend, by transferring it to the pattern chart it will give a prediction of a continuation of the declining trend, not a reversal - essentially an inverse signal. Something's not right here....

but if you drive the tester it will seldom find variants with a very big difference in the slope, it means that at a certain limit it will be a rising but not falling pattern and its structure will have a low correlation to the current one, that is why it will never show up in the results
 
Maxim Dmitrievsky:
it is true, just a big difference in the slope of patterns, you can make a search restriction if the slope is very different then do not take into account such variants, but if you run the tester it will rarely find variants with a very big difference in the slope, it says that at a certain limit it will be really not a falling pattern but a growing one and its structure will have a low correlation with the current one

Well, 10-15 degrees (by eye) how much less? And it is better not to find anything than a false signal.

PS. measured in photoshop - 18 degrees

 
elibrarius:
Well, 10-15 degrees (by eye) how much less? And it's better not to find anything than a false signal.


Even with perfect 50% coincidence it predicts in the wrong direction.)

And by the way yes, the prognostic curve is not considered correctly here, I screwed up somewhere... and I've lost the old version

 
Maxim Dmitrievsky:

He predicts the wrong way even if he has a perfect coincidence of 50%.)

Because you make a prediction based on one version, with a hundred similar versions the reliability of the prediction would be greater. But the average forecast will be zero.)

It's bad both for one and for many. It's necessary to feed this task to the optimizer.

 
Elibrarius:

Because you are making a single line forecast, with a hundred similar forecasts would be more reliable. But the average forecast will be zero.)

It's bad on one line and bad on many. We should feed this task to optimizer.


Yes, but first it should be correct, the forecast should not pull lines like that, it just shows the direction correctly)

But it's all bullshit through correlation in any case, that's why I gave up...

 
Maxim Dmitrievsky:

As a minimum, it is necessary to make affine transformations of charts, because patterns go at different slope angles (self-affine structures), and secondly, search on different timeframes. But it doesn't help when correlation is used - it finds very different patterns .

If the main problem of correlation is that"it finds patterns that are not very similar", then it is possible to simply toughen the requirements for the acceptable error and only very similar patterns will be found. But this will not happen on every bar, but sometimes (once every few hours, like in your Expert Advisor with a slope angle). The allowable error will again be chosen by the optimizer.
Besides, my version does not directly count the Pearson correlation, as you have, but the total error (with sifting out the maximum acceptable error on each bar). In doing so, the most correlated variants to the pattern are probably found, that's why I compared it to the correlation.

 
Dr. Trader:

Suppose there are two arrays of prices, 5 prices in each
The first is a1,a2,a3,a4,a5.
The second is b1,b2,b3,b4,b5.

1) The price graph can be detrended, i.e. it can be placed horizontally from some rotated arrangement. This can be done with a linear regression - find it, and use the error array instead of the original price series. Whether this step will help in searching for patterns I don't know, I haven't studied its effect in detail. So far, I don't use this step myself.

2) It is questionable to call a row of prices a pattern; there has to be a mathematical description of the shape formed by these prices. For example, we can find the increase of price on every bar and use these increases as a certain pattern description.
the first pattern will be obtained by the formula a5-a4, a4-a3, a3-a2, a2-a1
the second is b5-b4, b4-b3, b3-b2, b2-b1.

3) "similarity" of patterns - either correlation (I did not check it myself) or Cartesian distance by the Pythagorean theorem (I checked it, and it worked out very well) -
sqrt( ((a5-a4)-(b5-b4))^2 + ((a4-a3)-(b4-b3))^2 + ((a3-a2)-(b3-b2))^2 + ((a2-a1)-(b2-b1))^2 )
or something else, I think there must be better options.

1. You are doing it - increasing the error tolerance when going deeper into the history.

2. Lop-side error calculation (sum of delta abs. values on every bar) Charts must be preliminary summed up on zero bar.
Abs(a5-b5)+ abs(a4-b4)+abs(a3-b3)+abs(a2-b2)+abs(a1-b2)

Calculation of the error according to your variant
abs((a5-a4)-(b5-b4))+abs((a4-a3)-(b4-b3)+....
transform the 1st element
abs((a5-a4)-(b5-b4)) = abs((a5-b5)+(b4-a4)) -

(a5-b5)+(b4-a4) = delta 5 + ( - delta 4), this is similar to the sum of deltas, i.e. errors. But this is not the sum of abs. delta values, but just the sum, and deltas with different sign! If errors on neighboring bars have the same sign, they compensate each other (due to the fact that the second delta has a minus sign). Even a huge error of +1000pts and +1000pts will be reduced to zero in your formula. And it will mark as similar, a chart with an outlier of +1000pts on 2 bars. Although on the next element, only 1 of these outliers will be counted and the resulting error will eliminate this option.
But all the same, this error calculation function may miss as a similar variant, such a number with deltas: 0, +10, +15, +12, +5. Your formula for this combination will give less error (25 pt) than just the sum of the absolute values of the deltas on each bar (42 pt).

3. This is the same formula as in item 2 with the same drawbacks.

 
elibrarius:

The easiest one is to stepwise shift the window width to the desired example across the entire sequence and find the sum of the abs. values of the delt:

0,0,0 and 1,2,3 error = (1-0)+(2-0)+(3-0)=6

0,0,1 and 1,2,3 error = (1-0)+(2-0)+(3-1)=5

0,1,2 and 1,2,3 error = (1-0)+(2-1)+(3-2)=3

1,2,3 and 1,2,3 error = (1-1)+(2-2)+(3-3)=0

2,3,1 and 1,2,3 error = (2-1)+(3-2)+Abs(1-3) = 4

Where the minimum error is the maximum similarity.


And the convolution is the same only instead of addition and modulus one multiplication and the maximum is chosen, it is faster

0,0,0 and 1,2,3 error = 0*1+0*2+0*3 = 0

 
Gianni:

And convolution is the same but instead of addition and modulus one multiplication and the maximum is chosen, it is faster

0,0,0 and 1,2,3 = 0*1+0*2+0*3 = 0

That's a cool processor you have! ))
Mine adds and subtracts faster than multiplies, and finds modulus just by equating the 64th bit to zero.
 
elibrarius:

3. This is the same formula as in item 2 with the same disadvantages.


It's all one formula, I just wrote it in three steps to make it clearer. So signs will not be a problem, because there is a square.

Reason: