Machine learning in trading: theory, models, practice and algo-trading - page 131

 

Release of version 10 of the ternary classifier jPrediction is released

The new version has implemented crossvalidation.

Due to crossvalidation, the generalization ability on OOS compared to version 9 has improved markedly, especially on non-stationary data. At the same time, the generalization ability according to test results on different parts of the sample, which is given as model characteristics in versions 9 and 10 has not practically changed.

Since the changes have not affected the user interface, the instructions for users of version 9 remain in force. Links to download fresh versions of jPrediction as well as its source codes can be found in the instructions.

 

Awl Writer:
1) it just compares the two time series by absolute values, i.e. you need pre-normalization of shift and scale on the vertical axis, and a lot depends on the particular implementation.

2) For example, here https://www.mql5.com/ru/code/10755 we take two pieces of fixed length for comparison

3) and does not take into account that one of them can be longer, the other shorter

4), and the volume of calculations can be significantly reduced, etc. We can talk about clustering by DTW-specific parameters - you can calculate not only the "degree of similarity" of the two fragments, but also the ratio of horizontal scales.

1) Well before the similarity comparison of two series by DTW algorithm the normalization is of course only usual... ie replace the absolute values of prices in the range of say from 0 to 1, what do you mean by rationingshift and scale ? explain please

I've thought up exactly the same idea as the author, did exactly the same research, went through exactly the same evolutionary path as he did, from conventional correlation to DTW, it gives me goosebumps... We were doing the same thing, thinking about the same thing just at different times in different places, it's hard...

3) Yes not taken into account, although it would be correct to consider, but I guess why the author has not implemented it, the fact is that if you start thinking deeper type: "but how to implement it" then comes out a lot of questions that have no answers ...

because it's not just searching for similarities with DTW on fixed-length segments, like I did in clustering or the author of the article in his algorithm, but it's much more complicated...

here are a few questions that come up

1. How to compare the similarity of the current price pattern with historical patterns, if we not only have to walk the history and look for similarities, but also dynamically expand/decrease the current pattern and the historical one to which we are comparing?

2. how to struggle with the shortage of deducted power personally for DTW even on the fixed length of two sections I do not have enough power in full, and with variant ( 1.) the load increases thousands of times without exaggeration....

4) How???

 
mytarmailS:

1. You can bring the values of the series to the range [0;1], but it is better imho to eliminate the constant component by subtracting the mean (MA) from each element, and dividing each element by the standard deviation. The browser, a parasite like that, destroyed some of the text.

3-4. If you look into the essence of the algorithm, how the matrix is filled in, much will become clear. Two segments of equal length are compared, which have fixed beginning and end. We can fix only the beginning and make the end floating, and enter into the algorithm to limit the scale factor from 0.5 to 2 - you get DTW with constraints. The result will not be one number, but two, and it will give us an additional predictor. To reduce the amount of calculations, we find "singular points" (extrema) and pull them to each other, thus eliminating most of the matrix area. See also wikipedia Dynamic Time Warping, section References.

 
Alexey Burnakov:

Gentlemen, a new task from me:

Here is a dataset in .R format: https://drive.google.com/open?id=0B_Au3ANgcG7CcjZVRU9fbUZyUkE

The set has about 40,000 rows, 101 columns. The far right column is the target variable. On the left are 100 inputs.

I suggest you try to build a regression model predicting the value of the 101st column based on the 100 remaining columns, on the first 20,000 observations.

On the remaining 20,000+ observations, the constructed model should show an R^2 of at least 0.5.

I then reveal the way the data is generated and give my solution.

The clue is the time series data. The input is 100 samples, predicted 1 ahead. It's not prices or quotes or derivatives thereof.

Alexei

Who tried? My colleagues and I want to train a convolutional NS. There's some mapping going on. We hope.

Kind of non-standard application of the method. On the other hand, we simply present a one-dimensional "picture" as input and we can smap neighboring "pixels" and their interactions there.

 
Alexey Burnakov:

Who has tried? My colleagues and I want to train a convolutional NS. There's some mapping going on. We hope.

It seems to be a non-standard application of the method. On the other hand, we simply present a one-dimensional "picture" as input and we can smap neighboring "pixels" and their various interactions there.

So let the colleagues try it, or is it weak?
 
Alexey Burnakov:

My colleagues and I want to train a convolutional NS. There's some mapping going on. We hope.

Interesting, I'm waiting for the impressions...
 
mytarmailS:


1. How can I compare the similarity of the current price pattern with historical patterns if I not only have to walk through the history and look for similarity, but also dynamically expand/decrease both the current pattern and the historical one I am comparing with?


Why such a need? If a pattern has an analogy in the history, then it should also correspond in its duration. At least I was looking for proportional sections when I did a pattern search.
 
Youri Tarshecki:
Why such a need? If a pattern has an analogy in the history, then it should also correspond in terms of duration. At least I was looking for proportional sections when I did a pattern search.

1) Well, at least because no pattern is exactly the same on the market,

2) and because dtw gives such a great opportunity

3) and because we all know the result of searching for identical patterns by size, including you... or will you surprise me? :)

 
Event:
So let your colleagues try, or don't you?
What's your point, passenger? You don't want to try, or your way. I'm working on my own task and I'm curious.
 
mytarmailS:
Interesting, looking forward to the impressions...
What else would you say? Are you waiting for the results?
Reason: