Machine learning in trading: theory, models, practice and algo-trading - page 33

 
mytarmailS:

on this "wonderful" forum I can neither a picture nor attach a file, can anyone know what the problem is?

Attached a picture today. See the post on the previous page. To do this while composing a post you need to press the key combination Ctrl+Alt+I
 
Yury Reshetov:

Unfortunately, my RAR archive doesn't unpack. IMHO it's better to pack everything in ZIP. There are unpackers for ZIP formats on all platforms. Also, a lot of users don't use RAR.



I will have a look at it. I don't know R well enough though.

Was the port performed manually or using some automatic tool?

I need a new version of winrar, since the fifth version the format of files has changed. I couldn't get it to Zip right away, it was too big. Here is the same files again, but compressed with an old version of winrar.

I made the port by hand, took the files from https://sourceforge.net/projects/libvmr/ and copied them with corrections. The syntax is mostly similar, the main problem is that in R the array indices start with 1 and not 0. I purposely left out the Separator class; R already has built-in functions (sample) to separate the files into training and test samples.

Files:
 

Thanks Yuri!

So in the course of research it turned out that the most improper targets that predict the direction of prices such as the next candle prediction or zigzag in the classical application, such targets are not unambiguous and often confuse the network or another algorithm, because not always with the growth of zigzag price increases and if the network is trained on such data is extremely difficult to recognize something new data. So the conclusion is the following: the target should be described very clearly and consistent...

What I mean by inconsistency I wrote several pages before) like indicators and various "smoothers" (where basically the "PCA" method can also be applied), everything must be very precise, distinct and unambiguous and informative and not soft, smooth and vague.

shortly about the model

The target - the fact of the zigzag turn (not the direction)

Predictors - candlesticks, levels - about 110 pcs (no conflicting data)

5 minute data

I have trained two models on RF, separately buy and separately sit, although I could use one

Working with new data

model on new data

i want to note that this is one of the best pictures, in fact the situation is much worse, but this is the best i've achieved so far....

I also want to note that if you add indicators or reduce the dimensionality of the signs using the same PCA the accuracy of inputs is not only low but totally gone, that is, everything is floating at once and that's what I call inconsistency in the signs.


p.s.

I think I've exhausted the classic approach to teaching the network in the market, I'm not satisfied with the results, I hope these rules will help someone. But even these rules will not lead to an acceptable result, it is necessary to change the concept itself in the search for predictors, you need to look much deeper, I have such a concept but unfortunately only at the level of the concept, if there are people who are interested in this concept and have good programming skills, I will be glad to communicate but not in the sense of communication and in the sense of implementation, as my programming skills are at an initial level as yet.

 
Yury Reshetov:

I'll watch it for sure. I don't know R well enough, though.

Did you port it manually or through some automatic porting?

Take rattle - very useful for a beginner. You can master it in an hour - GUI.

Immediately the whole cycle of modeling: data mining, fitting models (6 types of models, including the one similar to yours - SVM), evaluation. Besides the log in R gives the newbies a chance to see the finished code. It can be used in the future. Takes Excel files. Ie you can prepare in excel, you can unload from μl in excel... In general, no problems with the original data at all.


Also useful for advanced users: something to think over, to try.

In R it is necessary to write, if you managed to pick up predictors. Then change modes of models used in rattle and take other models... And in general use at least caret. But first, predictors that have predictive ability for a particular target variable.

PS.

The post above recommends using sample.

I don't recommend it.

The rattle itself divides the file quite intricately, but if we mean to create sets for training, testing, validation - this will be done by rattle itself and sample is not needed, and also to check outside these sets, which is very important for modeling future trades, then the source file must be divided mechanically by index, the first part is put into rattle and the second part is used to compare results with rattle. This can be done in the same rattele. If all four errors are matched, and with error less than 20%, it is a grail for years to come.

PPSS.

Example of using rattle in my article, there is also attached a file, you can use it directly, or as a sample.

 
mytarmailS:

target - the very fact that the zigzag turns (not the direction)

We still take the zigzag to get the teacher's values. The shoulder up is 1, the shoulder down is 0.

In your terminology: is it "the fact of reversal itself" or not? If not, what do you mean?

 
SanSanych Fomenko:


rattle divides itself divides the file quite intricately, but if you mean creating sets for training, testing, validation - rattle will do it,

As far as I remember it'sintricate just called "sample" function ;)
 
SanSanych Fomenko:

To get the teacher's values, we still take a zigzag. The shoulder up is 1, the shoulder down is 0.

In your terminology: is it "the fact of reversal itself" or not? If not, what do you mean?

I'm sorry, I must have misunderstood... I mean the target candlestick is the one that has received the reversal.
 
mytarmailS:
I'm sorry, I must have misunderstood... I mean the target candlestick is only the one which has a reversal, all the rest is another class, the class "not a reversal".
Very interesting, I tried it, but for some reason I gave it up... I don't remember
 
mytarmailS:
As I remember it'sintricately just called the "sample" function ;)
Exactly, it's just that rattle itself does it and you may not know about it at all. Here is the second part which is crucial for evaluating overtraining and should NOT be derived from sample.
 
SanSanych Fomenko:
Very interesting, tried it, but for some reason I discarded it... I don't remember.

What predictors did you feed there?

As I see it, the predictors and the target are like a whole, everything should be interconnected there, it's very stupid to want to catch exact bounces by feeding a 200 moving average as predictors. It's like two different universes that don't intersect with each other

Reason: