Machine learning in trading: theory, models, practice and algo-trading - page 2029

 
elibrarius:
What is the data?

Every minute purchases are opened with SL and TP from 100 to 2500 in steps of 100 on the 5 digits. That is, every minute 625 buys are opened. All at the opening minute price.

After decoding it should look like this:

#, entry price (*10^5), unix time, year, month, day, minute, second, day of the week, day of the year, ... , all the same for the exit, . , SL in pp, SL in prices, TP in PP, TP in prices, trade result.

This should be very well accepted by the forest. It would be interesting to break it into clusters, too.

 
Rorschach:

Every minute, open purchases with SL and TP from 100 to 2500 in increments of 100 on the 5-digit. That is, every minute opens 625 purchases. All at the opening price of minutes.

After decoding it should look like this:

#, entry price (*10^5), unix time, year, month, day, minute, second, day of the week, day of the year, ... , all the same for the exit, . , SL in pp, SL in prices, TP in PP, TP in prices, trade result.

This should be very well accepted by the forest. It's interesting to break it down into clusters, too.

I did the same thing with TP/CL.
I did not do a 10 pt step, but on a grid of 100,200,300,500,700,1000,2000 - if you apply it will be 49 different combinations per minute, that's almost 13 times less data.
I do not think that the fine pitch is needed at high TT/SCL values, 2400 and 2500 will be slightly different, and the resource savings is 13 times.
I also did separate passes for each combination of TPLs, rather than all at once in one file.
By equal TP/SL, you can immediately assess the profitability of the system. For example, at TP = 50. I have obtained 53-55% of successful trades. In the tester.

Since March, when volatility increased, it is impossible to operate with TP=SL=50; before that it was a valid combination for many years. Before March I could gain 50 pts in 10-20 minutes, while in March it took 1-2-5 minutes.
I have decided to refuse from stable TP/SL, now I just want to search for entry points and estimate trade results on the fly with my tester. I think it will be more universal for any volatility.

Try to take one combination of TP=SL=100 for example and train the model. I think you should have enough power on it. If it will be higher than 55%, then your data is better than mine) And even better to check not on the OOS (because it can be just lucky), and cross-validation - this is more reliable.
 
elibrarius:
I did the same thing with TP/SCL.
I did not do a 10 pt step in mine, but on a grid of 100,200,300,500,700,1000,2000 - if you apply it will be 49 different combinations per minute, that's almost 13 times less data.
I do not think that the fine pitch is needed at high values of TP/SCL, 2400 and 2500 will be insignificantly different, and the resource savings of 13 times.
I also did separate passes for each combination of TPLs, rather than all at once in one file.
By equal TP/SL, you can immediately assess the profitability of the system. For example, at TP = 50. I have obtained 53-55% of successful trades. In the tester.

Since March, when volatility increased, it is impossible to use TP=SL=50; before that it was a valid combination for many years. Before March I could gain 50 points within 10-20 minutes, while in March it took 1-2-5 minutes.
I have decided to refuse from stable TP/SL, now I just want to search for entry points and estimate trade results on the fly with my tester. I think it will be more universal for any volatility.

Try to take one combination of TP=SL=100 for example and train the model. I think you should have enough power on it. If it will be higher than 55% on the OOS, then your data is better than mine) And it's even better to check not on the OOS (since it may be just lucky), but by cross-validation - this is more reliable.

Most likely you will have to thin your data

 
Rorschach:

You will most likely have to thin your data

If you do the training with TPL=50 or 100 write the result on the cross validation, I wonder if you will get more than 55%
 
elibrarius:
If you make a training with SP/SL=50 or 100 write the result on the cross validation, I wonder if you get more than 55%

I ran in the tester for 10 years a system that opened once a day buying with different SL/TP ratios. Of those who survived there were no systems with the same SL and TP, mostly with TP<SL. You can try it yourself.

I also did an emulation, which showed that the system with TP<SL is faster to go to the plus.

 
mytarmailS:

Something wrong with the network again?)

Found the bug in the tester at last. Now everything is in its place.

Most likely he made the same mistake in his tester.

closed the topic ) in a hurry, Lamba is postponed

 
Rorschach:

I've got the dates, now I don't know how to process it, I don't have that much power((((

Right now it's a csv with one column of indexes, weighs a gig. After "decoding" the binary will weigh 17 times more.

Who needs it?

CatBoost will eat up this data, it works well with large files. I just don't understand what's the target...

If you'll prepare the data as a csv with the target, I can run it at my place.

 
Rorschach:

I ran in the tester for 10 years a system that opened once a day buying with different ratios SL / TP. Of the survivors was not a system with the same SL and TP, mostly with TP<SL. You can try it yourself.

I also did an emulation, which showed that the system with TP<SL went to the pluses faster.

I managed to make a system, where TP is 2-3 times larger than SL, but there is another problem - 20%-25% of profitable trades, and I cannot properly train the model to sift out losing entries.

 
Maxim Dmitrievsky:

Checked it, it works better in the tester. I'll figure it out again.)

Just got into recurrences... very cool to learn.

Z.U. did it right - very sadly learnable.

thought love, but no - experience again))

Why is it sad? Maxim, don't you have links to documentation on recurrence? I am interested in how exactly the weights are updated during training
 
Alexander Alekseyevich:
Why is it sad? Maxim, don't you have links to the documentation on recurrence? I am interested in how exactly the weights are updated during training

they are retrained the same way as others

There's a lot of information on google.

here's a good one

https://towardsdatascience.com/gate-recurrent-units-explained-using-matrices-part-1-3c781469fc18

Gated Recurrent Units explained using Matrices: Part 1
Gated Recurrent Units explained using Matrices: Part 1
  • Sparkle Russell-Puleri
  • towardsdatascience.com
In the first step a dictionary of all of the unique characters is created to map each letter to a unique integer: Character dictionary : {‘h’: 0, ‘a’: 1, ‘t’: 2, ‘M’: 3} Our encoded input now becomes: MathMath = [3, 1, 2, 0, 3, 1, 2, 0] Step 1: Create batches of data This step is achieved by the user specifying how many batches (B), or sequence...
Reason: