Machine learning in trading: theory, models, practice and algo-trading - page 391

 
Dr. Trader:

This is in Reshetov's RNN, a probabilistic model.

And then there's jPredictor, which Mikhail uses. Reshetov's neuron, there are a lot of inputs, and some kind of training instead of gradient descent.


I tried to solve the problem from the first post on the NS from Alglib. The network is 20-5-1. Successful. But very long, you have about 2 sec solution. My calculations average 10-20 min, there are variants in 2 min from 1 cycle, but apparently it stumbles by accident, I have to set 20 cycles of training to be reliable... Or 100000 iterations, like in the variant below

Alert: Average error in training (60.0%) section =0.000 (0.0%) nLearns=2 NGrad=68213 NHess=0 NCholesky=0 codResp=2
Alert: Average error in validation (20.0%) section =0.000 (0.0%) nLearns=2 NGrad=68213 NHess=0 NCholesky=0 codResp=2
Alert: Average error on test (20.0%) section =0.000 (0.0%) nLearns=2 NGrad=68213 NHess=0 NCholesky=0 codResp=2

Calculation time=22.30 min
0 sum weight=3.2260
1 sum weight=0.0000
2 sum weight=3.2258
3 sum weight=0.0000
4 sum weight=8.7035
5 sum weight=0.0000
6 sum weight=3.2253
7 sum weight=0.0000
8 sum weight=3.2258
9 sum weight=0.0000
10 sum weight=3.2251
11 sum weight=0.0000
12 sum weight=0.0000
13 sum weight=0.0000
14 sum weight=0.0000
15 sum weight=0.0000
16 sum weight=0.0000
17 sum weight=0.0000
18 sum weight=0.0000
19 sum weight=0.0000

I want to be quicker...
If the problem is not with 20 inputs, but with 200, it will be dozens of hours.
Sifting out low-correlated to outputs or high-correlated to other inputs removes important inputs, I even tried Fisher's LDA - it also removes important inputs. I.e. sifting with these methods doesn't help, but rather prevents me from doing it.

Apparently, it leaves me with long solution for all inputs once, sifting out inputs according to the sum of weights and getting a model for future use. Then retraining once a week, for example, on the truncated number of inputs.

I was thinking, maybe for the speed to give this task to MS Azure to get the total weights of inputs, and then use them in my model. I experimented a bit, but I don't see where to get the weights from...

 
That's right bros!!! That's right, I use Resolute's optimizer. So I'd like to run the calculations on the GPU. Has anyone ever done such a thing? Since JPrediction is parallelized, it's just a matter of running the program on the GPU. Does anyone know how to run a JAVA program on the GPU? I think useful knowledge would be....
 
Mihail Marchukajtes:
That's right, bros!!! I do, I use Resolute's optimizer. So I'd like to run the calculations on the GPU. Who has done such a thing? Since JPrediction is parallelized, it's just a matter of running the program on the GPU. Does anyone know how to run a JAVA program on the GPU? I think it would be useful knowledge....

Can you give me a link to a working version and a description?
 
Mihail Marchukajtes:
Hi all!!!! I'm glad that this thread hasn't died out and still is going on, so I have a question for the public. I have a dataset for training, but unfortunately it became so big that training takes too long. Can someone build a model with his own work, and then we will see how it works together!!!!!.
Your set on the contrary, VERY small 111 features, 452 points. But if the data are collected sensibly (the target is not confused with chips), then there is a 3-4% advantage (accuracy - 53.5%), if for a large investment fund or bank, when trading in the medium term is enough, for an intrade with a giant leverage and a couple of k$ depo certainly not.
 
Alyosha:
There is a 3-4% advantage (accuracy - 53.5%)
What model was used and in what configuration? Why decided that this is not a random result, I do not come together with this dataset, then 47%, then 50, then 53
 
Aliosha:
Your set, on the contrary, is a VERY small 111 chips, 452 points. But if the data is collected sensibly (targeting is not confused with chips), then there is 3-4% advantage (accuracy - 53.5%), if for a large investment fund or bank, when trading in medium term is enough, for an intrade with giant leverage and a couple of k$ depo of course not.

I think it would be enough for intraday as well, if I enter 50 pips better than the signal. I'd rather trade with delays. You'll earn better than the spread.
 
Maxim Dmitrievsky:

And give me a link to the working version and a description

What do you mean? A link to JPrediction?
 
Aliosha:
Your set, on the contrary, is a VERY small 111 chips, 452 points. But if the data are collected sensibly (targeting is not confused with chips), then there is 3-4% advantage (accuracy - 53.5%), if for a large investment fund or bank, when trading in the medium term is enough, for an intrade with a giant leverage and a couple of k$ depo certainly not.

I don't know about the smallness. That's the whole futures contract in 3 months. The question is different. I've got two more weeks that aren't in the network. So I thought I'd build a model and run it on this sample. But with JPrediction training will take weeks, which is not good. That's why I wanted to get the model using other algorithms and see how the model works.
 
Again, this set is intended for classification. That is, the output variable already carries a prediction. If you use a resilient model, you don't need to forecast the output variable, you just need to proxy the model to it, because the output is already in the future. That's it, if you understand me correctly.
 

Another thing is to run the program on the GPU and increase the speed of calculations at least 10-20 times. Here I think there would be progress .... Only here is information that is in the internet is very old and with it to understand how to do it I do not get. I am not so good at programming. I am no good at it :-)

The idea behind all this fuss is the following. It doesn't really matter what algorithm is used (although I'm lying, of course it does matter. It is important that retraining in it be kept to a minimum) IMPORTANT is the nature of the data, how it is collected and prepared for training. This is what I wanted to check. Is there really a fish in the data that I collect. Here is an example.

Reason: