Machine learning in trading: theory, models, practice and algo-trading - page 1439

 
Maxim Dmitrievsky:


realistically rewrite the whole TS under catbust to try... also a lot of trouble. But the fact remains, learning on small datasets forest generalizes well and works, for example, on 2-5k samples, increasing only 2 times, on the same data, full retraining. That's a fact.

Tried short datasets, there are weeks with 30% error and the next week with 60-70% error. Which averages out to 50%.

 
Elibrarius:

Tried short datasets, sometimes one week with an error of 30%, and the next one 60-70%. Which averages out to 50%.

For example, if I teach a month, it works almost as well for a year with new data. I train for 2-3 months - it doesn't work any more... some kind of bullshit.

and the model errors are the same
 
Maxim Dmitrievsky:

For example, if I teach a month, it works almost as well for a year with new data. I train for 2-3 months - it doesn't work anymore... some bullshit.

Are these results on your selftraining system?
 
elibrarius:
Is this on your self-learning system with these results?

Yes, on it, with some tricks. I'll tell you one - add intermediate samples to the model. For example, there was a signal to open a buy trade, all the time it is open, on each new bar add another sample with the same mark to buy, with new readings of chips, respectively. This will greatly reduce the error. Some kind of duplication of samples.

It may not reduce error in some model, but it does on mine.
 
Maxim Dmitrievsky:

Yes, on it, with some tricks. I'll tell you one - add intermediate samples to the model. For example, there was a signal to open a buy trade, all the time it is open, on each new bar add another sample with the same mark to buy, with new readings of chips, respectively. This will greatly reduce the error. Some kind of duplication of samples.

Well, this is sort of a target matching on the first run. The rest of the cycles are basically just learning with the teacher from the first run.
With this trick, you will test more variations.
 
elibrarius:
Well, it's kind of like picking a target on the first run. The rest of the cycles are essentially already learning with the teacher from the first run.
With this trick, you will test more variations.

I don't really get it. Rather, it is a duplication of samples. Usually they just give buy and sell marks, not caring how the market behaves between these signals. If you add intermediate supporting samples, the model automatically classifies better.

For example, if I actually have 1000 samples-signals, then along with the intermediate reinforcers it's 5k or more
 
Aleksey Vyazmikin:

The pruning should control the completeness, i.e. cut to a sample coverage of at least 0.5-1%.

completeness of what? just empirically cut to the right depth

 
Maxim Dmitrievsky:

Completeness of what? Just empirically cut off to the right depth

The list should contain at least a given percentage of examples from the sample, if less, we cut off the splits. The more examples, the more probable regularity - everything is simple here.

 
Maxim Dmitrievsky:

Yes, on it, with some tricks. I'll tell you one - add intermediate samples to the model. For example, there was a signal to open a buy trade, all the time it is open, on each new bar add another sample with the same mark to buy, with new readings of chips, respectively. This will greatly reduce the error. Some sort of duplication of samples.

Maybe in some model it won't reduce the error, but on mine it reduces it a lot.

I started with this approach, but on the contrary I tried to reveal the smoothness of the correct classification curve from entry point to exit point, but my approach requires a lot of computing power - so I had to abandon it. You have the reverse, it's interesting, there's potential for counter-trending... I'm just thinking how it would be possible to realize something similar to MO, that works in my signals - I don't know how to train, but there is obviously some potential there.

 
Maxim Dmitrievsky:

I don't really get it. Rather, it is a duplication of samples. Usually they just give buy and sell marks, not caring how the market behaves between these signals. If intermediate supporting samples are added, then the model automatically classifies better.

For example, if I have 1000 samples-signals, then together with intermediate supporting signals it is 5k or more.

I also noticed and applied it, but as far as I understood it works because the data is crap, and this trick helps to train on a range of outliers. If you don't, the model fits one broker or even sometimes stops working after some data reload updates on the same terminal.

Reason: