Machine learning in trading: theory, models, practice and algo-trading - page 2606

 
Maxim Dmitrievsky #:

There is such a formulation of the question:

Two models are used. One predicts to buy or sell, the other to trade or not.

First the first model is trained, then we see where it predicts badly, mark those examples as "not to trade", the other good ones as "to trade", then we train the second model.

The first model is tested not only in the training area but also in the additional area, while the second model is trained in both areas.

We repeat this several times, retraining both models on the same dataset. The results gradually improve on the samples. But not always on the control sample.

Parallel to this there is a log of bad trades cumulative for all passes, all "bad" deals for "not to trade" are collected in it for training the second model and filtered by some principle, like the more copies of bad deals for all passes, the more chance to mark them as "not to trade"

For example, for each date some amount of bad trades is accumulated for all iterations of training, where this number exceeds the threshold (mean, average), those trades are marked as "Do Not Trade". The rest trades are skipped, otherwise it would be possible to exclude all trades if there are a lot of training iterations.

coefficient allows to adjust the number of exit trades, the lower it is the more trades are filtered out

... by this point i'm already tired of writing ...

How can such a combination of models be improved so that it improves its results on a new independent plot?
Is there any philosophy as to why this might work? Other than the fact that the models naturally improve each other (error drops) on each lap of retraining, but how to get rid of the fit?

Interesting concept!

1. How to get rid of fitting. I don't understand much about iterations. Why can't you just train the filtering (second) model once and evaluate if it improves/doesn't improve the performance of the first one. You could just filter signals 1 through 2, or feed output 2 to input 1.

2. How to improve.
2.1. You could try to make a post-transactional markup into a cluster markup. Surely bad signals pile up. And the good ones. We mark up the clusters. You may train in a transactional way (chips on the basis of entries/tricks), targeting - we should be in a good or bad cluster. Or you can train by clusters (chips by clusters, one object of training set - cluster), target - either the same (next candle in good or bad cluster) or next cluster - good or bad (well, it is almost the same, I guess).
2.2. The feature descriptions should probably be different for these models, otherwise I think the marginal utility from the second model would be low.


3. The philosophy behind the concept. Who needs it, the effectiveness of the model, the profit - that's the criterion. Experiments rule, not philosophy.)

 
Replikant_mih #:

Interesting concept!

1. How to get rid of the fitting. I didn't understand much about iterations. Why can't you just train the filtering (second) model once and evaluate if it improves/doesn't improve the performance of the first. You could just filter signals 1 through 2, or feed output 2 to input 1.

2. How to improve.
2.1. You could try to make a post-transactional markup into a cluster markup. Surely bad signals pile up. And the good ones. We mark up the clusters. You may use transactional training (chips on the basis of entries/tricks), targeting - find ourselves in a good or bad cluster. Or you can train by clusters (chips by clusters, one object of training set - cluster), target - either the same (next candle in good or bad cluster) or next cluster - good or bad (well, it is almost the same, I guess).
2.2. The feature descriptions should probably be different for these models, otherwise I think the marginal utility from the second model would be low.


3. The philosophy behind the concept. Who needs it, the effectiveness of the model, the profit - that's the criterion. Experiments rule, not philosophy).

We want to improve the generalizability of the first model (and the second too). If we just filter out her signals with the second model, the classification error of the first model will not reduce within itself. So we sort of run the two trained models on the dataset and drop the bad examples from the training for the first one so that the error gets lower. And the error on the second one drops as well. And so we repeat this several times. It should get better each time. But I would like it to be better on the test samples every time, but there is a big variation.

I'm thinking what else I should add to it, maybe some reflections will work too :)

 
Maxim Dmitrievsky #:

We want to improve the generalization ability of the first model (and the second model, too). If we just filter its signals with the second model, the classification error of the first model will not reduce internally. So we sort of run the two trained models on the dataset and drop the bad examples from the training for the first one so that the error gets lower. And the error on the second one drops as well. And so we repeat this several times. It should get better each time. But I would like it to be better on the test samples every time, but there is a big variation.

I think what else I should add there, maybe some ideas may work too :)

Are you sure you need 2 models, and that they will improve the result on the feedback?
You can simply take the boundary of the first model not through 0.5, but through 0.3 and 0.7 or even 0.1 and 0.9 - so deals with low probability will be eliminated and they will be less and the model will be only 1.
 
Aleksey Nikolayev #:

Don't get me wrong... Therefore, I prefer to base it onthe assertions of the verifiable.

Don't get me wrong. I only pointed out the inaccuracy of your logical construction: "there are no long-playing algorithms on the market, otherwise sooner or later they would be the only ones left on the market". I showed you exactly in what niche they exist. Why they exist there in isolation, but do not capture the entire market. And he gave averifiable example.

Об авторе | QuantAlgos
  • www.quantalgos.ru
Приветствую вас в моем блоге. Меня зовут Виталий, занимаюсь написанием биржевых роботов c 2008 года. В 2009 и 2010 гг. участвовал в конкурсе ЛЧИ (лучший частный инвестор) под ником robot_uralpro, проводимом биржей РТС  (сейчас Московская биржа). Использовались HFT роботы, написанные на C#.  Результаты были следующие: В настоящее время продолжаю...
 
elibrarius #:
Are you sure you need 2 models, and that they will improve the result on OOS?
You can just take the boundary of the first model not through 0.5, but through 0.3 and 0.7 or even 0.1 and 0.9 - so low probability deals will drop out and there will be less of them and the model will be only 1.

With 2 there is more flexibility, these probabilities are such... just the number of deals decreases, the stability doesn't...

 
Maxim Dmitrievsky #:

With 2 there is more flexibility, these probabilities are such... It's just the number of trades decreases, the stability is not.

You don't have stability with 2...
 
Doctor #:

Don't misunderstand me. I just pointed out the inaccuracy of your logical construction: "there are no long-playing algorithms on the market, otherwise sooner or later they would be the only ones left on the market. I showed you exactly in what niche they exist. Why they exist there in isolation, but do not capture the entire market. And gave averifiable example.

Verifying the factual existence of a given statement does not mean verifying its content.

Even assuming it is proven (although there can and often are problems with this) that someone is making steady money year after year, it is not at all clear how a proof that this is done by the same algorithm could even look like. I'd like to see more substantive options than "take your word for it" and "that's what I'm telling you".

 
Maxim Dmitrievsky #:

with 2 more flexibility, these probabilities are... it's just that the number of trades decreases, the stability doesn't.

it's better with 3.

;)

 
elibrarius #:
You don't have stability with 2 either...

many options, it's hard to compare

 
Maxim Dmitrievsky #:

many options, it's hard to compare

You already have an example of a working bundle of 2 models. A variant with the 1st model (with a cutoff through 0.1-0.9 or 0.2-0.8) is easy to make of them and compare their stability on the OOS.
Reason: