Machine learning in trading: theory, models, practice and algo-trading - page 691

 
Maxim Dmitrievsky:

But all these statistical approaches are not relevant to forex :)

just to rack my brains

It depends on what approach to the market in general. For example, after selecting predicates and obtaining a model, you can apply the same metrics to the trained model results. If there are several trained models, then use these metrics to select the appropriate one. In many respects this is exactly the problem. After obtaining 10 models you need to choose the one that will be the best in the future. And this can be done by building a VVI or the same scaffolding, but on the results of the models obtained.... IMHO

 
Mihail Marchukajtes:

It depends on what approach to the market in general. For example, after selecting predicates and obtaining a model, you can apply the same metrics to the results of the trained model. And if several models have been trained, then use these metrics to select the appropriate one. In many respects this is exactly the problem. After obtaining 10 models you need to choose the one that will be the best in the future. And this can be done by building the VI or the same scaffolding, but on the results of the models obtained.... IMHO

imo, i think it's too labor intensive in a constantly changing market... more labor intensive than hand trading for me personally

if we talk about efficiency this approach is not effective, with low efficiency

I'm not interested in data mining for the sake of data mining

 
Mihail Marchukajtes:

Sincerely told!!!! In continuation of the topic.... As you know I started to twist R and was able to twist the maximum is to count the VI between each input and output, BUT even that was enough to reduce the amount of input data from 110 to 20-30 at the same time that the input data remain those that have maximum information about the output. As a result, the models began to pass my own tests more and more successfully. Let's see how it will be with the CB. The week will show.

But here I think that one VI metric will not be enough. I should try to calculate redundancy and try to reduce the number of columns.

Maybe there are already ready functions allowing to estimate input data to output besides mutual information????

And I've written about it more than once in this thread.

Predictor selection algorithms that are used WITHOUT the machine learning model give good results, since those that are built into the algorithm are part of that algorithm and simply inform about how the predictors were used in that particular algorithm, not about their importance to the target variable.

The algorithms in caret are very effective, there are three of them. In general, you should take this package, because it has everything in a complex: data mining, which is not only selection of predictors, a bunch of models, selection of models and their evaluation. In any case, this caret can be used as a textbook on "what happens".

I once made a review for myself, it may be useful.

 
SanSanych Fomenko:

I've written about it more than once in this thread.

Predictor selection algorithms that are used WITHOUT the machine learning model give good results, since those that are built into the algorithm are part of that algorithm and simply inform about how the predictors were used in that particular algorithm, not about their importance to the target variable.

Did a review for myself once, might be useful.

And if you think about it? You enter an endless loop of feature selection with this approach

and discounted examples where most models are built on trees lolz :)

 
Maxim Dmitrievsky:

imo, I think it is too labor intensive in the ever-changing market... more labor intensive than hand trading for me personally

And data mining for the sake of data mining is not very interesting to me.

If you do not take into account expended computing power, but take only time taken in preparation for trading, it turns out quite an interesting picture. I do the following.

On Saturday I spend 4-8 hours (working day) for creation of a model and more than one model, while I leave Friday as a piece of OOS in order to determine the working condition of the TS. Thus Saturday I spend for preparation for next week. And you're absolutely right about the fact that the market is changing too quickly, so it's just stupid to build models on a 5-year timeframe. As a rule, if TS works out 50% of training period, then it is a decent result. As a result, realizing that there is no sense in large models, because it will be worse in quality of training (the more training period the worse model), I choose the training area of two weeks, so that the TS was able to work at least a week. As a result, I get about 10 models, run the models through all sorts of tests, and now all sorts of metrics.... I choose exactly those that have passed these tests and throw it all on the UPU and...... free until the next Saturday. The robot works independently, I only control it with respect to order execution. So... ...I just keep an eye on it to make sure it's not stuck. I used to have to go to TS every morning and set one parameter, but now I got rid of this problem and do not visit TSU at all, once in two or three days, and even then, if there are no deals during this time, but otherwise ... Fuck it all. As a result I estimate my work not from deal to deal, but for weeks. Either a week in the plus or minus - the main thing that the plus week was more. But the fact is:

I spent 5 hours on Saturday for the next week to walk with my hands in my pants and not think about the market, and teach students all sorts of computer wisdom. Sitting down to trade by hand carries one disadvantage. You can sit in front of the monitor all day and get lost, which leads not only to a loss of money, but also to a loss of time. And as you know, time is a non-renewable resource.

The main thing I wanted to say to you is that if you trade robot you should try to spend as less time as possible on the market, so that in case of failure it can be avoided by earning money in the real sector (job, workshop, etc.).

There is NO point in building big models in an ever-changing market. It may be so big, that it will become outdated as quickly as a small one, but small models are usually better in training results, and they are built quicker.

Adaptive models that follow the market when new data introduce corrections into the model structure do not live long either. Unless it is a self-training system that automatically retrains itself after a time interval, selects itself, etc. This clearly smells of intelligence, but I think it is still far away. IMHO, naturally!!!!

 
Mihail Marchukajtes:

If we talk about adaptive models to follow the market, when new data make adjustments to the structure of the model itself, then such models do not live long either. Unless it is a self-training system, which automatically retrains itself after a time interval, selects itself, etc. This clearly smells of intelligence, but I think it is still far away. IMHO, naturally!!!!

it has all been there for a long time :) it works and is constantly retrained, then the results of its "activity" are approximated by a neural network and then these estimates are used with a certain probability to make new decisions and their subsequent correction

At least the approach is more logical for forex

Roughly speaking, such a system is constantly poking at different states, remembering what it has done, analyzing the consequences and making decisions based on its experience... some of it forgets, it improves... It's a kind of AI and it trades almost like a real trader :) that's real learning and not a usual approximation in case of what we've done so far
 
SanSanych Fomenko:

I've written about it more than once in this thread.

Predictor selection algorithms that are used WITHOUT the machine learning model give good results, since those embedded in the algorithm are part of that algorithm and simply inform how the predictors were used in that particular algorithm, rather than their importance to the target variable.

The algorithms in caret are very effective, there are three of them. In general, you should take this package, because it has everything in a complex: data mining, which is not only selection of predictors, a bunch of models, selection of models and their evaluation. In any case, this caret can be used as a textbook on "what happens".

Did a review for myself once, might be useful.

Thank you! Caret installed. I will have a think. But it suddenly dawned on me here the other day. I have about 110 inputs at the moment, this is the maximum I could formulate and assemble. I did it long ago, three or more years ago and I thought, what if these inputs are not so good as I think about them, that led me to the idea of resuming the search of inputs for my TS!!!! Especially since it's much easier to do this with stat metrics. First, we put everything in the big pile, and then sift it out and leave only the important ones by some criteria or other.

I got in touch with Denis from KD and he kind of promised to help me get some more data of a completely different nature, but related to the market. However I think that taking data for the period of N bars is not correct in principle, because in this case we use the time scale, while we earn on the price scale. Thus, the market should be analyzed using the price scale (profile), not the time scale. Denis promised to help with the construction of the delta profile. And this very data will be more interesting for the AI than, say, the Delta over N bars. Plus he also takes a glass with CME, so it will be possible to get to OM, and this together with the volume is already GOGOOOOOOOOOOOOOOO!!!!!. Of course OM will not make the weather, but 5-10% addition to TC performance will not hurt, because sometimes it is these percentages that are not enough......

 
Maxim Dmitrievsky:

And if you think about it? You will enter an endless cycle of feature selection with this approach

and discounted examples where most models are built on trees lolz :)

I don't need to think - for me it is a passed stage with a rather large archive of results of experiments.

I'll repeat what I've written many times.

1. The target ZZ

2. I have invented about 200 predictors for this target.

3. 27 predictors were chosen out of 200 using the "influence on the target" algorithm

I select predictors from 27 predicators on every bar. The number of selected predictors varies from 6-7 to 15 out of 27.

5. I adjust rf. Fitting error is just under 30%.


No infinite cycles. 30% is a very good result, but in theory. I have not managed to build a practical Expert Advisor based on this result, I had to add trend indicators. At present I am changing indicators (stuff) to GARCH.

 
Maxim Dmitrievsky:

It has all been there for a long time :) it works and is constantly learning, then the results of its "activity" are approximated by a neural network, and then these estimates are used with a certain probability to make new decisions and their subsequent correction

At least the approach is more logical for forex

Roughly speaking, such a system is constantly poking around in different states, it remembers what it has done, analyzes consequences and makes decisions based on its experience... some of it forgets, it improves... It is like an AI and trades almost like a real trader :)

This is the first option, and the second is to build small models without adaptation for a relatively short period of time. So to say, to raid the market. I came, optimized, took away from the commoners a couple of good deals and was off until the next time....

 

Predictors can be selected, can be extracted, and can be created. Don't forget that in addition to the so-called "noise" predictors, there are "noise" examples, which also need to be either repartitioned or deleted. You can read about it all and repeat the examples in the articles

Deep Neural Networks (Part III). Selecting examples and dimensionality reduction.

Deep Neural Networks (Part II). Predictor development and selection

Deep Neural Networks (Part I). Data Preparation

Evaluation and choice of variables for machine learning models

Good luck

Reason: