Machine learning in trading: theory, models, practice and algo-trading - page 1810

 
Evgeny Dyuka:

YES! that's right and nothing else.

I don't agree, but I'm not going to impose anything...

 
mytarmailS:

I don't agree, but I'm not going to impose anything...

You can spend 3 days on different visually different instruments. In any case, we examine the series first and draw conclusions. And it's always better to find some understanding in the test than to fix the loss in the real one.)

 
Valeriy Yastremskiy:

It is better to test without emotions))) Trade all the more)))))))

To be honest, the theme of predictors is not disclosed. As well as the logic of models, which ones should be applied when, and what is the criterion for their selection.

Recommendations on how to prepare data have nothing to do with the result. Although without it it does not begin)))))

The logic models, selection criteria and data preparation are key issues, no one will give you a working solution. If it is laid out, it means it does not work.

It does not matter whether it is good or not in human terms, you just have to admit that these are the rules of the topic in which we sit.

 
Valeriy Yastremskiy:

It is better to test without emotions))) Trade all the more)))))))

To be honest, the theme of predictors is not disclosed. As well as the logic of models, which ones should be applied when, and what is the criterion for their selection.

Recommendations on how to prepare data have nothing to do with the result. Although without it does not begin)))))

Not disclosed it is not to here, it is to the current state unfortunately. Works and conclusions are not, how to determine which model is better for a particular series, other than to compare the results)

 
Evgeny Dyuka:

The logic of models, selection criteria and data preparation are key issues, no one will give you a working solution. If it's laid out, then it doesn't work.


Well, probably not exactly so. There are just mathematical methods and previously their use was not available to all, now available. But there is no solution other than look and choose and try. Maximum likelihood is a method of course, but it is subjective, and the problem is the subjectivity of the choice of significant parameters for the analysis.

It is better to discuss logics, models and predictors with features and logic of their application.

It makes no difference whether it works or not. It is a proven fact that it is not 100%. And even one will be enough to drain)))) The main thing is the hands!!!!! Or the tail)))))

 
mytarmailS:

please...

200 or 300 in absolute values.

What ranges are you interested in?


Or maybe study R a little? ;)


5 lines of code, and you got what you wanted.

I think you need to look at the balance of errors (+1 correct and -1 wrong input), or at least normalize the balance in order to reduce emissions.

I will not use R to study it, it's quite unlike MQL, and I'm not a programmer. And I'm far from a programmer, and there is no good HELP, like MQL.


I saw that you are interested in the sampling efficiency. I am also interested in this topic, especially I want to understand if it is possible to make a breakdown better than I do. I can make a sample where predictor values without and with descriptor, so you can check the efficiency of the package, i.e. if it can learn better after artificial sampling than after my logical one, then the package is more efficient than human.

 
Aleksey Vyazmikin:

I think that we should look at the balance of errors (+1 correct and -1 wrong input), or at least normalize the balance in order to reduce emissions.

I have a hard time studying R, it doesn't look like MQL, and I'm far from being a programmer. And I'm far from a programmer, and there is no good HELP, like MQL.


I saw that you are interested in the sampling efficiency. I am also interested in this topic, especially I want to understand if it is possible to make a breakdown better than I do. I can make a sample of predictor values without and with deselection, so you can check the efficiency of the package, i.e. if it can learn better after artificial sampling than after my logical one, then the package is more efficient than human.

I'm not a programmer either, in fact I even started from studying C#, I understood nothing and gave up, then I tried R and everything went fine :)


I'm sure it won't yield any gain in quality; moreover, it will rather decrease, the main thing is not to gain too much.

I need it to translate numeric variables with a range of thousands of divisions into categorical variables, which have, say, only 20 levels.

I need it to generate rules that will repeat .....

Why do I need all this? Forests work on principle of voting, the output probability is a sum of tree votes, often happens that algorithm shows great probability but prediction is bad, but sometimes probability is great and prediction is good, so I see, if I know exactly which rules are involved in voting at a certain moment I can distinguish fair rules and noisy ones...

 
mytarmailS:

I'm not a programmer either, in fact I even started with C#, I understood nothing and gave it up, then I tried R and everything went fine :)


I'm sure it won't yield any gain in quality; moreover, it will rather decrease, the main thing is not to gain too much.

I need it to translate numeric variables with a range of thousands of divisions into categorical variables, which have, say, only 20 levels.

I need it to generate rules that will repeat .....

What do I need it for? Forests work on principle of voting, on output probability is sum of tree votes, sometimes algorithm shows high probability but prediction is bad, but sometimes probability is big and prediction is good, so I see, if I know exactly which rules participate in voting at exact moment I can distinguish honest and noisy rules...

In my case, discretization improves the result, and yes, my predictors are closer to categorical in meaning, almost all of them, with values between 2 and 20.

In fact, to estimate a model like this, you need to check for similarity in the activation points of the voting leaves, and remove/un-weight the leaves that are constantly activated at similar sampling points. Such noisy trees will fit the story well, because of the excess memory.

Ideally, each leaf should contain the meaning, and the neighboring add it, but describing something different, for example, one determined that we have a ball in front of us, another determined its color - so classified the ball belonging to a particular type of game. Simplified.

Decompose the forest into leaves with tree indices and see the activation of each leaf on the sample, then discard the garbage.
 
mytarmailS:

Vladimir, could you please tell me how in R you can teach AMO not for example classification or regression, but something more vague ...

I don't know how it should look like and what values it should take, and it is not important for me, I can only describe the anticipation and let AMO maximize the anticipation criterion in the anticipation function, which it created itself

Or is it purely an optimization problem and has nothing to do with AMO?


Any model requires optimization of hyperparameters. With the defaults set, the result will not be the best. When optimizing, set the criterion that is important to you. In all examples in the literature, these criteria are statistical metrics (Acc, F1, etc.). In our case, these criteria do not always lead to the expected result in trade (strange as it may seem). For example, I use the average reward per bar over a certain period of time (usually 1 week) as an optimization criterion and indicator of model performance. If it is not less than the minimum value (for example 5 points of 4 signs), then we continue working. If it has fallen, then we finetune the model with fresh data. Optimization is Bayesian only, it gives variants.

The model must be constantly improved in the process, taking into account changing market conditions. It is a great illusion that you can train a model on a huge range of past data and then use it for a long time without retraining.

2. Synthesizing some function, I don't know what kind - this is a sort of do something I don't know what. There are several packages which implement genetic programming. The exact names are not available at the moment. But this is a very hard section. Try .

3. Discretization. The main purpose of discretization is to make the predictor-target ratio as linear as possible. There is information loss in this case, of course. But in some cases it gives quite good results.

Good luck

Генетическое программирование - Genetic programming - qwe.wiki
Генетическое программирование - Genetic programming - qwe.wiki
  • ru.qwe.wiki
В области искусственного интеллекта , генетическое программирование ( ГП ) представляет собой метод , посредством которого компьютерные программы кодируются в виде набора генов, которые затем модифицированных (эволюционировали) с использованием эволюционного алгоритма (часто генетический алгоритм , «GA») - это применение ( например...
 
Discretization is nonsense, you can use regularization. The additional training of the model in the course of trading is also nonsense, it will not work
Reason: