Machine learning in trading: theory, models, practice and algo-trading - page 3384

 
Andrey Dik #:

OpenCL is for the more or less advanced.

For less advanced - running agents on separate charts. Each chart in a separate thread, all processor cores will be used in total.

Besides, terminal agents themselves can be used to parallelise application calculations on a terminal chart, this is not known to many people.

In this article I will show you how to write a binary GA covering all significant digits of a double number with an unlimitedly small step of parameters in MQL5 (in fact, it is limited to 16 decimal places for double). And even this is not the limit, you can write extensions of standard types of numbers in MQL5.

That's what I'm saying - you'll get a bicycle with a lot of cramping.

And OpenCL is so good that metaquotes didn't write their GPU optimiser on it, probably only because any of us can easily write our own.

Okay, I'll stop, or I'll be banned like Maxim if I say everything I think about your demagogy.

 
Aleksey Nikolayev #:

That's what I'm saying - it's going to be a pretty nasty bicycle.

And OpenCL is so good that metaquotes didn't write their GPU optimiser on it, probably only because any of us can easily write our own.

Okay, I'll stop, or I'll be banned like Maxim if I say everything I think about your demagogy.

So what exactly is the snag? If you have problems with MQL5, it's not the language's fault, there are special profile threads for those who ask questions.

What is my "demogogy"? I gave you a huge list of literature at your request to read, expand your horizons, I tell and show you specific implementations and search strategies in MQL5. What else do you need from me, so that I don't get slop on my face and make you afraid of being banned?

I am very surprised by people.

 

A little bit about the redundancy of rendered forrest

take iris dataset + train forrest + extract rules from forrest + create a dataset where each rule is a feature.

we get a matrix with rules in columns (about 700 hundred pieces).

X <- iris[,-5]
target <- iris[,"Species"] 

library(inTrees)
library(RRF)

rules_dataset <- target |> 
                  RRF(x = X) |> 
                  RF2List() |> 
                  extractRules(X = X) |> 
                  sapply(\(r) eval(str2expression(r)))
ncol(rules_dataset)
[1] 698

Now identify all linearly related rules and remove them as redundant.

remove_lin_comb <- caret::findLinearCombos(rules_dataset)$remove
clear_rules_dataset <- rules_dataset[, -remove_lin_comb]

and we get

ncol(clear_rules_dataset)
[1] 32


The whole dataset can be described by 32 rules instead of 698.


That's the way it is...

Forrest is 698/32 = 21.8125 times more redundant than it could be.

 
mytarmailS #:

A little bit about the redundancy of the Random Forrest

take iris dataset + train forest + extract rules from forest + create a dataset where each rule is a feature.

get a matrix with rules in columns (about 700 hundred pieces)

Now identify all linearly related rules and remove them as redundant.

and we get


The whole dataset can be described by 32 rules instead of 698.


That's the way it is.

Forrest is 698/32 = 21.8125 times more redundant than it could be.

Where do the rules come from? That's right: mountains of information on input, compress and get rules and then use them for prediction, not the original information. That's why it's called a model.

 
СанСаныч Фоменко #:

Where do the rules come from? That's right: mountains of information at the input, compress it and get rules and then use them for prediction, not the original information. That's why it's called a model.

Read carefully what was written

 
mytarmailS #:
Didn't you want to write an article on rules, or have you changed your mind? It's probably an interesting topic, more interesting than minimising test functions. Or do you have problems with their validation on OOS? Or there are no problems, you are just too lazy to write.
 
Some kind of general approach to rule selection. Like here's breaking down the tree into rules, what then... in the context of TC. Best practices and insights. That would be interesting.

Just not random functions and random wolves, but closer to profits.
 
Maxim Dmitrievsky #:
Some kind of general approach to rule selection. Like here's breaking down the tree into rules, what then... in the context of TC. Best practices and insights. I'd be curious.

Just not random functions and random wolves, but closer to profits.

Isn't "closer to the profit" synonymous with "overtraining"?
We get a nice even balance on random profit, since the basis is a random incremental value. And where does the beauty of balance come from?

Balance is the evaluation of TS in the terminal, where this balance is influenced not only by classification error.

But if we stay within the MOE, then the valuation is NOT the profile

 
СанСаныч Фоменко #:

Isn't "closer to profit" synonymous with "overtraining"?
We get a beautiful even balance on a random profit, because the basis is a random increment value. Where does the beauty of the balance come from?

Balance is the evaluation of TS in the terminal, where this balance is influenced not only by classification error.

And if we stay within the MO, then the evaluation is NOT a profit

Closer to the profit - closer to the quotes, not training on anything meaningless. There are plenty of such tests on the internet, and the peculiarities of different MOs have been known for a long time. What is worse and what is better.

I just don't understand where rule extraction fits in the hierarchy.
 
Maxim Dmitrievsky #:
Didn't you want to write an article on rules, or have you changed your mind? It's probably an interesting topic, more interesting than minimising test functions. Or do you have problems with their validation on OOS? Or there are no problems, but just laziness to write.
I don't know, there's nothing to write.
I'll write how to break a wooden model into rules, so what?
In fact, my post has already shown everything.

Or are you referring to my old post? If so, in splitting I found no super healing properties, there are pluses that can not give the model.

1. You can drastically reduce the dimensionality of the model.



2. You can know the statistics of each rule (this is really important).

For example, we have a wooden model with 100 rules and we never know whether each rule worked once inside 100 rules (there is no pattern) or whether 10 rules worked 50 times (there is a pattern).
If we don't break the model, we won't know and both models will be the same for us.

Reason: