Machine learning in trading: theory, models, practice and algo-trading - page 1034

 
Maxim Dmitrievsky:

They decided to cram in the un-crammed and multiply predictors without any sense.

where did you build so many predictors from? what's their importance? there's 1/3 of the forest not included in training set, and 95% have low importance. And what is the feedback now the system has with so many predictors, 3 hours for 1 forecast? )

You won't know until you build it.

The result itself: it's possible with Apache Spark

I don't do more than 1000 predictors now. But less than 100 predictors is not enough.

 
Roffild:

You won't know until you build it...

The result itself: this is possible with Apache Spark

I don't make more than 1000 predictors right now. But less than 100 predictors is not enough.

are you deliberately ignoring questions?

Repeat: how come there are so many predictors?

 
Maxim Dmitrievsky:

are you deliberately ignoring the questions?

Repeat: how come there are so many predictors?

You can make a billion predictors, momentum in increments of one, etc.)

But they will not provide anything additional.

 
Maxim Dmitrievsky:

are you deliberately ignoring the questions?

repeat: how come there are so many predictors?

Most of these predictors result: bool(high[0] > high[1]) and respectively have state 0 or 1

Of course, I'm generating a list of predictors with a script.

There is no point in purposefully selecting predictors, because it's easy to exclude an important one.

 
Roffild:

Most of these predictors result in: bool(high[0] > high[1]) and have state 0 or 1 accordingly

Of course, I generate a list of predictors with a script.

There is no point in purposefully selecting predictors, because it is easy to exclude important ones.

You need to take not by quantity, but by transforming the original few until the classes are well separable, with error control on the OOS

 
forexman77:

You can make at least a billion predictors, momentum in increments of one, etc.)

But they won't give you anything extra.

If the predictor won't be part of the forest, nothing bad will come of it. Perhaps this predictor will manifest itself in another version of the random forest.

We are discussing an array of 7000 doubles, which takes up little RAM and gets around in a matter of nanoseconds. There is no noticeable slowdown in interpreting 500 trees with 7000 predictors. If you don't believe me, install Spark and check it yourself.

 
Maxim Dmitrievsky:

You need to take not a number, but a transformation of the initial few until the classes are well separable, with error control on the OOS

Evaluating the quality of the forest is a completely different topic.
 
Roffild:
Evaluating the quality of the forest is a completely different topic.

the issue is the effectiveness of the approach, your proposed, to put it crudely - not effective

 
Maxim Dmitrievsky:

The topic is the effectiveness of the approach, your proposed, to put it crudely - is not effective

The effectiveness still needs to be proven by practical results. Perhaps my method of evaluation for price charts does not coincide with the classical quality indicators, but in the end everything is decided by profit.

I was answering the question "why do we need Spark?" I answered it. Or are you going to accuse me again of ignoring questions?

 
Roffild:

The effectiveness still needs to be proven with practical results. Perhaps my method of evaluation for price charts does not coincide with the classical indicators of quality, but in the end everything is decided by the profit.

I was answering the question "why do we need Spark?" I answered it. Or are you going to accuse me of ignoring questions again?

It has long been clear about spark, I did not ask. Asked about the idea. This approach with spark is just out of hand because of inefficient way to train and required power

The same can be done through optimization in the MT5 cloud without scaffolding. I don't know what the output is or if there is any profit, but in theory there isn't and this algorithm will always fail because of the overfit

IMHA

Reason: