Machine learning in trading: theory, models, practice and algo-trading - page 1038

 
Roffild:
What does this have to do with the "threshold" for a random forest?

I don't remember what the threshold is anymore, I mean your threshold for entering a trade or something like 0.75 or whatever (like probability)

 
Roffild:
What does the "threshold" have to do with random forest?

In the case of logit regression I can imagine what the probabilities of assignment to one class or another are, in the case of the forest - alas. So these are pseudo-probabilities most likely, and it shouldn't work that way. At a threshold of 0.75, it shouldn't mean that the probability of assignment to a class is higher than at 0.6, for example.

At least I did not read any information about it
 

And I accept "probability" as an important part of the random forest algorithm, because the formula for counting results from all trees is based on it.

I even set the number of trees with this "probability" in mind.

 
Roffild:

And I accept "probability" as an important part of the random forest algorithm, because the formula for counting the result from all trees is built on that.

I even set the number of trees with this "probability" in mind.

Do you take into account how many leaves take up a sample plot? The size of the committee actually voting on the situation given the high probability of competence of each leaf of such a tree?

 
Aleksey Vyazmikin:

Do you take into account how many leaves occupy a sample plot? The size of the committee actually voting on the situation given the high probability of competence of each leaf of such a tree?

The final branch is responsible for at least 25 variants of the trained sample. It's set in parameters of Spark. In AlgLib there is no such parameter.
 
Roffild:
The final branch is responsible for at least 25 variants of the trained sample. It is set in the parameters of Spark. There is no such parameter in AlgLib.

Perhaps I did not put it that way.

Suppose we have 100 trees, each tree leaf at a time (in the simple case of 2 choices) makes a classification, so do we take into account the fact that the voting may involve trees with a very large margin of error - for example 49/51, which will significantly distort the average forecast. Maybe we should get rid of such leaves from voting at all? Since the lack of predictive ability says more about a bad tree leaf model when dealing with specific data.

 
New rating of programming languages, including Python and R
 
SanSanych Fomenko:
New ranking of programming languages including python and R
Good material. But your own conclusions about R and Python are far-fetched. You can't compare them at all according to your criteria, it's like warm and soft.
 
Yuriy Asaulenko:
Good stuff. But your own conclusions about R and Python are far-fetched. You can't compare them at all by your criteria, it's like warm and soft.

And more specifically?

I'm comparing the reference apparatus of languages.

How do you want it?

 
SanSanych Fomenko:

And more specifically?

I'm comparing the reference apparatus of languages.

How do you want it?

Quantitatively, there are more cars than tractors, but to compare them to each other is nonsense.
In terms of help - both there and there, it is extensive. In terms of modules, they partially overlap.
Python has a port in R, I think, and R to Python.
If you have a complex task, then working in any environment and in absolutely any language, you somehow have to port something from the outside. Like it or not, you will have to have a truck, tractor and excavator in addition - it's on the 30th place by applicability.
Reason: