Discussion of article "Advanced resampling and selection of CatBoost models by brute-force method" - page 7

 
Maxim Dmitrievsky

In that notebook, only this code block gives an error


pr = get_prices(look_back=LOOK_BACK)

pr = add_labels(pr, 10, 25)

rep = tester(pr, MARKUP)

plt.plot(rep)

plt.show()


ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series

What can be the reason?

[Deleted]  
Evgeni Gavrilovi:

In that notebook, only this code block gives an error


pr = get_prices(look_back=LOOK_BACK)

pr = add_labels(pr, 10, 25)

rep = tester(pr, MARKUP)

plt.plot(rep)

plt.show()


ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series

What can be the reason?

dataframe is empty

check if the quotes are received or not

[Deleted]  
elibrarius:
You try it. It won't take long. Wouldn't it be interesting to test it in an experiment? Breiman didn't do it in his random forest.

It's slow. I'll try it later.

 
Maxim Dmitrievsky:

it's slow. I'll try it later.

It'll be interesting to see the result. I think we can split the test in half, half for the test and half for the exam. Or add a couple of years.
 
Maxim Dmitrievsky:

dataframe empty

check whether quotes are received or not

that's right, I didn't pay attention to the fact that the broker has an "m" at the end of the eurobucks pair - EURUSDm.

[Deleted]  
elibrarius:
It will be interesting to see the result. I think we can split the test in half, half for the test and half for the exam. Or add a couple of years.

I've done something like that before, a glass of wood. Actually, it didn't do anything great.

I doubt it in this case, too. But I'll check later.

 
Maxim Dmitrievsky:

I've done something like this before, a glass of wood. As a matter of fact, it did not give anything wonderful.

I doubt it in this case, too. But I'll check it later.

I agree, in the forest initially averaging the best results. But it doesn't hurt to check)

 
Valeriy Yastremskiy:

I agree, in the forest initially averaging the best results. But it doesn't hurt to check)

No, all of them.

And it's called a random forest because all random trees are summed.
For the best would not be called random forest, but best forest. )))

 
elibrarius:

No, all of them.

And it's called a random forest because all the random trees add up.
For the best would be called the best forest, not random forest. )))

Apparently we have different ideas about random-boosting. Decisive tree, it's about selected features from a random set. The point is that the sets are random, but the selection / clustering into bad good ones was originally there. It's like throwing a needle, measuring angles and calculating Pi number)

from the wiki.

  1. Let's build adecision tree that classifies the samples of the given subsample, and during the creation of the next node of the tree we will choose a set of features on the basis of which the partitioning is performed (not from allM features , but only fromm randomly chosen ones). The selection of the best of thesem features can be done in different ways. The original Breiman code uses theGini criterion, which is also used in theCART decisive tree algorithm. Some implementations of the algorithm use theinformation gain criterion instead. [3]
[Deleted]  
Valeriy Yastremskiy:

Apparently we have different ideas about random bousting. Decisive tree, it's about selected features from a random set. The point is that the sets are random, but the selection / clustering into bad good ones was originally there. It's like throwing a needle, measuring angles and calculating the number of pi)

from the wiki

  1. Let's build adecision tree that classifies the samples of the given subsample, and during the creation of the next node of the tree we will choose a set of features on the basis of which the partitioning is performed (not from allM features , but only fromm randomly chosen ones). The selection of the best of thesem features can be done in different ways. The original Breiman code uses theGini criterion, which is also used in theCART decisive tree algorithm. Some implementations of the algorithm use theinformation gain criterion instead. [3]

Yes, there are many trees, but each one is trying to learn best on different features. This is not the same as combining multiple forests (including bad ones)