Machine learning in trading: theory, models, practice and algo-trading - page 96

 

Iris petal data is not a signal, this table is not suitable for the foreca test at all. Only time series, where you get new values at certain intervals, and you combine them into a vector, are suitable for a batch. For this reason, you can't change the order of the rows in the data table for foreca. And you can't randomly remove some of the rows for validation, everything must be in strict order - first data for training, then data for validation. No sample.

The best thing to do with iris is to use the maximum number of components min(dim(forec.dt)) = 14, but I think the accuracy will still be below 100%.

 
Dr.Trader:

The best thing to do with irises is to use the maximum number of components min(dim(forec.dt)) = 14, but I think the accuracy will still be below 100%.

I did it both ways, it was about 85% accurate, and just forrest showed 95%
 
Dr.Trader:

Iris petal data is not a signal, this table is not suitable for the foreca test at all. Only time series, where you get new values at certain intervals, and you combine them into a vector, are suitable for a batch. For this reason, you can't change the order of the rows in the data table for foreca. And you can't randomly remove some of the rows for validation, everything must be in strict order - first data for training, then data for validation. No sample.

The best thing to do with iris is to use the maximum number of components min(dim(forec.dt)) = 14, but I think the accuracy will still be below 100%.

I think the post about irises is very important.

The point is that rf is phenomenally prone to overtraining.

And here it turns out that foreCA has no such tendency. So it's a very useful package.

 
Dr.Trader:


What are your results with BP there?
 
SanSanych Fomenko:

I think the post about irises is very important.

The point is that rf is phenomenally prone to overlearning.

And here it turns out that foreCA has no such tendency. So it's a very useful package.

Even though it overtrains, if you add 10 more columns with random values to the 4 predictors for irises, the forest still predicts new data with almost 100% accuracy. I'm surprised, and glad that the forest did well. I haven't done such an experiment myself before, I'll keep it in mind for the future.

foreCA, in its turn, called all predictors noise with predictability ~ 1% (both lobe lengths and predictors from random values), and tried to extract some signal from it all. To extract signal from where it shouldn't be is useless in my opinion, this experiment says nothing for foreca.

mytarmailS:
What are your results there with BP ?

The model is still learning. I have probably fed too much data, but I don't want to cancel it, let it run until the end, I will leave it. I will write later about the results when it is over.

 
Of course, I do not want to get ahead of myself, but Reshetov made such a cool thing in the new release.... He'll figure out your problems in no time. I gave him the idea, but he was already thinking about it himself, so fools think alike and the result is a powerful thing. You shouldn't be taking it out on him.....
 
Mihail Marchukajtes:
I certainly don't want to get ahead of myself, but Reshetov made such a cool thing in the new release.... . You shouldn't be criticizing him.....

Cool talk about cool stuff...

And will we see at least one comparison with the generally accepted and universally known and recognized?

 
SanSanych Fomenko:

Cool talk about cool stuff...

Will we see at least one comparison with the generally accepted and commonly known and recognized?

Someday you will, why not.....
 
Dr.Trader:

Even though the forest retrains, if we add 10 more columns with random values to the 4 Iris predictors, the forest still predicts new data with almost 100% accuracy. I'm surprised, and glad that the forest did well. I haven't done this experiment myself before, I'll keep it in mind for the future.

Yes I'm surprised myself that it so brilliantly ignored noise and differentiated from predictors, I've never done that either, I was curious myself....

So even until today I had absolutely no confidence in the importense function.

n

but it made me believe

 
Continue not to trust importance when using it for forex. Iris is very simple data, there are direct patterns between available data and classes. RF is enough to find a minimal set of predictors on which iris classes can be defined and you're done.
Reason: