Machine learning in trading: theory, models, practice and algo-trading - page 2589

 
mytarmailS # :
https://stats.stackexchange.com/questions/31513/new-revolutionary-way-of-data-mining

Very interesting thoughts in this question are touched upon...

By the way, the respondents still do not understand the essence of the question

When choosing a model it is suggested to optimize not by profit on OOS, but by the ratio of this profit to profit on the tray. Or throw out models with small this ratio and from the remaining to take the maximum profit on the OOS. This is if you understand the quotes literally, without speculation.

 
Aleksey Nikolayev #:

When selecting a model it is suggested to optimize not by profit on OOS, but by the ratio of this profit to profit on the tray. Or throw out the models with small such ratio and take the maximum of the remaining ones in terms of profit on the OOS. This is if you understand the quotes literally, without speculation.

Aleksey, can I have a piece of the quote where it talks about profit, maximum profit, throwing out models....

And while it sounds like a fierce abstraction and you declare as -
literal, no speculation.
 
Aleksey Nikolayev #:

When selecting a model it is suggested to optimize not by profit on OOS, but by the ratio of this profit to profit on the tray. Or throw out the models with small such ratio and take the maximum of the remaining ones in terms of profit on the OOS. This is if you understand the quotes literally, without speculation.

In my above example with coins and 10000 people. Let heads be 1, tails 0. If we act on this algorithm, we won't get anything either. This is understandable for the described context. In other words, if we come across some edge, then it's not so important whether we take profit ratio on IS and OOS or something else, and if there is no edge, then none of these methods will work.


Exactly! We need to assess the presence of edge first. And then think about how to select. For example, we can do the following: we look at the IS share of models by some metric above a certain threshold. For example, the win rate is higher than 55% - 45% of models. Ranking by win rate, we take some TOP. Then we check results of OOS for this top. Out of selected models win rate higher than 55% is given by the same 45% (ratio of models giving such win rate at OOS to all selected)? - I think this group of models can be safely thrown out. If you can see that such a selection of the TOP works, it means there is an edge, and by how strongly this effect is expressed you can estimate the quality of the pattern. It is decided that it is strong enough. All, then it is a matter of technique to select - even if by the same win rate, PF, you don't have to bother with complicated metrics and logics, and by win rate and PF directly on the IS.

 
mytarmailS #:
Alexey, can you give me a piece from the quote, where it says about profit, profit maximum, throwing out models....

Because so far it sounds like a fierce selves and you declare like -
literally, without speculation

I have a free translation) The point is that initially many models are trained and in the end you have to choose a working one (model evaluation). Comrade claims that everyone usually chooses the model that just gives the maximum result on the OOS and this is the wrong approach. His second quote states how it should be done.

You know you are doing well if the average for the out-of-sample models is a significant percentage of the in-sample score. This is translated as maximizing the ratio of profits on the OOS to profits on the track.

Generally speaking, you are really getting somewhere if the out-of-sample results are more than 50 percent of the in-sample. This can be translated as discarding models where the ratio of profit on OOS to profit on tray is less than 0.5
 

Well, it's kind of a question of selecting models, yes, as in optimization. You can come up with your own subjective criteria.

It is not bad if there is a bunch of models with slightly different parameters, that is, allowing for variation, but they all pass the OOS. But it is not a panacea of course.

 
Aleksey Nikolayev #:
Alexey, are there any methods of restoring optimization surface?
You run a parameter search algorithm, it finds something, and you use the data from the algorithm to restore the optimization surface...
We're talking about heuristic algorithms, not a complete search naturally...
I googled it, but no results.
 
mytarmailS #:
Alexey, are there any techniques to restore the optimization surface?
As if you ran a parameter search algorithm, it found something, and you use the data that appeared as a result of the algorithm's search to restore the optimization surface...
We're talking about heuristic algorithms, not a complete search naturally...
I googled it, but no result.

Supplement model quality metrics for missing incoming, conditional, sets of hyper parameter values? Well boosting is simple to teach. What would that be needed for?

 
Replikant_mih #:

Supplement the model quality metrics for the missing incoming, conditional, sets of hyper parameter values? Well, it's easy to teach the boosting. And what can you need it for?

Maybe a simple interpolation can do it, let's see, I wanted to see if there's a ready-made one first...
Why? I'm pretty sure I can predict if the model will work on the new data if I see the model OP


 
mytarmailS #:
Alexey, are there any techniques for surface optimization reconstruction?
As if you ran a parameter search algorithm, it found something, and you use the data that appeared as a result of the algorithm's search to restore the optimization surface...
We're talking about heuristic algorithms, not a complete search naturally...
I googled it, but no results.

In the model parameter space? That's a huge dimension. This is possible only for very simple models with a small number of predictors.

It is not very clear how it is possible to build a surface in a space of huge dimensionality. We simply have very few points in comparison with this dimensionality. Unless by some dimensional downscaling visualization like PCA etc, but the point is unclear.

 
Maxim Dmitrievsky #:

Well, it's kind of a question of selecting models, yes, as in optimization. You can come up with your own subjective criteria.

It is not bad if there is a bunch of models with slightly different parameters, that is, allowing for variation, but they all pass the OOS. But this is not a panacea of course.

Earlier you had an idea of combining standard and custom metrics, which I understood as follows: the models are trained using standard ones, while the selection is done using custom ones.

Reason: