Machine learning in trading: theory, models, practice and algo-trading - page 109

 
I'll second that, and the thought occurred over the weekend. Indeed, the more often the network says "I don't know" the closer the moment is to overtraining it. At the beginning of OOS almost all signals are interpreted unambiguously, but as time goes by, the network begins to say "I don't know" more and more often, which indicates the arrival of new data, which is difficult to interpret. When a certain level of "I don't know" is reached, the network is retrained. Very Useful Thing....
 
All the talk about the gradual attenuation of the model on the OOS and the usefulness of this information in trade looks unconvincing without talk about the preselection of predictors.
 
SanSanych Fomenko:
All the talk about the gradual fading of the model on the OOS and the usefulness of this information in trade looks unconvincing without talk about the preselection of predictors.
So what's the choice. Let's select a model with the maximum level of generalization, see how it worked at the training interval. The euviti should grow evenly... And that's where luck comes in. You can't go anywhere without it...
 
Andrey Dik:

Ato.

If the model gives wrong signals on the OOS - it is an indication of improper training, not the fact of market changes.

I agree with that. And how do you analyze the signals of two grids? It is not quite clear? How much do they diverge or are they in unison?
 
Alexey Burnakov:
I agree with this. How do you analyze the signals of two grids? It is not quite clear? How do they diverge or do they move in the same way?

I bring the signals to a common scheme like this:

SELL BUY Interpretation

-1 0 sell

0 0 fence

0 1 buy

-1 1 fence

In a well-trained model the signals rarely contradict each other. Equal number of signals on the training area is not required from them, and as a rule, they are different, and this is understandable, because the market may have a prolonged global trends. But I restrict the number of signals from one grid not to exceed 2 times the number of signals from the other one. It is an empirical ratio and may be different for someone else. For example, if the trend changes from ascending to descending, the number of signals for sell increases and signals for buy start to lie, a contradiction occurs and the number of deals declines - this is a sign that a new training is required.

 
Andrey Dik:

I bring the signals to a common scheme like this:

SELL BUY Interpretation

-1 0 sell

0 0 fence

0 1 buy

-1 1 fence

In a well-trained model the signals rarely contradict each other. Equal number of signals on the training area is not required from them, and as a rule, they are different, and this is understandable, because the market may have a prolonged global trends. But I restrict the number of signals from one grid not to exceed 2 times the number of signals from the other one. It is an empirical ratio and may be different for someone else. For example, if the trend changes from ascending to descending, the number of signals for sell increases and signals for buy start to lie, a contradiction occurs and the number of deals declines - this is a sign that a new training is required.

Thank you. This may be a working idea.
 
Combinator:

Yes, but not on the neuron configuration.

Andrew, apparently, is alluding to some such powerful inputs, that any model by default on them will give a good and not over-trained result.

Or maybe he is referring to something else. But I would like to get a more detailed answer.

 

New jPrediction 9.00 Release is out

Quote from the user manual:

"Differences between jPrediction and other machine learning software

The main difference of jPrediction is the absence of any user-defined settings, which allows you to get rid of the human factor in the form of human errors, both in the process of setting up and choosing algorithms and in the process of choosing the architecture of neural networks. The whole process of machine learning in jPrediction is fully automated and does not require any special knowledge from users or their intervention.

Functions performed by jPrediction in automatic mode

  1. Reading and parsing a file with multiple examples to build a mathematical classification model;
  2. Normalizing the data before machine learning;
  3. Dividing the set of all examples from the sample into two subsets: a training subset and a test subset;
  4. Balancing of examples from the training subset;
  5. Formation of the neural network architecture;
  6. Training set of models on the training subset of examples with different combinations of predictors (factors);
  7. Reduction of neural network architecture - removal of superfluous elements;
  8. Testing a set of models on a test subset of examples and calculating generalizing abilities;
  9. Selection of the best model by the criterion of maximum generalizing ability.

Since, from the set of models, each of which differs from any other combination of predictors, only the one that has the maximum generalizing ability is selected, the reduction (selection) of the most significant predictors is thus automatically performed."

It should be said that starting from version 8, jPrediction has no limitations on the maximum number of predictors in the training sample. Before version 8, the number of predictors in the training sample was limited to ten pieces.

Before version 8, jPrediction was single-model. That is, a sample was taken and only one single model was trained and tested on it.

Since version 8, jPrediction has become multi-model, i.e. it trains and tests many different models, on different parts of the sample, and each part contains different combinations of predictors. One of these models would give the maximum generalizability on the test part of the sample.

The problem was that if different combinations of predictors were taken, then a so-called combinatorial (from the term combinatorics) "explosion" would be obtained when the combinations are fully searched, i.e. with each additional predictor it is necessary to train and test twice as many models as without it. It is quite obvious, when the number of predictors in the sample is measured in tens and even hundreds, it becomes problematic to wait for finishing training and testing all combinations of models in reasonable time.

The problem of combinatorial "explosion" in jPrediction was solved not by going through all possible combinations, but by sequential search. The essence of the method is as follows:

Suppose we found some combination containing N predictors with maximum generalizability by trying all possible combinations of N and fewer predictors. We need to add N+1 predictor to it. For this purpose we add one by one predictors from the sample that were not included in the combination to the already found combination and measure the generalization ability for them. If in the course of such a search we found a combination with N+1 predictors whose generalizing ability exceeds the best combination of N predictors, then it will be possible to find a combination with N+2 predictors in the same way. And if they haven't found it, then it is obvious that there is no sense to search further and the algorithm of searching combinations stops at the best combination of N predictors. As a result, the algorithm of searching for combinations of predictors for the model stops much earlier, in comparison with a complete enumeration of all possible combinations. Additional saving of computational resources occurs due to the fact that the search begins with a small number of predictors in the direction of increasing this number. And the fewer predictors are needed for training, the less time and computational power it takes to build models.

That's the kind of pie.

If you're interested, the attached ZIP archive contains the manual for jPrediction 9 users in Russian in PDF format:

 
Cool! Are you the only one dragging all this? The term "reduction" is unclear. If you look at technology, it's a multiple reduction of something. And you have it as a selection.
 
Yury Reshetov:

The new jPrediction 9.00 Release is out


Everything is fine except for one small thing: there is no comparison with other models.

I offer my services for comparison

1. You prepare an input Excel file containing predictors and target variable

2. You do the calculations

3. You send the input file to me.

4. I do the calculations using randomforest, ada, SVM

We compare.

Reason: