Machine learning in trading: theory, models, practice and algo-trading - page 58

 
I have a question for Yuri. When figuring out the result of a trinary model, when I manually enter the data, the results sometimes show a dash symbol. That is 0, there is a 1 and a dash. So what does a dash mean?
 

I tried to classify the zigzag, yes, but not the pivot point, but the whole trend the zigzag shows, the result is 0 if the current zz trend is going down, and 1 if the trend is going up. The zz trends look pretty unbalanced, but that's not why I got away from them. What I didn't like is that the model needs a very high precision. If the model makes a mistake or two in the trend and turns the trade at the wrong time, even if just for one bar, it usually leads to additional losses plus paying a commission each time with the spread. The model will be profitable only if it will open a trade, wait for the end of the trend, and reverse. Without a single error within each trend.

If it predicts the next bar rather than the trend, each error will result in less money lost.


I don't do balancing, the scatter of classes is minimal when predicting the next bar, I don't think that +-10% of any one class will greatly affect the result.

This is where they write in the article that balancing can be replaced by correct model estimation (F-measure or R-Precision). This is the Russian analogue of the article that SanSanych linked earlier.

http://bazhenov.me/blog/2012/07/21/classification-performance-evaluation.html

...

Nevertheless, this metric [accuracy] has one thing to consider. It gives equal weight to all documents, which might be incorrect if the distribution of documents in the training set is strongly skewed towards one or more classes. In this case, the classifier has more information on these classes and, respectively, within these classes it will make more adequate decisions. In practice, this leads to a situation where you have an accuracy of, say, 80%, but within the framework of some particular class, the classifier works out of all proportion not defining even a third of the documents correctly.

One way out of this situation is to train the classifier on a specially prepared, balanced corpus of documents. The disadvantage of this solution is that you take away information from the classifier about the relative frequency of documents. This information, all other things being equal, can be very helpful in making the right decision.

Another way out is to change the approach to formal quality assessment.

Accuracy and completeness

Precision and recall are the metrics that are used when evaluating most information extraction algorithms. Sometimes they are used by themselves, sometimes as a basis for derived metrics such as F-measure or R-Precision. The essence of accuracy and completeness is very simple.

System accuracy within a class is the proportion of documents that actually belong to that class relative to all documents that the system has assigned to that class. Completeness is the proportion of documents found by the classifier which belong to the class with respect to all documents of this class in the test sample.

....

F-measure

It is clear that the higher the accuracy and completeness, the better. But in real life maximum accuracy and completeness are not achievable simultaneously and we have to look for a balance. That is why we want to have some kind of metric which combines information about accuracy and completeness of our algorithm. In this case it will be easier for us to decide which implementation to launch in production (the one who has more is better). F-measure1 is exactly such metric.

F-measure isa harmonic meanbetween accuracy and completeness. It tends to zero if accuracy or completeness tends to zero.


etc., there are various beautiful graphs in the article

 
Dr.Trader:

I tried to classify the zigzag, yes, but not the pivot point, but the whole trend the zigzag shows, the result is 0 if the current zz trend is going down, and 1 if the trend is going up. The zz trends look pretty unbalanced, but that's not why I got away from them. What I didn't like is that the model needs a very high precision. If the model makes a couple of mistakes in the trend and turns the trade at the wrong time, even if just for one bar, it usually leads to additional losses plus paying a commission each time with the spread. The model will be profitable only if it will open a trade, wait for the end of the trend, and reverse. Without a single error within each trend.

If it predicts the next bar rather than the trend, each error will result in less money lost.


I don't do balancing, for the next bar, the spread of classes is minimal and I don't think that +-10% of one class will strongly affect the result.

This is where they write in the article that balancing can be replaced by correct model estimation (F-measure or R-Precision). This is the Russian analogue of the article that SanSanych linked earlier.

http://bazhenov.me/blog/2012/07/21/classification-performance-evaluation.html

etc., there are various beautiful charts in the article

A little advice. Any system leads to one single phenomenon. It's a signal!!! The totality of all conditions leads to a fait accompli, which is the decision point. That is, any system, no matter how complex it is, leads to signals or buy or sell. So it is advisable to classify them. Crossing of Wands. The crossing occurred, there is a buy signal, crossing in the opposite direction, there is a sell signal. Now, for a correct classification it is necessary to do it separately for sell and separately for buy; thus you can double a training interval while keeping the generalization level. At the beginning my models seldom rose above 40-50% of generalization, but after I thought what to do with the data. What is the point of the model obtained after classification. On the same data I now get models no lower than 70% on average 80-90% and in the future, on unknown data errors of about 1-2 out of 10-12. This is quite enough to make money. But the confidence interval I take 30% of the training interval. That is, I take 100 buy signals and 100 sell signals, and I know I can work with 30 or 50 signals without re-training the model. In the first versions of the predictor six inputs were optimized in about 40 minutes, which was extremely inconvenient, and now it takes 10 minutes to make nine inputs. And this has only increased the quality of the model. Now the problem is where to find so many inputs. But we are not in the know. We still have something to offer to the predictor :-)
 
Mihail Marchukajtes:
I have a question for Yuri. When figuring out the results of a trinary model, when I manually enter the data, the results sometimes show a dash symbol. That is 0, there is a 1 and a dash. Is that what a dash means?

Same as the famous Socratic phrase "I know what I do not know. The ternary classifier, answering with a minus, says that the training sample had no examples similar to the pattern being classified, so he cannot attribute it unambiguously to any class, i.e. he cannot give an affirmative answer to the pattern being presented. He honestly admits his lack of proper competence in some areas of knowledge, rather than trying to answer positively with a cheeky face to questions to which he does not know the answers.

 
Yury Reshetov:

It is the same as the famous Socratic phrase "I know what I don't know. The ternary classifier, answering with a minus, reports that the training sample had no examples similar to the pattern being classified, so it cannot unambiguously assign it to any class, i.e. it cannot give an affirmative answer to the pattern presented.

Hm. Tell me, is there any possibility in the foreseeable future to upload a ternary model to a file and use it in MKUL? The same as the binary model, but when you enter it by hand, there is a chance to make a mistake and all that.....
 
Mihail Marchukajtes:
Hmm. Well, I see... Tell me whether there is a possibility in the foreseeable future to upload a ternary model to a file, so that you can use it later in MKUL? The same as the binary, and when you enter it by hand, there is a chance to make a mistake and all that.....
Now I'm working on it. That is, the code generator is not finished yet and at the moment gives out the sources of only one of the binary classifiers, but not the whole ternary classifier.
 
Yury Reshetov:

The same thing that Socrates' famous phrase "I know what I don't know" means. The ternary classifier, answering with a minus, says that in the training sample there were no examples similar to the pattern being classified, so he cannot attribute it unambiguously to any class, i.e. he cannot give an affirmative answer to the pattern being presented. Honestly admits his lack of proper competence in some areas of knowledge, and does not try to answer affirmatively with a smug face to questions to which he does not know the answers.

Judging by the attached picture, do I understand the point correctly? On the left is a binary classifier; on the right is a ternary classifier (the white zone is "minus")

If so, then I think the idea is good, for some reason I've never seen it before, can you please advise some articles on the ternary classifier?



Finished this later:

Intuitively, this task is pretty simple. Suppose there are 2 predictors (X and Y), that means we need to work in 2-dimensional space (like on the pictures above). Then we need to enclose such a 2-dimensional space that includes all the "buy" classes (blue fill). Then, enclose a second space that includes all the "sell" classes (red). The two fenced spaces must not overlap. To classify new data, just look at which fenced space the point you're looking for falls into. If it doesn't get anywhere (white on the right picture) - then it is clear that the model cannot tell anything about that point and should not trade at the moment.

With 3 predictors there will be a 3-dimensional space where classes will be enclosed by some three-dimensional volumetric shapes. Etc, the more predictors, the more multidimensional the figures.

Do such models exist? Usually classifiers find some hyperplane in space that separates classes. But here we need two closed hyperfigures.

 

Mihail Marchukajtes:

...

In the first versions of the predictor it took about 40 minutes to optimize 6 inputs, which was extremely inconvenient, but now it takes 10 minutes to make 9 inputs. This only increased the quality of the model. Now the problem is where to find so many inputs. But we are not in the know. We still have something to offer to the predictor :-).
Yeah, I'm trying to classify strictly buy/sell too. But how did you get the original 6 inputs, did you just take them from some known strategy? Adequate inputs are one of the most important things. On the contrary, I have thousands of entries (prices and indicators over a hundred bars) and I need to sift them out leaving a couple dozen, because on so many inputs any model overtrains.
 
Dr.Trader:

Judging by the attached picture, do I get the point right?


Binary classifier on the left; ternary classifier on the right (the white zone is "minus")

If primitive for dummies, it's good enough as a visual aid.

Dr.Trader:
If so, it's a good idea, for some reason I've never seen it before, can you please advise some articles on the ternary classifier?

If you are not banned from google, you can search by the phrase "ternary classifier machine learning".

 
Yury Reshetov:

If you're not banned from google, you can search for "ternary classifier machine learning".

In other words "Look for the first google link that leads to my site" :)

I found it, you have a committee of two models, this is not at all how I understood and wrote above.

Reason: