Discussion of article "Metamodels in machine learning and trading: Original timing of trading orders" - page 13

 
Maxim Dmitrievsky #:
There x[0] are probabilities for the zero class, while the model gives probabilities for two classes. That is, if the probability of the null class is less than 0.5, then the class is predicted first. So True == 1 and vice versa. Therefore, there is no error.

Hm, I haven't heard about such a thing, the probability there can be from 0 to 1, more than 0.5 - "1", otherwise "0" is the default in binary classification. Although the translator translates strangely:
"

  • One object is a one-dimensional numpy.ndarray with probabilities for each class.

"

But, then how to get the probability for class "1"? There can't be in a one-dimensional array separate probabilities for each class, or I don't understand something.....

 
Maxim Dmitrievsky #:
Maybe in your dataset the labels are upside down

"1" is the right deal/profitable/good. Should it be the other way round?

[Deleted]  
Aleksey Vyazmikin #:

"1" is the right deal/profitable/good. Should it be otherwise?

No, I mean the buys and sells in your dataset, what is a zero and what is a 1. Buy or sell.
[Deleted]  
Aleksey Vyazmikin #:

Hm, I have not heard about such a probability there can be from 0 to 1, more than 0.5 - "1", otherwise "0" by default in binary classification. Although the translator translates oddly:
"

  • One object is a one-dimensional numpy.ndarray with probabilities for each class.

"

But, how to get the probability for class "1"? There can't be separate probabilities for each class in a one-dimensional array, or I don't understand something.....

If the probability for class 0 < 0.5, then class 1 is predicted. That code just translates the probabilities back into class labels for the tester. Everything is fine there.
 
Maxim Dmitrievsky #:
No, I mean buy and sell in your dataset, what is zero and what is 1. Buy or sell.

There is a markup on the financial result already. In the close column I put the outcome of the trade. I.e. for training it is not important to buy or sell.

Maxim Dmitrievsky #:
If the probability for class 0 < 0.5, then class 1 is predicted. That code simply translates probabilities back into class labels for the tester. Everything is fine there.

I don't want to sound obsessive, but still, three options:
1. I've been doing it wrong all the time, assuming that it is the probability of class "1" in CatBoost that is estimated.
2. I don't understand your code.
3. You are wrong in assuming that probability less than 0.5 should be classed as "1".

[Deleted]  
Aleksey Vyazmikin #:

There the markup is already based on the financial result. In the close column I put the outcome of the deal. I.e. for training it is not important to buy or sell.

I don't want to be intrusive, but still, three options:
1. I've been doing it wrong all along, assuming that it is the CatBoost class "1" probability that is being evaluated.
2. I don't understand your code.
3. You are wrong in assuming that probability less than 0.5 should be classed as "1".

I don't understand anything, the Close column should be the closing prices.

The total probability is always equal to one. If the probability of one class is less than 0.5, then another class is predicted.

 
Maxim Dmitrievsky #:

I don't understand anything, the Close column should be the closing prices.

Look at the code I have attached. It may be clearer there. I do not have classification on every bar.

Maxim Dmitrievsky #:

The total probability is always equal to one. If the probability of one of the classes is less than 0.5, then another class is predicted.

In the code, if the probability is 0.4, you get class "1". Why?

[Deleted]  
Aleksey Vyazmikin #:

Take a look at the code I've attached. It might be clearer. I don't have classification on every bar.

In the code, a probability of 0.4 gives you a class "1". Why?

Can I get a zipped dataset? I don't have rar.

because the class 1 probability is 0.6.

In general, that algorithm should accept the data exactly as it is done there.
 
Maxim Dmitrievsky #:

Can I get a zipped dataset? I don't have rar.

I can download it. Although there is command line support for mac....

Maxim Dmitrievsky #:
because the probability of class 1 is 0.6.
predict_proba p
[[0.74864123 0.25135877]
 [0.81097595 0.18902405]
 [0.81477042 0.18522958]
 ...
 [0.83347862 0.16652138]
 [0.84273186 0.15726814]
 [0.84617344 0.15382656]]

I couldn't understand until I printed it - there is a difference in the console version in this respect.

Then everything makes sense, and I commented out the code of the flip, leaving the logic of markup.

#//-------Restoring previously reversed marks 
            #if  pred_meta>0.5:pred_meta=0
            #else :pred_meta=1
            #if  pred>0.5:pred=0
            #else :pred=1

#//-------Do the repartitioning and count the balance
            if pred_meta==1:#Мета  модель детектировала примеры как класс "1"
                if pred < 0.5 and Target_100<0.0:
                    meta_labels[i] = 1 
                if pred < 0.5 and Target_100>0.0:
                    meta_labels[i] = 0 
                if pred >= 0.5 and Target_100>0.0:
                    meta_labels[i] = 1
                    report.append(report[-1]+Target_100)
                if pred >= 0.5 and Target_100<0.0:
                    meta_labels[i] = 0
                    report.append(report[-1]+Target_100)
            if pred_meta==0:#Мета  модель детектировала примеры как класс "0"
                if pred < 0.5 and Target_100<0.0:
                    meta_labels[i] = 1 
                if pred < 0.5 and Target_100>0.0:
                    meta_labels[i] = 0 
                if pred >= 0.5 and Target_100>0.0:
                    meta_labels[i] = 1                    
                if pred >= 0.5 and Target_100<0.0:
                    meta_labels[i] = 0
 
Maxim Dmitrievsky #:
Can a zipped dataset be provided?

Link