Machine learning in trading: theory, models, practice and algo-trading - page 261

You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
And what are the results in forecasting? dare I ask.
Here is a prepared spreadsheet for model training.
Prices are converted to deltas by columns; removed columns without changes or with NA.
I crammed 3 lags into each row for each column (current row minus previous; previous minus pre-existing; pre-existing minus pre-existing)
Target - increase/decrease value on next line (next line minus current line, rounded to +-1, or 0 if no change; 3 classes total). All targets have the prefix "target_"
*You can't use other targets to predict any target, or you'll be looking into the future. All columns that have "target_" prefix cannot be used as a predictor or an intu.
Problem.
Divided csv file into two parts by rows, 10:1 ratio for training data and fronttest. Trained model for target_SiH7.bid (from table above), got classification accuracy of 62% on training data, and 74% on new data. Was happy, but double-checked, it turned out that the class 0 is very unbalanced in relation to others, and just by the amount has 60% and 74%. That is, the model has learned to detect 0 well, but it has bogged down on classes -1 and 1.
A different score is needed. For two unbalanced classes this metric is great -https://en.wikipedia.org/wiki/Cohen's_kappa, but in our case there are three unbalanced classes, not two, is there a Kappa analog for 3?
If you asked me, I probably can't answer unequivocally, there are a lot of things that are predicted and everything is different. But on the average it's about 60-65%.
Interesting, can you describe it in more detail...
Just now I am engaged in a completely different area of forecasting and can not afford to scatter so I can not together with you to do experiments on this date march, but to read and observe me very interesting, write more please ...
it turned out that the class 0 is very unbalanced in relation to the others, and just by the number has 60% and 74%. That is, the model has learned to detect 0 well, but it has scored on classes -1 and 1.
I had the same problem when I trained my random forest for U-turns, U-turns were naturally much less than non-turns. The more trees I made, the more the MO scored on turn classes and concentrated more on non-turn classes.
There are several methods in caret for balancing classes, but all of them are trivial like - either we duplicate the class which has less observations to equalize the sum of observations in all classes, or vice versa we remove unnecessary observations from the class which has more observations
None of methods is more profitable than without balancing at all (but this is only in my case).
Interesting, can you describe it in more detail...
I'm just now working in a completely different area of forecasting and I can not afford to scatter myself so I can not do experiments with you on this market date, but it is very interesting to read and watch, please write more...
I had the same problem when I was training for U-turns, of course there were much less U-turns than non U-turns. The more trees I made, the more the MoD got bogged down with the U-turn class and concentrated more on the non-turn class.
There are several methods in caret for balancing classes, but all of them are trivial like - either we duplicate the class which has less observations to equalize the sum of observations in all classes, or vice versa we remove unnecessary observations from the class which has more observations
None of the methods is more profitable than without balancing at all (but this is only in my case)
Everything is negotiable. I suggested forts futures, for example Si, RI, BR, etc. the most liquid in general. As a result I propose a signal (-1,0,1)(short,cash,long), the signal is unambiguous than probability and is not distortedby MM as orders. Postprocessing, signs and targeting is up to the owner, or order.
After some thought I came to a conclusion that I need to add at least one more glass{price:vol,...||..., price:vol } as it is, one last per second, for each predicted instrument, then delta is not needed and bid, ask too, this is IMHO mandatory, if with a tape per second separate volumes and shift OI, more or less informative, then for the glass one delta is very little, at least you should see different "plates" distribution and so on, that's enough as a start. Add a glass and lay out a training dataset for a couple of days, we will play. :)
In the end I settled on Scott'sPi metrichttps://en.wikipedia.org/wiki/Scott's_Pi
Here are my results with this estimation, I trained model on first 91% of lines, and then did fronttest on last remaining data (ratio backtest : fronttest = 10:1)
the "class_0_rate" column in the table - the ratio of class 0 to classes -1and1, so that I could sift out in excel those results where this value is too high.
The last two columns are Scott's Pi metric for training and testing, the value will be from -1 to 1; where 0=result is random and the meodel is useless, and 1 = everything is fine. A negative result is no good, inverse correlation, you can try inverting the prediction result. Inverse correlation sometimes works fine with two classes when trading the exact opposite of the prediction. But with three classes it is hard to find "opposite", every class has two opposite results, in this case negative result is also bad.
I think we need to choose the currency which bid (or ask) predictions have similar and high values in the backtest and fronttest, for example xagusd. The score 0.18 on a scale from 0 to 1 is not much. And predicting one tick ahead in a real trade is also bad. In general there is a result, but it is not applicable :)
code R for Scott's Pi
if(length(act) != length(pred)){
stop("different length")
}
n_observ <- length(act)
all_levels <- unique(c(act,pred))
n_levels <- length(all_levels)
marginal_matrix <- matrix(NA, ncol=n_levels, nrow=n_levels)
colnames(marginal_matrix) <- all_levels
rownames(marginal_matrix) <- all_levels
for(i in 1:n_levels){
for(j in 1:n_levels){
marginal_matrix[i,j] <- sum((act==all_levels[i]) & (pred==all_levels[j]))
}
}
diagSum <- 0
for(i in 1:n_levels){
diagSum <- diagSum + marginal_matrix[i,i]
}
diagSum <- diagSum / n_observ
marginalSum <- 0
for(i in 1:n_levels){
marginalSum <- marginalSum + ((sum(marginal_matrix[i,]) + sum(marginal_matrix[,i]))/n_observ/2)^2
}
p <- marginalSum
return((diagSum - p)/(1-p))
}
In the end I settled on Scott'sPi metrichttps://en.wikipedia.org/wiki/Scott's_Pi
Here are my results with this estimation, I trained model on first 91% of lines, and then did fronttest on last remaining data (ratio backtest : fronttest = 10:1)
column "class_0_rate" in the table - the ratio of class 0 to classes -1and1, so that I could sift out in the excel those results where this value is too high.
The last two columns are Scott's Pi metric for training and testing, the value will be from -1 to 1; where 0=result is random and the meodel is useless, and 1 = everything is fine. A negative result is no good, inverse correlation, you can try inverting the prediction result. Inverse correlation sometimes works fine with two classes when trading the exact opposite of the prediction. But with three classes it is hard to find "opposite", every class has two opposite results, in this case negative result is also bad.
I think we need to choose the currency which bid (or ask) predictions have similar and high values in the backtest and fronttest, for example xagusd. The score 0.18 on a scale from 0 to 1 is not much. And predicting one tick ahead in a real trade is also bad. In general there is a result, but it is not applicable :)
code R for Scott's Pi
if(length(act) != length(pred)){
stop("different length")
}
n_observ <- length(act)
all_levels <- unique(c(act,pred))
n_levels <- length(all_levels)
marginal_matrix <- matrix(NA, ncol=n_levels, nrow=n_levels)
colnames(marginal_matrix) <- all_levels
rownames(marginal_matrix) <- all_levels
for(i in 1:n_levels){
for(j in 1:n_levels){
marginal_matrix[i,j] <- sum((act==all_levels[i]) & (pred==all_levels[j]))
}
}
diagSum <- 0
for(i in 1:n_levels){
diagSum <- diagSum + marginal_matrix[i,i]
}
diagSum <- diagSum / n_observ
marginalSum <- 0
for(i in 1:n_levels){
marginalSum <- marginalSum + ((sum(marginal_matrix[i,]) + sum(marginal_matrix[,i]))/n_observ/2)^2
}
p <- marginalSum
return((diagSum - p)/(1-p))
}