Machine learning in trading: theory, models, practice and algo-trading - page 643

 
Yousufkhodja Sultonov:

MA-shares were created by nature to fool you - those who do not understand the laws of the market. And everyone has fallen for this trick. It's amazing, but it's a fact. Look around, and you'll realize that, MA is a property of all numerical series, regardless of whether they are market or random. Wake up traders, don't let yourselves be fooled.

I would call it paranoia.)

MACs are the most common filter that is not guilty of anything. Any mate method is good where and for where it is applicable.

 
Maxim Dmitrievsky:

So what's the question about finding features?

There is only price in our case. Any price transformation is a priori regularities, in the form of a certain "memory" of the process (indicators plotted over n-periods). That is, if we do not know the regularities, we can only input price, increments with different periods to account for memory processes.

what can be other than price increments? or not, what are you picking up there so scrupulously, is there any? :)

There is an atvoregression process with order, you can do the same thing through NS. In my opinion, this is the only thing that can be taught. I mean take econometric models and extend them

IMHO... that's why I don't even try to pick up chips :) and nerves are fine (but not really)

In other words, what can we find in the price: trend, seasonality, cyclicality, noise

You yourself posted an example a couple of pages ago of a neuron that learns to recognize a spiral. The standard two features require 3 hidden layers of neuronics. And if you add more features, one layer is enough.
So here too, you can feed a neuron a hundred past increments and process it all in a dozen hidden layers, or get some good homemade features that a one-layer grid from the 90s can handle.

 

I found another interesting package for sifting predictors. It is called FSelector. It offers about a dozen methods for sifting predictors, including using entropy.

I took the file with predictors and the target from here -https://www.mql5.com/ru/forum/86386/page6#comment_2534058


library(FSelector)
library(corrplot)

load("ALL_cod.RData")
trainTable <- Rat_DF1

PREDICTOR_COLUMNS_SEQ <- 1:27
TARGET_COLUMN_ID     <- 28

targetFormula <- as.simple.formula(colnames(trainTable)[PREDICTOR_COLUMNS_SEQ], colnames(trainTable)[TARGET_COLUMN_ID])

impMatrix <- matrix(NA, nrow = 0, ncol = length(PREDICTOR_COLUMNS_SEQ))

impMatrix <- rbind(impMatrix, colnames(trainTable)[PREDICTOR_COLUMNS_SEQ] %in% cfs(targetFormula, trainTable))
rownames(impMatrix)[nrow(impMatrix)] <- "cfs"
impMatrix <- rbind(impMatrix, chi.squared(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "chi.squared"
impMatrix <- rbind(impMatrix, colnames(trainTable)[PREDICTOR_COLUMNS_SEQ] %in% consistency(targetFormula, trainTable))
rownames(impMatrix)[nrow(impMatrix)] <- "consistency"
if(class(trainTable[,TARGET_COLUMN_ID]) != "factor"){
  impMatrix <- rbind(impMatrix, linear.correlation(targetFormula, trainTable)[[1]])
  rownames(impMatrix)[nrow(impMatrix)] <- "linear.correlation"
  impMatrix <- rbind(impMatrix, rank.correlation(targetFormula, trainTable)[[1]])
  rownames(impMatrix)[nrow(impMatrix)] <- "rank.correlation"
}
impMatrix <- rbind(impMatrix, information.gain(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "information.gain"
impMatrix <- rbind(impMatrix, gain.ratio(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "gain.ratio"
impMatrix <- rbind(impMatrix, symmetrical.uncertainty(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "symmetrical.uncertainty"
impMatrix <- rbind(impMatrix, oneR(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "oneR"
impMatrix <- rbind(impMatrix, random.forest.importance(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "random.forest.importance"
impMatrix <- rbind(impMatrix, relief(targetFormula, trainTable)[[1]])
rownames(impMatrix)[nrow(impMatrix)] <- "relief"

impMatrix

for(i in 1:nrow(impMatrix)){
  if(length(unique(impMatrix[i,]))==1){
    impMatrix[i,] <- 0
  }else{
    impMatrix[i,] <- -1 + (impMatrix[i,]-min(impMatrix[i,]))/(max(impMatrix[i,])-min(impMatrix[i,]))*2
  }
}

while(nrow(impMatrix) < ncol(impMatrix)){
  impMatrix <- rbind(impMatrix, 0)
}
while(ncol(impMatrix) < nrow(impMatrix)){
  impMatrix <- cbind(impMatrix, 0)
}

impMatrix <- as.matrix(impMatrix)
colnames(impMatrix) <- colnames(trainTable)[PREDICTOR_COLUMNS_SEQ]

corrplot(impMatrix)

The estimation of the predictor by each method I showed on the graph at the end.

Blue is good, red is bad (for corrplot the results were scaled to [-1:1], for an accurate estimate see the results of calls to the functions cfs(targetFormula, trainTable), chi.squared(targetFormula, trainTable), etc.)
You can see that X3, X4, X5, X19, X20 are well evaluated by almost all methods, you can start with them, then try to add/remove more.

However, the models in rattle did not pass the test with these 5 predictors on Rat_DF2, again the miracle did not happen. I.e. even with the remaining predictors, you have to tweak model parameters, do crossvalidation, add/remove predictors yourself.

 
SanSanych Fomenko:

Could you run a window on the merge result and give graphs:

  • entropy values
  • adfTest results
  • ArchTest results

Took just eurusd m1 for about January this year, and a 1 day sliding window.

If there is an increase in entropy then it is necessary to suspend trading and continue trading at low entropy. In contrast, when the entropy is low it is in a trend for some reason, although it is easier to trade on the flat, it is unusual.

(corrected typo in attached code, download it again if you already had time to download the old one)

 
Dr. Trader:

A few pages ago you posted yourself an example with a neuron which learns to recognize a spiral. The standard two features require 3 hidden layers of neuronics. And if you add more features, one layer is enough.
So here, you can feed a neuronkey with a hundred past increments and process it all in a dozen hidden layers, or get some good home-made features that a one-layer grid from the 90s can handle.

It is clear, but the spiral in time does not change... think about what false problem you are solving, when the spiral in time will be a square or an ellipse

And crossvalidation won't help, because state transitions are random.

 
Dr. Trader:

Logically, if entropy increases, then we should stop trading, and continue trading at low entropy. And here with low entropy for some reason the trend, although it is easier to trade on the flat, it is unusual.


That's why I say - take your time.

At high entropy we obtain the normal distribution where counter-trend trading is going on.

At low entropy - Pareto distribution, trend, "memory" - whatever you want to call it.

It turns out you have some ready-made things in R, it's easier for you. I now have a lot of work to do to account for non-entropy, so dropped out of the discussion on the forum.

I stand by my opinion - entropy accounting is the key to everything.

 
Maxim Dmitrievsky:

and cross-validation will not help, because state transitions are random

If state transitions are random, then the process is Markovian, and this whole forum thread can be deleted for uselessness :)

But I, for example, believe that the process is non-Markovian. Alexander seems to agree, he knows statistics much better than I do, I'd trust him.

 
Dr. Trader:

If state transitions are random, then the process is Markovian, and this whole forum thread can be deleted for uselessness :)

But I, for example, believe that the process is non-Markovian. Alexander seems to agree, he knows statistics much better than me, I would trust him.

I already wrote: random at local level, you can not take them all into account without shifting to a large lag or another scale, and there the process becomes predictable again. One BUT, the general population is unknown and the number of transitions to another scale is limited. That's why Alexander takes ticks. So it is so, but even it will not always work, when we will come to insufficiency of history due to its absence, and as consequence, to absence of a clear representation of patterns of the investigated BP.

In a word, some transitions at local levels can't be predicted at all, for this you have to go to another level of representation

 
Dr. Trader:

If state transitions are random, then the process is Markovian, and this whole forum thread can be deleted for uselessness :)

But I, for example, believe that the process is non-Markovian. Alexander seems to agree, he knows statistics much better than I do, I would trust him.

Although I do not use neural networks, but I read the thread, because Feynman was convinced that it is possible to predict the further movement of the particle from state A to state B (exactly from state to state, and not just extrapolate to infinity).

For this purpose he used as an input the usual increments between current and previous states, and took into account a lot of additional parameters. Shelepin L.A. was the first to start using nonentropy and died for some reason... He did not finish his work to the end. Therefore, it is up to us to finish this subject.

 

Yes! I forgot to tell you.

States are considered a set of data that characterizes a particle almost completely. I.e. it is a set of data, simply speaking - sample volume, with its characteristics - kurtosis, asymmetry, nonentropy, etc.

I.e. with confidence of R. Feynman it is possible to assert, that having correctly defined sample volume for the concrete pair, having calculated on history characteristic average values of these coefficients for this sample, it is possible to predict, that having at present time the certain set of parameters, in a certain time interval, the system will pass to a condition with its stable parameters.

This is what I expect from this branch. If you need help in determining the desired sample size, write, I will try to help.

Reason: