Machine learning in trading: theory, models, practice and algo-trading - page 268

 
mytarmailS:

received a warning

logically, the script tried to read the previously created table from the rdata file, but failed, so the warning. The next time you run the script, the rdata file with the table will be read, and there will be no warping.


NA - also logical, the data for creating the model should be prepared, not take the raw output from those indicators. There are many more things you can do.

<NA> at the beginning of the table is fine, candlesticks do require at least 23 bars. For the first 23 bars you can always expect NA.
I did not fill the table to the width of the window, not only that there will be NA, but you can expect "some not so" results because of the shallower depth of indicator calculations.
It's better to cut all first lines to window width.
trainData <- trainData[-(1:indicatorDepth), ]

Correct <NA> in column names except target: colnames(trainData)[-ncol(trainData)] <- paste0("pred",1:(ncol(trainData)-1))

Replace the target with 1 for all positive, and -1 for all negative. Or {0;1} if you have neuronics.

Those indicators that contain some values close to prices - scale to 0-1, or count deltas. (for example MA values are always close to price, they need to be scaled or delta. And RSI is always in its own 0-100 range, and that is good in itself. If the indicator values can go beyond the values known during the training - delta it, it will not get worse)

For neuronics in general it's better to scale all indicators in 0-1.

So on and so forth.

But NA in columns 46-51 - there is really something wrong. There or indicators return everything in a different format and we need another code specifically for their insertion into the table.
Or - these indicators just return NA by themselves; maybe you need more window width; or they always return NA for the last bar and then replace NA based on new bar data, which is redrawing, and bad.

 
SanSanych Fomenko:

Here comes a new bar, which is a herald of a market reversal. But we, continuing to feed the sacred cow, don't change our view of history for the sake of some idea taken from the "analysis" section.

For a simple trade this is quite excellent, I agree. The indicator has found some just now apparent pattern, and showed it to us, all is well.

But we need to prepare the data and train the model. If an indicator is overdrawn, it usually means that the values of past bars are constantly changing based on newer data, i.e. there is a look into the future. And the model eventually learns from these future looking values, nothing good can come out of it.
Although, such indicators can be used as a target for learning, the same zigzag for example - it looks 100 bars ahead, that is why it draws trends so beautifully.

 
Dr.Trader:

But NA in columns 46-51 - there is really something wrong. Either indicators return everything in a different format and need a different code specifically for their insertion into the table.

Or these indicators just return NA by themselves; maybe you need a larger window width; or they always return NA for the last bar and then replace NA based on the new bar data, which is overdrawing, and bad.

Checked, it looks like a redraw. For the last bar, the nextCandlePosition indicator always returns NA. And then on the next bar it replaces NA with something it needs. @mytarmailS Try your first code again, but without this indicator and train the model, I think the result will be worse.

I corrected my script to take the penultimate value of nextCandlePosition instead of the last one, now there will be no NA in the last table rows.

 
Dr.Trader:

For a simple trade this is quite excellent, I agree. The indicator has found some just now revealed pattern, and showed it to us, all is well.

But we need to prepare the data and teach the model. If the indicator redraws, it usually means that the values of past bars are constantly changing on the basis of newer data, i.e. there is a look into the future. And the model eventually learns from these future looking values, nothing good can come out of it.
Although, such indicators may be used as a target for studying, the same zigzag for example - it is looking 100 bars ahead, therefore it draws trends so beautifully.

Let's use a simple example.

1. We draw a Hodrick-Prescott smoothing. It is redrawn.

On the current bar the tangent looks up. On the next bar the tangent is pointing downwards. The indicator has redrawn as it takes the current changes into account. Forecast one step forward - down.

2. Drawing the tangent.

On the current bar the tangent is pointing up. On the next bar the tangent points upwards - the indicator has not yet reached the changes.

Please note that the HP indicator stops redrawing somewhere at 10-15 bars.

Your choice and why?

 
SanSanych Fomenko:

Your choice and why?

In hand trading - I can assume that there are good strategies using both of these indicators. I do not know such strategies, so I would not trade with them.

In machine learning - I'll choose the wizard. A dumb lagging indicator is better than any redrawing indicator.

 
Dr.Trader:

Checked it, it looks like a redraw. For the last bar the nextCandlePosition indicator always returns NA. And then on the next bar it replaces NA with something you want. @mytarmailS Try your first code again, but without this indicator and train the model, I think the result will be worse.

I corrected my script to take the penultimate value of nextCandlePosition instead of the last one, now there will be no NA in the last rows of the table.

Yes I wrote that I deleted about six of the best predictors to just remove the ones that might be redrawn, but the accuracy has dropped literally by 3%, there probably all redrawn ...

So, have you already trained the model? Maybe you should start with some thousands to try it instead of counting 50k at once?

 
Although what can be redrawn there dunno, in the vast majority of candlestick formations there are only three options TRUE,FALSE,NA on the output
 
mytarmailS:

Yes I wrote that I deleted about six of the best predictors to just remove the ones that might be redrawn, but the accuracy fell literally 3%, there's probably all redrawn ...

Took your code, removed 6 indicators obtained from the nextCandlePosition (X27), got an accuracy of 52% instead of 100%. On slightly different indices for training, the accuracy is sometimes less than 50%. In general - random.

mytarmailS:
although what can redraw there I don't know, there are only three TRUE,FALSE,NA outputs in the vast majority of candlestick formations

Using nextCandlePosition - you get the values really relevant to the next bar, here is a look into the future for 1 step.

So it didn't work.

 
Dr.Trader:

Took your code, removed 6 indicators obtained from the nextCandlePosition (X27), got an accuracy of 52% instead of 100%. On slightly different indices for training, the accuracy is sometimes less than 50%. In general it is random.

Using the nextCandlePosition you get values that really apply to the next bar, here is a look into the future by 1 step.

In general it does not work.

Well, it's good that everything cleared up, I didn't really believe in the grail.

 

There is a new and very promising package RKEEL gateway to KEEL.

Good luck

KEEL: Software tool. Evolutionary algorithms for Data Mining
  • www.keel.es
KEEL contains classical knowledge extraction algorithms, preprocessing techniques, Computational Intelligence based learning algorithms, evolutionary rule learning algorithms, genetic fuzzy systems, evolutionary neural networks, etc.
Reason: