Machine learning in trading: theory, models, practice and algo-trading - page 286

 
Mihail Marchukajtes:
Y Well, now I see. In other words, let's first train, say, a network without a teacher. Then we write obtained values into classifier weights and then the classifier is further trained with a teacher. Suppose we received weights of neuronal fine-tuning, weights continue to be optimized????? In other words, by pre-training without a teacher, we set initial weights for the classifier that bring it to the global minimum. Is this how it works?

"Deep" learning makes sense because of attenuation (sigmoid)\explosive growth (hypertangent) of derivatives during back propagation, when layers become significantly more than 4-5, this problem was circumvented by "crutch" of layer-by-layer pretraining by autoencoders or RBM, which essentially perform hierarchical nonlinear feature selection (like PCA only nonlinearly and many times), which is then easy to work with by high-level layers, then regular backprops quickly fine-tune whole system by targeting (with a teacher).

See: https://www.youtube.com/watch?v=iowsPfk7uUY

Of course, there are a lot of "experts" with articles and ate their dogs on ZZ, who know by heart which package in R should be used in all cases, so I'll say in the air, no one will believe me anyway, deep neural networks, as well as CNN, when you understand how they are arranged, you can't use them for market forecasting, because such a big hierarchy of features is needed only for pictures, videos and NLP, because there is a big hierarchy, our world is arranged so, objects are made of objects, etc. the same with NLP, because language is hierarchical. There is nothing like that in the market, you trade a kind of timeframe, what happens on the smaller ones is noise to you. Investors make a decision on one timeframe, they don't depend on what scalpers and HFTs, much less hedgers, arbitrageurs, etc. made a decision. And the hierarchy in the deep network implies coherence, that small defines large, an avalanche consists of snowflakes.

11. Data mining. Глубокие нейронные сети
11. Data mining. Глубокие нейронные сети
  • 2015.11.24
  • www.youtube.com
Техносфера Mail.ru Group, МГУ им. М.В. Ломоносова. Курс "Методы обработки больших объемов данных" (осень 2015) Лекция №11 - "Глубокие нейронные сети" Лектор ...
 
Zhenya:

It looks cool, but it is a bit expensive.

1 I would like to practice on something free, to see how it works in real time without delay, and there is a demo with a giant lag.

2 Can you describe in a nutshell how this signal is used in trading and MO? If it's no secret, when important news are released, do you have time to trade, or someone starts hitting the market a second, half a second earlier and takes the bait?

1 Search websites.

2 Buy better than expected, sell worse than expected.

 
toxic:

"Deep" learning makes sense because of attenuation (sigmoid)\explosive growth (hypertangent) of derivatives during back propagation, when layers become significantly more than 4-5, this problem was circumvented by "crutch" of layer-by-layer pretraining by autoencoders or RBM, which essentially perform hierarchical nonlinear feature selection (like PCA only nonlinearly and many times), which is then easy to work with by high-level layers, then regular backprops quickly fine-tune whole system by targeting (with a teacher).

See: https://www.youtube.com/watch?v=iowsPfk7uUY

Of course, there are a lot of "experts" with articles and ate their dogs on ZZ, who know by heart which package in R should be used in all cases, so I'll say in the air, no one will believe me anyway, deep neural networks, as well as CNN, when you understand how they are arranged, you can't use them for market forecasting, because such a big hierarchy of features is needed only for pictures, videos and NLP, because there is a big hierarchy, our world is arranged so, objects are made of objects, etc. the same with NLP, because language is hierarchical. There is nothing like that in the market, you trade a kind of timeframe, what happens on the smaller ones is noise to you. Investors make a decision on one timeframe, they don't depend on what scalpers and HFTs, much less hedgers, arbitrageurs, etc. make a decision. And the hierarchy in the deep net implies coherence, that the small defines the large, the snowflakes make up the avalanche.

Thank you, I will definitely watch the video. But I liked Reshetov's approach better. I came across his article where he explains in detail how his predictor is structured, so. I'm looking forward to redoing the code for the article. This is handled by the moderator. And I invite to the discussion, because I have my own view on data preparation and training in general.....
 
It hasnothing to do with the major currency pairs:

"Deep" learning makes sense because of attenuation (sigmoid)\explosive growth (hypertangent) of derivatives during back propagation, when layers become significantly more than 4-5, this problem was circumvented by "crutch" of layer-by-layer pretraining by autoencoders or RBM, which essentially perform hierarchical nonlinear feature selection (like PCA only nonlinearly and many times), which is then easy to work with by high-level layers, then regular backprops quickly fine-tune whole system by targeting (with a teacher).

See: https://www.youtube.com/watch?v=iowsPfk7uUY

Of course, there are a lot of "experts" with articles and ate their dogs on ZZ, who know by heart which package in R should be used in all cases, so I'll say in the air, no one will believe me anyway, deep neural networks, as well as CNN, when you understand how they are arranged, you can't use them for market forecasting, because such a big hierarchy of features is needed only for pictures, videos and NLP, because there is a big hierarchy, our world is arranged so, objects are made of objects, etc. the same with NLP, because language is hierarchical. There is nothing like that in the market, you trade a kind of timeframe, what happens on the smaller ones is noise to you. Investors make a decision on one timeframe, they don't depend on what scalpers and HFTs, much less hedgers, arbitrageurs, etc. made a decision. And the hierarchy in the deep network implies coherence, that the small defines the large, the snowflakes make an avalanche.

For some reason the conversation constantly descends into a discussion of the advantages and disadvantages of certain models.

Even though not much of my experience suggests that the contribution of models themselves to successful trading is extremely insignificant.

The definition of a target and its predictors is decisive.

On the example of ZZ, I tried to show many times that even so obvious, illustrative and beautiful target variable as ZZ is not the same and at a closer look it hides insurmountable obstacles.

If we talk about predictors, it is quite obvious to me as a person who was engaged in economics all his life that:

  • the predictor must be relevant to the target variable - predictive ability for the target variable
  • the predictor must be ahead of the target variable

If one concentrates solely on solving these two fundamentals for economic and forex prediction, then success will only come from these two parameters. And selecting the best-fitting model to the target and its predictors can only marginally improve the performance, can give some valid considerations about the lifetime of the model without re-training.


Once again, I urge you to focus on the target variable and the justification of predictors for that particular target variable.

PS.

Getting into the interrelationships of currency pairs. Got some surprising results for me. In particular:

  • EURUSD and GBPUSD currency pairs are uncorrelated. And this despite the widely published correlation. The construction of VAR models on these pairs is hopeless.
  • AUDUSD has nothing to do with the major currency pairs at all.

 
SanSanych Fomenko:

I am delving into the interrelationships of currency pairs. I got surprising results for me. In particular:

  • EURUSD and GBPUSD currency pairs are not related to each other. And this despite the widely published correlation. The construction of VAR models on these pairs is hopeless.
  • AUDUSD has nothing to do with the major currency pairs at all.

Probably it would be correct to tell the reason for such conclusions
 
SanSanych Fomenko:

For some reason the conversation constantly slips into a discussion of the advantages and disadvantages of certain models.

Even though not much of my experience tells me that the contribution of the models themselves to successful trading is extremely insignificant.

The definition of a target and its predictors is decisive.

On the example of PZ, I tried to show many times that even so obvious, illustrative and beautiful target variable as PZ is not the same and at a closer look it hides insurmountable obstacles.

If we talk about predictors, it is quite obvious to me as a person who was engaged in economics all his life that:

  • the predictor must be relevant to the target variable - predictive ability for the target variable
  • the predictor must be ahead of the target variable

If one concentrates solely on solving these two fundamentals for economic and forex prediction, then success will only come from these two parameters. And selecting the best-fitting model to the target and its predictors can only marginally improve the performance, can give some valid considerations about the lifetime of the model without re-training.


Once again, I urge you to focus on the target variable and the justification of predictors for that particular target variable.

PS.

Getting into the interrelationships of currency pairs. Got some surprising results for me. In particular:

  • EURUSD and GBPUSD currency pairs are uncorrelated. And this despite the widely published correlation. The construction of VAR models on these pairs is hopeless.
  • AUDUSD has nothing to do with the major currency pairs at all.

Here I agree with you, but I want to make a correction. First of all, I would like to clarify the first point. Not to relate to the target, but to be the reason for it. That is, the predictor has changed, the target has changed, not vice versa. And then the second point simply falls away. There is no need for any anticipation. It is enough that the input data is the cause for the output. And again, everyone forgets where you work. The main thing is the price on the exchange. Find the inputs, which are the cause of price changes and any TS, I assure you, absolutely any TS will work as it should. But only this is a secret!!!!! Don't tell anyone.... You guys need to read my article after all. Don't think that I'm not promoting or anything. Of course I am worried that the readers will only be me :-) just kidding. So after reading a lot of questions will fall away. Here, in addition to the AI itself (let's assume you have a grid), you need to properly organize the data collection, careful with indicators, so that no peeking occurs, etc. I think this article is one of the approaches to the market. I used to be so engaged in the grids that trading was in the background. I think that now there are such specialists, for whom trading is just a matter of experimenting.....
 
Well, as for the use of Deep Neural Networks in trading, there is a rationale in it, but the number of manipulations to be carried out is huge enough, implying that the network works on the clock, but analyzes starting from the minutes, minutes summarized, then five minutes to summarize, etc. IMHO
 

Mihail Marchukajtes:
I agree with you here, but I would like to make a correction. First of all, I would like to clarify the first point. Not to be related to the target, but to be the reason for it. That is the predictor has changed, the target has changed, and not vice versa. And then the second point simply falls away. There is no need for any anticipation. It is enough that the input data is the cause for the output. And again, everyone forgets where you work. The main thing is the price on the exchange. Find the inputs which are the cause of price changes and any TS, I assure you, absolutely any TS will work as it should.

=================================================================================

This is absolutely the right point. Only I would rephrase: The target must be generated by the predictor(s).

 
mytarmailS:
Nevertheless you don't have an answer to the question how to implement this target using R tools, if I understood correctly

The answer is not so simple. Here is one way, for example, with a neuron, but you need to understand how a neuron works, what formulas it has, etc.

Suppose there is a simple neuron, with four inputs, three perceptrons in the hidden layer, and one output. Such a neuron in mathematical language would work like this:

#include <math.h>
double sigmoid(double x)
{
     return 1.0 / (1.0 + exp(-x));
}

double NeuralNetwork(double* input, double* bias, double* weight){
    double perc_output[3]; //временный массив для хранения промежуточных результатов каждого перцептрона

    double perc_output[0] = sigmoid(bias[0] + input[0] * weight[0] + input[1] * weight[1] + input[2] * weight[2]  + input[3] * weight[3]);
    double perc_output[1] = sigmoid(bias[1] + input[0] * weight[4] + input[1] * weight[5] + input[2] * weight[6]  + input[3] * weight[7]);
    double perc_output[2] = sigmoid(bias[2] + input[0] * weight[8] + input[1] * weight[9] + input[2] * weight[10] + input[3] * weight[11]);
    double result         = sigmoid(bias[3] + perc_output[0] * weight[12] + perc_output[1] * weight[13] + perc_output[2] * weight[14]);
    return result;
}


Now you can take a table with training examples, and find the result for each example

double nn_input[4]; //массив со входными значениями

double nn_bias[4]; //массив сдвигов, на текущем этапе тут будут рандомные значения
double nn_weight[15]; //массив весов, на текущем этапе тут будут рандомные значения

//для всех обучающих примеров находим по очереди результат нейронки. При этом массивы nn_bias и nn_weight не должны меняться
double results[trainSampleCount];
// trainSampleCount = число обучающих примеров
for(int i=0; i<trainSampleCount; i++){
  nn_input = trainSamples[i]; //синтаксис реально не такой, нужно из обучающей таблицы взять i-тую строку и запихнуть в этот массив nn_input
  results[i] = NeuralNetwork(nn_input, nn_bias, nn_weight);
}

Next, for example, plot the profit graph on the predictions in the results array, evaluate it.

The code above can be shoved into optimizer. The optimizer should find suitable values of weights and shifts in the nn_bias and nn_weight arrays:
1) change the values of nn_bias and nn_weight according to their algorithms
2) find results for all training examples
3) plot the trade graph
4) estimate the trade chart, use this estimate as a fitness value for subsequent optimization steps
5) Repeat steps 1-4 of your optimization algorithm, until the profit chart becomes acceptable

That's all, but there is a nuance - those optimizers that I have tried cannot handle weights, they just find some local minimum when all results are equal to 0.5 to minimize the average error, and get stuck at it. You need to apply some kind of trick here, I have not progressed any further.
The more complex is the structure of neuronics - the more weights will be, and the more difficult is the optimization algorithm to select them, on large neuronics they just get stupid and almost do not improve the initial result.

 
Dr.Trader:

Here you need to apply some kind of trick.

The trick is, in fact, even known, but I have not seen any software to implement it. Derivatives.

Neuronics, balance plotting, graph evaluation are all formulas. And so you can find derivatives of nn_bias and nn_weight with respect to the final estimate.
In some training video about MO the lecturer talked about new programming languages of the future, for example someone somewhere is trying to make a language with automatic calculation of derivatives for any variable inside any complex formula (not by recalculation with a small shift of the value, but analytically). This is the kind of thing that would help.

I.e. usually take a training example, and for each weight analytically determine how much it improves the result, and the weight is correspondingly slightly increased or decreased. We need to do the same, but not for one example, but for all at once, and the derivative is not sought for the training results in turn, but immediately for the final assessment of the graph.


And a small disadvantage - all this in this form will not help for trading. We'll just adjust the weights for the ideal chart, we'll get 100% overtraining and drain on the new data. For real profit it would be necessary to mess with the structure of neuronics, in the end a profitable neuron will be something like a convolutional network probably.

Reason: