Discussion of article "Thomas DeMark's Sequential (TD SEQUENTIAL) using artificial intelligence" - page 6

 
Mihail Marchukajtes:

I built the model for 50 records, I was interested in the result of the model during the next 50 or 100% training interval. When you increase the number of records to build the model without increasing the number of moves. The generalisation ability will decrease. Thus it is possible to reduce the level of generalisation to acceptable 65%, regulating the length of the sample, if we say that it is enough to earn money on the market, then the size of the training sample will be much higher and such a model will work much longer, but much worse than the model with the level of generalisation of 90%. Applying proper MM and money management to such models (65%), you can make a lot of money.

I have already said, you can't learn from 50 samples, very noisy data for the previous minute does not contain information about all the nuances of market behaviour, learn to learn from the whole set.

I don't know how you consider "generalisation" but calculating accuracy is still in question for you. The algorithms are higher, even a non-programmer can read it. Count how many times the model guessed and divide by the number of samples.

 
toxic:

There are fish, but not with such data, on low-frequency data the price takes everything into account, on pure market data (volume price, delta, etc.) you can't get anything, the price adapts to news and new information almost completely, within a few minutes, and information diffusion is the main market inefficiency. The rest of it, if in simple terms, is just insider information. You don't know when and why the doll will buy/sell a lot creating trends and when it will stop.

Imagine that you are fighting, your success in a fight depends on how you predict the opponent's blows, on their beginning, seeing the posture and the beginning of the movement you take appropriate evasive manoeuvres, and when you see the inefficiency of the opponent's defence you attack, everything is the same in trading (speculation), you can't decide to react two times slower, you won't become two times less efficient, you will completely lose efficiency.

Now all speculation is automated, everything that is based on diffusion of information (static, event arbitrage, etc.) all HFT, it is necessarily ultra HFT as in some MM, it is more like "algo-scalping" (average time of holding a position around a minute, or even 10 minutes), but we are not talking about hours and days, there is no information in prices, everything is antiquity.

But in general, theoretically, it is possible to predict hours and even days, but not only by market data, it is necessary to monitor thousands of parameters of human activity all over the world, especially in relation to large companies, we need the weather, the amount of transport everywhere, social activity of people on the Internet, especially throws tn. "For example, I've heard that they are watching factories from space to see how many products are produced there, what was brought in, what was taken out)))) This is on the border with the insider, but not caught is not a thief))))). And all this needs to be processed by a team of good analysts into a sign form, as well as a team of cool fundamental forecasters and collection of data from open forecasts and their analysis. In general, even a medium-sized bank will not have enough resources to realise and work it all out to production quality. And on the basis of only price with the volume of days it is impossible to predict statistically reliable future price, this is a fairy tale for "bet on red and double"))))


All true, I agree completely, but the market pair forms prerequisites, well as an example, let's take the signal "Sequences", let's assume that a signal to buy was formed and the NS says that the signal is "True", but after some time there is a change in the market situation and the signal loses its relevance and the market begins to go against it, in the end it is written off as a network error. Or rather it was not an error. At the moment of the signal the market was really going up, and only then it changed its mind. Then we get an error and be well. The task is to make as few mistakes as possible :-) So, in the end, how about optimising my data?
 
Mihail Marchukajtes:

So, in the end, how about I optimise my data?

150 samples. whoa.

Okay, I'll run it tonight.

 
toxic:

I already said, you can't learn from 50 samples, very noisy data for the previous minute does not contain information about all the nuances of market behaviour, learn to learn from the whole set.

I don't know how you consider "generalisation" but calculating accuracy is still in question for you. The algorithms are higher, even a non-programmer can read it. We count how many times the model guessed and divide by the number of samples.


By the way yes, in the beginning when I was organising indicator codes and everything, I counted on the history of what signals and how many, because we have 4 parameters (and everything worked out for me). But it is possible to count signals when the number of zeros and ones is equal, because otherwise when dividing some lines are copied. That is, if two ones are missing, it adds them (I mean Separator). Now I will try to get the model and demonstrate how it works on the training sample, but for this you need to take the same amount of data with zero and one. A little later, the machine counts. As for the training file, I can convert it to 11 columns and 750 rows. Such a file will be more convenient????
 
Mihail Marchukajtes:

By the way yes, in the beginning when I was organising indicator codes and everything, I counted on the history of what signals and how many, because we have 4 parameters (and everything worked out for me). But it is possible to count signals when the number of zeros and ones is equal, because otherwise when dividing some lines are copied. That is, if two ones are missing, it adds them (I mean Separator). Now I will try to get the model and demonstrate how it works on the training sample, but for this you need to take the same amount of data with zero and one. A little later, the machine counts. As for the training file, I can convert it to 11 columns and 750 rows. Such a file will be more convenient????


11 columns and 750 rows is certainly better, maybe on crossvalidation to something will come together ...

In general, lay out different sets, with low-frequency date and their chips and targeting, you can in a private message if what is not for the public, I confess that I did not dig deeply low-frequency as I was immediately convinced by "sophisticated" on the subject of lack of information in it, so if you change my mind, I will be grateful, it will turn my understanding of the markets, I am ready for it, although I consider it unlikely.

 

Yes indeed, I prepared the file at once, it is the same data, but collected differently.

Well, I still have one more theory, so to speak an assumption, I want to voice it and hear the opinion not only from you, tohis. (by the way, what's your name? Because it's not usual to address by nickname) But also from Wizard. So, he is old in this business and I remember we still communicated on NSh.

Files:
 

I would like to discuss another topic. The point is that in the process of optimisation we get a model and after several optimisations we will get a set of say 5 models (as an example) The question is how to choose the one that will work best in the future. Now I will post a link to the lecture, please watch it from 33 minutes, where he talks about the degree of the polynomial and the effect of retraining the polynomial. The graph of error minimisation. So let's talk about that.

That is, the task of the optimiser is to build such a model so that the degree of approximation is maximum at minimum polynomial dimension, if I understand him correctly. That is, the task is to build such a polynomial that would maximise the degree of approximation to the output but not have a large number of degrees. And now let's imagine that our optimiser knows how to build such a polynomial and in case of repeated optimisation of the data, we constantly get into a certain area that is on the border between convergence and overtraining. Let's imagine that this is a small area, but no matter how many times we get there, we will always get models entering the area of sufficiency and not overtraining (schematically drawn as I could) BUT these models will be different in terms of the results of work in the future. So it is the choice of the very model that will work according to the expert's opinion. So maybe there are methods of selecting models for workability in the future?????.

The figure shows the area where the training is complete and sufficient, the main thing is not overtrained.


001. Вводная лекция - К.В. Воронцов
001. Вводная лекция - К.В. Воронцов
  • 2014.12.22
  • www.youtube.com
Курс "Машинное обучение" является одним из основных курсов Школы, поэтому он является обязательным для всех студентов ШАД. Лектор: Константин Вячеславович Во...
 
What is surprising is such a number of prsomotorovat, and no one to support the conversation. AWOOOOO People.... Where are you? Are only toxic and Wizard in the topic... I can't believe it....
 
Mihail Marchukajtes:

Yes indeed, I prepared the file at once, it is the same data, but collected differently.

Well, I still have one more theory, so to speak an assumption, I want to voice it and hear the opinion not only from you, tohis. (by the way, what's your name? Because it's not usual to address by nickname) But also from Wizard. So that he is old in this business and I remember we still communicated on the NSH.

Well, in general, surprisingly, the first dataset - contains a little bit of alpha, about 3-4% above the random 50%, if you do not care too much, that is, theoretically, with a larger number of samples can be up to 5-6% can be squeezed out, which as for hours and days in principle VERY NOT BAD, given the ponzi transaction costs. Hmmm... interesting, interesting... It wouldn't be bad if someone else checked how much information there is.

This is all of course if the targeting is correct, if there is no past returnee or price in the targeting. In the target should be only future return, so for example if you have indices are built for example at the price pt-n,...,pt-1,pt, the target should not "see" the price at which the signs are built, for example the target can be the sign of the next return ((pt+2-pt+1)/pt+1).Sign(), but if the target will be ((pt+1-pt)/pt).Sign()), the picture will "blur" will get an unrealistic performance of the model, fake"grail", it is important to take into account.


The second dataset (the longer one) is not good at all, you have stretched it strangely, shifted some features to others)))).

 
toxic:

Well, in general, surprisingly, the first dataset - contains a little bit of alpha, about 3-4% above the random 50%, if you do not care too much, that is, theoretically, with more samples can be up to 5-6% can be squeezed out, which as for hours and days in principle VERY NOT BAD, given the ponzi transaction costs. Hmmm... interesting, interesting... It wouldn't be bad if someone else checked how much information there is.

This is all of course if the targeting is correct, if the targeting doesn't include past returnees or price. In the target should be only future return, so for example if you have indices built for example at the price pt-n,...,pt-1,pt, the target should not "see" the price at which the signs are built, for example the target can be the sign of the next return ((pt+2-pt+1)/pt+1).Sign(), but if the target will be ((pt+1-pt)/pt).Sign()), the picture will "blur" will get an unrealistic performance of the model, fake "grail", it is important to take into account.


The second dataset (the longer one) is not good at all, you have stretched it strangely, shifted some features to others)))).


Yes, my output is looking ahead, don't worry about the purity of the data collection, I am very careful about it.

That was 15 minutes of data.

The second I just turned the columns into rows, 11 pieces and multiplied the output, it turns out that when we get a signal, we submit 11 columns 5 times for one signal, you can even organise a committee at this level. I also made such a file for myself, as soon as the machine is free I will try to spin.