Crossvalidation and testing for non-linear regression - General

Maxim Dmitrievsky 2017.12.29 07:53 #21

Non-linear regression refers to machine learning models, the main problem with the models is overfitting, i.e. overfitting to the current slice of the graph, as already written above. This model is constantly being adjusted for new data, so at certain points its efficiency tends to zero. To prevent this from happening we should use crossvalidation and out of sample testing. Anyone who is more or less familiar with the subject, even without running in the tester, will immediately understand that this channel will not work on real data

Machine learning in trading: Statistics as a way Let's discuss video "Machine

Aleksey Panfilov 2017.12.29 08:25 #22

Maxim Dmitrievsky:

Non-linear regression refers to machine learning models, the main problem with the models is overfitting, i.e. overfitting to the current slice of the graph, as already written above. This model is constantly being adjusted for new data, so at certain points its efficiency tends to zero. To prevent this from happening we should use crossvalidation and out of sample testing. Anyone who is more or less familiar with the subject even without running in the tester understands at once that this channel will not work with real data.

N-order polynomial difference equations do not have this drawback. Sooner or later I will test the channel on these equations too.

Maxim Dmitrievsky 2017.12.29 08:28 #23

Aleksey Panfilov:

The difference equations of N-order polynomials do not have this drawback. Sooner or later I will check the channel on these equations as well.

what are they? ) difference equations, and why are they deprived?

Aleksey Panfilov 2017.12.29 08:55 #24

Maxim Dmitrievsky:

what are they? ) difference equations, and why are they deprived?

If not strictly that:

The classical EMA formula, if taken, is a difference equation of the first degree, (but of the second order - the specific order is determined by the number of points on which the next, like a ruler or a mold) line is plotted, the full analogue of Archimedes' lever. Interpolation. Using the previously calculated point and the last price value, the next point is built adjacent to the calculated one and is not redrawn.

For the polynomial of the second order, using two previously calculated points and the last price (I take the open point or the median of the penultimate bar), a point adjacent to the first two is drawn and also not redrawn. And so on. If we invert the formula, we can extrapolate, i.e. using three calculated adjacent points on the basis of the polynomial of the second degree, construct the fourth point with the given distance from the first three. This point is not redrawn as well.

If necessary, you can enable the mode of drawing lines visibility(that is, the calculation of all neighboring points up to any given point in the loop) and these drawing lines will be redrawn from every new price.

Example formulas.

      a1_Buffer[i]=((open[i] - Znach)    +5061600*a1_Buffer[i+1 ]-7489800   *a1_Buffer[i+2 ]+4926624*a1_Buffer[i+3 ]-1215450*a1_Buffer[i+4 ])/1282975;

      a2_Buffer[i]=  3160*a1_Buffer[i]   -6240   *a1_Buffer[i+1 ]    +  3081*a1_Buffer[i+2 ];

      a3_Buffer[i]=((open[i] - Znach)    +5061600*a3_Buffer[i+1 ]-7489800    *a3_Buffer[i+2 ]+4926624*a3_Buffer[i+3 ]-1215450*a3_Buffer[i+4 ])/1282975;

      a4_Buffer[i]=  2701*a3_Buffer[i]   -5328   *a3_Buffer[i+1 ]    +  2628 *a3_Buffer[i+2 ];

Formulas with division (first and third) are interpolation (finding a point inside an interval).

Formulas without division are extrapolation (finding a point outside the initial interval).

Difference calculus, examples. Interpolation, approximation and the Any questions from newcomers

Maxim Dmitrievsky 2017.12.29 09:12 #25

Aleksey Panfilov:
If not strictly that:

The classical EMA formula, if taken, is a difference equation of the first order, (but of the second order - the specific order is determined by the number of points on which the next, like a ruler or a mould) line is plotted, the full analogue of Archimedes' lever. Interpolation. Using the previously calculated point and the last price value, the next point is built adjacent to the calculated one and is not redrawn.

For the polynomial of the second order, using two previously calculated points and the last price (I take the open point or the median of the penultimate bar), a point adjacent to the first two is drawn and also not redrawn. And so on. If we invert the formula, we can extrapolate, i.e. using three calculated adjacent points on the basis of the polynomial of the second degree, construct the fourth point with the given distance from the first three. This point is not redrawn as well.

If necessary, the mode of drawing lines visibility can be enabled and these lines will be redrawn from every new price.

Example of formulas.

Formulas with division (the first and the third ones) are interpolation (finding a point inside an interval).

Formulas without division are extrapolation (finding a point outside the initial interval).

Curious, but not quite sure why this should adequately predict an unsteady market, for example...

now reading this stuffhttp://blog.datadive.net/selecting-good-features-part-iv-stability-selection-rfe-and-everything-side-by-side/

Aleksey Panfilov 2017.12.29 09:22 #26

Maxim Dmitrievsky:

curious, but not sure why it should adequately predict an unsteady market, for example...

now reading this stuffhttp://blog.datadive.net/selecting-good-features-part-iv-stability-selection-rfe-and-everything-side-by-side/

These equations don't promise to predict trading either, they only construct curves of given polynomials or sinusoids based on the entire history.

Actually just like regression doesn't promise to predict bidding. :)

Maxim Dmitrievsky 2017.12.29 09:25 #27

Aleksey Panfilov:

These equations do not promise to predict bidding, they only construct curves of given polynomials or sinusoids based on the entire history.

In fact, neither does regression promise to predict bidding. :)

So my main idea now is autosampling of most informative (at current moment) traits through random forests with specified time interval and automatic retraining... because if you take all history then model gets too rough, if you take few then it's always retrained... and if you vary quality and quantity of traits through feature importance and do cross validation then there is chance to catch periodically needed regularities

But it's such a pain in the neck that I've long regretted using it, but there's no turning back now :)

Machine learning in trading: How can I know Obtaining Open Interest values

Aleksey Panfilov 2017.12.29 09:32 #28

Maxim Dmitrievsky:

now my main idea is autosampling the most informative (at the current moment) traits through random forests with a certain time interval and automatic retraining... because if you take the whole history then the model gets too coarse, if you take a little then it is always retrained... and if you vary quality and quantity of traits through feature importance and do cross validation then there is a chance to catch periodically the necessary patterns

but it's such a pain in the neck that I've long regretted using it, but there's no going back :)

Here we are already getting into neural networks, but you won't see any channels there. :) Although they may be in the code.

But to a line based on inertia or a sinusoidal line a visual channel can be attached. )

Order Placement Tool and Don't tell me that The market is a

Maxim Dmitrievsky 2017.12.29 09:36 #29

Aleksey Panfilov:

Here we have already gone into neural networks, and you can't see the channels there. :) Although they can be in the code.

But it is possible to attach a visual channel to a line constructed with inertia or a sinusoidal line. )

Well, why, build curves based on values at outputs from ns and you can channel... but I don't see much point in channels, because most of the different kinds of signals for the TS are missed and you get only one strategy for returning to the mean

Machine learning in trading: a trading strategy based [WARNING CLOSED!] Any newbie

Alexander_K2 2017.12.29 09:44 #30

Maxim Dmitrievsky:

Well, why, to build curves based on values at outputs from ns and channels is possible... but I don't see much sense in channels, because most of the different types of signals for the TS are missed and you get only one strategy to return to the average

Greetings Maxim! That's exactly how it turns out.

Do you know how to make canals? - page 3