Machine learning in trading: theory, models, practice and algo-trading - page 2786

 
Valeriy Yastremskiy #:

I.e. if the factors are few, the process is controlled, but after a certain number of factors collisions and probabilistic results of summing up the factors begin to occur. Besides, factors can and do have connections, and there are feedbacks. But the Markov process has no such links.

It describes random variables - how can it have connections (logically, they are absent).... but if there is a matrix, it means that we can find/describe/get lost/form connections in it... imho about describing the state under the influence of events... all the same statistics, but also a step forward depending on the state (and just this step sets the dynamics of the statistical series of each moment)... only I am also confused by "randomness" in the formulation of the whole Markovian (but that's what statistics and dim_reduction are for).

 
JeeyCi #:

it describes random variables - where does it have connections (logically, there are none)... but since there is a matrix - it means it is possible to find/describe/get lost/form connections in it... imho about describing the state under the influence of events... all the same statistics, but also a step forward depending on the state (and just this step sets the dynamics of the statistical series of each moment)... only I am also confused by "randomness" in the formulation of the whole Markovian (but that's what statistics and dim_reduction are for).

I gave my understanding of the physics of random processes, to me it is 2 variants, in the market the first is when the number of factors gives a probabilistic result, and the second is the result of a low-frequency process relative to a high-frequency one, at that the processes are not interconnected.

And philosophically I understand that if there is no connection with past values of a function or a process, the process will be random. But in the real world this is usually not the case.

In the market, if we assume that prolonged stationary states are the effect of some inertial forces from strong factors, or strong prolonged factors, then it is possible to distinguish them in noise, and it is not a Markov state. The approach of distinguishing the model from the SB model is quite logical. But what to do with it, logically, if there is a non-Markov state, then it is possible to investigate, and if there are no differences, then there is no sense to investigate.))))))

 
Maxim Dmitrievsky #:

What do you mean you don't, where you are in the transition matrix, that's where you go.

It is clear, but it is a random process, because there is no connection between the present value and the previous one.)))))) And so, yes, in the matrix there are values)))))

GSCs are built on the principle of minimising this relationship to almost zero).

 

Vladimir Perervenko 's methods of normalisation are just strange - log2(x + 1) is still understandable,

but the appearance of such a beast - to get rid of asymmetry via sin(2*pi*x)- it is not quite clear what it does - logically it adds some cyclic component, and the question is, why such a component? or removes it? (we remove cycles, we are left with noise)...

and tanh(x) in general looks like an imitation of neural network processing to compress a series... or just another simple warping of the row? - It's unlikely to get rid of cyclicity, and it's not clear which one....

Anyway, of course, it is clear that timeseries= trend+cycle+noise...

... but he may be trying to get rid of cyclicity with such transformations (and it is not known how sin(2*pi*x) is a universal way ??) ... I somehow thought at first that this is a kind of attempt to put a d/df element into the series--to remove cyclicity (by incorporating this length into the sign-factors themselves? wavelength into the signs-factors themselves), to achieve normal distribution, i.e. speed and acceleration to put into the composition of the signs ... ?? but still manipulation with sin seems to be an unjustified distortion of the series under the amplitude scaled according to the sign value - I have not met such a thing in statistical processing .... why not cos? Why not tanh? -- just different ways to curve? Why?

?? maybe the author can explain the essence of this particular trigonometry (the purpose of removing skew distribution via log is already clear) - but what are the justifications/assumptions for using sin?? why not cos? and why this curvature at all ?(will it accelerate the sign change?-rather even just smoothing it out sometimes)


Renat Akhtyamov #:

you were given vectors, wrote a paper, and treated like a trinket....

someone could write how seriously you should/can take such transformations and why ? (apart from the desire to get rid of asymmetry with log, I think ln is the most common).

 
JeeyCi #:

Vladimir Perervenko has just some strange ways of converting to normality - log2(x + 1) can still be tried to understand,

but the appearance of such a beast - to get rid of asymmetry via sin(2*pi*x)- it is not quite clear what it does - logically it adds some cyclic component, and the question is, why such a component? or removes it? (we remove cycles, we are left with noise)....

and tanh(x) in general looks like an imitation of neural network processing to compress a series... or just another simple warping of the row? - unlikely to get rid of cycles, and it's not clear which one....

Anyway, of course, it is clear that timeseries= trend+cycle+noise...

... but he may be trying to get rid of cyclicity with such transformations (and it is not known how sin(2*pi*x) is a universal way ??) ... I somehow thought at first that this is a kind of attempt to put a d/df element into the series--to remove cyclicity (by incorporating this length into the sign-factors themselves? wavelength into the signs-factors themselves), to achieve normal distribution, i.e. speed and acceleration to put into the composition of the signs ... ?? but still manipulation with sin seems to be an unjustified distortion of the series under the amplitude scaled according to the sign value - I have not met such a thing in statistical processing .... why not cos? Why not tanh? -- just different ways of warping? What for?

?? maybe the author can explain the essence of this particular trigonometry (the purpose of removing skew distribution via log is already clear) - but what are the justifications/assumptions for using sin?? why not cos? and why this curvature at all ?(will it accelerate the sign change?-rather even just smoothing it out sometimes)


someone could write how seriously one should/can take such transformations and why ? (apart from trying to get rid of asymmetry with log, I think ln is the most common).

I never understood such transformations either, but most likely it's just a choice of the best transformation from the rest. And there is no logic in the choice, usually based on tests.

Shapes of filters and antennas in UHF were not originally calculated. Yes, and the calculation then in the real life was finalised with a file)))))

 
Valeriy Yastremskiy #:

I've never understood such conversions either, but most likely it's just a choice of the picker's opinion of the best conversion from the rest. And there is no logic in the choice usually, usually based on tests.

Shapes of filters and antennas in UHF were not originally calculated. Yes, and the calculation then in the real life was finalised with a file)))))

You can simply compare the histograms of the sample before and after conversion. If the final one is closer to the target form (normal or uniform distribution, for example), then the transformation is quite suitable). Instead of drawing histograms, you can consider tests for conformity to the target (for normality or uniformity, respectively).

Don't they make plates parabolic in shape? Quite according to the formula)

 
JeeyCi #:

Vladimir Perervenko has just some strange ways of converting to normality - log2(x + 1) can still be tried to understand,

but the appearance of such a beast - to get rid of asymmetry via sin(2*pi*x)- it is not quite clear what it does - logically it adds some cyclic component, and the question is, why such a component? or removes it? (we remove cycles, we are left with noise)....

and tanh(x) in general looks like an imitation of neural network processing to compress a series... or just another simple warping of the row? - unlikely to get rid of cycles, and it's not clear which one....

Anyway, of course, it is clear that timeseries= trend+cycle+noise...

... but he may be trying to get rid of cyclicity with such transformations (and it is not known how sin(2*pi*x) is a universal way ??) ... I somehow thought at first that this is a kind of attempt to put a d/df element into the series--to remove cyclicity (by incorporating this length into the sign-factors themselves? wavelength into the signs-factors themselves), to achieve normal distribution, i.e. speed and acceleration to put into the composition of the signs ... ?? but still manipulation with sin seems to be an unjustified distortion of the series under the amplitude scaled according to the sign value - I have not met such a thing in statistical processing .... why not cos? Why not tanh? -- just different ways of warping? What for?

?? maybe the author can explain the essence of this particular trigonometry (the purpose of removing skew distribution via log is already clear) - but what are the justifications/assumptions for using sin?? why not cos? and why this curvature at all ?(will it accelerate the sign change?-rather even just smoothing it out sometimes)


someone could write how seriously one should/can take such transformations and why ? (apart from trying to get rid of asymmetry with log, I think it is usually ln after all).

As long as we are on the level of reasoning of trigonometric functions or anything else on this level, there is no justification for one reason - it is impossible to make a justification, because the purpose of such justifications is NOT declared and the criterion for achieving the purpose is unknown.


And the goal in MO is the only one - to reduce the fitting error, or rather to reduce the prediction error of the machine learning model. And under the restriction that the prediction error should NOT change much in the future.


The main obstacle to achieving this goal is non-stationarity of financial series.

The given formula timeseries= trend+cycle+noise is not quite accurate. It is more accurate and very well worked out in GARCH-type models, of which there are more than a hundred, but none of them solves the problem in its final form, i.e. how to model non-stationarity as accurately as possible.

If we specify a goal and a criterion for achieving the goal, then the methods for dealing with nonstationarity are not important at all - it is the result that matters. In any case, it is clear that the closer to stationary the initial non-stationary series can be transformed, the smaller will be the prediction error of the MO model, and, most importantly, the smaller will be the fluctuations of this error.

Vladimir Perervenko understands it perfectly well, but his articles are more educational than practical - he just shows problems and provides tools for their solution, and very completely and systematically without visible gaps. And the selection of problems and tools for their solution is all subordinated to the goal: to reduce the prediction error.

Vladimir Perervenko
Vladimir Perervenko
  • www.mql5.com
Профиль трейдера
 

there Aleksey Vyazmikin asked the author such a question in the comments - got a link to the discussion thread - the link is DESTROYED ! has the author Vladimir Perervenko gone into hiding? )

 

The idea of a local decision tree came to mind. It is something like an analogue of KNN or local regression (also potentially suitable for non-stationarity). The idea is that we split into boxes only the box containing the point of interest (up to at least a given number of K points in it), and do not care about the rest of the boxes. It may be better than KNN or local regression if the boundaries between classes are sharp and the point is close to such a boundary.

I wonder if the approach makes sense at all.

 
СанСаныч Фоменко #:

As long as we are at the level of reasoning of trigonometric functions or anything else at this level, there is no justification for one reason - it is impossible to make a justification, because the purpose of such justifications is NOT declared and the criterion for achieving the purpose is unknown.

And the goal in MO is the only one - to reduce the fitting error, or rather to reduce the prediction error of the machine learning model. And with the restriction: the prediction error must NOT change much in the future.

and the goal is always the same - logic and adequacy instead of a stupid search by a greedy algorithm of all rubbish and howling about lack of power for that....

yes, estimates should be (consistent) stable - you call it "the error should not change", the prediction itself will change in time series (in dynamics)...

you can't get any further than your advertising remarks about tools -- without knowing how these tools work ... you have been given a sledgehammer in your hands - youare waving it around(are you Chapayev? ??? ) with reference to your IV=0.02 threshold - THAT IS A LOW(!) connection - so why are you waving your slogans here... and calling the proposals of adequate analysis as mashkas (where they never existed in the past)... open your own advertising thread.

and MO yes - it works the same way everywhere and for the SAME PURPOSE - and in Py and other libraries it is not IV at all - but the essence does not change, - you, apparently, not understanding the essence of data analysis - can only shout slogans about candidates and tools and stupidly load rubbish into your "black box" - and even to use your predictions for their intended purpose did not bother....

well, open a branch for your own promotions, and shout there -- if you can't do anything but churn analysis (not even normal conclusions) -- you look like a fucking collector trying to get other people's ideas for your scrap metal (except for the word "tool" - you haven't even understood how it works) -- what didn't LogisticRegression do?

=== you don't have to answer! (your personal Informational Value = 0 for me)... your interpretations of linear algebra are even lower in IV

Reason: