Machine learning in trading: theory, models, practice and algo-trading - page 2526

 
Aleksey Nikolayev #:

For example, if 1<=t1<=t2<n, then ACF(t1,t2)=sqrt(t1/t2).

I have another question. Here we count the ACF of neighboring values in a sample of infinite size. For example, t1=1, t2=2. We get ACF =sqrt(0.5) ~ 0.707. Now we take other neighboring values, e.g.t1=10000, t2=10001. We obtain ACF = 1 (almost). It turns out that neighboring values are differently correlated to each other. Is it normal?

 
LenaTrap #:

To be honest, I can't understand anything at all.

p.s maybe some super smart mathematician will take pity on me and explain what's going on here?

no need for a "super smart mathematician" in the trade .....

DL has 3 layers - internal (hidden) handles t moment of external t-1 and t+1 respectively... - hence autocorrelation is possible... imho... I see it this way

although it seems to me for some reason that if you take not delta change of a feature in time, but contract some index - then, perhaps, the effect of autocorrelation of these intersecting values in time space can be somehow leveled... it's debatable... because close(t)/close(t-1) also have intersection and hence autocorrelation... Although at TF>15min the autocorrelation seems to disappear (not observed) - I have not checked it myself... and this is not yet the index I need...

It's useless to pray for autocorrelation in modeling of price moves with adequate TFs... And it makes no sense to keep modeling after every tick (as in derivation of regularities, moreover of long-term ones)... It is also imho (but more probabilistically fair)...

BUT Recurrent neural networks only pass the info forward (with the advent of Boltzmann machines started to be used in multilayer probabilistic learning)... although it has already sounded

Recurrent networks and Bayesian methods, by themselves, have not demonstrated either the ability to pull "memory" out of financial time series or to get conclusions about the most robust model on new data.

that is why recursive networks with back propagation of error and its minimization dy/dx are used in real problems (since they allow to perform integration just for that reason of their capabilities to minimize dy/dx)

p.s.

in general, as for me - all the same Monte Carlo method - only by machine... I don't see anything new in finding forward using backpropagation... purely terminologically...

p.p.s

except that with Theano you can try something without loading the PC's resources too much (although TensorFlow has been touted)...

What's Y and what's X is up to developer (either a priori or as a result of stat analysis)... if you're good with python -- in sklearn, even 2-in-1 features are already implemented in some methods? examples! -- and the feature importance themselves do -- a couple of lines too (like you found corrcoef in a couple of lines)

 
LenaTrap #:

In the real market? Personally, I hold to some such philosophy:

*but I don't really want to discuss it, because without evidence it's useless to discuss assumptions

! In real trading - don't twist the meaning...

yes, philosophy, indeed, everyone has their own... the purpose of statistics is to explain variance

and formalize dependencies from independent tests

 
Dr. #:

I have another question. Consider the ACF of neighboring values in a sample of infinite size. For example, t1=1, t2=2. We get ACF =sqrt(0.5) ~ 0.707. Now we take other neighboring values, e.g.t1=10000, t2=10001. We obtain ACF = 1 (almost). It turns out that neighboring values are differently correlated to each other. Is it normal?

That's right, it is. This is the second reason to talk about non-stationarity of SB (the first is the growth of dispersion with time). It is only in stationary processes (by their very definition) that the ACF depends only on the time difference ACF(t1,t2)=ACF(t1-t2). That's why for stationary series the ACF is usually written as a function of one argument t1-t2.

 
Doctor #:

The question, of course, should be addressed to Alexei. But I would answer "whatever. The question, I assume, is that the SB travels a path proportional to sqrt(t).

It was referring to the famous "player's ruin" problem. Could be used, for example, to test the statistical significance of the effect of prices "aspiring" to some levels.

 
Aleksey Nikolayev #:

The well-known problem "about the ruin of the player" was meant. It can be used, for example, to check the statistical significance of the effect of prices "aspiring" to some levels.

This is already much more interesting.

Maybe we should give up the idea that the market is a time series, and finally make a breakthrough in market analysis.

 
Aleksey Nikolayev #:

That's right, it is. This is the second reason to talk about non-stationarity of SB (the first is the growth of dispersion with time). It is only for stationary processes (by their very definition) that the ACF depends only on the time difference ACF(t1,t2)=ACF(t1-t2). That's why for stationary series the ACF is usually written as a function of one argument t1-t2.

Okay. Let me put the question another way. Are the two situations described below different from each other?

1) We have a sample of infinite size. Consider two time moments n and (n-t). Consider that 1 <= (n-t) <= n. CalculateACF((n-t),n)=sqrt((n-t)/n).

2) We have a sample of length n. Calculate the ACF with the lag t:ACF(t) =sqrt((n-t)/n).

 
JeeyCi #:

Although it seems to me, for some reason, that if we take not delta the change of a trait in time, but some index - then, perhaps, the effect of autocorrelation of these overlapping values in time space can be somehow leveled... it's debatable... because close(t)/close(t-1) also have intersection and hence autocorrelation... Although at TF>15min the autocorrelation seems to disappear (not observed) - I have not checked it myself... and it's not the index I need yet...


You really probably don't need it, but with any trend the data of time series start to show autocorrelation, sometimes very high, which theoretically should interfere with many analysis models/neural networks.

It is difficult to use this effect for prognosticating because nothing is everlasting, the trend changes with the rank, chaos with the order, a randomly wandering time series suddenly may not wander for a long time, and vice versa, and according to such estimation you cannot understand the process structure, it is too simple, it is like trading above 200SMA.

But perhaps it is still worth checking how your neural network reacts to autocorrelations, and try to remove them if they are there and interfere, the processing and proper data preparation is the most important thing in working with such systems. Neighboring elements shouldn't have overlap at all(if that's what I think?), if you're using data like that, it would be a big miracle if the model worked.

 
LenaTrap #:

You really probably don't need it, but with any trend the time series data

don't twist it: trying to argue something against me - you're still talking about your own... just about the time series... (and no one cancelled the ways of discretization)...

price long term does not depend on time series, I've made my point more than once (and I won't duplicate it)... I showed you where you can get autocorrelation in DL... I also told you what you will use for X and Y and for modelling what dependencies - I've already written it for the 10th time - it's up to developer to decide...

I am not the developer of your model - I do not need to prove the behavior of the price in time... (maybe I shouldn't have scribbled about DL - anyway everyone here thinks about his own things, that he refutes something or proves something to someone -- taking one word out of every discipline)... Engineers who do MO-ing (of which there are none here) will still understand the narrowness of the autocorrelation debate (for the sake of arcane speeches) even in trending, even in ticks, if the model is built in a much broader aspect and on a broader horizon of the learning set than the horizon where your fleas (autocorrelation) can come out... that's what Deep Learning is for (to account for everything)

... for me the issue of trading is not an issue:

Aleksey Nikolayev #:

the famous "player's ruin" problem.

... ... that's why I've been avoiding this nonsense for a long time ... it has been revealed that those who have no idea about modeling here, and those who have, don't waste time on this thread ... ok, there is a lot more useful information on the DL on the net than all the blatant jargon that has already poured out here to no avail...

On the basics of statistics you really should chat better with academic mathematicians, it's not me you need to answer... - I'm not interested in your belief that autocorrelation rules something in DL... - for the 5th time I wrote "it's a bad model" (I don't want to write the 10th)... let your academics answer you (if my answer made you want to prove something)

 
Doc #:

Well, all right. Let me put the question another way. Are the two situations described below different from each other?

1) We have a sample of infinite size. Consider two time moments n and (n-t). Consider that 1 <= (n-t) <= n. CalculateACF((n-t),n)=sqrt((n-t)/n).

2) We have a sample of length n. Calculate the sample ACF with lag t:ACF(t) =sqrt((n-t)/n).

The difference is that in the first case ACF is considered for all possible pairs of time moments, while in the second case one of the time moments is fixed t2=n and many pairs of time moments( for example, the pair t1=1, t2=2) fall out of consideration. In the general case, the ACF is a function of two arguments. Only for stationary processes ACF can be considered as a function of one argument t=t1-t2 (lag).

The sampling ACF is always considered on the basis of a particular numerical sample (realization) of a process and always appears to be a function of one argument (lag value). This is the main reason why the sample ACF on a SB realization is not an estimate for its ACF)

Reason: