Machine learning in trading: theory, models, practice and algo-trading - page 2520

 
Valeriy Yastremskiy #:

The time limit will reduce the probability.

Yes, but it's still possible to find patterns within 3-5 days with 100% probability of triggering (I checked).

Aleksey Nikolayev #:

Well, yes, half of the problem is solved, there's another half - where to put a stop loss.

This is not a TS, just an idea

 
Aleksey Nikolayev #:

You already know how to calculate the ACF, don't you? Unlike smartlab, you can't ban me for this question here)

Hello Alexei, could I answer your question without asking? I just read a lot how you ask it, and I couldn't stand it, because the solution seemed very simple to me.

I created a normalized series of numbers from random values ( 1 or -1).

And a classic stock chart from it, by summing all previous values for the current point.


Then for the normalized series the autocorrelation will tend to zero.

And for a series of the stock chart the autocorrelation will tend to unity.


But only if the series is long enough, with a series of 100,000 numbers I got results as:

0.0010599888334729966 (normalized data)

0.9999708433220806 (non-normalized)

For a series of 100 numbers:

0.018773466833541926

0.9367627243658354

Out of 10:

-0.4999999999999999999999 (these values change with each new series at random)

-0.14285714285714285 (these values change with each new series by chance)


These are only special cases, but as you can see, when the series size is small, it can show autocorrelation within very wide random limits .

That said, thisautocorrelation is not a property of the data-generating process (in which there is no autocorrelation), which makes it difficult to measure and evaluate the process in this case.

I will attach my Python code below, if someone suddenly wants to check the calculations.

import numpy as np
import random

def autocorr(x, t=1):
    return np.corrcoef(np.array([x[:-t], x[t:]]))[0][1]

SB_numbers = []
for i in range (1, 100000):
    r = random.randint(0, 1)
    if r == 0:
        r = -1
    SB_numbers.append(r)
#print(SB_numbers)
    
SB_time_series = []
price = 0
for el in SB_numbers:
    price = price + el
    SB_time_series.append(price)
#print(SB_time_series)
    

    
print('numbers autocorr:',autocorr(SB_numbers, 1))
print('time_series autocorr::',autocorr(SB_time_series, 1))
 
LenaTrap #:

Hello Alexei, could I answer your question without asking? I just read a lot how you ask it, and I couldn't stand it, because the solution seemed very simple to me.

I created a normalized series of numbers from random values ( 1 or -1).

And a classic stock chart from it, by summing all previous values for the current point.


Then for the normalized series the autocorrelation will tend to zero.

And for a series of the stock chart the autocorrelation will tend to unity.


But only if the series is long enough, with a series of 100,000 numbers I have obtained results as:

0.0010599888334729966 (normalized data)

0.9999708433220806 (non-normalized)

For a series of 100 numbers:

0.018773466833541926

0.9367627243658354

Out of 10:

-0.4999999999999999999999 (these values change with each new series at random)

-0.14285714285714285 (these values change with each new series by chance)


These are only special cases, but as you can see, when the series size is small, it can show autocorrelation within very wide random limits .

That said, thisautocorrelation is not a property of the data-generating process (in which there is no autocorrelation), which makes it difficult to measure and evaluate the process in this case.

I'll attach my Python code below if anyone suddenly wants to check the calculations.

You are calculating the sample ACF. What is being asked is ACF. Not so long ago in this thread, Valeriy Yastremskiy posted some links to manuals in econometrics, where the ACF formulas for the white noise and stationary AR(1) processes were given. If I'm not mistaken, this function was denoted there by the Greek letter gamma. The question is what the formula would be for SB.

 
Why do we need formulas if we trade by sample?
 
secret #:
Why do we need formulas if we trade by sample?

We trade on prices. The assumption that prices are a sample is abstraction and theorizing.

 
Aleksey Nikolayev #:

You are counting the selective ACF. You are asking for the ACF. Not so long ago in this thread, Valeriy Yastremskiy posted links to tutorials on econometrics, where the ACF formulas for white noise and the stationary AR(1) process were given. If I'm not mistaken, this function was denoted there by the Greek letter gamma. The question is what the formula would be for the SB

I considerPearson's correlation coefficient, which seems to be the standard for estimating the presence of autocorrelation. Unfortunately I'm not quite sure what exactly you mean, you write the very short term "AFC" = autocorrelation function? What exactly then does the Pearson coefficient not suit you? In my opinion, the estimation was done correctly.

[1, 1, 1, -1, -1, -1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, 1]
[1, 2, 3, 2, 1, 0, 1, 0, 1, 0, 1, 2, 1, 0, 1, 0, 1, 2, 3]
-----------
[19 -2  1 -4 -1 -4  3 -4  5  0  3 -2 -1 -4 -1  0  3  2  1]
[42 28 19 12 12 10 15 14 14 12 13  8  8  6 11 14 14  8  3]

Is that what you would like to get?


Actually this is not a sample. It's a series of data generated by a process. So it's complete and not cut off, if the process worked 10 ticks, then we get a dataset of 10 elements, completely generated by this process from the beginning to the end.
 
LenaTrap #:

I considerPearson's correlation coefficient, which seems to be the standard for assessing the presence of autocorrelation. Unfortunately I don't quite understand what exactly you mean, you write a very short term "AFC" = autocorrelation function? What exactly then does the Pearson coefficient not suit you? In my opinion, the estimation was done correctly.

Is that what you would like to get at?

You are trying to replace the ACF with its sample estimate. Start by defining the ACF, not how to approximate it from the available implementation (sample).

Example. Let Xi be white noise. Then its ACF = COV(Xj,Xk)/sqrt( COV(Xj,Xj)* COV(Xk,Xk)) - is a function of the two indices j and k, which is equal to one if j==k and zero when j!=k.
 
Aleksey Nikolayev #:

We trade on prices. The assumption that prices are a sample is an abstraction and theorizing.

Theorizing is trading by formulas.)
 
Aleksey Nikolayev #:

You are trying to replace the ACF with its sample estimate. Start with the definition of ACF, not with how to approximate it from an existing realization (sample).

Let me explain my conclusions again:

For a general estimation of the AFC over a random walk process, it is necessary to:

- Take as large a sample as possible (100,000,000 in my case)

- Use normalized data

Conclusion: Pearson's coefficient is zero, everything else is the error in estimating the process over the sample.

That is, the random walk process has no autocorrelation.

It is equal to 0. ( 0.0010599888334729966 ), where 0 is the real autocorrelation and 0.00105 is the error.

 
secret #:
Theorizing is formula trading)

The multiplication table is also a formula. Therefore your statement should be interpreted as follows: trading according to the formulas you are familiar with is practicality, and according to unfamiliar ones is theorizing)

Reason: