The discussion revolves around the computational complexity of HSIC compared to Pearson, the independence of transformed series, and the behavior of autocorrelation in time series analysis

Evgeniy Chernish 2025.05.13 04:41 #11

Maxim Dmitrievsky #:

I've also noticed that it's often faster to calculate fast MO models to determine dependency than these different criteria, which are usually slower. Although it should be the other way round :)

It takes us time to calculate significance, so it is fast to calculate the statistics itself

[Deleted] 2025.05.13 04:42 #12

Evgeniy Chernish #:
It takes us a while to calculate significance, so he's quick with the stats.

Right, I forgot about that.

fxsaber 2025.05.13 05:26 #13

Maxim Dmitrievsky #:
take a longer time lag.

There intervals (X1, X2, Y) do not overlap.

fxsaber 2025.05.13 05:33 #14

Evgeniy Chernish #:
HSIC cannot be used for non-stationary series. It is necessary to take price increments rather than prices. Pearson correlation indicates "dependence" for the same reason.

The computational complexity of HSIC is many orders of magnitude (with significance checks) higher than Pearson, so I expected a different result.

If the increments are independent, but their sums are suddenly "dependent", this is a strange result for such a resource-intensive criterion, even in theory.

A little surprised :) Complex Criterion Max formula Machine learning in trading:

[Deleted] 2025.05.13 05:36 #15

fxsaber #:

There the intervals (X1, X2, Y) do not overlap.

The sampled ACF of the SB decays even slower or not at all. Roughly speaking, these are meaningless calculations :)

fxsaber 2025.05.13 05:44 #16

Maxim Dmitrievsky #:

The sampled ACF of the SB attenuates even more slowly or not at all

I don't understand the application of this argument to the context of the discussion.

fxsaber 2025.05.13 05:46 #17

Assertion.

If after transforming the series (without loss of information - we can return to the initial state) we obtain independence, then the original series are independent.

[Deleted] 2025.05.13 05:51 #18

fxsaber #:

I don't understand the application of this argument to the context of the discussion.

These methods show not what is expected on non-stationary series. Therefore, we can take the ACF as a basis and explain using it as an example. How correlations change as a function of step t. On the SB, the autocorrelation is time dependent. It is all written, you can read it on the internet.

Autocorrelation is the correlation of the SB with itself, with the lag. It depends on the time lag.

This is the basics of time series analysis.

Read how the simple and sample acf on the SB varies with lag.

The only difference between the method proposed in the article is that it works with non-linear correlations.

Machine learning in trading: Stable MTS Hearst index

fxsaber 2025.05.13 06:20 #19

fxsaber #:

Approval.

If after transforming the series (without loss of information - we can return to the initial state) we obtain independence, then the original series are independent.

Three independent series.

if (SData == Nonlinear_dependence){
double x1 [];
MathRandomUniform(-5,5,data_,x1);
double x2 [];
MathRandomUniform(-5,5,data_,x2);
double y[];
MathRandomUniform(-5,5,data_,y);

we get this result.

Коэффициент корреляции (X1, Y) = 0.0283
Коэффициент корреляции (X2, Y) = -0.0097
----------------Nonlinear_dependence-------------
Время выполнения: 13.469 seconds
-----------------------------------
Number observations 1000
HSIC: 0.00028932
p-value: 0.5100
Critical value: 0.0005
Не отвергаем H0: Наблюдения независимы

Now we transform them into sums (no loss of information).

double sum1 = 0, sum2 = 0, sum = 0;

for (int i=0;i<data_;i++){
x1[i] = (sum1 += x1[i]);
x2[i] = (sum2 += x2[i]);
y[i] = (sum += y[i]);
}

The result is "dependent".

Коэффициент корреляции (X1, Y) = 0.3930
Коэффициент корреляции (X2, Y) = 0.1924
----------------Nonlinear_dependence-------------
Время выполнения: 12.890 seconds
-----------------------------------
Number observations 1000
HSIC: 0.01020060
p-value: 0.0000
Critical value: 0.0009
Отвергаем H0: Наблюдения зависимы

[Deleted] 2025.05.13 06:21 #20

fxsaber #:

Three independent rows.

we get this result.

Now convert them into sums (no loss of information).

The result is "dependent".

The loss of information is huge. Trend, seasonality and cycles are removed. It is 2 different time series.

Discussing the article: "Hilbert-Schmidt Independence Criterion (HSIC)" - page 2

Assertion.

Approval.