Discussing the article: "Hilbert-Schmidt Independence Criterion (HSIC)" - page 2

 
Maxim Dmitrievsky #:


I've also noticed that it's often faster to calculate fast MO models to determine dependency than these different criteria, which are usually slower. Although it should be the other way round :)
It takes us time to calculate significance, so it is fast to calculate the statistics itself
[Deleted]  
Evgeniy Chernish #:
It takes us a while to calculate significance, so he's quick with the stats.
Right, I forgot about that.
 
Maxim Dmitrievsky #:
take a longer time lag.

There intervals (X1, X2, Y) do not overlap.

 
Evgeniy Chernish #:
HSIC cannot be used for non-stationary series. It is necessary to take price increments rather than prices. Pearson correlation indicates "dependence" for the same reason.

The computational complexity of HSIC is many orders of magnitude (with significance checks) higher than Pearson, so I expected a different result.

If the increments are independent, but their sums are suddenly "dependent", this is a strange result for such a resource-intensive criterion, even in theory.

[Deleted]  
fxsaber #:

There the intervals (X1, X2, Y) do not overlap.

The sampled ACF of the SB decays even slower or not at all. Roughly speaking, these are meaningless calculations :)

 
Maxim Dmitrievsky #:

The sampled ACF of the SB attenuates even more slowly or not at all

I don't understand the application of this argument to the context of the discussion.

 

Assertion.

If after transforming the series (without loss of information - we can return to the initial state) we obtain independence, then the original series are independent.

[Deleted]  
fxsaber #:

I don't understand the application of this argument to the context of the discussion.

These methods show not what is expected on non-stationary series. Therefore, we can take the ACF as a basis and explain using it as an example. How correlations change as a function of step t. On the SB, the autocorrelation is time dependent. It is all written, you can read it on the internet.
Autocorrelation is the correlation of the SB with itself, with the lag. It depends on the time lag.
This is the basics of time series analysis.
Read how the simple and sample acf on the SB varies with lag.

The only difference between the method proposed in the article is that it works with non-linear correlations.
 
fxsaber #:

Approval.

If after transforming the series (without loss of information - we can return to the initial state) we obtain independence, then the original series are independent.

Three independent series.

if (SData == Nonlinear_dependence){
double x1 [];
MathRandomUniform(-5,5,data_,x1);
double x2 [];
MathRandomUniform(-5,5,data_,x2);
double y[];
MathRandomUniform(-5,5,data_,y);


we get this result.

Коэффициент корреляции (X1, Y) = 0.0283
Коэффициент корреляции (X2, Y) = -0.0097
----------------Nonlinear_dependence-------------
Время выполнения: 13.469 seconds
-----------------------------------
Number observations 1000
HSIC: 0.00028932
p-value: 0.5100
Critical value: 0.0005
Не отвергаем H0: Наблюдения независимы


Now we transform them into sums (no loss of information).

double sum1 = 0, sum2 = 0, sum = 0;

for (int i=0;i<data_;i++){
x1[i] = (sum1 += x1[i]);
x2[i] = (sum2 += x2[i]);
y[i] = (sum += y[i]);
}


The result is "dependent".

Коэффициент корреляции (X1, Y) = 0.3930
Коэффициент корреляции (X2, Y) = 0.1924
----------------Nonlinear_dependence-------------
Время выполнения: 12.890 seconds
-----------------------------------
Number observations 1000
HSIC: 0.01020060
p-value: 0.0000
Critical value: 0.0009
Отвергаем H0: Наблюдения зависимы
[Deleted]  
fxsaber #:

Three independent rows.


we get this result.


Now convert them into sums (no loss of information).


The result is "dependent".

The loss of information is huge. Trend, seasonality and cycles are removed. It is 2 different time series.