Zero sample correlation does not necessarily mean there is no linear relationship - page 42

 

Yes... The magic word "correlation" misleads many people.

Correlation == probabilistic dependence. That is, self-delusion. Look for a linear relationship.

 
C-4: What will logarithms do for you? Logarithms can only be used when the start and end points of a series are too different in their volatility and level. That is, if you are analyzing the DowJons from 1900 to 2013, you cannot do without it, but in other cases it cannot be used.

Again, this thread seems to have already talked about it.

Think about the definition of correlation - in simple words it is the relationship of two sets. For sets from the linear space this correlation can be estimated via the scalar product of vectors (equivalent to Pearson's QC), and for example it is logical that for the orthogonal vectors such correlation is zero. For sets not belonging to the linear space, this relationship should be estimated differently. How? It already depends on the characteristics of the space. As examples we could consider other correlation coefficients.

If the readings are on a relative scale, and for quotes they are (showing how many times one currency is "more valuable" than another), then it is incorrect to apply linear methods (scalar product) "outright" to the raw data. Logarithm transfers readings from a relative scale to an interval scale, where the same correlation can already be estimated using Pearson's QC.

 
GaryKa:

Again, this thread seems to have already talked about it.

Think about the definition of correlation - in simple words it is a relationship of two sets. For sets from the linear space this correlation can be estimated via the scalar product of vectors (equivalent to Pearson's QC), and for example it is logical that for the orthogonal vectors such correlation is zero. For sets not belonging to the linear space, this relationship should be estimated differently. How? It already depends on the characteristics of the space. As examples we could consider other correlation coefficients.

If the readings are on a relative scale, which is the case for quotes (showing how many times one currency is "more valuable" than another), then it is incorrect to apply linear methods (scalar product) "outright" to the raw data. Logarithm transfers readings from a relative scale to an interval scale, where the same correlation can already be estimated using Pearson's QC.


Can you provide a specific example where taking logarithms changes the QC reading in a key way? Please give me an example where the original series gives a QC close to zero, while its logarithms miraculously put the QC at a meaningful estimate.

So far, let's take an example:

Pearson correlation between gold prices and Open Interest calculated on first differences without logarithm: 0.1968

Pearson correlation between gold prices and Open Interest calculated for ln(Pi/Pi-1): 0.2067

Now, because of the difference of 1% you can shout with delight and say on every corner that there is no way without logarithm.

 
alsu:

The kind of distribution of the correlation matrix depends on the properties of both series and the relationship between them, i.e. it does not have to be the same for all possible series... For SB it is one, for some solar flares another...
that's a measure of the error. If the distribution is as C-4 has shown, the error is huge and the probability of getting a larger deviation from the actual value is almost non-existent. What is the point of such an indicator if with real independence one can get a correlation from -0.6 to +0.6 with equal probability?
 
C-4: Can you provide a specific example where taking the logarithms changes the QC reading in a key way? Please give me an example where the original series gives a QC close to zero, while its logarithms miraculously put the QC at a meaningful estimate.

I'll try to do it.

C-4: While you catch an example:
  • Pearson correlation between gold prices and Open Interest calculated on first differences without logarithm: 0.1968
  • Pearson correlation between the price of gold and Open Interest, calculated for ln(Pi/Pi-1): 0.2067

Now, because of the difference of 1% you can shout with delight and say on every corner that without logarithm you cannot go anywhere.

I don't count the first differences ... tenths either )

On the data from your example:

  • Pearson correlation on the raw data is 0.767687.
  • The Pearson correlation on the logarithms of the raw data is 0.819971.

Seems to be in pretty good agreement with visual observation. The difference is more than 5%.

Files:
 
GaryKa:

I'll try to make one.

I don't count first differences ... tenths too...)

...

Let's first find out if it's correct to use QC on regular price series at all. So far I have provided data saying that QC on I(1) should not be counted.
 
C-4:
Let's first find out if it is correct to use QR on a regular price series. So far I've provided data saying that it seems that QC on I(1) cannot be calculated.

Where have you ever seen a normality requirement for calculating QC? Once again, it is a requirement for using correlation analysis.

What nonsense - QC is only for normally distributed values.......... It turns out that you cannot calculate QC between, for example, gold and silver quotes.........

 
Demi:

Where have you ever seen a normality requirement for calculating QC? Once again, it is a requirement for the use of correlation analysis.

What nonsense - QC is only for normally distributed values..........

What does normality have to do with it? Again, I(1) is the consecutive sum of a series of the form I(0). I(0) is the normal increments, or returns. The type of returns is not important. The important thing is that QC can only be calculated on returns, but not on the price itself.
 
C-4:
The important thing is that QC can only be counted on yields, not on the price itself.

Again, why?
 
Demi:
Again, why?


Because: 1. see the picture above.

2. 2. Read what Avals writes:

Avals:
this is a measure of the error. If the distribution is as shown by C-4, the error is huge and the probability of getting a larger deviation from the actual value hardly decreases. What is the point of such an indicator if one can get a correlation from -0.6 to +0.6 with real independence?

Reason: