Dependency statistics in quotes (information theory, correlation and other feature selection methods) - page 11

 
HideYourRichess:

The concept of information entropy was introduced by Shannon for independent symbols. If you don't believe me, consult an academic dictionary. I will not argue with you on this subject any more. You cannot calculate the information entropy for the market because you do not know the alphabet, you do not know the frequency of symbols, and the independence of symbols is also unknown.

The next question, conditional entropy, is just the case when there are dependencies between the original alphabet. This thing is not the same as the information entropy discussed.

I do not understand what conclusions you draw from the example of the archiver, but I will say this. The task of the archiver is to translate conditional entropy into informational entropy. That is, to create a perfectly defined limited alphabet, the characters from which, in the resulting sequence, would be as independent as possible. If you mix up the ordered structure of a literary text at the letter level, of course those letter sequences would be broken and the compression would deteriorate. To the extent that a completely random set of letters can no longer be compressed.


I find your formulation of the question paradoxical at the outset. If we get a value other than 0 as a result of the mutual information calculation, then we have taken an alphabet with dependencies. If we study independent values, then the mutual information will always be 0 (or very close to that value).
 
Mutual entropy is not the same as conventional entropy and not the same as informational entropy.
 
TheXpert: What are the numbers not alphabetical?

Alphabet - but not a number system.

Choice of alphabet.

OK, so be it. I have constructed the alphabet this way:

I find the unconditional distribution of returns over the whole history (EURUSD, H1, about 10 years). The histogram is more or less known. It is a curve resembling a Gaussian bell but with differences near zero and in the tail parts. I won't draw it here.

Then I choose how many quantiles I will divide the distribution into. Say, by 30. This will be the alphabet. Here it is:

0: [-10000.000; -305.000),2166
1: [-305.000; -210.000),2167
2: [-210.000; -161.000),2166
3: [-161.000; -130.000),2166
4: [-130.000; -110.000),2166
5: [-110.000; -90.000),2167
6: [-90.000; -80.000),2166
7: [-80.000; -60.000),2166
8: [-60.000; -50.000),2166
9: [-50.000; -40.000),2167
10: [-40.000; -30.000),2166
11: [-30.000; -20.000),2166
12: [-20.000; -10.000),2166
13: [-10.000; -10.000),2167
14: [-10.000; 0.000),2166
15: [0.000; 10.000),2166
16: [10.000; 20.000),2167
17: [20.000; 24.000),2166
18: [24.000; 30.000),2166
19: [30.000; 40.000),2166
20: [40.000; 50.000),2167
21: [50.000; 62.000),2166
22: [62.000; 80.000),2166
23: [80.000; 90.000),2166
24: [90.000; 110.000),2167
25: [110.000; 136.000),2166
26: [136.000; 170.000),2166
27: [170.000; 211.000),2166
28: [211.000; 300.000),2167
29: [300.000; 10000.000),2167

Explanation: first there is the number of quantile (from 0 to 29). Then comes the half-interval which characterizes the boundaries of the quantile in five-digit pips. For example, quantile 22 corresponds to a positive return from 62 to 80 pips. And the last number is the number of values falling within that quantile (to control the correctness of the breakdown into quantiles).

Yes, it's not very pretty for large returns, as in reality returns can be up to about 3000 new points. Well those are fat tails, can't be helped...

This alphabet was convenient for me specifically when calculating the chi-square criterion. It was convenient because even for very serious deviations from independence, the minimum frequency of joint hits was not less than 5 (this is a condition for correctness of chi-square). Perhaps a different choice of alphabet would be better.

And in general, say, with a number of quantiles of 50, the inner limits of the outermost quantiles are pushed back to about 380 new points (instead of the previous 300). This is better, but still not great.

 
Mathemat:

Then I choose how many quantiles I want to divide the distribution into. Let's say 30. This will be the alphabet. That's what it is:

If you don't mind, could you tell me how to analyse data using alphabet? I'm currently struggling with a similar problem, so far I'm analysing it using NS in Matlab.

Is there any way to analyse data presented as alphabet apart from NS?

 
Mathemat:

It's quite realistic. I haven't noticed any limits, but it is possible to do sums and logarithms in MQL4. I don't know what sergeev did. But as far as I know from other sources, the most difficult part of calculations was calculating the gamma function. The TI was out of the question.


People wrote the indicator according to the article by Y.Sultonov "Universal regression model for market price forecasting" - here in Kodobase.

Arethere similar constructions usedthere? Or not?

 
HideYourRichess:
Mutual entropy is not the same as conventional entropy and not the same as information entropy.

You are getting away from the question. What is the purpose of applying mutual information statistics if we require the system to be independent of random values? Mutual information will be zero in that case. It's written all over the place.

I will also say that the introduction of the entropy concept into TC was typical of the Soviet school. Americans give the following classical formula of mutual information calculation:

That is, there is no entropy as a concept here.

 
HideYourRichess: Shannon introduced the concept of information entropy for independent symbols. If you don't believe me, consult an academic dictionary.

Found an article on information entropy (Wiki). Quote 1 from there:

Entropy is the amount of information per elementary message of a source producing statistically independent messages.

It's entropy, regular entropy. Is that the definition you're talking about?

Yes, I'm willing to agree that the letters of the alphabet must be statistically independent so that there is no redundancy or dependencies. This is roughly what the archiver is doing, creating an alphabet that is clearly different from the alphabet used to create the text.

But that's not what we're counting! What we're counting is next.

Further, you have already been given Quote 2 from the same place:
Conditional entropy

If an alphabet's symbols sequence is not independent (for example, in French"q" is almost always followed by "u", and the word "vanguard" in soviet newspapers was usually followed by "production" or "labour"), the amount of information, which a sequence of such symbols carries (and consequently, entropy), is obviously less. Conditional entropy is used to account for such facts.

This is different, and you have already written about it:

HideYourRichess : The next question, conditional entropy, is exactly the case when there are dependencies between characters of the original alphabet. This thing is not the same as the information entropy in question.

The topicstarter's speech (and mine too) was not about informational entropy, but, dammit, mutual information (Wiki again)!!!

Mutual information is a statistical function of two random variables describing the amount of information contained in one random variable relative to the other.

Mutual information is defined through the entropy and conditional entropy of two random variables as [next comes the formula for I(X,Y)

Now for your final argument:

HideYourRichess : The task of the archiver is to translate conditional entropy into information entropy. That is, to create a perfectly defined bounded alphabet, the characters from which, in the resulting sequence, would be as independent as possible. If you mix up the ordered structure of a literary text at the letter level, of course those letter sequences would be broken and the compression would deteriorate. To the extent that a completely random set of letters can no longer be compressed. So what? What's that got to do with the bazaar?

The argument is that it is not about what you call information entropy, but about mutual information. That's it. Full stop. The argument is over.

 
IgorM:

If you don't mind, could you tell me how to analyse data using alphabet? I'm currently struggling with a similar problem, so far I'm analysing it using NS in Matlab.

Are there any other ways to analyse data represented in the form of alphabet besides NS?

To be honest, I don't really understand your question. We just assign an ordinal number to each character of the alphabet - and then analyse the numbers as usual. Perhaps there is something specific, but I am not aware of it.

Roman: People wrote the indicator according to Sultonov's article "The Universal Regression Model for Market Price Forecasting" - here in kodobase.

Arethere some similar constructions usedthere? Or not?

There's not even a hint of terver/statistics or information theory there! Yusuf posted his post in this thread, but it turned out to be an afterthought as it has nothing to do with the topic of discussion. Although... yes, the logarithms seemed to be there...
 
Mathemat:

There's not even a hint of terver/statistics or information theory there! Although... yes, there were logarithms, I think...

I'm just saying that the curves and squiggles here and here look a lot alike to me... :-))), including the presence of a gamma distribution, hence the approaches to the solution should be SIGNIFICANTLY similar.

Is such a thing possible, at least CONSTANTLY?

 

The point is that the gamma distribution function appears in the article as if out of thin air, supposedly by solving a deterministic motion diphury - but not as a result of statistical or terversive analysis. Roman, so far I don't see any similarity in approaches to the solution - even conventionally.

But if you look closely, some similarity can still be found - say, in the word "distribution", which is also found in Yusuf's article :)

Reason: