All Things Statistical - page 6

 

Stylized Facts and Statistical Issues

http://www.proba.jussieu.fr/pageperso/ramacont/papers/empirical.pdf

Summary: The paper above summarizes the empirical issues for generic financial time series, also known as stylized facts. Take note that they are generalized across financial markets and hence may not be specifically true across specific financial arenas. Some papers have also questioned the wisdom regarding the presence of some stylized facts. For instance, a new wave of academic papers have questioned the wisdom of extremely high kurtosis (fat tails) present.

=================================================

Sample moment versus Sample Size taken from the paper:

This part here is really interesting. This picture here seems to speak about a menacing fact. As said in the paper, second moments is of finite characteristics, but only after several thousand sample sizes. The implications are devastating. Just how much sample size is truly enough? That leaves quite some food for thought.

Wintersky

Cheers

Files:
sample1.jpg  58 kb
 

This is a good video every trader should watch and I think this is right thread to post it in.

Alex

IIJ JPM VIDEO PLAYER

 
wintersky111:
An Elementary introduction to R-Squared which i chanced upon as i was reviewing the statistical tools available at our disposal.....

TASC article on R-Squared:

identifying_market_trends_1.pdf

A Visual Simplified Explaination of R-Squared:

R-Squared: Sometimes, a Square is just a Square

Proper Interpretation of R-Squared:

Regression Analysis: How Do I Interpret R-squared and Assess the Goodness-of-Fit? | Minitab

Why NOT to use R-Squared for Nonlinear Regression:

Why Is There No R-Squared for Nonlinear Regression?

Enjoy

Wintersky

I like the standard error bands(confidence bands) of the regression article Regression Analysis: How to Interpret S, the Standard Error of the Regression | Minitab .Another goodness of fit is Akaike information criterion : Akaike information criterion - Wikipedia, the free encyclopedia

 
nevar:
Another goodness of fit is Akaike information criterion : Akaike information criterion - Wikipedia, the free encyclopedia

Interesting. I haven't heard of this before. So it seems that based upon information theory, maximum likelihood and parameters, there can be a sensor for overfitting. Personally, im still quite confused about maximum likelihood.

Wintersky

Cheers

 

Original post here:

https://www.mql5.com/en/forum/183054/page3

I've been thinking abit about the volume aspect ever since seeing that article but cant wrap my head around afew things. Not sure if this is a digression from our statistical thread, but thinking more about the actual application of tick volume given that correlation has been proved.

According to age old wisdom and current Volume Science Analysis (VSA) knowledge, persistent price increases are accompanied by volume decreases and vice versa. However, this logic would hold true for markets with "limited quantity" such as commodities and stocks where price upside is more difficult than price downside possibly, and price evolves from a single commodity or stock. Whereas in foreign exchange, there are 2 currencies involved as we are talking about currency pairs, so the above theory would not be applicable.

However, this does not discount that volume has some tangible benefit, just that i have yet to learn more about it yet as it seems the volume-price relationship is more complicated/altered as seems, not to mention the applicability of volume on different timeframes. Starting on reading Simba's thread and the VSA thread at FF:

https://www.mql5.com/en/forum/178912

vsa with Malcolm @ Forex Factory

Feedback welcome.

Wintersky

Cheers

 

Multicollinearity Part 1

Hi All,

Back from a long time away and gained the view that theres lots of new issues which i am seeking a clearer answer to, such as practical collinearity issues as pertaining to our FX environment...

The usual definition of Multicollinearity: "High multicollinearity results from a linear relationship between your independent variables with a high degree of correlation but aren’t completely deterministic"

OR

"Multicollinearity is simply the multiple counting of the same information"

Links Available:

What Are the Effects of Multicollinearity and When Can I Ignore Them?

When Can You Safely Ignore Multicollinearity? | Statistical Horizons

Multicollinearity Of TA Indicators Here:

Multicollinearity [ChartSchool]

muticollinearity_of_technical_indicators.pdf

Wintersky

Cheers

 
wintersky111:
Hi All,

Back from a long time away and gained the view that theres lots of new issues which i am seeking a clearer answer to, such as practical collinearity issues as pertaining to our FX environment...

The usual definition of Multicollinearity: "High multicollinearity results from a linear relationship between your independent variables with a high degree of correlation but aren’t completely deterministic"

OR

"Multicollinearity is simply the multiple counting of the same information"

Links Available:

What Are the Effects of Multicollinearity and When Can I Ignore Them?

When Can You Safely Ignore Multicollinearity? | Statistical Horizons

Multicollinearity Of TA Indicators Here:

Multicollinearity [ChartSchool]

muticollinearity_of_technical_indicators.pdf

Wintersky

Cheers

Nice. Thanks

 
techmac:
Nice. Thanks

 

Good Day All,

Things are going to start to get more difficult all the way here with the definition of what construes multicollinearity. Apparently, correlation has NOTHING to do with Multicollinearity:

"High correlation is neither necessary nor sufficient for collinearity. 1) Not necessary: Collinearity can exist among sets of variables; you can have all small correlations and yet very high collinearity. Suppose, for example, you have 10 independent variables. The first 9 all have 0 correlation among them. The 10th is the first nine, added up. No high correlation but perfect colinearity (although, in the real world, the 9 wouldn't be exactlyuncorrelated and the 10th wouldn't be exactly the sum of the others, so it wouldn't be exact collinearity. 2) Not sufficient. The argument here is trickier, but it turns out that, in some cases, you can have variables that are fairly highly correlated that do not cause collinearity"

Why is multicollinearity bad in layman's terms? - Quora

==========================================================

& under the same web link above, theres another issue popping out:

"Collinearity is what happens when you have only 1 variable's worth of information spread among 2 variables. Whatever is the effect of the 1 true variable on your outcome, you have no idea how to split that effect among your 2 proxy variables"

If it was a Physics experiment, the data variables available might be temperature, particle velocity, vectors of speed and direction, particle distance travelled etc....

If it was a biology experiment, the data variables involved might be mutation rate/cell count, heat levels etc etc.

But if it's in FX, there isnt really much although theres OHLC, opening and close are the same from last bar to current bar. & adding in the issue of practical construction of indicators, there's only the close by itself and the High+ Lows which might be used to construct volatility indicators. Not to mention that in the first place, OHLC is actually just 1 variable actually of price only......

High Multicollinearity and Your Econometric Model - For Dummies

Wintersky

Cheers

Files:
mc2.jpg  16 kb
 

Multicollinearity Part 3

Found a very good page on 6 practical ways to solve multicollinearity issues:

6 Ways to Address Collinearity in Regression Models | Learn it daily.

1. Manual Variable Selection

2. Tree Based Automatic Selection

3. Regression Based Automatic Variance Selection

4. Variable Reduction Via PCA

5. Variable Reduction Via Partial Least Squares

6. Parameter Estimation Via Shrinkage Methods

For parameter estimation via shrinkage methods, what seems to be happening here is that a user/self-created number here is used to penalize the coefficients in a way.

& of course there's also an additional way of increasing sample size to reduce multicollinearity issues..

https://www3.nd.edu/~rwilliam/stats2/l11.pdf

Here the link below suggests finding the Variance Inflating Factor if it's more than 2.50, or for a more clear cut easy rule, if one indicator has an R-Square of 0.60 or more with another single indicator, it ought to be eliminated. Sounds reasonable? I dont know.... It would be good certainly if someone who knows a link to R-Squared between different indicators could post it here

When Can You Safely Ignore Multicollinearity? | Statistical Horizons

============================================

But the main issue here (at this forum) is that we are (i believe) NOT using multiple variables to create a regression, but using single indicators standalone and doing signal comparisions....

OR be aware of the horse and the cart. If the structural theory is sound, multicollinearity isnt a really very major issue to worry about... or is it?

"The use of four different indicators all derived from the same series of closing prices to confirm each other is a perfect example: Bollinger On Bollinger Bands"https://lcchong.wordpress.com/2011/05/10/avoid-multicollinearity/

Wintersky

Cheers

Files:
Reason: