Taking Neural Networks to the next level - page 30

 
Chris70:

@Young Ho Seo

First of all: thank you for your comprehensive answer. You give good and interesting explanations. Just one thing: according to the title, people expect to read about neural networks; please open a new thread that only deals with fractals, Eliott Waves and/or Fibonacci numbers (or use one of the existing threads about this), if you want to discuss more in detail (apart from that.. sure, we can do that). Out of respect for your detailed answer, I nevertheless won't leave you without a response here, but let's continue elsewhere if that is your interest.

I have a good understanding of how fractals in trading are made, so there is no need for an explanation, but still: I respect your answer and it will probably be informative for many who haven't heard much about fractals.

My point about "how many fractals" simply was that you always need at least three datapoints (in that example: candles from a daily time chart) to build a triangle, so you always end up with a lot less fractals than datapoints. When using a daily chart, this obviously isn't that many fractals per year, which also sets a limit to the expected accuracy of distribution curves collected from just ten years of daily candles, because this sets the upper(!) limit of how many fractals you can have. It is obvious, according to the hypothesis of scale-independence, that the lower limit is just 1, meaning you have zoomed completely out of the picture.

Just to avoid confusion: Although there is no theoretical upper limit in fractal geometry and although we could infinitely "zoom in" into the picture in computer models, praxis often tells us otherwise, i.e. if we zoom into a cauliflower or the bronchi branches inside a lung, we see self-repetition on many levels, but there is a limit at least once you reached the cellular level. It's the same in trading: fractals could not go beyond the tick level. Of course, this isn't a problem in praxis - I just wanted to clearify this if I talk about "limits".

I also understand - if we talk about something like "daily" candles - that it's not important to take the time factor for x-axis scaling. Infact, I remember an interview I once saw with Mandelbrot where he said (this was about finance) that he was more pleased with his findings once he took the time factor out of the equation, so it's a good point that you mention renko charts.

You see, as I'm talking about fractals myself, obviously I don't doubt their existence, even not in trading. I worked with fractals myself (just in a different way) and think they can be useful to divide prices into meaningful portion on whatever detail level and, more so, peak/trough analysis probably is the go-to method to scan for support and resistance levels. This is why I have no doubt that fractals are useful.

My point of critique is about the kind of self-repetition and predictability. Because fractals in trading (and also often in nature) have an uncanny simple base structure - just think of Mandelbrot's famous z²+c formula - it doesn't come as a suprise that we see self-similarity on all levels/scales. We need only very few parameters to define a fractal in trading, like the amount of retracement and the momentum/steepness of the triangle edges (or anything that relates the x-axis to the y-axis). With only so little parameters, it is impossible not to see similar triangles over and over again. But you never know which triangle appears when. What's missing is regularity. I know, Mandelbrot didn't like the opposing term "irregularity" and prefered "roughness". So let's take examples from nature again: the cauliflower has a measurable roughness and near-perfect self-repetition [edit: I'm talking about this "Romanesco" type stuff... sorry, no vegetable expert ;-) ]. If you know what the global shape looks like, you will find almost the same shapes on lower levels and therefore it isn't hard to unravel their probable shape almost perfectly, without ever having seen them. This is different with the mountain or coastline examples: you could computer-generate them by only giving the roughness as a single variable. But the problem is: you will generate some mountain, but not always the same mountain. Every newly generated mountain will be completely different. We have self-similarity, but also complete randomness at the same time. This doesn't mean that other traits of fractals aren't true: you will find mountain-like shapes on any scaling level, maybe even down to an atomic lower limit. But with near-constant roughness and self-similarity only we don't automatically have self-identity and predictability, too. Back to trading, price fractality is like the mountain, but not the cauliflower, if you like. [edit: you're probably referring to the same thing as "loose similarity", that's all fine, but it's a problem for trading decisions; it's a big step from just recognizing fractals or building a fractal-based notation system to valid (repeatable) signals; with the latter I'm missing the empirical proof that this is possible with statistic significance]

I hope this was a good explanation about the limitations of fractals in trading. Just because we often observe near-perfect self-repetition in nature, there are also many differing examples, i.e. self-repeating near-chaos. The latter isn't useful for predictability at all - and this is the problem we have in trading.

I mentioned several times in this thread that I don't actually believe in perfect chaos in trading. If I did, trading would be pointless - I should walk away and never trade again. There is at least a small amount of non-randomness, mostly during extreme events (e.g. if the price has just dropped by 100 pips within a minute, who's gonna catch the knife and what's the more probable direction of the next tick?). My personal philosophy therefore is that trading works best by reacting to the exception (machine learning is just a method for automatic for finding such higher probability situations) - and by the way: even Mandelbrot, as an expert on distribution- tails(!) would have agreed with me on this. Self-repetition on the other hand is pretty much the opposite: we react to expected regularity. This just doesn't work in trading! You "trade the mountain, not the cauliflower". This is why fractals (and Elliott Waves!) fail in trading. But to stay in the analogy: what happens if a (specific) moutain is the exception of the expected fractal geometry, i.e. if we find areas of much less chaos than the average of the rest of the mountain? Just imagine a steep cliff that has just dropped by 1000 feet 90 degrees down: what's more likely... that out of pure chance you have just found exactly the point where the cliff ends or might it continue down by another few feet? I hope you understand how this relates to trading: it's much better to trade the scenarios where fractality rules apply the least.

By the way: what seems like a good idea in order to actually get something useful out of fractals (instead of some Elliott voodoo predictions): finding a measure of fractality "roughness" in trading (or simply copy this from Mandelbrot's work), then measure its variation and take the exceptions, i.e. those regions with extremly high or low roughness for generating trading signals (e.g. using roughness as trend/range separation). Something like a "roughness-standard-deviation-oscillator". Interesting, right?

I think I gave some undeniable arguments, but I'm sure I'm not going to convince you if you are deep into the subject for many years and I'm now asking you to think out of the box again. We both have our convictions and that's okay.

(again, just please: open another thread if you feel that a further discussion is necessary (although I don't think myself that there is much left to clarify)).


That is good comment you wrote here.

Thanks  so much Chris.

However, I do not have time to bring the fractal theory to cellular level or atomatic level although I like the idea.

So I will pass this one.

But this one seems good comment though.


By the way: what seems like a good idea in order to actually get something useful out of fractals (instead of some Elliott voodoo predictions): finding a measure of fractality "roughness" in trading (or simply copy this from Mandelbrot's work), then measure its variation and take the exceptions, i.e. those regions with extremly high or low roughness for generating trading signals (e.g. using roughness as trend/range separation). Something like a "roughness-standard-deviation-oscillator". Interesting, right?


This is exactly going back to my post  here about the fractal dimension.

Fractal dimension, left by Benoit Mandelbrot, is one of the approach of measuring the roughness in price series.

I am sure there is other approach too. But it is better to start with Fractal dimension.

This one would reveal a lot of useful insight about fractatal pattern and trading straetgy itself.

In fact, that was one of the reason why my motivation with price pattern study was doubled up in financial trading. 


One particular tool is designed to exam or to model one particular phenomenan only in our life. We won't expect one particular tool will be just the magical tool to do everything in our life.

This will never happen. :)


Hence, I am not here to conveniece you. Neither you will convenience me.

It is same thing like if I am the sword master and you are the master of spear, then we try to say our own weapon is better than another weapons.

Probably this is not necessary.

We all know Neural networks is undoutably the beast who could catch up human soon or later.

At the same time, we know that there are price action trader running tons of money with their pattern trading skills.


My ground breaking work was motivated by the successful price action and pattern trader, I personally met in the trading floor.

However, many of us using this methodogoy are not able to think beyond the methodolgoy and data itself.

Hence they are not adding too much value in our main stream knowledge.

It is like we know that Microwaved Grapes can create Plasma.

But we do not understand why they create Plasma inside Microwave (it seems it is useful informtion towards the energy crysis).


So I stepped up to try to connect them to science at least try to be useful.

That is all. It is my personal interest and I think one dominating interest for my life now. :)


But thanks so much for sharing your comments with me.

In fact, I wanted to write something here when some one was talking about ARFIMA somewhere in your thread several weeks ago.

But only now I have time to write some.


Please keep up with good work of yours. You are very knolwedged in what you are doing.

As I said, people need to understand and to support who is doing the ground breaking work voluntarily like you and I. :) 

 

Thanks for your answer.

Well, I could even use "roughness" as an additional neural network input and we would even return to the subject ;-)

I just had a quick look on wikipedia about fractal dimension and I see that there a many different formulas/concepts.

Is there any specific formula that you can recommend (and why?) in order to express fractal roughness in terms of numbers?

 
Chris70:

Thanks for your answer.

Well, I could even use "roughness" as an additional neural network input and we would even return to the subject ;-)

I just had a quick look on wikipedia about fractal dimension and I see that there a many different formulas/concepts.

Is there any specific formula that you can recommend (and why?) in order to express fractal roughness in terms of numbers?

Yes, there are some Fratal dimensions including Box counting dimension and Information dimension, Correlation dimension, etc


https://en.wikipedia.org/wiki/Fractal_dimension


The thing is Neural network is black box machine.

I can not guess how the neural network with your own archetecture will react on this fractal dimension input.

I am sure that the archetecture I am thinking of is differnet too from yours in terms of number of hidden layers, activation function and error feedback, etc.


So the best bet is to go through all and pick up the best one for your archetecture if you can.

I think you already know that is data science anyway. :)

These are very unique input to neural networks and you can probably keep these input together with your other inputs.


The image you have sent to me is also interesting. and I am glad that you made some improvement over your system with your first attempt with fractal.

As I mentioned before, calculating Fractal Dimension is the first step in Fractal analysis but it is a big step forward. :)



Fractal dimension - Wikipedia
Fractal dimension - Wikipedia
  • en.wikipedia.org
The essential idea of "fractured" dimensions has a long history in mathematics, but the term itself was brought to the fore by Benoit Mandelbrot based on his 1967 paper on self-similarity in which he discussed fractional dimensions.[4] In that paper, Mandelbrot cited previous work by Lewis Fry Richardson describing the counter-intuitive notion...
 
Young Ho Seo:

Yes, there are some Fratal dimensions including Box counting dimension and Information dimension, Correlation dimension, etc


https://en.wikipedia.org/wiki/Fractal_dimension


The thing is Neural network is black box machine.

I can not guess how the neural network with your own archetecture will react on this fractal dimension input.

I am sure that the archetecture I am thinking of is differnet too from yours in terms of number of hidden layers, activation function and error feedback, etc.


So the best bet is to go through all and pick up the best one for your archetecture if you can.

I think you already know that is data science anyway. :)

These are very unique input to neural networks and you can probably keep these input together with your other inputs.


The image you have sent to me is also interesting. and I am glad that you made some improvement over your system with your first attempt with fractal.

As I mentioned before, calculating Fractal Dimension is the first step in Fractal analysis but it is a big step forward. :)



"First attempt" isn't exactly true. I've use fractals often to obtain major peaks/troughs that mark price levels that might act as support/resistance and therefore oportunities for breakout or pull-back trades. This never worked very well for the monetary aspects in Expert Advisors, though. I also don't believe in the whole extrapolation/prediction part on the basis of fractals and am hesitant to call fractals an improvement. 

Neural networks can accept anything as inputs as long as it can be put into numbers (usually combined with normalized scaling und for time series usually combined with stationary transformation). Data from fractals make no difference. The question is if they have any meaning. Because backpropagation penalizes anything that isn't relevant for the correct outputs, the network weights associatiated with paths coming from irrelevant features over time will just go down to near zero, so that these features are practically ignored, which is exactly what might happen (like for many other less relevant features, too).

But: keeping irrelevant features in the data set also has some downsides, because although irrelevant, they may still make just "memorizing" input combinations easier for the network and hence contribute to curve fitting, so a mindful feature selection also has its benefits. 

I'm seeing this problem with all the other "indicators", too. Giving so many variables allows to draw a very detailed (recognizable --> memorizable) picture, which can be a problem. I'm just trying many neural network related things her. see what works and what doesn't and share it with you guys, so that others don't need to repeat my mistakes. My experience with this indicator based version for NN price prediction so far is, that I get incredibly low training set errors (I've seen results like "predicting" the wrong direction the next bar in as little as 2% --> =memorizing the training set, which confirms that the "learning" is working, but the generalization is much worse. I'm really struggeling with getting the validation error down - only proving what was to be expected and giving some insights on the question if indicators (used for neural networks) are helping compared to pure price.  What I'm seeing so far is that they clearly don't. The only part that I don't understand is that I get really good backtest results, but most of us [ --> ;-) ] know that backtest can give a wrong idea. If I remember the numbers correctly, what had worked best so far for getting the validation error down was pure price (as fractional stationary series) with multi-currency inputs.

Because you mention it: Network architecture always is a bit trial an error with a few rules of thumbs or "best practice". You then just observe the effects of your changes, i.e. mainly look at the cost function graph and keep what works. I wrote my code in a way that e.g. adding layers (with any amount of neurons) and associating activation functions is done in single lines of code (just like adding layers in Python), which makes the architecture flexible. Changing cost function or optimizer algorithm is done just as quickly.

 
Chris70:

"First attempt" isn't exactly true. I've use fractals often to obtain major peaks/troughs that mark price levels that might act as support/resistance and therefore oportunities for breakout or pull-back trades. This never worked very well for the monetary aspects in Expert Advisors, though. I also don't believe in the whole extrapolation/prediction part on the basis of fractals and am hesitant to call fractals an improvement. 

Neural networks can accept anything as inputs as long as it can be put into numbers (usually combined with normalized scaling und for time series usually combined with stationary transformation). Data from fractals make no difference. The question is if they have any meaning. Because backpropagation penalizes anything that isn't relevant for the correct outputs, the network weights associatiated with paths coming from irrelevant features over time will just go down to near zero, so that these features are practically ignored, which is exactly what might happen (like for many other less relevant features, too).

But: keeping irrelevant features in the data set also has some downsides, because although irrelevant, they may still make just "memorizing" input combinations easier for the network and hence contribute to curve fitting, so a mindful feature selection also has its benefits. 

I'm seeing this problem with all the other "indicators", too. Giving so many variables allows to draw a very detailed (recognizable --> memorizable) picture, which can be a problem. I'm just trying many neural network related things her. see what works and what doesn't and share it with you guys, so that others don't need to repeat my mistakes. My experience with this indicator based version for NN price prediction so far is, that I get incredibly low training set errors (I've seen results like "predicting" the wrong direction the next bar in as little as 2% --> =memorizing the training set, which confirms that the "learning" is working, but the generalization is much worse. I'm really struggeling with getting the validation error down - only proving what was to be expected and giving some insights on the question if indicators (used for neural networks) are helping compared to pure price.  What I'm seeing so far is that they clearly don't. The only part that I don't understand is that I get really good backtest results, but most of us [ --> ;-) ] know that backtest can give a wrong idea. If I remember the numbers correctly, what had worked best so far for getting the validation error down was pure price (as fractional stationary series) with multi-currency inputs.

Because you mention it: Network architecture always is a bit trial an error with a few rules of thumbs or "best practice". You then just observe the effects of your changes, i.e. mainly look at the cost function graph and keep what works. I wrote my code in a way that e.g. adding layers (with any amount of neurons) and associating activation functions is done in single lines of code (just like adding layers in Python), which makes the architecture flexible. Changing cost function or optimizer algorithm is done just as quickly.

But what can I say ?

This is common issues for most of Neural Networks and the artificial intelligence techniques (kernel trick based method might be some other story).

But some does come up with better results with Neural Networks for sure.

Probabaly they have gone through numerous number of model selection and building process to find better performing one, I guess.

I am not sure if you have automatic prunning option built in because playing with neural network with fixed archetecture will not give you meaningful results.

 
Young Ho Seo:

But what can I say ?

This is common issues for most of Neural Networks and the artificial intelligence techniques (kernel trick based method might be some other story).

But some does come up with better results with Neural Networks for sure.

Probabaly they have gone through numerous number of model selection and building process to find better performing one, I guess.

I am not sure if you have automatic prunning option built in because playing with neural network with fixed archetecture will not give you meaningful results.

Every neural network suffers from the garbage in / garbage out problem. The inputs need to contain relevant information.

Transforming price data by calculating indicators or fractals from them adds no information. It is just a different presentation. This is why I ran the recent tests only with a "can't hurt" mentality, but low expectations.

It is usually better to just let the network do it's job and find hidden relationships by itself.

Apart from that, achieving better results and really adding information could probably be achieved with things like for example sentiment data, indices, oil price... This is why I'm also not surprised that the multi-currency approach helped.

I'm still convinced that neural networks are superior to traditional technical analysis, which makes sense once someone has understood the universal approximation theorem (I know... I mentioned that on many occasions) and has lost unsubstantiated prejudice about that obscure "black box", but NN are no holy grail either. This is not because neural networks are bad -infact they are incredibly powerful - but the data are bad. We just have much more random moves than many people like to believe and even the most sophisticated technique in the world won't have an answer to this.

This is different with NN applications that have a deterministic answer. Like e.g. speech recognition: there is just one correct word for a given sound. But that's not how prices work. Identical patterns lead to varying consequences. The correct answers are changing all the time. I'm just looking for those very few non-random percent.

About fixed architecture and pruning: The architecture is only "fixed" at it's starting point. I do have automatic pruning. With pruning we need to pay attention to what is actually meant. Usually "pruning" refers to removal of redundant weights, whereas the term "apoptosis" (stolen from the term for programmed cell death in biology) ususally is used for automatic removal of redundant neurons. Some people also say pruning when say actually mean neurons, not weights. Whatever nomenclature is used, I work with both. The only problem with weight pruning is that it makes the learning a bit slower, because regularly scheduled rankings of e.g. 1 million weights (which isn't unusual) comes with a little performance price. Of course I also use dropout (random exclusion of a defined percentage of neurons and their connected weights, without permanent deactivation), too.


Example with a neuron activation map showing permanently deactivated neurons (apoptosis / "neuron pruning", selected through a neuron relevance ranking algorithm) in black, dropout (changing with every iteration) shown in gray.

Apoptosis

 

I think we can simplify everything by instead of using wave theories or technical indicators, just by looking at the current bar and then the cluster of bars next to the current bar and then asking is this movement looking like it will continue or reverse? It would end up as a bit of a guessing game, however good predictions may be possible just with natural intelligence.


Now with Artificial intelligence maybe the system could keep readjusting its probability of will the price continue or reverse? Then hopefully based on this information trades can be initiated, hedged, unhedged appropriately at the correct time and condition and resulting in net profit. ( But I suppose more in a qualitative way for the brain and more of a quantitative way for the computer)

technical analysis - Trading blogs and financial markets analysis
technical analysis - Trading blogs and financial markets analysis
  • www.mql5.com
Technical analysis and fundamental analysis are the two broad methods used to make investment decisions. Technical analysis is a security analysis methodology for forecasting the direction of prices
 

New and interesting algorithm from   researchers from  Google DeepMind:https://towardsdatascience.com/a-quick-introduction-to-neural-arithmetic-logic-units-288da7e259d7   and  https://arxiv.org/pdf/1905.07581v1.pdf

So you’d like to learn the identity function?

That sounds like a simple enough task for a neural network, right? And it is, if our test set is in the same range as our training set. However, as soon as our test data goes outside the training range, most neural networks fail spectacularly. This failure reveals an uncomfortable truth: a multi-layer perceptron is theoretically capable of representing any continuous function, but choices of architecture, training data and learning schedule will severely bias what function the MLP ends up learning. Trask et al effectively illustrate this failure with the following graph:

A Quick Introduction to Neural Arithmetic Logic Units
A Quick Introduction to Neural Arithmetic Logic Units
  • Valeri Alexiev
  • towardsdatascience.com
Classical neural networks are extremely flexible, but there are certain tasks they are not well suited for. Some arithmetic operations, in particular, prove challenging to neural networks. That’s why Trask et al in their paper Neural Arithmetic Logic Units introduce two new modules that are meant to perform well on certain arithmetic tasks. In...
 
fractals are not about finding peaks and troughs, you got the "Chaos Theory" trading system wrong my friend
 
Jean Francois Le Bas:
fractals are not about finding peaks and troughs, you got the "Chaos Theory" trading system wrong my friend

in lower time frames the fractals of higher TF are actually the peaks and troughs

Reason: