Machine learning in trading: theory, models, practice and algo-trading - page 2405

 
Maxim Dmitrievsky:

The idea is generally correct, but it does not require online training in real life, it can only be conducted at the stage of basic training / retraining, and then used as is

Well yes, it makes sense that there shouldn't be any heavy calculations when trading.

 
High-Frequency Financial Trading on FOREX with MDFA and R: An Example with the Japanese Yen
High-Frequency Financial Trading on FOREX with MDFA and R: An Example with the Japanese Yen
  • 2013.02.19
  • Christian Dallas Blakely
  • imetricablog.com
In my previous article on high-frequency trading in iMetrica on the FOREX/GLOBEX, I introduced some robust signal extraction strategies in iMetrica using the multidimensional direct filter approach (MDFA) to generate high-performance signals for trading on the foreign exchange and Futures market. In this article I take a brief leave-of-absence...
 

I decided to compare proximity metrics with each other, which are the best for market data recognition...

The most common metric is "Euclidean" , it is used almost in 99% of cases and is something like a standard in MO...

almost all clusters work on it...

So 24 metrics were compared for adequacy of recognition of new market data...

list of metrics and error result

 [1,] "0.51"  "euclidean"  
 [2,] "0.525" "manhattan"  
 [3,] "0.51"  "minkowski"  
 [4,] "0.545" "infnorm"    
 [5,] "0.505" "ccor"       
 [6,] "0.565" "sts"        
 [7,] "0.51"  "dtw"        
 [8,] "0.52"  "edr"        
 [9,] "0.55"  "erp"        
[10,] "0.51"  "lcss"       
[11,] "0.535" "fourier"    
[12,] "0.46"  "tquest"     
[13,] "0.525" "acf"        
[14,] "0.52"  "pacf"       
[15,] "0.525" "cdm"        
[16,] "0.53"  "cid"        
[17,] "0.53"  "cor"        
[18,] "0.5"   "cort"       
[19,] "0.495" "ar.pic"     
[20,] "0.485" "int.per"    
[21,] "0.49"  "per"        
[22,] "0.52"  "mindist.sax"
[23,] "0.535" "ncd"        
[24,] "0.51"  "pdc"

As you can see Euclid is not the best solution for prices))

 
mytarmailS:

I decided to compare proximity metrics with each other, which are the best for market data recognition...

The most common metric is "Euclidean" , it is used almost in 99% of cases and is something like a standard in MO...

almost all clusters work on it...

So 24 metrics were compared for adequacy of recognition of new market data...

list of metrics and error result

As you can see euclid is not the best solution for prices ))

That's if you only have entry prices. And if you also have time from 0 bar to the one where you look at the price, and if you also have volumes (tick/real) or something else. Euclidean, and indeed any distance between the chips would be inadequate. How do you equate 5 pips price, 5 minute bars, 5 hour bars, 5 lots volume? You can't.
And the clusterizer will consider them equal.
 
elibrarius:
This is if you only have entry prices. And if you also have time from 0 bar to the one where you look at the price, and if you also have volumes (tick/real) or something else. Euclidean, and indeed any distance between the chips would be inadequate. How do you equate 5 pips price, 5 minute bars, 5 hour bars, 5 lots volume? You can't.
And the clusterizer will consider them equal.

You can use the Mahalanobis metric or some kind of data normalization.

 
Aleksey Nikolayev:

You can use the Mahalanobis metric or some kind of data normalization.

Normalization will just change the scales. The ball will make an elipsoid - if the max values don't match. You will equalize 5 pts with 2 hours and 7 lots.
Either way it's equating warm to soft. After normalization it will be warm and fluffy.)

 
elibrarius:

Normalization will simply change the masses. The ball will make an elipsoid - if the max values do not match. You will equalize 5 pts with 2 hours and 7 lots.
Either way it's equating warm to soft. After normalization you will get warm and fuzzy))

Sometimes you can use their distribution function for SB to normalizetraits. For example, zigzag knee lengths for SBs are distributed exponentially, etc. If the distribution is not known exactly, it can be approximated by a Monte Carlo simulation.

 
elibrarius:

Normalization will simply change the masses. The ball will make an elipsoid - if the max values do not match. You will equalize 5 pts with 2 hours and 7 lots.
Either way it's equating warm to soft. After normalization you will get warm and fuzzy))

Warm and fuzzy is wanting to equalize -

elibrarius:
5 hour bars and 5 lots of volume
 
Aleksey Nikolayev:

You could use the Mahalanobis metric or some kind of data normalization.

Why do all famous mathematicians have such complicated names?

 
secret:

Why do all famous mathematicians have such tricky last names?

Indian) They have more complicated names.)

Reason: