Machine learning in trading: theory, models, practice and algo-trading - page 297

 
anonymous:

apply(embed(pattern, length(signal)), 1, cor, y = signal, method = 'pearson')

Thanks! I wonder how much R counts like this. I measured the biblical algorithm with a signal length of 1,000,000 and a pattern of 100,000 - 1 second.
 
fxsaber:
Thank you! I wonder how much R counts like that. I measured the bible algorithm with a signal length of 1,000,000 and a pattern of 100,000 - 1 second.

A million times faster! So what? Do we measure trading systems as processors?
 
SanSanych Fomenko:

A million times faster! So? Trading systems are measured like processors?
You have some kind of complex.
 
fxsaber:
You have some kind of complex.


No, not a complex.

μl and R are completely different and non-overlapping systems. And what is there to compare? And you are not the only one!

 
SanSanych Fomenko:

μl and R are completely different and non-overlapping systems. And what is there to compare? And you are not the only one!

I was only interested in the speed of implementation of not rare statistical problem in any of the languages.

R is the most popular statistical language and many people here know it. That's why the question of comparison is asked here.

The algorithm of implementation and, consequently, its efficiency are of interest. What language it is in doesn't matter.

 
fxsaber:

I was only interested in the speed of implementation of a not uncommon statistical task in any of the languages.

R is the most popular statistical language and many people here know it. That is why the question of comparison is asked here.

The algorithm of implementation and, consequently, its efficiency are of interest. The language is of no importance.

MT is a trading terminal. In accordance with this I, here on this site and in this thread, am discussing the development of the TS. But there are always some people discussing some programming tricks that have practically no effect on the trading results. Your question is exactly of that type, because the correlation function itself makes sense as a part of other algorithms.

This function can be used in trading decision blocks (at least I use it), but the speed of its execution does not play any role, because the main time of calculating the trading signal is determined by other, computationally complex algorithms that are not present in μl at all.

It is at the execution stage.

If we consider the stage of TS development, R is principally superior to the speed of µl, as it is the interpreter, which is extremely useful at the stage when the algorithm is not quite clear and it is necessary to try many variants, for example, to compare the correlation of currency pairs. In R the time to check the correlation is the time to knock on the keyboard, with a couple of lines, including a very convenient formation of initial vectors.

That was the point I was making, that it makes no sense to compare the execution speed of these functions and in general any other functions implemented on mcl and R.


PS.

But your library saved me from having to study mcl5, thanks for that.

 
SanSanych Fomenko:

MT is a trading terminal. Accordingly, I, here on this site and in this thread, discuss the development of the TS. But there are always some people discussing some programming tricks that have practically no effect on the trading results. Your question is exactly the same because the correlation function itself makes sense as a part of other algorithms.

Previously, some ideas TC could not verify, because it was hampered by the low performance of some algorithms. In this case that is exactly what happened - an alternative algorithm allowed in the optimizer to explore an idea as old as the world, but could not previously be computed in a reasonable time.


When one has to count hundreds of billions of Pearson QCs in patterns of several thousand in length, the low speed of a seemingly simple task becomes an insurmountable bottleneck. One might begin to say that if a problem seems too computationally heavy, it is a poorly formulated problem with poor understanding. Perhaps it is. But what is done is done. And it's always interesting to see how others do it.

 

What is better, to spend a little more time on development, but then always quickly calculate, or to develop quickly and then always put up with slow calculations?

If in R to develop quickly but slowly to calculate, then where to calculate? To quickly develop a supercar that is slow? What the hell is the need for such a supercar?

 
fxsaber:

I was only interested in the speed of implementation of a not uncommon statistical task in any of the languages.

R is the most popular statistical language and many people here know it. That is why the question of comparison is asked here.

The algorithm of implementation and, consequently, its efficiency are of interest. In what language it does not matter.


Well, on a 1000000 length signal and 100000 length pattern that implementation is unlikely to count in reasonable time at all, because it would require creating a 900001x100000 time matrix :D But it took less than 30 seconds to write it and up to some task size it will be quite applicable. You can implement the same thing with fft/convolve, and in this case you will need to write more code, but it will be as fast as C code.

In R it is very convenient to make prototypes of complex models - this is its strong side. The code performance is a question of skills and experience:

1. Some R constructions and data types work faster than others (mutable vs immutable types (list vs environment), for vs lapply/sapply/etc., S4 vs R6).

2. The ease of parallelization in R for some problems allows you to get a slow code solution faster than it would take to write fast code in another language + computation.

3. some operations in the language are made universally, but inefficiently. If you implement small but computationally heavy functions in C++, you can achieve tremendous results without reducing the speed of development as much as if you wrote the whole code in a C-like language. For example, summing matrix elements by rows or columns in R can be done 4 to 15 times faster than rowSums/colSums/apply(, 1, sum)/apply(, 2, sum).

 
anonymous:


Well, on a 1000000 signal and 100000 pattern length that implementation can hardly be counted in reasonable time at all, because it would require creating a 900001x100000 time matrix :D But it took less than 30 seconds to write it, and it will be quite applicable up to some task size. You can implement the same thing with fft/convolve, and in this case you will need to write more code, but it will be as fast as C code.

R is very handy for prototyping complex models - this is its strength. The speed of the code is a matter of skill and experience:

1. Some R constructs and data types are faster than others (mutable vs immutable types (list vs environment), for vs lapply/sapply/etc., S4 vs R6).

2. The ease of paralleling in R for some problems allows you to get a slow code solution faster than it would take to write fast code in another language + computation.

3. some operations in the language are made universally, but inefficiently. If you implement small but computationally heavy functions in C++, you can get tremendous results without reducing the speed of development as much as if you wrote the whole code in a C-like language. For example, summing matrix elements on rows or columns in R can be done 4 to 15 times faster than rowSums/colSums/apply(, 1, sum)/apply(, 2, sum).

Thanks for the detailed answer! Always have the same problem - my own poor competence.
Reason: