Discussion of article "Statistical Distributions in MQL5 - taking the best of R" - page 13

You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Read it.
Got into it.
I think that the tests given by you are not quite correct. I consider it necessary to write about it, because comparisons of performance are not the last.
The point is that MKL is qualitatively different from R. And in cases of performance comparisons these qualitative differences should be taken into account where possible. R is an interpreter and MKL is a compiler. This qualitative difference for industrial programmes is to MKL's advantage.
Take a look at the source code of R (it is an opensource code). There all basic mathematics in C/C++ is compiled into the engine. And most of the packages are written in C++ too, otherwise you can't wait for the results of calculations.
If R were an interpreter in basic/system functions, it would lag behind MQL5 by 200-500 times. We specifically tested the system C/C++ functions of R, rather than building manual processing in loops (where R lags behind by hundreds of times).
In R development, there is a constant search for "how can I find a package so I don't have to write a for/while/foreach loop". In fact, there is only one method of doing calculations in R, and that is to pass any more or less massive calculations to third-party packages.
But there is another qualitative difference, which is also of great importance in industrial operation of programmes and in tests these differences were not taken into account, which led to distorted results.
The qualitative difference between R and MKL is that the elementary object in MKL is a scalar, from which more complex objects, for example vectors, are made up. It is vectors that are fed to the input of distribution functions.
Look in the /include/math/stat catalogue for hundreds of vector functions.
In R, there is no concept of a scalar at all. The simplest object is a vector. R exploits this fact extensively and in our example comparing distribution functions in R code we can clearly see such a programming technique, specific for R, as "vectorisation", which is not available in MKL. Since this is a specific technique in R that speeds up computations 10-100 times (depending on the size of the matrix), the code for R would have to contain this very technique. The use of vectorisation is obvious, because in the tests we take an input vector and perform calculations over it 100 times, i.e. it is a matrix with the same, but we can do what with different columns.
There is no vectorisation and no modern features in R. The code there is just written head-on by ordinary programming juniors. Yes, they are decent mathematicians, but they are mediocre programmers.
GPUs in R remain only fairy tales and isolated attempts in the rarest packages.
To summarise: a text in R should be written in R using its capabilities, especially in the absence of their analogues in MKL.
You simply do not know either R or MQL5.
You haven't looked at the sources of R, you don't know the sources of MQL5. You have not built compilers for the last 15 years. But you are trying to argue with those who have done it all.
Currently, the MQL5 statistical library (excluding Alglib, Fuzzy) already has more than 461 functions: https://www.mql5.com/ru/forum/86386/page222#comment_3867386.
This already covers the basic statistical functions well.
If you have read the article before, I recommend you to read it again - yesterday they released a new version of the article with a lot of new functions.
Currently, the MQL5 statistical library (excluding Alglib, Fuzzy) already has more than 461 functions: https://www.mql5.com/ru/forum/86386/page222#comment_3867386.
This already covers the basic statistical functions well.
I recommend those who have read the article before to read it again - yesterday they released a new version of the article with a lot of new functions.
Still haven't figured out how to send a push message to Quantum. Please add a thing that may not even be in R.
This is a quick calculation of Mean interval when shifting it by one to the right. Similarly, the calculation of the Pearson correlation coefficient.
Pearson is pretty hard to calculate, if head-on. But there are iterative methods of calculation: K[i] through K[i-1].
It's funny, it's the first time I've encountered a sentence in Russian with a comma after each word:
Why don't you write the necessary function yourself?
Look at the full sources of functions in /include/math/stat and write the missing ones.
There is no vectorisation and no modern features in R. The code there is just written by ordinary programming juniors. Yes, they are decent mathematicians, but they are mediocre programmers.
You simply do not know either R or MQL5.
You haven't looked at R sources, you don't know MQL5 sources. You haven't built compilers for the last 15 years. But you are trying to argue with those who have done it all.
I have very modest knowledge of programming, but not to the extent you describe.
Anyway, I understand perfectly well that the internal implementation of R in C++ you refer to has nothing to do with the problem of measuring execution speed I raised. I am writing about the technique of writing code in R itself, and what is inside is what we measure.
So, about vectorisation.
A string looks normal in R
It is always at least a vector calculation. It depends on the context - what a and b are.
Furthermore,
will give a vector c, each element of which is the square root of the corresponding element of vector a
In this case, a does not necessarily have to be a vector, it can be a more complex object, such as a matrix.
In MQL these are always cycles.
Moreover, vectorisation in R implies not only the objects themselves, but:
And returning to the meaning of what I wrote in the previous post.
I don't write anything about the quality of C++ functions' implementation at all. Like you, I propose to measure them as they are. But using R language tools which are specially intended for vectorised operations.
For example.
For all your tests, form a matrix M with 100 (as you have), where each column models a quote
Then on R the minimum over all columns looks like
The result will be a vector that contains the minimum of each column
Using this pattern, we need to measure the rate of all distribution functions wrapped in the appropriate apply. There are many of them and they are different. There are no analogues in MKL.
At the same time, make sure that the MKL library is installed together with R.
Why don't you write the necessary function yourself?
Look at the full function sources in /include/math/stat and write the missing ones.
Interesting idea.
Maybe you can find a performer to port packages. For example, splines. Got top quality mashups, the real deal.
I have very modest knowledge of programming, but not to the extent you describe.
In any case, I understand perfectly well that the internal C++ implementation of R you refer to has nothing to do with the problem of measuring execution speed I have raised. I am writing about the technique of writing code in R itself, and what is inside is what we measure.
So, about vectorisation.
In R, a string looks normal
It's always at least a vectorising computation. Depends on the context - what a and b are.
Furthermore,
will give a vector c, each element of which is the square root of the corresponding element of vector a
In this case, a does not necessarily have to be a vector, it can be a more complex object, such as a matrix.
In MQL these are always cycles.
We have shown how to work faster in loops. And in pure sources in MQL5 without using C++.
And we will also defeat the simplest vector sqrt. Here are two standard functions from the library with a full analogue of R:
bool MathSqrt(const double &array[],double &result[]) // result into a separate vector
You haven't quite understood yet that these 461 functions of the standard MQ5 maths library have a huge coverage of basic mathematical operations.
Moreover, vectorisation in R implies not only the objects themselves, but:
Yes, yes. Theoretically.
And 99% of all operations you do exclusively in the simplest functions without a chance for acceleration.
In MQL5 OpenCL is standard and you can accelerate everything without third-party libraries. And in ordinary MQL5 you can get results at the level of C++.
But in R, the only option is to look for a package to accelerate each cycle. Yes, exactly every cycle, if it is anything in terms of the number of iterations.
And returning to the meaning of what I wrote in the previous post.
Few people realise this, but it's likely that when using MKL there will be a fabulous overhead on moving the R input data into regular arrays that MKL will work on, and then the result has to be moved back into R's internal data representation format.
I haven't dug into this, but logically that's what it looks like. Which means serious expenses for providing MKL support.
In MQL5 there are no such losses at all, of course. Only in OpenCL you need to copy data, but there it's a simple and flat memcopy.
Why don't you write the necessary function yourself?
I wrote it once, but I didn't format it as a math function.
Look at the full sources of functions in /include/math/stat and write the missing ones.
The question is to put them into a standard library with scientific and programming combing, as Quantum does.
Most likely, it will be necessary to make a performance comparison with your solution. Then, I think, it will be possible to convince to put the bicycle in the mat. library. I haven't seen this in mat. packages myself (I can't say for R).
Another little secret - why MQL5 is so fast, especially when the libraries are fully in the source code.
Our compiler is engaged in such a deep optimisation and has the ability to cut off so many checks and conditions that functions disappear completely and loops are simplified to the extreme. Of course, only for the x64 version.
Unlike the use of libraries/packages (where you can't even optimise a call) by other systems, the MQL5 compiler almost always works with the full source code and always performs global optimisation to the maximum depth. This gives amazing results.
That is why it is important for us to provide all standard libraries in source code. We know that in the finals everything will be overoptimised so that you can beat almost everyone in terms of speed. And even the overhead on the managed language will not affect it so much anymore.