OpenCl and the tools for it. Reviews and impressions.

 
Considering the fact that Metaquotes has changed its opinion regarding the applicability of GPU in analysis and trading (it would be more accurate to say that GPU software has finally reached the level required for introducing it into commercial programs), I suggest that we are ready to start testing OpenCL in real programming today.
In case you missed this news by flubbing on the forum instead of trading, modelling and programming, here's what Metacquotes has to say for example:

.............................................................................................................................................................................

GPU experiences in financial modelling

http://habrahabr.ru/blogs/hi/131983/

MetaQuotes November 7, 2011, 19:27#

OpenCL support will soon be included in MQL5, it will allow increasing the performance of calculations in a distributed network by several times/order of magnitude.

.............................................................................................................................................................................

So, OpenCL, what is it?

It's a software approach, an interpreter program to run your program in parallel on the video card, i.e. on GPU.

What does it give?

Firstly, it gives scalability, i.e. easy and cheap scaling of available processing power. It is one thing to buy, install and maintain a dozen servers and quite another thing to buy and plug in 3 or 4 additional video cards. The result will be the same in terms of speed, but in terms of money and time expenditure - dozens of times.

Secondly, it allows you to use such sophisticated mathematical methods that were previously inaccessible due to weakness of computers.

Thirdly, if metaquotes could bolt on OpenCL not only for users but also for its built-in tester, it would give rise to two oddly opposite perspectives:

(a) half the time CloudNetwork will not be needed at all for individual optimization of simple expert advisors.

(b) Using CloudNetwork will open such wide prospects for expert optimization, modeling and analytics, application of such complex mathematical methods, which were simply unthinkable before.

Where to start?

Here is a software SDK (with CPU driver) for developer from AMD (claimed to be for Vista-Win7, but works on WinXP as well. Without video accelerator card they work successfully on CPU, and on any SSE processor from Intel, not only AMD):

http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx

Older versions of AMD-ATI SDK (works on WinXP):

http://developer.amd.com/sdks/AMDAPPSDK/downloads/pages/AMDAPPSDKDownloadArchive.aspx

Nvidia's version of the interpreter program is supplied with every modern driver and the development environment is included in the CUDA Tools-SDK package:

http://developer.nvidia.com/opencl

Intel's version of the SDK (only works on Vista-Win7):

http://software.intel.com/en-us/articles/download-intel-opencl-sdk/

Note: to work properly with MetaTrader 4 or 5, we need OpenCL version 1.1 or higher, but not 1.0 because only from version 1.1 safe threading support is enabled. In MetaTrader 4 or 5 threading is created and destroyed dynamically and separately for each Expert Advisor. Therefore, to use GPU of Expert Advisors running on different currency pairs, you need exactly safe threading.

In practice this means using ATI Catalyst drivers only version 10.10 and above, or from Nvidia only version above 280.00. AMD-ATI drivers get better and faster with each version, while nVidia drivers get worse and slower. OpenCL from Nvidia version 1.1 is 30-40% slower than its version 1.0, i.e. it is really 2 times slower and not at all faster than OpenCL 1.1 from AMD.

How do I check for OpenCL?

AIDA by FinalWire shows GPU parameters and OpenCL version in the Display section.

How do I test the speed of OpenCL?

LuxMark:

http://www.luxrender.net/wiki/LuxMark

Approximate comparative results for different CPUs and GPUs:

http://www.luxrender.net/wiki/LuxMark_Results

Sometimes it is necessary to edit render.cfg file manually in order to separate it from CPU-Native CPU-OpenCL test.

Speed for top-end Phenom II X6 CPU = around 2300, the same program, but only in monster of 8pc Nvidia GTX580 gives 70000, i.e. 30 times more. Even such a monster set of 8 GTX580 will cost 30 times less in primary cost, and about 40 times less in power consumption than a set of 30 double float point equivalent servers. Moreover, it takes a lot of effort to synchronize instances of programs on 30 servers, while on OpenCL it all runs on one computer with the same single instance of the program.

More results from LuxMark:

http://www.xtremesystems.org/forums/showthread.php?267385-LuxMark-The-OpenCL-CPU-amp-GPU-benchmark

More software to test OpenCL:

gpcbenchmarkocl

It has been removed from the website of Chinese authors (probably, it is a strategic product) but you can find it on the web. (Some parts of the Image Processing section works only with DirectX version 10 and higher, it means that it does not work with WinXP).

There is another peculiarity of OpenCL: programming in it requires you to depart from customary mathematical abstractions used in decent programming and get involved in optimizing the program for the hardware which is not right.

An article on OpenCL describes approximate difficulties with its implementation:

http://habrahabr.ru/blogs/hi/125398/

An introductory video course on OpenCL from AMD:

http://developer.amd.com/documentation/videos/OpenCLTechnicalOverviewVideoSeries/Pages/default.aspx

In general, ANY computer running SSE will be good enough for OpenCL program development and debugging as long as the AMD program suite is used. Further, any GTS450....GTX580 system from Nvidia can be recommended as it also has CUDA, but for real fast work AMD-ATI cards and programs are more suitable - they are better scalable and more stable in Multi-GPU configurations.

OpenCL Multi-GPU hardware is a topic of a separate thread.

 
AlexEro:

OpenCL Multi-GPU hardware is a topic for a separate thread.

By the way, the issue has been studied quite extensively and there are comparative characteristics of the cards. Studied by miners of the notorious bitcoin cryptocurrency (initially) and other cryptocurrencies (later).

The comparison of course on specific tasks, but roughly oriented on configuration and budget is easy.

And yes, don't go for powerful b\u graphics cards now :)

 
TheXpert:

By the way, the issue has been studied quite extensively and there are comparative characteristics of the cards. Studied by miners of the notorious bitcoin cryptocurrency (initially) and other cryptocurrencies (later).

The comparison of course on specific tasks, but roughly oriented on configuration and budget is easy.

And yes, don't get powerful b\u video cards now :)

I just didn't want to scare the reader with these behemoths and distract from more important - programming. Although I agree, the end result, i.e. what real scaling speed can be expected, and how it looks like, should be shown. The range is wide - from beer crate to shelving :


 

OpenCL is the future today.

Question: when approximately will support be implemented in MT5?

I am more than 100% sure that OpenCL will become the standard for programming parallel workloads.

 
Microsoft Introduced C++ AMP
Published by shapovalovts on Thu, 06/16/2011 - 10:07 am
At AMD Fusion 11 Developer Summit, Herb Sutter announced a new technology for developing heterogeneous C++ applications called C++ Accelerated Massive Parallelism (AMP). This technology, as Microsoft developers declare, will enable to use capabilities of parallel code execution both on CPU and GPU.

Key competitors to AMP will be OpenCL and CUDA. Sutter also highlighted the potential of using C++ AMP in cloud computing.

https://www.mql5.com/ru/forum/132431

 
So it's back to being the proprietary AMP's little brother.)
 
It looks like AMP is going to have a tough time. OpenCL is already in full use.
 
Lack of ideas is hard to replace with the computing power to quickly go through or bubble sort the next nonsense derived from the "intellectual perversion" over the quotes )))
 
artikul:
Lack of ideas is hard to replace with the computing power to quickly go through or bubble sort the next nonsense derived from the "intellectual perversion" over the quotes )))

I guess if you load up such power with worthwhile ideas, you could make our planet revolve around the sun in the opposite direction. ))
 

articul, you don't have to be so categorical. The world doesn't stand still. Modern monster graphics cards, based on your logic, should also be recognized as a consequence of lack of ideas in image processing?

I myself sometimes like to speed up - just not to have to wait for code debugging for finishing heavy calculations, which are all in my init() (about 10 seconds). And all other calculations, which take place "on the fly", really run very fast, so multithreading isn't necessary here.

 
artikul:
Lack of ideas is hard to replace with the computing power to quickly go through or bubble sort the next nonsense derived from the "intellectual perversion" over the quotes )))

Very truly noted.And here comrades here are actively warming up in anticipation of engaging in the 6th item in a particularly perverted form. =)

Reason: