Discussion of article "OpenCL: The Bridge to Parallel Worlds"

 

New article OpenCL: The Bridge to Parallel Worlds is published:

In late January 2012, the software development company that stands behind the development of MetaTrader 5 announced native support for OpenCL in MQL5. Using an illustrative example, the article sets forth the programming basics in OpenCL in the MQL5 environment and provides a few examples of the naive optimization of the program for the increase of operating speed.

Article 'OpenCL: The Bridge to Parallel Worlds'


Author: Sceptic Philozoff

 

Thank you very much!

I've been waiting for an article on OpenCL for a long time.

I'm going to read it now. :)

 
A lot of bukaf :) But bukaf very necessary and useful, I will chew on this granite. Thank you!
 
joo: I have been waiting for an article on OpenCL for a long time.

I'm going to read it. :)

You probably don't need it too much. You should already know it all.

In the next article it will be serious, there will be a lot on hardware.

 
Yes. It is really too early to include it, but of course it is necessary to adapt this standard for specialisation. Perhaps it will be possible to do it in the next release of the sixth thester, I would like to have this functionality earlier....
 
GKS: Yeah. It's really too early to switch on

What is too early to enable, please clarify. If OpenCL - it is already enabled. All the experiments were performed directly in MetaEditor 5.

P.S. The main thing is that through OCL, the coder actually has access to something he had no access to before. These are:

- (S)SSEx, which can be included in Visual Studio, but cannot be included in MQL5 without OCL (not counting dlls).

- access to calculations on discrete GPUs, which additionally speeds up what can be done on a single core in MQL5 without any dll-like tricks.
.

 
Mathemat:

1. You probably don't need it too much. You should already know how to do all this stuff.

2. In the next article it will be serious, there will be a lot on hardware.

1. I need it.

2. Great!

The question remains unclear to me: Why do you call the execution of the OCL programme on the CPU "emulation"? The CPU is only one of the devices, along with the GPU, with which the OCL programme can work if there is a corresponding driver for the device, and all the processor cores are loaded.

Документация по MQL5: Программы MQL5 / Выполнение программ
Документация по MQL5: Программы MQL5 / Выполнение программ
  • www.mql5.com
Программы MQL5 / Выполнение программ - Документация по MQL5
 
joo: The question remains unclear to me: Why do you call running an OCL programme on the CPU "emulation"?

Because that is what emulation is, slow emulation. In CPU (thanks to Intel, it has a smart compiler that looks for vectorisation if you don't explicitly forbid it) parallelisation is done by (S)SSEx instructions and maybe dependencies analysis, while GPUs have much more possibilities related to SIMD Engines. And there are wider buses and faster memory - especially local and private memory.

About hardware. The main recommendations will apply to AMD hardware. But many of them with a slightly modified terminology also apply to NVidia hardware.

 
Mathemat:

Because it is emulation, slow emulation. In CPUs (thanks to Intel, it has a smart compiler that looks for vectorisation if you don't explicitly forbid it), parallelisation is done by (S)SSEx instructions and possibly dependencies analysis, while GPUs have much more possibilities related to SIMD Engines. And there are wider buses and faster memory - especially local and private memory.

About hardware. The main recommendations will apply to AMD hardware. But many of them with a slightly modified terminology also apply to NVidia hardware.

OpenCL (Open Computing Language) is an open royalty-free standard for general purpose parallel programming across CPUs, GPUs and other processors, giving software developers portable and efficient access to the power of these heterogeneous processing platforms.

See - there is no mention of OCL for GPUs and emulation mode for other devices. OpenCL is a universal programming language for organising parallel computations on any devices that have more than one computing core and have OCL support. It is not CUDA or ATI Stream, which are customised for GPUs.

Besides, in some cases parallelisation and OCL calculations on CPUs are even faster than on GPUs. Now I make it mandatory to choose a device in the settings of my programmes, because the speed of calculations directly depends on the amount of processed data and the "severity" of calculations (also depends on input settings) - sometimes it is faster on GPU, sometimes on CPU.

 
joo:

OpenCL (Open Computing Language) is an open royalty-free standard for general purpose parallel programming across CPUs, GPUs and other processors, giving software developers portable and efficient access to the power of these heterogeneous processing platforms.

See - there is no mention of OCL for GPUs and emulation mode for other devices. OpenCL is a universal programming language for organising parallel computations on any devices that have more than one computing core and have OCL support. It is not CUDA or ATI Stream, which are designed for GPUs.

Perhaps you are right in some respects, since CPU is seen as a device. But some data rather suggest that this is more like emulation. For example, there are such suspicions that writing a buffer into the device memory CLBufferWrite() in the case of CPU is done just "for a tick", because CPU has only one global memory. However, the CPU also has a cache, but I don't know what happens to it and how it works.

Besides, in some cases parallelisation and OCL calculations on CPU are even faster than on GPU.

Yes, there are such cases when, say, scalar product of dot( ) is faster on CPU. But I wouldn't speculate on what would be faster if the comparison was between a Core 2 Duo and a more powerful graphics card than the one in the link. Especially if you take care of optimising the algorithm. It's different for CPU and GPU, no matter what you say.

 
Mathemat:

What is early to enable, please clarify. If OpenCL - it is already enabled. All experiments were performed directly in MetaEditor 5.

P.S. The main thing is that through OCL, the coder actually has access to something that he had no access to before. These are:

- (S)SSEx, which can be included in Visual Studio, but cannot be included in MQL5 without OCL (not counting dlls).

- access to calculations on discrete GPUs, which additionally accelerates what can be done on a single core in MQL5 without any dll-type tricks.
.

I meant adaptation for cloud computing, if you imagine that each processor core is assisted by many "hungry bees" of graphics accelerators networked with many computers, it will be really cool and fast.

Half of the work is done, now it's just a matter of turning it on for the cloud....