OpenCL: internal implementation tests in MQL5 - page 63

 
GKS: If AMD had not bought ATI Readon a couple of years ago.

ATI was bought 6 years ago, in 2006.

The only pity that Intel competitors do not have similar technology as Hyper-threading hopefully AMD will invent it.

Bulldozer is a kind of hardcore implementation of hyper-threading: 8 undercore with obviously scarce FPUs that also fight for resources.

Surprisingly, Bull came out slightly better than Thuban x6 on average, even on multithreading. On integer well-paralleled tasks it counts fast (falls slightly short of the i7), but on everything else it loses out to the i7 and ends up, on average, on par with the i5-2400. In short, a server processor for the desktop. And Trinity is unlikely to fix this situation: the cores are bulldozed.

According to ixbt,

CPU performance gains of up to 29%, credit to the new processor core conventionally named Piledriver

So I was wrong. I wonder which applications would be like that?

AMD ����������� APU ����� A ������� ��������� (Trinity)
AMD ����������� APU ����� A ������� ��������� (Trinity)
  • www.ixbt.com
��� � ���������, �������� AMD ������� ����������� ��������� ���������� ����� A ������� ���������, ��������� ����� ��� �������� ������������ Trinity. ��� ������������� ��� ��������, ������� ������������ ������, ���������� �� ������������ � ����������� ����������, HTPC � ������������ ������. APU ������� ���������, �� ������ �������������...
 
Mathemat: Bulldozer is a kind of hardcore hypertrading implementation: 8 undercore with obviously scarce FPUs, which also fight for resources.

It's still worse in terms of energy performance. How much more energy is spent to run each physical core than when 4 physical cores execute 8 independent threads...

/ I have corrected the post by separating the quotation from your answer. To be able to type your answer outside the quote (if you can't), click the HTML button on the left, type a couple of letters at the very end of the markup, and return to visual mode - Mathemat/

 
I want to believe that in this project...
 

https://www.mql5.com/ru/articles/405 - after reading this article I was interested in the topic of GPU computing, although I am not a programmer, but in this article I came across a linkhttp://www.ixbt.com/video3/rad.shtml, which in turn attracted this article -http://www.ixbt.com/video3/rad2.shtml. I think this theme could zaiterezovat developers of this project, because it describes a variant to increase the performance optimization testosterone strategy with complex operations. Maybe it will help in development of the project.

P.s I have not read this article till the end.

Thanks for the tip....

OpenCL: Мост в параллельные миры
OpenCL: Мост в параллельные миры
  • 2012.05.16
  • Sceptic Philozoff
  • www.mql5.com
В конце января 2012 года компания-разработчик терминала MetaTrader 5 анонсировала нативную поддержку OpenCL в MQL5. В статье на конкретном примере изложены основы программирования на OpenCL в среде MQL5 и приведены несколько примеров "наивной" оптимизации программы по быстродействию.
 

And what do people here think about the C++ AMP that Microsoft has implemented for their VS11:

http://msdn.microsoft.com/en-us/library/hh265136(v=vs.110).aspx

We announced this technology at the AMD Fusion Developer Summit in June 2011. At the same time, we announced our intent to make the specification open, and we are working with other compiler vendors so they can support it in their compilers (on any platform).

Note that MS wants this language extension to be open. I tried it in VS11 and I must say it is a cool thing. Instead of cumbersome CUDA SDK code in a separate .cu just a few lines in the same .cpp:

I wish MQL5 had a similar feature. I've never worked with OpenCL but I've also heard it is hard to program there.

C++ AMP Overview
C++ AMP Overview
  • msdn.microsoft.com
C++ Accelerated Massive Parallelism (C++ AMP) accelerates execution of C++ code by taking advantage of data-parallel hardware such as a graphics processing unit (GPU) on a discrete graphics card. By using C++ AMP, you can code multi-dimensional data algorithms so that execution can be accelerated by using parallelism on heterogeneous hardware...
 
gpwr: It would be nice if MQL5 had the same capabilities.

OpenMP has already been asked. They don't.

 
Question to terminal developers: are there any plans to add the ability to set work group size to the OpenCL API? That would be very nice. Probably, in CLExecute(), as I understand it.
 
Mathemat:
Question to terminal developers: Are there any plans to add to OpenCL API a possibility to set work group size? It would be very nice. Probably, into CLExecute() function, as I understand it.

CLExecute(cl_krn,work_dim,offset,work) - isn't it?

bool  CLExecute(
               int          kernel,                   // хендл на кернел OpenCL программы
               uint         work_dim,                 // размерность пространства задач 
               const uint&  global_work_offset[],     // начальное смещение в пространстве задач
               const uint&  global_work_size[]        // общее количество задач
               );
 
joo: CLExecute(cl_krn,work_dim,offset,work) - isn't it?

No, this is just the size of the global workspace.

But the local workgroup size is nowhere to be found in the terminal developers' implementation. Well, you should.

In the fully-featured OpenCL API, there is the clEnqueueNDRangeKernel( ) function whose analog is CLExecute(). It is its sixth argument - const size_t*local_work_size that is needed.

 
I see.
Reason: