You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Thanks, I didn't realise you were opening a report.
The gain was even greater.
I'm amazed: this is a budget card for under $80! So, NVidia has done some serious work on the driver.
And here are the new results:
I understand: 1. pure CPU, 2. CPU with OpenCL, 3. GPU with OpenCL ?
And it's still 422.
I'm amazed: this is a budget card costing under $80! So NVidia has worked very hard on the driver.
And how amazed I am, from dirt to riches. One gets the impression that NVidia reads this forum, makes similar tests, finds bugs and fixes them.
If only the tester would choose what to simulate on, i.e. without the forced writing of code, it would be very good. Still, 1 second (or 11 seconds if video does not allow or is not available) against 7 minutes is power.
Modern operating systems and really multi-core processors have seriously removed the problem of scatter of measurements via GetTickCount. My original comment was solely about the erroneous statement "the average error of GetTickCount is at least tens of ms".
in the registry it looks like this:
"nvcuda.dll"=dword:00000000
"amdocl.dll"=dword:00000000
"amdocl64.dll"=dword:00000000
"IntelOpenCL64.dll"=dword:00000000
They're about 1.5 times slower (highlighted in red) than Intel's native driver (highlighted in green).
You may remove the corresponding registry values, but save the branch just in case.
Dear Admin. Haven't been on your forum for a while, may have missed this point.
Will there be an implementation of surrendering video cards to the needs of the cloud?
Dear Admin. Haven't been on your forum for a while, may have missed this point.
Will there be an implementation of video card swapping for the needs of the cloud?
Almost done https://www.mql5.com/ru/forum/23/page15#comment_201948
OpenCL programs are intended for performing computations on video cards that support OpenCL 1.1 or higher. Modern video cards contain hundreds of small specialized processors that can simultaneously perform simple mathematical operations on incoming data streams. The OpenCL language undertakes the organisation of such parallel computing and offers a great speed-up for a certain class of tasks.
Yes, that's right.
Would you mind running the attached script and posting the results? It's really interesting, isn't it?
Do not be afraid of a large number of digits. They are just there to check the correctness of calculations.
The script also runs through all the devices. The main task is to multiply two large matrices.
The settings can only be changed within the code - the linear size of matrices _size in this line:
#define _size 2000
Change them only if you run out of memory. A sign of that is discrepancies in array numbers when run on a discrete GPU: if the difference in numbers is more than 10^(-4), that is an obvious error. But you seem to have enough memory.
For example, I have a Radeon 6930 graphics card, it has 1280 stream processors. How will it show up in the agent list? As 1 device, or all 1280.
It is times faster by itself than 10 processors, and the bonus is not for 1 added device.
Would you mind running the attached script and displaying the results? It's really interesting.
No, it's not a bore. I'm curious about it myself. I haven't changed anything in the settings.
I just don't understand any of the numbers. Can you explain? Well, at least on the fingers: is it good or not? They are different between devices, and on the lines of 5-6 digits after the decimal point is already different in places.
I think I got it: it is a multiple test on repeated operations, the final time is the average for each device. Right?
These are just check digits. If they coincide with 0.00001, everything is OK. The indexes are chosen at random - it is a random check to make sure that the calculations are correct. Well, we are not going to print here the results of a full check of all 4 million elements of the resulting matrix, are we?
I think I got it: it's a multiple test on repeated operations, the final time is the average for each device. Right?
No, this is a single operation of multiplication of two large matrices.
In terms of performance figures: very good for this card. Now my results. Devices (bottom to top - initialisation order):
I.e. first an Intel CPU with Intel's OCL engine, then my dinosaur HD 4870, and then a stone again but with AMD's engine. Script: