How to use a new type of program with branching? - General

Dominik Egert 2021.07.10 20:05 #31

Ive come across this project recently...

https://github.com/PlummersSoftwareLLC/Primes

Accomodated by this video:

https://www.youtube.com/watch?v=tQtFdsEcK_s&list=PLF2KJ6Gy3cZ5Er-1eF9fN1Hgw_xkoD9V1

github.com

Prime Number Projects in C#/C++/Python. Contribute to PlummersSoftwareLLC/Primes development by creating an account on GitHub.

amrali 2021.07.11 05:52 #32

    eval = (data[cnt] > 0x7F);
    left[left_ptr] = (data[cnt] * eval) + (left[left_ptr] * !(eval)); 
    left_ptr += eval;

The result of the expression is assigned to variable 'eval', so it is OK.

But,

    ((test1) || ((test2) && (MyConditionFunc() == 1)) ) && MyTestFunc();

The compiler complains warning: "result of expression not used", because the value of the expression is not assigned to any variable. The expression will be evaluated but it is not a valid statement.

Look for statements vs expressions to understand the difference.

The boolean expression is evaluated according to the short-circuit evalution.

https://en.wikipedia.org/wiki/Short-circuit_evaluation

BTW: branchless code always executes faster than code with branching. It is a fact, but this is another topic.

Your code still has branching, to convert it to branchless use:

eval = (uchar) ((data[cnt] - 0x7F) >> 31);

Edit:

eval = (data[cnt] - 0x80) >> 31;
right[right_ptr] = (data[cnt] * eval) + (right[right_ptr] * !(eval));
right_ptr += eval;

left[left_ptr] = (data[cnt] * !(eval)) + (left[left_ptr] * eval);
left_ptr += !eval;

The difference found in the benchmarks is due to different condtions for branching plus some compiler optimizations.

Dominik Egert 2021.07.11 06:31 #33

Yes, very good example. Although I think you mean shift right 7 Bits, not 31 Bits.

amrali 2021.07.11 07:28 #34

The trick here is that if data[cnt] >= 128, then data[cnt] - 128 is nonnegative, otherwise it is negative. The highest bit in an int, the sign bit (bit 31), is 1 if and only if that number is negative. So shifting right by 31 makes the whole result 0 if it used to be nonnegative, and 1 if it used to be negative.

Questions on OOP in How to code? Experts: Droneox Equity Guardian

amrali 2021.07.11 07:50 #35

Not all branching is bad. In addition to compiler optimizations, modern CPUs have branch predictors. It is a harware optimization for if..else statements.

Please refer to this link for a nice explanation about branch prediction:

https://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-processing-an-unsorted-array

Keep in mind, by using this

eval = (data[cnt] - 0x80) >> 31;

you actually, turn off compiler optimizations and cpu branch predictors. Plus you sacrifice code readibility for performance. So, it is better to be reserved for performance-critical loops.

Why is processing a sorted array faster than processing an unsorted array?

2012.06.27
GManNickG GManNickG 462k 50 50 gold badges 467 467 silver badges 534 534 bronze badges
stackoverflow.com

Here is a piece of C++ code that shows some very peculiar behavior. For some strange reason, sorting the data miraculously makes the code almost six times faster:

How i can get [WARNING CLOSED!] Any newbie Questions about OOP

Alain Verleyen 2021.07.11 10:15 #36

amrali:
...

BTW: branchless code always executes faster than code with branching. It is a fact, but this is another topic.

Your code still has branching, to convert it to branchless use:

Edit:

The difference found in the benchmarks is due to different condtions for branching plus some compiler optimizations.

Your example code is the proof your statement is false actually.

2021.07.11 06:03:08.015 372818 (EURUSD,M1) BINARY SHIFT Loop time: 2.4 nanosec; total time: 40982 microsec. Left = 1606418432 Right = 532676545

And the ternary operator is way faster (why ?), however it is using branching. All depends of the compiler but unfortunately we can't see the assembler code.

Bar close rules Lot computed wrong ? Alternative implementations of standard

amrali 2021.07.11 10:31 #37

Alain Verleyen:

Your example code is the proof your statement is false actually.

2021.07.11 06:03:08.015 372818 (EURUSD,M1) BINARY SHIFT Loop time: 2.4 nanosec; total time: 40982 microsec. Left = 1606418432 Right = 532676545

And the ternary operator is way faster (why ?), however it is using branching. All depends of the compiler but unfortunately we can't see the assembler code.

You know Alain, I tested it on laptop with core i3 and branchless code was faster. This proves that it greatly depends on L1, L2 cache, pipeline capacity among other compiler optimizations.

Optimized Backtest? Perfect PC backtesting rig? L1 and L2 data

amrali 2021.07.11 10:40 #38

Anyway, these are the broad guidelines and actually there will always be some other factors in the equation. However, good algorithm design and good programming patterns are indispensable.

Alain Verleyen 2021.07.11 10:58 #39

amrali:
You know Alain, I tested it on laptop with core i3 and branchless code was faster. This proves that it greatly depends on L1, L2 cache, pipeline capacity among other compiler optimizations.

Interesting. Do you mind to post the results ?

My setup :

2021.07.09 15:36:53.702 Terminal Windows 10 build 19042, Intel Core i7-9750H @ 2.60GHz, 12 / 15 Gb memory, 62 / 279 Gb disk, IE 11, UAC, GMT-5

Files:

372818.mq5 6 kb

Comments in source file Errors, bugs, questions My trading platform shut

amrali 2021.07.11 13:57 #40

Alain Verleyen:

Interesting. Do you mind to post the results ?

My setup :

2021.07.09 15:36:53.702 Terminal Windows 10 build 19042, Intel Core i7-9750H @ 2.60GHz, 12 / 15 Gb memory, 62 / 279 Gb disk, IE 11, UAC, GMT-5

Hi Alain, I have checked your code. With the ternary operator, you are actually comparing apples to oranges. Your code increment the left and right pointers unconditionally, this is reflected on your shorter execution time, unlike the original code, which increment the pointers based on the condition.

Here is my results afer I made some corrections.

Unconcditional Loop time: 4.9 nanosec; total time: 81974 microsec. Left = 1606418432 Right = 532676574
IF IF Loop time: 6.6 nanosec; total time: 110222 microsec. Left = 1606418432 Right = 532676574
IF ELSE IF Loop time: 6.6 nanosec; total time: 110091 microsec. Left = 1606418432 Right = 532676574
IF ELSE Loop time: 6.6 nanosec; total time: 110222 microsec. Left = 1606418432 Right = 532676574
TERNARY ? Loop time: 0.6 nanosec; total time: 9404 microsec. Left = 1606418432 Right = 532676574
BINARY SHIFT Loop time: 4.5 nanosec; total time: 76047 microsec. Left = 1606418432 Right = 532676574
BINARY SHIFT APPLES Loop time: 0.6 nanosec; total time: 10869 microsec. Left = 1606418432 Right = 532676574

Terminal MetaTrader 5 x64 build 2994 started for MetaQuotes Software Corp.
Terminal Windows 7 Service Pack 1 build 7601, Intel Core i3-2330M  @ 2.20GHz, 2 / 3 Gb memory, 9 / 29 Gb disk, IE 8, Admin, GMT+2

Files:

372818.mq5 7 kb

MACD indicator Oanda ForexT Indicators: MACD Divergence

Possible conditional check error - page 4