Errors, bugs, questions - page 2505

 

a long-standing bug in the editor:

- save the file under a new name (for example: name_v1.2)
- place the cursor on some variable (or function call)
- press alt+g

- an old file is opened and editing jumps to it (

 
Vict:

In general, I didn't even expect that:

The code is a bit overcomplicated - I tried to get to the element that does not fit into the cache line and hammer directly on it, but it failed (it probably would have succeeded if I wanted to, but I got bored), and I did not change the code too much. But this way it is even more impressive - only one of 16 collapses is performed on an element which doesn't fit into cache line, nevertheless it has a noticeable result.

SZY: More objectively in this case to do RIGHT_ALIGNED through the insertion of two short, and not by removing a single (so we will achieve two updates of the cache line for both cases). The speedup will be more modest, but still about 1.5 times as much.

Excuse me, but where is the use of alignment? That's not what this example is about.

П. С. To post your code without comments and in a crude form is disrespectful to your friends.

 

Correctly noted, alignment was added to use MQL structure objects in third-party libraries, in particular dotnet.

It was when dotnet libraries were added to support dotnet that alignment was added to the fields of pack structures/classes.

To make it short and simple, it works like this:

For each type (char, short, int, ...) there is a default alignment (1, 2, 4 bytes respectively).
For the field of the structure the minimum of two alignments is chosen: the default one and the one defined by the user (through pack)

At the same time, the size of the packed object is set in such a way that the addressing of the object field in the array is always "correct" (the default one byte is set with the pack).
It is because of the latter that the false impression is created that pack aligns the size of the structure; it is not true, the field addresses are aligned, which entails aligning the size of the structure.



For example

struct A pack(8)
  {
   double d;
   char   c;
  };

void OnStart()
  {
   Print(sizeof(A));
   
  }

Result 16, so that addressing to the first field d is always aligned by 8 bytes

 
fxsaber:

Runs on my own have not shown any noticeable difference.

I've improved the original idea (in the first code, the addresses were counted incorrectly). If you don't mind, it will be interesting to see the result with you.

#define  WRONG_ALIGNED
#define  CACHE_LINE_SIZE 64

struct Data {
#ifdef  WRONG_ALIGNED
   ushort pad;
#else
   uint pad;
#endif
   uint ar[CACHE_LINE_SIZE/sizeof(int)+1];
};

#import "msvcrt.dll"
  long memcpy(uint &, uint &, long);
#import
#define  getaddr(x) memcpy(x, x, 0)

void OnStart()
{
   Data data[32768];
   ZeroMemory(data);
   
   srand(GetTickCount());
   
   ulong start_time = GetMicrosecondCount();
   
   for(unsigned i = 0; i < 10000; ++ i) {
      int rndnum = rand();
      while (++rndnum < 32768) {
         int index = int(CACHE_LINE_SIZE - getaddr(data[rndnum].ar[0]) % CACHE_LINE_SIZE) / sizeof(int);
         ++ data[rndnum].ar[index];
         ++ data[rndnum].pad;
      }
   }
      
   Alert(GetMicrosecondCount() - start_time);
   
   Print(data[100].ar[0]);
   Print(data[100].pad);
}
/*
WRONG_ALIGNED:
6206397
6185472

RIGHT_ALIGNED
4089827
4003213
*/
In essence the same thing happens with/without WRONG_ALIGNED - on each while we write to two adjacent cache lines (writing to pad always to correct address), the only difference is that with WRONG_ALIGNED there are cases (not always) when one of the entries in ar occurs in uint, which will not get completely into the cache line, I have a stable difference of about 1.5 times.
 
Vict:

I've worked out the original idea (in the first code, I didn't count the addresses correctly). If you don't mind, it will be interesting to see the result in your case.

Basically the same thing happens with/without WRONG_ALIGNED - at each while we write to two adjacent cache lines (pad entry always to correct address), the only difference is that with WRONG_ALIGNED there are cases (not always) when one of the entries in ar occurs in uint, which will not hit the whole cache line, I have a stable difference about 1.5 times.

Please explain, what are you trying to get with this line? In the previous example it was rubbish code.

int index = int(CACHE_LINE_SIZE - getaddr(data[rndnum].ar[0]) % CACHE_LINE_SIZE) / sizeof(int);
 
Francuz:

Please explain what you are trying to get with this line? In the previous example it was rubbish code.

Find our position in current cache-line (the one where pad is) and take such index for ar[], that element with it is in next cache-line (maybe element is in two cache-lines with WRONG_ALIGNED)

 
Vict:

Find our position in the current cache-line (the one where pad is) and take such an index for ar[] that the element with it is in the next cache-line (maybe the element is in two cache-lines at WRONG_ALIGNED)

Now we're talking about an offset. But what you show is purely a synthetic example, which will never occur in real life. And on real examples the speed gain will be about 1% at best. You shouldn't make a big deal out of it for such a paltry acceleration.

П. С. Also, you have calculated the register size incorrectly.
 
Francuz:

Now we're talking about displacement. But what you show is a purely synthetic example which will never be encountered in real life. In real-world examples, the speed gain is about 1% at best. You shouldn't make a big deal out of such a paltry acceleration.

No, this is quite a real example. Only 25% of writes there occur in "problem" places at the interface of cache-lines. How much is that? For example, if you have long double array, one cache line holds only 4 values and if you don't bother with alignment (and the compiler doesn't do it for you) then you will have 25% of problematic double places - as it is in my example. There are a lot more nuances that speak for alignment, but I won't go into them - I'm not well versed in the subject.

Well, master of the house.

P.S. Also, you've miscalculated the register size.
I didn't really count him at all ))
 
Vict:

No, this is a perfectly realistic example. In it, only 25% of the entries occur in "problem" places at the cache line junction. Is it a lot? For example, if you have long double array, one cache line holds only 4 values, and if you do not care about alignment (and the compiler does not do it for you), then you get 25% of problematic double - as in my example. There are many more nuances, which speak for alignment, but I won't talk about them - I'm not enough versed in the issue.

Well, you are the boss.

Once again I will say that you are confused by register size.

 
Francuz:

Once again, you're confusing the size of the register.

Justify
Reason: