ArrayBSearch bug when there are duplicate values in a sorted array?

 

Have noticed something weird with ArrayBSearch...

Code snippet here:

int OnStart()
{   
    int arr[6] = { 1,
                   2,
                   2,
                   2,
                   2, 
                   5 };
    
    const int idx1 = ArrayBsearch(arr, 5, WHOLE_ARRAY, 0, MODE_ASCEND);
    Print("idx1 is ", idx1);  // Okay this idx1 is 5 so all is well...
    
    const int idx2 = ArrayBsearch(arr, 2, WHOLE_ARRAY, 0, MODE_ASCEND);
    Print("idx2 is ", idx2);  // Hold on! Why is idx2 being 2? Surely it should be 1 right?
    
    return 0;
}

 

So idx1 is 5 which is good. BUT... why is idx2 2?!? I would expect it to be 1 because it is the first appearance of the value 2. 

According to the doc:

The function returns index of a found element. If the wanted value isn't found, the function returns the index of an element nearest in value.

 Surely the first appearance of value 2 at index 1 as good as the second appearance of 2 at index 2 especially when it is the first position the value is found?!?

 Yes? No? Maybe?  

 
dudess: Surely the first appearance of value 2 at index 1 as good as the second appearance of 2 at index 2 especially when it is the first position the value is found?!?
  1. Index 1 is as good as index 2. Therefor 2 is just as good as 1. You asked for it to find a value, it did. The only thing wrong is your exception.
  2. The C++ STL contains two functions (lower_bound, upper_bound, but no binary_search) so you must specify how to handle duplicates.
 
dudess:

 Surely the first appearance of value 2 at index 1 as good as the second appearance of 2 at index 2 especially when it is the first position the value is found?!?

 Yes? No? Maybe?  

I'd agree that it's strange for ArrayBSeach to return 2 rather than 1, particularly because MODE_ASCEND and MODE_DESCEND give identical results.

 

WHRoeder:

The C++ STL contains two functions (lower_bound, upper_bound, but no binary_search) so you must specify how to handle duplicates 

What is the relevance of the C++ STL to MQL4 programming?

 
jjc: What is the relevance of the C++ STL to MQL4 programming?
If you use a function that doesn't specify a result in the case of duplicates, don't expect a specific result.
If you want a specific result, then you need to code a function like the functions the STL has.
 

@jjc

I'd agree that it's strange for ArrayBSeach to return 2 rather than 1, particularly because MODE_ASCEND and MODE_DESCEND give identical results.

I'm glad that you agree too. Why always idx 2? Why not give me idx 3 some other random time because it's as good as the other two? There seems to be some kind of order in this "randomness" when the second appearance is deemed to be more worthy than the rest of the dupes.

 

@WHRoeder

If you use a function that doesn't specify a result in the case of duplicates, don't expect a specific result.

If you want a specific result, then you need to code a function like the functions the STL has.

Am I really asking for too much for a search function to return the first appearance of the value that I want it to look for? I have already sorted that array for heaven's sake!

I really, really, really want to ask the MetaQuote guys why idx 2. Just tell me whether it is intentional (just to keep us on our toes) or by accident (oops we did it again) then I'll shut my trap.  


P.S. Do excuse my frustration. I've been writing code for over a decade in other languages and this is the first time I've seen something like this! Basically what I gathered is that "don't assume the norm; always read between the lines with the API docs because if it doesn't say it'd return the first appearance then don't assume it would; test the APIs and if it doesn't do what you expect then write your own". Hmm... 

 
dudess:

I really, really, really want to ask the MetaQuote guys why idx 2. Just tell me whether it is intentional or by accident then I'll shut my trap.  

I'd guess it's a bug. As far as I can see, the behaviour of ArrayBSearch doesn't match the documentation (at least in its English translation). It says "the function returns the index of an element nearest in value". Therefore, I would expect the following to return idx = 2, not idx = 1.

   int arr[] = {1,2,5,6};
   int idx = ArrayBsearch(arr, 4, WHOLE_ARRAY, 0, MODE_ASCEND);

... because the value of 4 which it is searching for is nearer in value to 5 than 2. But that's not what happens.

I think what you are seeing in your example may be another aspect of the same failed logic. 

 
jjc: ArrayBSearch doesn't match the documentation (at least in its English translation). It says "the function returns the index of an element nearest in value".
Nearest here is the same nearest reported at https://www.mql5.com/en/forum/145041 where iBarShift returns earlier. In Russian documentation nearest is actually next.
 
dudess:

I'm glad that you agree too. Why always idx 2? Why not give me idx 3 some other random time because it's as good as the other two? There seems to be some kind of order in this "randomness" when the second appearance is deemed to be more worthy than the rest of the dupes.

 

Am I really asking for too much for a search function to return the first appearance of the value that I want it to look for? I have already sorted that array for heaven's sake!

I really, really, really want to ask the MetaQuote guys why idx 2. Just tell me whether it is intentional (just to keep us on our toes) or by accident (oops we did it again) then I'll shut my trap.  


P.S. Do excuse my frustration. I've been writing code for over a decade in other languages and this is the first time I've seen something like this! Basically what I gathered is that "don't assume the norm; always read between the lines with the API docs because if it doesn't say it'd return the first appearance then don't assume it would; test the APIs and if it doesn't do what you expect then write your own". Hmm... 

By definition the binary search algorithm cut the "index space" by 2, and it stops searching when a value matches. So in your case you get index=2 (int((0+5°2), as it's the first value who matched.
 
WHRoeder:
Nearest here is the same nearest reported at https://www.mql5.com/en/forum/145041 where iBarShift returns earlier. In Russian documentation nearest is actually next.

Agree, translation issue. Google translation from Russian documentation :

Returns the index of the first element. If the value is not found, returns the index of the next lower by value of the elements, which are located between the desired value.

 
WHRoeder:
Nearest here is the same nearest reported at https://www.mql5.com/en/forum/145041 where iBarShift returns earlier. In Russian documentation nearest is actually next.

After reviewing the Russian, I agree that it's a translation issue. But it doesn't look like a distinction between next/nearest in Russian (as also e.g. in German); the English translation simply seems to be missing any reference to the ближайшего меньшего in the Russian.
 

angevoyageur

By definition the binary search algorithm cut the "index space" by 2, and it stops searching when a value matches. So in your case you get index=2 (int((0+5°2), as it's the first value who matched.

Thanks so much for that! Now it makes more sense!

It would have been clearer if the doc at least mentions the behaviour in the case of duplicate. (Not really a translation thing, just for clarification...)

Reason: