Discussion of article "MQL5 Cookbook: Implementing an Associative Array or a Dictionary for Quick Data Access"

 

New article MQL5 Cookbook: Implementing an Associative Array or a Dictionary for Quick Data Access has been published:

This article describes a special algorithm allowing to gain access to elements by their unique keys. Any base data type can be used as a key. For example it may be represented as a string or an integer variable. Such data container is commonly referred to as a dictionary or an associative array. It provides easier and more efficient way of problem solving.

This article describes a class for convenient storage of information, namely an associative array or a dictionary. This class allows to gain access to information by its key.

The associative array resembles a regular array. But instead of an index it uses some unique key, for example, ENUM_TIMEFRAMES enumeration or some text. It does not matter what represents a key. It's uniqueness of the key that matters. This data storage algorithm significantly simplifies many programming aspects.

For example, a function, which would take an error code and print a text equivalent of the error, could be as follows:

//+------------------------------------------------------------------+
//| Displays the error description in the terminal.                  |
//| Displays "Unknown error" if error id is unknown                  |
//+------------------------------------------------------------------+
void PrintError(int error)
 {
   Dictionary dict;
   CStringNode* node = dict.GetObjectByKey(error);
   if(node != NULL)
      printf(node.Value());
   else
      printf("Unknown error");
 }

We will look into specific features of this code later.

Before treating a straight description of associative array internal logic, we will consider details of two main methods of data storage, namely arrays and lists. Our dictionary will be based on these two data types, that is why we should have a good understanding of their specific features. Chapter 1 is dedicated to description of data types. The second chapter is devoted to description of the associative array and methods of working with it.

Author: Vasiliy Sokolov

 

Cool work, kudos to the author! This is something that MQ should have included in \MQL5\Include\Arrays a long time ago, I hope it will be included in the library in the next releases. By the way, everything works fine in MQL4 as well, here are the measurements from the first test. I understand that it is impossible to include simple data types instead of *CObject due to the lack of full-fledged pointers? Or is there some way to get around it?

2015.03.23 13:25:54.617 TestDict EURUSD,M1: 1000000 elements. Add: 1373; Get: 218
2015.03.23 13:25:52.644 TestDict EURUSD,M1: 950000 elements. Add: 1216; Get: 219
2015.03.23 13:25:50.833 TestDict EURUSD,M1: 900000 elements. Add: 1217; Get: 218
2015.03.23 13:25:49.069 TestDict EURUSD,M1: 850000 elements. Add: 1154; Get: 187
2015.03.23 13:25:47.424 TestDict EURUSD,M1: 800000 elements. Add: 1092; Get: 187
2015.03.23 13:25:45.844 TestDict EURUSD,M1: 750000 elements. Add: 1061; Get: 171
2015.03.23 13:25:44.320 TestDict EURUSD,M1: 700000 elements. Add: 1107; Get: 156
2015.03.23 13:25:42.761 TestDict EURUSD,M1: 650000 elements. Add: 1045; Get: 140
2015.03.23 13:25:41.304 TestDict EURUSD,M1: 600000 elements. Add: 1014; Get: 156
2015.03.23 13:25:39.915 TestDict EURUSD,M1: 550000 elements. Add: 920; Get: 125
2015.03.23 13:25:38.665 TestDict EURUSD,M1: 500000 elements. Add: 702; Get: 109
2015.03.23 13:25:37.693 TestDict EURUSD,M1: 450000 elements. Add: 593; Get: 93
2015.03.23 13:25:36.836 TestDict EURUSD,M1: 400000 elements. Add: 577; Get: 78
2015.03.23 13:25:36.025 TestDict EURUSD,M1: 350000 elements. Add: 561; Get: 78
2015.03.23 13:25:35.247 TestDict EURUSD,M1: 300000 elements. Add: 515; Get: 78
2015.03.23 13:25:34.557 TestDict EURUSD,M1: 250000 elements. Add: 343; Get: 63
2015.03.23 13:25:34.063 TestDict EURUSD,M1: 200000 elements. Add: 312; Get: 47
2015.03.23 13:25:33.632 TestDict EURUSD,M1: 150000 elements. Add: 281; Get: 31
2015.03.23 13:25:33.264 TestDict EURUSD,M1: 100000 elements. Add: 171; Get: 16
2015.03.23 13:25:33.038 TestDict EURUSD,M1: 50000 elements. Add: 47; Get: 16
 
VDev:

Cool work, kudos to the author! This is something that MQ should have included in MQL5/Include/ Arrays a long time ago, I hope it will be included in the library in the next releases. By the way, everything works fine in MQL4 as well, here are the measurements from the first test. I understand that it is impossible to include simple data types instead of *CObject due to the lack of full-fledged pointers? Or is there some way to make it work?

It will work. With the help of boxing/unboxing mechanism and templates. The idea is that each base type is packed into a KeyValuePairBase container. Unpacking and returning the corresponding type is done by internal functions of GetObjectByKey type:

template<typename Type, typename T>
Type GetObjectByKey(T key);

It is important to emphasise that working with base types will not give any performance advantage, but will be much more convenient.

 

Now I tried to create a CDictionaryBase storing one of the basic MQL types instead of CObject on the basis of templates. Unfortunately, it didn't work, the functions don't allow to return a template type. It's a pity:

//+------------------------------------------------------------------+
//|TestDictBase.mq5 |
//|Copyright 2015, Vasiliy Sokolov. |
//|http://www.mql5.com |
//+------------------------------------------------------------------+
#property copyright "Copyright 2015, Vasiliy Sokolov."
#property link      "http://www.mql5.com"
#property version   "1.00"
#include <Dictionary.mqh>
#include <DictionaryBase.mqh>
//+------------------------------------------------------------------+
//| Script programme start function|
//+------------------------------------------------------------------+
void OnStart()
  {
//---
   CDictionaryBase base;
   base.AddValue("Pi", 3.14159);
   double pi = (double)base.GetValueByKey("Pi");
   printf(DoubleToString(pi, 5));
   //base.AddObject(
  }
//+------------------------------------------------------------------+
could not deduce template argument #1    TestDictBase.mq5        19      29
could not deduce template argument #0    DictionaryBase.mqh      404     25
possible loss of data due to type conversion    DictionaryBase.mqh      133     10
possible loss of data due to type conversion    DictionaryBase.mqh      135     10
possible loss of data due to type conversion    DictionaryBase.mqh      137     10
...

//+------------------------------------------------------------------+
//| Returns the object by key.|
//+------------------------------------------------------------------+
template<typename T, typename C>
C CDictionaryBase::GetValueByKey(T key)
  {
   if(!ContainsKey(key))
      return NULL;
   return m_current_kvp.GetValue();
  }

Too bad.

So, for each base type we will have to create a base container, or just create containers of base types: CDouble, CLong, CInt, etc.

 
C-4:

Now I tried to create a CDictionaryBase storing one of the basic MQL types instead of CObject on the basis of templates. Unfortunately I failed to do it, the functions do not allow to return a template type.

They do. But the type of the returned value cannot be automatically deduced, which is actually written by the compiler.

You can use a small crutch in the form of a pseudo-parameter.

template<typename T, typename C>
C CDictionaryBase::GetValueByKey(T key, C)
{
   if(!ContainsKey(key))
      return NULL;
   return m_current_kvp.GetValue();
}
 
TheXpert:

They do. But the type of the returned value cannot be automatically deduced, which is what the compiler actually writes.

You can use a small crutch in the form of a pseudo-parameter.

Where is the actual crutch?
 
C-4:
Where is the actual crutch?
A second parameter has been added
 
TheXpert:
A second parameter has been added
I see it now. I'll check it tomorrow.
 
Have you tried comparing performance. At what data size does the advantage over binary search in a sorted string array start?
 
Integer:
Have you tried comparing performance. At what data size does the advantage over binary search in a sorted string array begin?

I haven't made exact tests, but according to my observations the speed advantage starts to appear starting from tens of thousands of elements. I.e. in everyday tasks with 100-10 000 elements one cannot get a performance gain.

Another thing is important here, namely the convenience of working with the container. You don't need to write additional methods to search for elements. Many everyday tasks with dictionaries become many times easier to implement. You don't need to create an element to search for an index, then retrieve the necessary one by the corresponding index, etc., etc.

s.s. Though here I thought that performance should be measured as the total time for inserting elements and searching them. And if searching for sorted items in CArrayObj is quite a fast operation, then insertion is a real trouble. Since fast search requires orderliness, we cannot get rid of possible inserts, and this will significantly slow down performance.

 

Very interesting and it's clear that dictionary is very helpful and easy to use data organizer.


Thanks for your sharing.