GZip Algorithm in MQL

 

Is there GZip algorithm in MQL?

I searched the MQL4 and MQL5 sites and Google, but no results found out.

 
What is GZip?
 
A zip format mainly used in linux and web apps. https://en.wikipedia.org/wiki/Gzip 
 
 

This was eight years ago. Did someone develop a gzip library in the meantime?

@Vladimir: ZIP is something else than GZIP. A zip file contains a directory structure, metadata of files and can hold multiple files. A gzip file is merely a single file streaming data and an efficient way to store for example tick data, log files, CSV files, etc.

 
yoriz #:

This was eight years ago. Did someone develop a gzip library in the meantime?

@Vladimir: ZIP is something else than GZIP. A zip file contains a directory structure, metadata of files and can hold multiple files. A gzip file is merely a single file streaming data and an efficient way to store for example tick data, log files, CSV files, etc.


I guess nobody did that, but it would be an interesting journey to implement gzip efficiently in mql5.

Maybe a good starting point is the official source of the gzip implementation.


 

As per Wikipedia, in the most simple case Gzip starts with 10-byte header (with leading bytes 0x1f 0x8b 0x08, which are the 2-byte signature and "deflate" method id), followed by compressed payload and ended up with 8-byte trailer (CRC-32 and length of uncompressed data). So you can extract the compressed payload and pass it to the function CryptDecode(CRYPT_ARCH_ZIP, ...) - that's all.

In more complex cases, there can be optional fields after the header and in front of payload. It can be detected by the 4-th byte in the header - this is a bitmask of different flags. More on this can be found in the specifiction (here is an excerpt, optionals are grayed):

      +---+---+---+---+---+---+---+---+---+---+
      |ID1|ID2|CM |FLG|     MTIME     |XFL|OS | (more-->)
      +---+---+---+---+---+---+---+---+---+---+

      (if FLG.FEXTRA set)

      +---+---+=================================+
      | XLEN  |...XLEN bytes of "extra field"...| (more-->)
      +---+---+=================================+

      (if FLG.FNAME set)

      +=========================================+
      |...original file name, zero-terminated...| (more-->)
      +=========================================+

      (if FLG.FCOMMENT set)

      +===================================+
      |...file comment, zero-terminated...| (more-->)
      +===================================+

      (if FLG.FHCRC set)

      +---+---+
      | CRC16 |
      +---+---+

      +=======================+
      |...compressed blocks...| (more-->)
      +=======================+

        0   1   2   3   4   5   6   7
      +---+---+---+---+---+---+---+---+
      |     CRC32     |     ISIZE     |
      +---+---+---+---+---+---+---+---+

...

      ID1 (IDentification 1)
      ID2 (IDentification 2)
            These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139
            (0x8b, \213), to identify the file as being in gzip format.

      CM (Compression Method)
            This identifies the compression method used in the file.  CM
            = 0-7 are reserved.  CM = 8 denotes the "deflate"
            compression method, which is the one customarily used by
            gzip and which is documented elsewhere.

      FLG (FLaGs)
            This flag byte is divided into individual bits as follows:

               bit 0   FTEXT
               bit 1   FHCRC
               bit 2   FEXTRA
               bit 3   FNAME
               bit 4   FCOMMENT
               bit 5   reserved
               bit 6   reserved
               bit 7   reserved

...

      MTIME (Modification TIME)
            This gives the most recent modification time of the original
            file being compressed.  The time is in Unix format, i.e.,
            seconds since 00:00:00 GMT, Jan.  1, 1970.

      XFL (eXtra FLags)
            These flags are available for use by specific compression
            methods.  The "deflate" method (CM = 8) sets these flags as
            follows:

               XFL = 2 - compressor used maximum compression,
                         slowest algorithm
               XFL = 4 - compressor used fastest algorithm

      OS (Operating System)

It looks a relatively easy task to implement in MQL5.

RFC 1952: GZIP file format specification version 4.3
  • datatracker.ietf.org
This specification defines a lossless compressed data format that is compatible with the widely used GZIP utility. This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind.