How to reduce the size of an array by 4-5 times - General

Slava 2018.09.10 16:38 #22831

Nikolai Semko:

The ticks are not stored in the RAM. They are downloaded as needed, piecemeal, as I understand it.

If by actual ticks, then yes.

In a single pass, you can view statistics on how much memory is spent on ticks storage. When optimizing, no more than 320 meg is stored in memory at a time. The rest is stored on disk.

We are now considering a solution to keep all ticks in the shared memory, so that all local agents could read from this memory. Then there will be no disk accesses, and optimization will be faster.

MetaTrader 5 Strategy Tester: Multicurrency expert test results Tiki in real time

Nikolai Semko 2018.09.10 17:37 #22832

Slava:

If by actual ticks, yes.

It is possible to see statistics on how much memory is spent to store ticks in a single pass. When optimized, no more than 320 meg is stored in memory at a time. Everything else is on disk.

We are now considering a solution to keep all ticks in a shared memory so that all local agents can read from this memory. Then there will be no disk accesses and optimization will be faster.

Yes, this is archival. If I understand correctly, now the ticks and minute bars are stored unpacked on the disk and in the memory, i.e. for the bar(MqlRates structure) it is 60 bytes, and for the tick(MqlTick structure) it is 52 bytes.
It's horrible! Something has to be done about it a long time ago.

I understand that the main problem of compressed arrays is organizing fast access to each array item.

But even if we keep unpacked only every 256th element of the array, and store in the other elements only increments to unpacked ones, we can see that the array size will be reduced by 4-5 times and access time to each element will not increase much (maybe 1-2 nanoseconds), but it will save enormous time on saving and reading of the array from disk and to disk.

question for #define experts Any questions from newcomers [Archive!] Any rookie question,

Nikolai Semko 2018.09.10 18:29 #22833

fxsaber:
Why is the SSD constantly being addressed (the light flickers at a high rate) during Optimization?

That's why I don't use ticks, I use logarithmic data structure(I've already told about it), which at a given time consists of a couple thousand ticks, then a couple thousand minute bars, 2000 M2, 2000 M5 , M10, M30, H1, H3, H6, H12, D1, W1... all MN1 bars.
This structure of full history data is formed at any moment of time less than a millisecond and occupies only 1.5 MB in RAM (actually not even in RAM, but in the cache of the processor). And all algorithms, grounded for this structure, just fly.

After all, our eyesight is built on the same logarithmic scale: the further we look, the less we notice small details.

When in the not too distant future, computers will have only one physical memory device (hard drive, RAM, processor cache), namely the processor cache with 13 zeros in it, then I'll make the switch to ticks, too :))

...

Although, maybe it's me out of the way, since with such a data structure during optimization the light bulb will flicker too. After all, the ticks will still be loaded :((

Some signs of the ZigZags, waves, trends. "no disk space in

fxsaber 2018.09.10 21:28 #22834

Slava:

If by actual ticks, yes.

It is possible to see statistics on how much memory is spent on ticks storage in a single pass. When optimized, no more than 320 meg is stored in memory at a time. Everything else is on disk.

We are now considering a solution to keep all ticks in the shared memory, so that all local agents could read from this memory. Then there will be no access to disk, and optimization will be faster.

First, let's start with the Optimization log

Tester  optimization finished, total passes 714240
Statistics      optimization done in 7 hours 31 minutes 06 seconds
Statistics      local 714240 tasks (100%), remote 0 tasks (0%), cloud 0 tasks (0%)
Core 1  connection closed
Core 2  connection closed
Core 3  connection closed
Core 4  connection closed
Core 5  connection closed
Core 6  connection closed
Core 7  connection closed
Core 8  connection closed
Tester  714240 new records saved to cache file 'tester\cache\Test.FILTER_EURUSD.rann_RannForex.M1.20180226.20180909.40.2D734373DF0CAD251E2BD6535A4C6C84.opt'

During those 7.5 hours, the SSD was accessed with huge frequency. If ticks were read on each pass, that works out to an average of 26 times per second for 7.5 hours. Hence such a wild blink - more than 700 thousand reads.

Single run log

Core 1  FILTER_EURUSD.rann_RannForex,M1: 132843 ticks, 60283 bars generated. Environment synchronized in 0:00:00.140. Test passed in 0:00:00.827 (including ticks preprocessing 0:00:00.109).
Core 1  FILTER_EURUSD.rann_RannForex,M1: total time from login to stop testing 0:00:00.967 (including 0:00:00.140 for history data synchronization)
Core 1  322 Mb memory used including 36 Mb of history data, 64 Mb of tick data

As seen, ~130K ticks and 60K bars are used (the "Entire history" mode is selected in the Tester). I.e. a very small amount of history.

The history of custom symbol in the Terminal contains the following amount of history data

Saved ticks = 133331
Generated Rates = 60609

I.e. in the history of the symbol is very little more than the Tester uses.

ZS It's a shame to look at the SSD... How much faster could the Optimise be? Strange that the OS doesn't cache this data, since it's less than 7MB of ticks in uncompressed form.

Testing 'CopyTicks' what's up with these MetaTrader 5 Strategy Tester

Alexey Navoykov 2018.09.11 00:45 #22835

Nikolai Semko:

But even if we store only each 256th element of an array unpacked and store only increments to unpacked elements, the size of an array will reduce by 4-5 times while access time to each element will not greatly increase (maybe by 1-2 nanoseconds), but it will save enormous time on saving and reading an array from disk and to disk.

Renate is not enough for you ) How many times it has been suggested to optimize history storage. Especially since nothing needs to be spent on compression (which is the most resource-intensive part), because the data initially comes from the server compressed, and only cache, which is constantly used, is kept unpacked... But that's where the lecture always comes in: if you can't buy a bigger or faster hard drive, there's nothing to do on a MT. And slow VPSs are always mentioned for some reason.

New trends in technical History Centre updated - Profitable expert. Investors are

Nikolai Semko 2018.09.11 03:46 #22836

Alexey Navoykov:

Renate is not enough for you ) How many times it has been suggested to optimize history storage. Especially since nothing needs to be spent on compression (which is the most resource-intensive part), because the data originally comes from the server compressed, and only cache, which is constantly used, is kept in unpacked form... But that's where the lecture always comes in: if you can't buy a bigger or faster hard drive, there's nothing to do on a MT. And slow VPSs are always mentioned for some reason.

Once again, the main problem with packed arrays is organising quick access to any array element, rather than reading them sequentially. That's why a different compression format (or rather even storage format) is needed here, yes such that it doesn't need to be unpacked and packed. Of course, ~10 times compression as for zip, png, etc. will not work, but I think 5 times compression is possible.

Well, really, if we think about it, in MqlRates 8*4=32 bytes are allocated for storing information about one-minute bars (while only one-minute bars are stored), although in 99% of cases these values differ by less than one byte of information, i.e. 8+1+1+1=11 bytes is almost enough, even if not bound to previous bars. And time in 99 % cases differs from the previous value by exactly 60 (i.e. in 99 % cases 1 bit of information is enough - 60 or not 60). And 8 bytes are allocated for this too.

OrderClose not working An advisor without an Interesting topic for many:

Alexey Navoykov 2018.09.11 04:17 #22837

Nikolai Semko:

Once again, the main problem with packed arrays is organising quick access to any array element, rather than reading them sequentially. That's why a different compression format (or rather even storage format) is needed here, yes such that it doesn't need to be unpacked and packed. Of course, zip, png, etc. can't be compressed ~10 times, but I think 5 times is possible.

If we're talking about storage on disk, access to a specific item makes no sense, because the file operation itself is costly. Therefore a large chunk is read at once. For example bar history files are broken down by year, ticks by month. And if you mean to keep the history in memory in a packed form, constantly unpacking each element on the fly, then I'm afraid it will not suit anyone.

Scripts: TickCompressor Chatter about the MT5 Repository STORAGE|SVN

Nikolai Semko 2018.09.11 04:23 #22838

I've just invented a storage format that stores blocks of 256 MqlRates elements and takes 2900 bytes on average (the block size will be floating), i.e. 2900/256= ~12 bytes will be allocated per one MqlRates structure, which is 5 times less, as I thought.

Access to each element of packed MqlRates structure is fast enough ( 2-3 sums, 2-3 checks, 2-3 shifts, i.e. hardly more than 1 nanosecond)

Features of the mql5 Bid && Ask && Packing structures in memory

Nikolai Semko 2018.09.11 04:27 #22839

Alexey Navoykov:

If we're talking about storing on disk, accessing a particular element makes no sense, because the file operation itself is costly. Therefore a large chunk is read at once. For example, bar history files are split up by years, ticks are split up by months. And if you mean to keep the history in memory in a packed form, constantly unpacking each element on the fly, then I'm afraid it will not suit anyone.

It will be stored on disk in a "compressed" format and also read into memory in the same format. There will be no conversion to a full format, but there will only be a calculation at the moment of reading of a specific element of MqlRates structure. And it will be much faster, taking into account the fact that there will be five times less work with disk.

[Archive!] Any rookie question, What is the best Questions from Beginners MQL4

Alexey Navoykov 2018.09.11 04:56 #22840

Nikolai Semko:

Access to each element of a packed MqlRates structure is quite fast

...

It will be stored on the disk in a "compressed" format and read into the memory in the same format. There would not be any conversion to a full format, but only the resulting calculations at the moment of reading of a particular element of MqlRates structure.

Yes, but the concept of "fast" in your case is very relative. One thing is that the user requested an array of bars, he simply copied a section of memory, or requested some specific time series, it is also a simple copying of data with constant step, equal to the size of the structure. And another thing is additional calculations and conversions over each number.

Although, personally, I would prefer to have a compressed history, so as not to waste memory, because I'm organizing my own arrays for storing it anyway. So I'm willing to tolerate a little delay. But most other users would tear you to pieces for it )

p.s. Ideally though, it would be nice to have such an option in the terminal to choose how history is stored in memory. For example, if the system has little RAM, but a fast processor, this would be very useful.

Create Auto-Shifting Arrays Any rookie question, so Features of the mql5

Errors, bugs, questions - page 2284