
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
I recommend making a minimal change first, so that memory reallocation is done less frequently. Two lines
m_total_rows++;
ArrayResize(m_cells,m_total_rows*m_total_columns,10000);
in bool CSVReader::AddData(string data_str,bool header) replace with
m_total_rows++;
if (m_total_rows*m_total_columns>ArraySize(m_cells)) ArrayResize(m_cells,2*m_total_rows*m_total_columns);
The number of memory reallocations with copying should become O(log(n,2)) instead of O(n). 20 instead of 600 thousand. Maybe that's enough for you now.
Thank you! I want to let you know what I've got:
1. No change in memory - by 10 gigabytes the current code ate RAM there and there.
2. By speed:
2.1 Old version 574 seconds
2.2 New version: 138 seconds.
So you get a 4-time gain, which is pretty good! However, memory is tight, and this is far from all that needs to be loaded....
very handy :)
So I converted CSV to binary, minus the date.
What it turns out, when running the script took up 1 gigabyte of memory, which compared to 10 is very good. However, it's still a lot :)
In terms of speed - only 16 seconds! Quite good!
I recommend making a minimal change first, so that memory reallocation is done less frequently. Two lines
m_total_rows++;
ArrayResize(m_cells,m_total_rows*m_total_columns,10000);
in bool CSVReader::AddData(string data_str,bool header) replace with
m_total_rows++;
if (m_total_rows*m_total_columns>ArraySize(m_cells)) ArrayResize(m_cells,2*m_total_rows*m_total_columns);
The number of memory reallocations with copying should become O(log(n,2)) instead of O(n). 20 instead of 600 thousand. Maybe that will be enough for you.
Actually the third parameter for ArrayResize() is specified for a reason... It's a bad change.
Read the documentation
Actually the third parameter for ArrayResize() is specified for a reason... a feathery change...
Read the documentation
What did you manage to get from the documentation about the third parameter, useful for this case, when solving the task of lifting into memory .csv created in different programs and having arbitrary size?
Feel free to suggest a better, non-binary change, which increases the speed of memory reallocation (reducing the number of ArrayResize calls) more than binary search...
Thank you! I'll let you know what comes out:
1. No change in memory - by 10 gigabytes the current code ate RAM there and there.
2. By speed:
2.1 Old version 574 seconds
2.2 New version: 138 seconds.
So you get a 4-time gain, which is pretty good! However, memory is tight, and it's not all that much to load....
After reading, in bool CSVReader::Load(int start_line), after the line
FileClose(filehandle);
insert freeing of memory
ArrayResize(m_cells,m_total_rows*m_total_columns);
Frees up unnecessary 0-50% of memory occupied by m_cells. Only m_cells itself, without cell contents.
Now I'm making a small library to work quickly with CSV.
On the screenshot is a test run that goes through in 7 seconds!!! Xeon processor, 3.0 frequency.
First the script makes up the data format for each column. There are 6 columns. Then 1000000 rows are added to the table, then they are filled with numbers from 0 to 999999. According to the data format the numbers can be perceived differently. Then everything is saved into a file.
The file size is 65.4 MB. The whole structure took up 232 MB in memory.
So I converted CSV to binary, minus the date.
What it turns out, when running the script took up 1 gigabyte of memory, which compared to 10 is very good. However, still a lot :)
In terms of speed - only 16 seconds! Quite good!
Well, the script itself is still crippled.
After reading, in bool CSVReader::Load(int start_line), after line
FileClose(filehandle);
insert memory release
ArrayResize(m_cells,m_total_rows*m_total_columns);
Frees up unnecessary 0-50% of memory occupied by m_cells. Only m_cells itself, without cell contents.
Thanks, but after closing the file/finishing the script, the memory is quickly freed anyway. Here's how to reduce consumption while running....
Now I'm making a small library for fast CSV handling.
On the screenshot is a test run that goes through in 7 seconds!!! Xeon processor, 3.0 frequency.
First the script makes up the data format for each column. There are 6 columns. Then 1000000 rows are added to the table, then they are filled with numbers from 0 to 999999. According to the data format the numbers can be perceived differently. Then everything is saved into a file.
The file size is 65.4 MB. The whole structure took up 232 MB in memory.
Quite interesting. Are you planning to publish your programming achievements publicly?
Well, the script itself is still a mess.
Can you tell me what to fix in it?