Does anyone have experience with reading and parsing an HTML page starting with
to save the table rows in a csv-file.
I try to open the already downloaded, local files with
This in the sense the the whole file can be read at once but e. g. "€" becomes "€" if I open the csv file by LO Calc
:(
If I try:
it doesn't work at all as only the first 40 char are read:
If I try
I get:
Instead on reading 619471 char I only get the first 4095 :(
I am adding each table row with:
All of that is normal. If you open it as TXT in UTF8 then use FileReadString it reads up to the first "end" character, you can't read all at once this way (the length parameter is ignored).
Seems the FileReadString using a BIN file had a limitation of 4096 characters (4095 + the '\0' end). I was not aware about this, but that seems understandable for a BIN file reading string.
Of course, as usual, the documentation is unclear or incomplete, we have to live with it.
So one way to go is using BIN but with a char array.
int fHdl = FileOpen(fName,FILE_READ|FILE_BIN|FILE_ANSI, 0, CP_UTF8); char array[]; uint read = FileReadArray(fHdl,array);
Does anyone have experience with reading and parsing an HTML page starting with
to save the table rows in a csv-file.
I try to open the already downloaded, local files with
This in the sense the the whole file can be read at once but e. g. "€" becomes "€" if I open the csv file by LO Calc
:(
If I try:
it doesn't work at all as only the first 40 char are read:
If I try
I get:
Instead on reading 619471 char I only get the first 4095 :(
I am adding each table row with:
I suggest you parse your HTML code in the box provided at Blogcrowds HTML Parser
Then you can combine it with your HTML code.
Seems the FileReadString using a BIN file had a limitation of 4096 characters (4095 + the '\0' end). I was not aware about this, but that seems understandable for a BIN file reading string.
Of course, as usual, the documentation is unclear or incomplete, we have to live with it.
So one way to go is using BIN but with a char array.
Well if you use "FileOpen(fHTML,FILE_READ|FILE_BIN|FILE_ANSI|FILE_COMMON);" WITHOUT ", 0, CP_UTF8"
int fHdl = FileOpen(fHTML,FILE_READ|FILE_BIN|FILE_ANSI|FILE_COMMON); if (fHdl<0) { return(false); } string wrkHtml = FileReadString(fHdl,(int)FileSize(fHdl)); debug Prt("open "+fHTML+" Html-Size:"+(string)FileSize(fHdl)+" String-Size: "+(string)StringLen(wrkHtml)+" "+StringSubstr(wrkHtml,0,60));
It reads the whole file at once:
2025.01.03 19:06:31.196 createETF-Tabelle (EURUSD,H1) 106 e:0 open ETF-Tabelle\USA\ETF TABLE (2024-12-27 21:05:15).html Html-Size:619471 String-Size: 619471 <!DOCTYPE html> <html lang=en style><!-- Page saved with Si
MQL5 is probably tripping itself up. I'll try your suggestion with char...
You can find some useful info in the following places:
- algotrading book;
- the article on parsing MQL5 source files, specifically the section FileReader;
- and the article on parsing HTML-pages;
All of these do not have problems with reading texts in Unicode.

- www.mql5.com
Well if you use "FileOpen(fHTML,FILE_READ|FILE_BIN|FILE_ANSI|FILE_COMMON);" WITHOUT ", 0, CP_UTF8"
It reads the whole file at once:
MQL5 is probably tripping itself up. I'll try your suggestion with char...
SOLVED! This works:
int fHdl = FileOpen(fHTML,FILE_READ|FILE_BIN|FILE_ANSI|FILE_COMMON ,0, CP_UTF8); if (fHdl<0) { Prt(fHTML+" "+(string)fHdl+" FAILED"); FileClose(fHdl); return(false); } char cArr[]; uint read = FileReadArray(fHdl,cArr); string wrkHtml = CharArrayToString(cArr,0,WHOLE_ARRAY,CP_UTF8); Print(fHTML+" Html-Size:"+(string)FileSize(fHdl)+" Array-Size: "+(string)ArraySize(cArr)+", read:"+(string)read+", String-Size: "+(string)StringLen(wrkHtml) ); FileClose(fHdl);
See the size of the file the array and the string are equal:
and the "€" remains "€":
€ 11,560,000 (EU)
...
¥ 806,380,000 (China)
...
¥ 5,130,000,000 (Japan)
...
₩ 1,010,000,000 (Korea)
It's a bit like “from behind through the chest into the eye” (a German saying for overcomplicated).
SOLVED! This works:
See the size of the file the array and the string are equal:
and the "€" remains "€":
It's a bit like “from behind through the chest into the eye” (a German saying for overcomplicated).

- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
Does anyone have experience with reading and parsing an HTML page starting with
<meta charset=utf-8>
to save the table rows in a csv-file.
I try to open the already downloaded, local files with
This in the sense the the whole file can be read at once but e. g. "€" becomes "€" if I open the csv file by LO Calc
:(
If I try:
it doesn't work at all as only the first 40 char are read:
2025.01.03 17:13:31.783 createETF-Tabelle (EURUSD,H1) 106 e:5035 open ETF-Tabelle\USA\ETF TABLE (2024-12-27 21:07:08).html Html-Size:619513 String-Size: 40 <!DOCTYPE html> <html lang=en style><!--
If I try
I get:
Instead on reading 619471 char I only get the first 4095 :(
I am adding each table row with: