Feature request: UTF-8 source file support - page 2

 
Alain Verleyen #:

What makes you think MetaEditor is using BOM for UTF-8 ? I just check (Build 2402 for MT4) and there is no BOM.

Out of curiosity, what is your use case for Sed/Grep on a mql source code ?

Well I have a mq4 file with a couple of non-ASCII characters - a copyright mark in a #property statement and a non-breaking space in a sinput statement (to get a completely blank line in the input grid (thanks to William Roeder for this great suggestion)).  When I do a Save As and select UTF-8 encoding (using metaeditor 2403) I get a mq4 file the first three bytes of which are  EB BB BF.  The copyright mark is entered as Alt-00A9 and appears in the mq4 file as C2 A9 and the non-breaking space is entered as Alt-00A0 and appears in the file as C2 A0.

I wonder how you got a UTF-8 file without BOM?  Do you have any unicode characters in your file?

Regarding the build process - I an helping someone with some legacy code that consists of about 10 indicators and a similar number of scripts that all work together.  I am doing the recoding to fix Y2038 limits and also to fix some post-build-600 issues.  I have added a version number suffix to the mq4 and mqh files and need to be able to build ex4s with this suffixed name for the alpha testers (so they can compare old and new), or as the final un-suffixed ex4 files for eventual release.  All the component have to be packaged up in a zip file with appropriate paths.  I use a combination of python scripts, bat files, sed.exe and grep.exe to do all of that.  If I wasn't such a dinosaur I would probably know far more modern ways to get the same outcomes. :-) 

William Roeder
William Roeder
  • www.mql5.com
Trader's profile
 
LawrenceIpsum #:

Well I have a mq4 file with a couple of non-ASCII characters - a copyright mark in a #property statement and a non-breaking space in a sinput statement (to get a completely blank line in the input grid (thanks to William Roeder for this great suggestion)).  When I do a Save As and select UTF-8 encoding (using metaeditor 2403) I get a mq4 file the first three bytes of which are  EB BB BF.  The copyright mark is entered as Alt-00A9 and appears in the mq4 file as C2 A9 and the non-breaking space is entered as Alt-00A0 and appears in the file as C2 A0.

I wonder how you got a UTF-8 file without BOM?  Do you have any unicode characters in your file?

I just took the Zigzag indicator and saved it as UTF-8, no BOM (attached).

But you are right, if I take one of my own file, where I am also using non-ASCII characters then it's saved with BOM. Interesting to know.

Regarding the build process - I an helping someone with some legacy code that consists of about 10 indicators and a similar number of scripts that all work together.  I am doing the recoding to fix Y2038 limits and also to fix some post-build-600 issues.  I have added a version number suffix to the mq4 and mqh files and need to be able to build ex4s with this suffixed name for the alpha testers (so they can compare old and new), or as the final un-suffixed ex4 files for eventual release.  All the component have to be packaged up in a zip file with appropriate paths.  I use a combination of python scripts, bat files, sed.exe and grep.exe to do all of that.  If I wasn't such a dinosaur I would probably know far more modern ways to get the same outcomes. :-) 

I see. Thanks.

See the link for a way to deal with BOM and Sed.

How can I remove the BOM from a UTF-8 file?
How can I remove the BOM from a UTF-8 file?
  • 2017.07.23
  • m13r m13r 2,585 2 2 gold badges 16 16 silver badges 14 14 bronze badges
  • unix.stackexchange.com
I have a file in UTF-8 encoding with BOM and want to remove the BOM. Are there any linux command-line tools to remove the BOM from the file?
Files:
 
Alain Verleyen #:

I just took the Zigzag indicator and saved it as UTF-8, no BOM (attached).

But you are right, if I take one of my own file, where I am also using non-ASCII characters then it's saved with BOM. Interesting to know.

I see. Thanks.

See the link for a way to deal with BOM and Sed.

Thanks Alain!  

Reason: