Problem with WebRequest().

 

Hello everyone,


First thank you for your time and effort to just simply having a look to my problem.


To try to make it short I just want to parse some data from a webpage to integrate it in my EA further down the line but I can't manage to parse it for any reason.


I just simply follow the WebRequest() documentation for now and just add one line to it (highlighted in yellow), I would've thought that it would do the job to read an HTML file but I'll attach what is returned on my journal when I print it, I do not understand why the full HTML page is not printed and just some random parts of the page. I have been scratching my head for a while now, I hope someone can give me a hand with it.


//+------------------------------------------------------------------+
//|                                                     Test_API.mq4 |
//|                        Copyright 2022, MetaQuotes Software Corp. |
//|                                             https://www.mql5.com |
//+------------------------------------------------------------------+
#property copyright "Copyright 2022, MetaQuotes Software Corp."
#property link      "https://www.mql5.com"
#property version   "1.00"
#property strict

//+------------------------------------------------------------------+
//| Expert initialization function                                   |
//+------------------------------------------------------------------+
int OnInit()
  {
   string cookie=NULL,headers;
   char post[],result[];
   int res;
//--- to enable access to the server, you should add URL "https://www.google.com/finance"
//--- in the list of allowed URLs (Main Menu->Tools->Options, tab "Expert Advisors"):
   string google_url="https://www.google.com/finance";
//--- Reset the last error code
   ResetLastError();
//--- Loading a html page from Google Finance
   int timeout=5000; //--- Timeout below 1000 (1 sec.) is not enough for slow Internet connection
   res=WebRequest("GET",google_url,cookie,NULL,timeout,post,0,result,headers);
//--- Checking errors
   if(res==-1)
     {
      Print("Error in WebRequest. Error code  =",GetLastError());
      //--- Perhaps the URL is not listed, display a message about the necessity to add the address
      MessageBox("Add the address '"+google_url+"' in the list of allowed URLs on tab 'Expert Advisors'","Error",MB_ICONINFORMATION);
     }
   else
     {
      //--- Load successfully
      PrintFormat("The file has been successfully loaded, File size =%d bytes.",ArraySize(result));
      Print(CharArrayToString(result));
      //--- Save the data to a file
      int filehandle=FileOpen("GoogleFinance.htm",FILE_WRITE|FILE_BIN);
      //--- Checking errors
      if(filehandle!=INVALID_HANDLE)
        {
         //--- Save the contents of the result[] array to a file
         FileWriteArray(filehandle,result,0,ArraySize(result));
         //--- Close the file
         FileClose(filehandle);
        }
      else
         Print("Error in FileOpen. Error code=",GetLastError());
     }
   return(INIT_SUCCEEDED);
  }
//+------------------------------------------------------------------+
//| Expert deinitialization function                                 |
//+------------------------------------------------------------------+
void OnDeinit(const int reason)
  {
//---

  }
//+------------------------------------------------------------------+
//| Expert tick function                                             |
//+------------------------------------------------------------------+
void OnTick()
  {
//---

  }
//+------------------------------------------------------------------+

This is what is returned on my journal :

2022.10.10 18:33:31.117 Test_API AUDUSD,H1: <!DOCTYPE html><html lang="en-US" dir="ltr"><head><style nonce="laaqg-E3XCYliyDkv19f-w">
a, a:link, a:visited, a:active, a:hover {
  color: #1a73e8;
  text-decoration: none;
}
body {
  font-family: Roboto,RobotoDraft,Helvetica,Arial,sans-serif;
  text-align: center;
  -ms-text-size-adjust: 100%;
  -moz-text-size-adjust: 100%;
  -webkit-text-size-adjust: 100%;
}
.box {
  border: 1px solid #dadce0;
  box-sizing: border-box;
  border-radius: 8px;
  margin: 24px auto 5px auto;
  max-width: 

That would be great again if someone could help me understanding what am I doing wrong there ? 


Thank you,

Valentin

 
I think it's OK. What you see in the terminal is the HTML code of the page you've requested. My guess is that "Print" is limited to a certain number of characters and you've reached the limit - to avoid printing an entire novel in the terminal. Instead of using Print, try writing the outcome of the request to a file, then check the content of the file once you've successfully completed the request... and let us know! :)
 
Carlos Moreno Gonzalez #:
I think it's OK. What you see in the terminal is the HTML code of the page you've requested. My guess is that "Printf" is limited to a certain number of characters and you've reached the limite - to avoid printing an entire novel in the terminal. Instead of using Print, tryiing write the outcome of the request to a file, then check the content of the file once you run've successfully completed the request... and let us know! :)

Hi there,


Thank you for your answer.

Well, I thought about that to be fair because the file is working perfectly fine when I open it, all info are in the file, so yes it definitely write it in the file! 

So, I reckon like you said, it should be limited to a certain number of characters in the journal, I am going to try an other URL and check my journal to see if it prints the same number of character and if it does, that would be the definite answer.


I'll get back to you, 

Thanks again

 
I just checked and the number of character is a tiny bit different all the time but it my be limited by a number of bytes instead, won't it ?
 
Mr Valentin Michel Draperi #:
I just checked and the number of character is a tiny bit different all the time but it my be limited by a number of bytes instead, won't it ?

That makes sense, too. I couldn't find anything about limitations to "Print", but I'm certain it must have a limit, whether it is number of characters or bytes. Keep in mind that some special characters occupy more than 1 byte, i.e., they are like there were 2-3 characters together, etc. Also I'm not sure if an escape character might count as one or not. To be honest, I wouldn't worry that much - if the output file looks OK and complete, then that's it. You're not gonna be parsing the HTML file with "Print".

Give this Economic Calendar WebRequest URL a go, just to be sure. It should write a complete HTML with all economic news for the week to come:

http://calendar.fxstreet.com/EventDateWidget/GetMini?culture=en-US&view=range&start=20221010&end=20221017&timezone=UTC&columns=date%2Ctime%2Ccountry%2Ccountrycurrency%2Cevent%2Cconsensus%2Cprevious%2Cvolatility%2Cactual&showcountryname=false&showcurrencyname=true&isfree=true&_=1455009216444
 
Carlos Moreno Gonzalez #:

That makes sense, too. I couldn't find anything about limitations to "Print", but I'm certain it must have a limit, whether it is number of characters or bytes. Keep in mind that some special characters occupy more than 1 byte, i.e., they are like there were 2-3 characters together, etc. Also I'm not sure if an escape character might count as one or not. To be honest, I woldn't worry that much - if the output file looks OK and complete, then that's it. You're not gonna be parsing the HTML file with "Print".

Give this Economic Calendar WebRequest URL a go, just to be sure. It should write a complete HTML with all economic news for the week to come:

That's what I thought to be fair.


I mean I tried a bit more and found some function in the code base to parse some HTML elements but unfortunately, there is definitely something I do wrong and I don't think it's due to the print command. 

I've Highlighted in yellow what should be printed (the content of that div tag) but it still return the exact same thing in my journal that I've posted in my first comment. I have no clue what is going on there.

//+------------------------------------------------------------------+
//|                                                     Test_API.mq4 |
//|                        Copyright 2022, MetaQuotes Software Corp. |
//|                                             https://www.mql5.com |
//+------------------------------------------------------------------+
#property copyright "Copyright 2022, MetaQuotes Software Corp."
#property link      "https://www.mql5.com"
#property version   "1.00"
#property strict

//+------------------------------------------------------------------+
//| Expert initialization function                                   |
//+------------------------------------------------------------------+
int OnInit()
  {
   string cookie=NULL,headers;
   char post[],result[];
   int res;

//--- to enable access to the server, you should add URL "https://www.google.com/finance"
//--- in the list of allowed URLs (Main Menu->Tools->Options, tab "Expert Advisors"):
   string google_url="https://www.google.com/finance";
//--- Reset the last error code
   ResetLastError();
//--- Loading a html page from Google Finance
   int timeout=5000; //--- Timeout below 1000 (1 sec.) is not enough for slow Internet connection
   res=WebRequest("GET",google_url,cookie,NULL,timeout,post,0,result,headers);
//--- Checking errors
   if(res==-1)
     {
      Print("Error in WebRequest. Error code  =",GetLastError());
      //--- Perhaps the URL is not listed, display a message about the necessity to add the address
      MessageBox("Add the address '"+google_url+"' in the list of allowed URLs on tab 'Expert Advisors'","Error",MB_ICONINFORMATION);
     }
   else
     {
      //--- Load successfully
      PrintFormat("The file has been successfully loaded, File size =%d bytes.",ArraySize(result));
      //--- Save the data to a file
      int filehandle=FileOpen("GoogleFinance.htm",FILE_WRITE|FILE_BIN);
      //--- Checking errors
      if(filehandle!=INVALID_HANDLE)
        {
         //--- Save the contents of the result[] array to a file
         FileWriteArray(filehandle,result,0,ArraySize(result));
         //--- Close the file
         FileClose(filehandle);
        }
      else
         Print("Error in FileOpen. Error code=",GetLastError());
      string HTML = CharArrayToString(result);
      GetHTMLElement(HTML,"<div class=\"YMlKec\">","</div>");
     }
   return(INIT_SUCCEEDED);
  }
//+------------------------------------------------------------------+
//| Expert deinitialization function                                 |
//+------------------------------------------------------------------+
void OnDeinit(const int reason)
  {
//---

  }
//+------------------------------------------------------------------+
//| Expert tick function                                             |
//+------------------------------------------------------------------+
void OnTick()
  {
//---

  }
//+------------------------------------------------------------------+
string   GetHTMLElement(string HTML, string ElementStart, string ElementEnd)
  {
   string   data = NULL;

// Find start and end position for element
   int s = StringFind(HTML, ElementStart) + StringLen(ElementStart);
   int e = StringFind(StringSubstr(HTML, s), ElementEnd);

// Return element content
   if(e != 0)
      data = StringSubstr(HTML, s, e);
   Print(data);
   return(data);
  }
//+------------------------------------------------------------------+

This is what should be printed (well the content of it), this is found by inspecting the webpage:

<div class="YMlKec">17 125,29</div>


I tried the link you've sent me, for any reason it doesn't work it keeps asking me to add it in my EA urls even though I already added it.



Anything that could resolve this ? :/

 
Please be aware of dynamic HTML content generated by client side javascript can't be captured by webrequest.
 
Soewono Effendi #:
Please be aware of dynamic HTML content generated by client side javascript can't be captured by webrequest.
Hello there,

Thank you for your answer, I did not know that. Is there anywhere in the documentation that you could point me out ? I couldn’t find anywhere stipulating that. 
I still don’t understand why it would instead print some random, what looks like, CSS code.

Thanks
 
Mr Valentin Michel Draperi #Thank you for your answer, I did not know that. Is there anywhere in the documentation that you could point me out ? I couldn’t find anywhere stipulating that. 
I still don’t understand why it would instead print some random, what looks like, CSS code. Thanks

That is not in the documentation, because that has nothing to do with MQL. It has to do with web protocols and how browsers work.

WebRequest is not "browser". It is simply a function for handling some of the HTTP protocol. Nothing else.

Do some research on the HTTP protocol.

 
Fernando Carreiro #:

That is not in the documentation, because that has nothing to do with MQL. It has to do with web protocols and how browsers work.

WebRequest is not "browser". It is simply a function for handling some of the HTTP protocol. Nothing else.

Do some research on the HTTP protocol.

Hi Fernando,

Thank you for your answer and I’ll definitely have a look into HTTP protocol. 

My issue remain as if I save the file as an HTM doc I can see it fine but as soon as I try to extract anything from it I encounter the same problem.

I’ll definitely research more on web request though.

Thanks for yo ur time.
Reason: