Reading RSS News Feeds by Means of MQL4

vgs | 26 June, 2013


Introduction

This article deals with an example of reading RSS markup by means of MQL4 using the functions from the article HTML Walkthrough Using MQL4. It is assumed that the reader has read the article or at least has the general understanding of the ideas described there.


What Is RSS and Why Do We Need It in MQL4?

RSS is an XML format for transferring various data from one source to another.

RSS is actively used by news agencies, companies, as well as various news web sites.

RSS can be aggregated (or read) by a variety of special applications (readers) and delivered to users in a convenient form. In this article, we will try to make a work piece which can then be turned into a news indicator or just an RSS reader on MQL4 language. What kind of information are we interested in RSS? It is the news, of course.

As mentioned above, RSS is an XML document. So, what is XML?

Xml (eXtensible Markup Language) is a text format for storing structured data. The structure can be visually represented as a tree of elements. XML elements are described by the tags.

Below is an example of a simple XML document:

<!--?xml version="1.0" encoding="windows-1252"?-->
<weeklyevents>
        <event>
                <title>Rightmove HPI m/m</title>
                <country>GBP</country>
                <date><!--[CDATA[05-15-2011]]--></date>
                <time><!--[CDATA[23:01]]--></time>
                <impact><!--[CDATA[Medium]]--></impact>
                <forecast>
                <previous><!--[CDATA[1.7%]]--></previous>
        </forecast></event>
</weeklyevents>


Implementation

As we can see from the above example, XML is somewhat similar to HTML. Therefore, in order not to "reinvent the wheel", we will use the code from the article HTML Walkthrough Using MQL4.

The first thing we need to do is connect HTML walkthrough functions to our project (indicator). To do this, download ReportHTMLtoCSV-2.mq4 file and put it to experts/include folder. Since we are going to use the file as a function library, start() function should be commented out in it.

I would also suggest to rename the file (for example, into HTMLTagsLib.mq4) for more clarity.

The file is ready. Now, connect it to the indicator (the work piece file for the indicator is attached below):

#include <htmltagslib.mq4>

Now we need to include wininet.dll Windows standard library to work with the links:

#include <winuser32.mqh>
#import "wininet.dll"
  int InternetAttemptConnect(int x);
  int InternetOpenA(string sAgent, int lAccessType, 
                    string sProxyName = "", string sProxyBypass = "", 
                    int lFlags = 0);
  int InternetOpenUrlA(int hInternetSession, string sUrl, 
                       string sHeaders = "", int lHeadersLength = 0,
                       int lFlags = 0, int lContext = 0);
  int InternetReadFile(int hFile, int& sBuffer[], int lNumBytesToRead, 
                       int& lNumberOfBytesRead[]);
  int InternetCloseHandle(int hInet);
#import

We will use ReadWebResource(string url) function for reading URL. The function's operation is not a topic of this article. Therefore, we will not dwell on it.

We are only interested in the input and output arguments. The function receives a link to be read and returns the resource content as a string.

In order to analyze the tags, we will use two functions from HTMLTagsLib.mq4 file - FillTagStructure() and GetContent(). These functions are described in details in the article HTML Walkthrough Using MQL4. It should be noted that input data for analysis is passed as an array. Therefore, after the data has been received, it should be converted into array using ReadWebResource(string url) function.

ArrayFromString() function will help us in that:

//+------------------------------------------------------------------+
int ArrayFromString(string & webResData[], string inputStr, string divider) 
{   
   if (inputStr == "") 
   {
     Print ("Input string is not set"); 
     return(0);
   }
   if (divider == "") 
   {
      Print ("Separator is not set"); 
      return(0);
   }
   int i, stringCounter = 0;
   
   string tmpChar, tmpString, tmpArr[64000];   
   int inputStringLen = StringLen(inputStr);
   for (i = 0; i < inputStringLen; i++ ) 
   {
      tmpChar = StringSubstr(inputStr, i, 1);
      tmpString = tmpString + tmpChar;
      tmpArr[stringCounter] = tmpString; 
      if (tmpChar == divider) 
      {          
          stringCounter++;
          tmpString = "";
      }               
   }
   if (stringCounter > 0) 
   {
      ArrayResize(webResData, stringCounter);   
      for (i = 0; i < stringCounter; i++) webResData[i] = tmpArr[i];
   }
   return (stringCounter);
}

Three arguments are passed to the function's input. The first one is the link to the array where the function's operation result is stored, the second one is a string that should be converted into an array and the third one is a separator, by which the string is divided. The function returns the number of rows in the resulting array.

Now our data is ready for analysis.

In the next fragment, we analyze data and display the values of title and country tags in the terminal's console:

   string webRss = ReadWebResource(rssUrl);
   int i, stringsCount = ArrayFromString(webResData, webRss, ">");      
            
   string tags[];    // array for storing the tags
   int startPos[][2];// tag start coordinates
   int endPos[][2];  // tag end coordinates
   
   FillTagStructure(tags, startPos, endPos, webResData);
   int tagsNumber = ArraySize(tags);
   
   string text = "";
   string currTag;
   int start[1][2];
   int end[1][2];
  
   for (i = 0; i < tagsNumber; i++)
      {
      currTag = tags[i];     

      if (currTag == "<weeklyevents>")
         {
            Print("News block start;");
         }

      if (currTag == "<event>")
         {
            text = "";
            start[0][0] = -1;
            start[0][1] = -1;
         }

      if (currTag == "<title>")
         {// coordinates of the initial position for selecting the content between the tags
            start[0][0] = endPos[i][0];
            start[0][1] = endPos[i][1];
         }
                 
      if (currTag == "</title>")
         {// coordinates of the end position for selecting the contents between the tags
            end[0][0] = startPos[i][0];
            end[0][1] = startPos[i][1];
            text = text + GetContent(webResData, start, end) + ";";
         }

      if (currTag == "<country>")
         {// coordinates of the initial position for selecting the content between the tags
            start[0][0] = endPos[i][0];
            start[0][1] = endPos[i][1];
         }
                       
      if (currTag == "</country>")
         {// coordinates of the end position for selecting the contents between the tags
            end[0][0] = startPos[i][0];
            end[0][1] = startPos[i][1];
            text = text + GetContent(webResData, start, end) + " ;";
         }

      if (currTag == "</event>")
         {
            Print(text);
         }

      if (currTag == "</weeklyevents>")
         {
            Print("end of the news;");
         }

      }

Using FillTagStructure() function, we receive the number and the structure of the tags, while GetContent() function provides us with their value.

Script operation results:

Script operation results

Fig. 1. NewsRss script operation results

In the results, we can see the news title and the currency symbol of the country the news is related to.


Conclusions

We have examined the way of reading RSS by means of MQL4 using the functions for HTML tags analysis. The drawbacks of this method are described in details in the article HTML Walkthrough Using MQL4. I would also like to add that one of the drawbacks of the method is an "inconvenience" of using the functions in the code in contrast to other standard libraries for reading XML.

Now that the article and the script have been completed, I am going to consider connection of the external library for working with XML. As for the advantages, I would name implementation speed as one of them.