Discussing the article: "Statistical Arbitrage Through Cointegrated Stocks (Part 3): Database Setup"

 

Check out the new article: Statistical Arbitrage Through Cointegrated Stocks (Part 3): Database Setup.

This article presents a sample MQL5 Service implementation for updating a newly created database used as source for data analysis and for trading a basket of cointegrated stocks. The rationale behind the database design is explained in detail and the data dictionary is documented for reference. MQL5 and Python scripts are provided for the database creation, schema initialization, and market data insertion.

Our Expert Advisor must be aware in real-time if the portfolio weights that we’ve been using still apply or have changed. If they changed, the EA must be informed what the new portfolio weights as quickly as possible. Also, our EA must know if the model itself remains valid. If not, the EA should be informed which assets must be replaced, and the rotation must be applied as soon as possible in the active portfolio.

We have been using the Metatrader 5 Python integration and the professionally developed statistical functions from the statsmodels library, but until now, we have been working with real-time data only, downloading the quotes (the price data) as we need them. This approach is useful in the exploratory phase because of its simplicity. But if we are going to rotate our portfolio, update our models, or the portfolio weights, we might be starting to think about data persistence. That is, we need to start thinking about storing our data in a database because it is not practical to download the data every time we need it. More than that, we may need to look for relationships among different asset classes, among symbols that were not related to our first cointegration tests.

A high-quality, scalable, and metadata-rich database is the core of any serious statistical arbitrage endeavour. By taking into account that database design is a very idiosyncratic task, in the sense that a good database is the one that fits each business's requirements, in this article, we will see one possible approach for building our statistical arbitrage-oriented database.

Author: Jocimar Lopes