Discussing the article: "Statistical Arbitrage Through Cointegrated Stocks (Part 5): Screening"

 

Check out the new article: Statistical Arbitrage Through Cointegrated Stocks (Part 5): Screening.

This article proposes an asset screening process for a statistical arbitrage trading strategy through cointegrated stocks. The system starts with the regular filtering by economic factors, like asset sector and industry, and finishes with a list of criteria for a scoring system. For each statistical test used in the screening, a respective Python class was developed: Pearson correlation, Engle-Granger cointegration, Johansen cointegration, and ADF/KPSS stationarity. These Python classes are provided along with a personal note from the author about the use of AI assistants for software development.

Our strategy is a mean-reversion one, a kind of statistical arbitrage that takes Nasdaq stocks cointegrated with Nvidia, then buys and sells them simultaneously according to their portfolio weights, to seek market neutrality. Our vast array of available options includes all Nasdaq stocks. The specific criteria to be met are:

  1. The cointegration strength as indicated by the Engle-Granger and the Johansen tests
  2. The stability of the portfolio weights
  3. The quality of the spread stationarity, as indicated by the ADF and the KPSS tests
  4. The asset liquidity

These four criteria are enough for us to build a scoring system. A scoring system should give us a trading edge that comes from identifying groups of stocks whose prices move together in a stable and predictable way in the long run. We need a scoring system because not every pair or group serves our purposes. If we simply test every possible combination of stocks, with hundreds or thousands of stocks, the number of potential pairs and baskets explodes. We must not forget that we are developing this statistical arbitrage framework for the average retail trader, with a consumer notebook and a regular network bandwidth. If we start testing every possible combination, the process becomes computationally expensive from the beginning. By having a scoring system, we avoid risking our money in pairs or groups that are only correlated in the short run, or that are not tradeable due to low liquidity or transaction costs.

Maybe the better analogy for a screening process with a scoring system is that of a funnel: we start broad and gradually eliminate unsuitable candidates, first excluding by sector and industry similarity, then scoring by correlation, cointegration, and stationarity. Finally, we eliminate those that are not tradeable according to our risk/money management. This process increases the likelihood of finding baskets that are not only statistically significant but also economically meaningful and tradable.

Author: Jocimar Lopes