Market prediction based on macroeconomic indicators - page 50

 
Дмитрий:

It's so hard to tell by eye at this scale.

I used to spin a multifactor model, but the accuracy of the model was lower than publicly available forecasts

Take just one indicator like HOUST1F or PRFI and already your model is more accurate than publicly available forecasts. Add a couple of consumer indicators and yield curve and you have a super model.

Below is a graph of GDP growth and the S&P500 since 1959. You can't deny that the S&P500 has fallen during negative GDP growth (recessions):

 

I have a lot of "super models" myself. Only the forward test for some reason shows worse prediction accuracy than the published predictions

 
Vladimir:

Take just one indicator like HOUST1F or PRFI and already your model will be more accurate than publicly available forecasts. Add a couple of consumer indicators and yield curve and you have a super model.

Below is a graph of GDP growth and the S&P500 since 1959. You can't deny that the S&P500 fell during negative GDP growth (recessions):


You are responding selectively to my posts somehow.

Naturally, the graph given is like a tram. The tram is running - and all the passengers are going, stop.... And the connection between the passengers is one - they are on the same tram - in our case the same economy

The index and GDP are derived from the economic situation in the country and there is no connection between them.

The crisis of 2008 was a real estate crisis and the numbers of GDP and indexes and a lot of other numbers are derived from that crisis. And the index does not follow from the GDP and the index does not follow from the GDP - they are in sync at best (and not always - you can see it in the chart you brought up earlier).

There are processes in the economy which determine its future movement and a bunch of indicators will reflect that movement.

What are the underlying movements in the US economy today?

Personally, I join those people who believe that the main problem with the US economy is the zero rate. The entire social sector (insurance and pension funds) profited from investing in government securities. At zero rates, these organisations do not make the profits they need. If they start bankrupting this type of organizations, it will be another level of problems, they are not dotcoms. By the way GDP and all indices will go in the same direction - downwards, vertically

 
СанСаныч Фоменко:

The index and GDP are derived from the economic situation in the country and there is no connection between them.

The crisis of 2008 is a real estate crisis and GDP and index figures and a lot of other figures in the tram are derived from that crisis. And the index does not follow from the GDP and the index does not follow from the GDP - they are at best (and not always - you can see it in the chart you brought up earlier) and they show the same picture.

There are processes in the economy which determine its future movement and a bunch of indicators will reflect that movement.

What are the underlying movements in the US economy today?

Personally, I join those people who believe that the main problem with the US economy is the zero rate. The entire social sector (insurance and pension funds) profited from investing in government securities. At zero rates, these organisations do not make the profits they need. If they start bankrupting this type of organizations, it will be another level of problems, they are not dotcoms. By the way GDP and all indices will go in the same direction, downwards, vertically.

I agree with all that has been said. I tried to find where I said the market index was falling due to a drop in GDP and couldn't find it. Both falls reflect the state of the economy as you correctly said. A fall in the market index is difficult to predict, a fall in GDP is much easier to predict. Since the declines in the index and GDP occur in sync (you wrote it yourself, although from my observations the index starts falling 1 quarter before GDP), one can predict a decline in the index by predicting a decline in GDP, which is what I am doing here. Housing starts starts falling much earlier than the market and GDP. So as a last resort, if I fail to create a good S&P500 and GDP model, I will just observe HOUST and house prices and exit the market when they fall. So far no such thing has been observed. When HOUST reaches 1.6-1.7M, I will watch carefully. When HOUST falls below 1.2-1.3M, past recessions have legitimately started.

https://research.stlouisfed.org/fred2/series/HOUST

About what is the problem with the economy today, I think private debt continues to have a big problem. Banks are still lending to people who can't repay. In the US, banks are going nuts and giving discounts on purchases with credit cards. Every brand name shop has a credit card: Walmart, Target, Macy's, Starbuks, and hundreds of others. In China private debt to GDP has reached even higher levels than in the USA before the recession. Perhaps China will be the cause of the next recession after all.

 
Vladimir:

As for what is the problem with the economy today, I think private debt continues to be a big problem. Banks are still lending to people who can't repay. In the US, banks are going nuts and giving discounts on purchases with credit cards. Every brand name shop has a credit card: Walmart, Target, Macy's, Starbuks, and hundreds of others. In China private debt to GDP has reached even higher levels than in the USA before the recession. Perhaps China will be the cause of the next recession after all.

About China's GDP is mentioned in yesterday's "Interesting and Humorous" thread. According to the "General Theory of Everything" the GDP growth of China is 2/3 proportional to time (t-t0).


 

Let's start going through the predictors step by step. First, let's transform all the data as described above by normalising the absolute increments of their mean. Then run through the entire history and see the prediction errors of the transformed GDP using linear regression. Here's a list of the first 10 predictors sorted by increasing prediction error:

'Series' 'Delay' 'Error' 'Corr Coeff'. 'Mutual info' Description'
'A012RC1Q027SBEA' 1 0.785084491 0.521239874 0.207278508 'Private fixed investment: Residential: Structures'
'PRFI'. 1 0.785370338 0.52030199 0.205244075 'Private Residential Fixed Investment'
'A756RC1Q027SBEA' 1 0.788998988 0.513150108 0.203337794 'Private fixed investment in new structures: Residential structures'
'DFDHRA3Q086SBEA' 1 0.792817832 0.509246158 0.238935402 'Real personal consumption expenditures: Durable goods: Furnishings and durable household equipment (chain-type quantity index)'
'W988RC1Q027SBEA' 1 0.792819625 0.512427741 0.209527444 'Gross private domestic investment: Households and institutions'
'A713RX1Q020SBEA'. 1 0.79292839 0.511152419 0.227008161 'Real final sales to domestic purchasers'
'B713RA3Q086SBEA' 1 0.792933677 0.511052828 0.227015597 'Real final sales to domestic purchasers (chain-type quantity index)'
'W791RC1Q027SBEA' 1 0.795610445 0.509720881 0.220612324 'Net domestic investment: Private: Households and institutions'
'A943RC1Q027SBEA' 1 0.799721554 0.493581939 0.198662644 'Private fixed investment: Residential: Structures: Permanent site
'A011RE1Q156NBEA' 1 0.802124995 0.476308607 0.198071775 'Shares of gross domestic product: Gross private domestic investment: Fixed investment: Residential'

As we can see, there are many predictors from the investment theme, especially in real estate and household equipment. The predictors with the lowest prediction error also have the highest correlation coefficients with GDP and high mutual information. Either A012RC1Q027SBEA or PRFI are suitable as the first predictor of the model. For example, let us see a graph of the dependence of transformed GDP on PRFI(1):

The colour of the points changes smoothly along the spectrum according to time, i.e. for example the blue points belong to the same time interval. As can be seen from the graph, there is no particular change in the dependence of GDP on the PRFI over time. The linear dependence is no worse than the non-linear dependence in this case, and is preferable due to its simplicity. By the way, we can have a discussion about whether non-linear neural networks give any advantages in financial models when the input data is so noisy.

Now let's look at past and future GDP predictions based on PRFI(1):

Pretty good, and with just one predictor, better than bank predictions. There is no look into the future in these predictions as at each past point in history GDP and PRFI data available up to that point has been used. The only forward looking insight exists in the choice of predictor itself (the PRFI has been chosen throughout history).

 

Let's move on. The choice of the second predictor is not so simple. I use a kind of stepwise regression. The idea is that after finding the first predictor and GDP model on its basis, I subtract its model from GDP. The resulting residue becomes a new modelled series for which I find the second predictor and so on. Those who are familiar with mathematics know that all predictors selected this way should be orthogonal (zero correlation between the predictors), which is not the case for most economic indicators. There are a few solutions to that, which we will talk about later.

So we have a residual (GDP minus model based on the first predictor). We start going through all available predictors and calculate their error in predicting the residual as well as their correlation and mutual information with the residual. We get the following table (only the first 11 predictors are shown)

'Delay'. 'Error' 'Corr Coeff'. 'Mutual info'.
pred2 3 0.726557236 0.284915131 0.127184886
pred3 3 0.726787378 0.315902493 0.130087104
pred4 2 0.727334208 0.277286708 0.128992973
pred5 1 0.728784473 0.308420433 0.129030595
pred6 3 0.729279452 0.292608987 0.134332245
pred7 3 0.729297628 0.283750358 0.125613004
pred8 1 0.732298245 0.314324885 0.152677285
pred9 1 0.732362897 0.301421196 0.134899474
pred10 1 0.732917749 0.290449918 0.126357606
pred11 1 0.7342473 0.307902294 0.16423315
pred12 2 0.734315072 0.327789051 0.165246136

In this case the prediction error is the combined error of the first predictor and each of the predictors in this table. The choice of the second predictor should be made with caution here. Although pred2 will give us the lowest error when combined with the first predictor (PRFI or pred1), the correlation coefficient and mutual information of this predictor is not as high. Pred12 looks more promising, so I will choose it. The graph of the dependence of the residual on pred12:

The cloud has become more fuzzy. Predictions based on pred1 and pred12:

 
Vladimir:

There is no peek into the future in these predictions, as GDP and PRFI data available up to that point was used at each past point in history. The only peek into the future exists in the choice of the predictor itself (the PRFI was chosen throughout history).

A peek into the future. Too bad it's a peek into the future.

When you have a VERY large set of input variables, you can always choose the one that will be the best fit for the chosen interval of the predicted variable, BUT NOT THE FACT THAT THE FUNCTIONAL RELATIONSHIP BETWEEN THE FACTOR AND THE PREVIOUS FUNCTION IS EXISTING.

That is, the variable "Number of patients admitted to hospitals in Angola with food poisoning" may well be fine for the selected segment of projected US GDP, but obviously there is no functional relationship.

Once again, only the forward is decisive, with no peeks (even in variable selection).

The trap of a large number of variables.

 
Дмитрий:

That is, the variable "Number of patients admitted to hospitals in Angola with food poisoning" may well fit perfectly for the selected plot of projected US GDP, but there is obviously no functional relationship.

I have often encountered similar problems when optimising experts. For example, you can optimize an Expert Advisor for 10 years of history, get the best result, and then get nothing from such settings. The problem is that the Expert Advisor was stagnating for 9 years using the settings found and only came out on top within a week when the settings were accidentally matched and led to a profit. Such an accident is unlikely to happen in the future. A good solution is to divide the entire history of trades by years, calculate profit separately for each year, and take the worst yearly result.

To find the best correlation I would use the following error function: MAX(error(2000-2016), error(2000), error(2001), ..., error(2014), error(2015)). I don't guarantee anything, I haven't tried this approach for statistics.

 
Дмитрий:

A peek into the future. Too bad it's a peek into the future.

When you have a VERY large set of input variables, you can always choose the one that will best fit the chosen interval of predicted variable, BUT NOT THE FACT THAT THE FUNCTIONAL RELATIONSHIP BETWEEN THE FACTOR AND THE FUNCTION PROVIDED IS EXISTING.

That is, the variable "Number of patients admitted to hospitals in Angola with food poisoning" may well be fine for the selected segment of projected US GDP, but obviously there is no functional relationship.

Once again, only the forward is decisive, with no peeks (even in variable selection).

The trap of a large number of variables.

I agree, even wrote such a thing myself somewhere here. Picking a predictor on all history and then using a forward test from the same history is a self-deception that everyone from traders to pundits do. Many articles written on predicting the economy start with a list of selected predictors and then report "great" results. Traders choose strategies based on e.g. rebound or breakout because "it worked in the past" and hope it will work in the future and show forward tests from the past without realizing that their choice of the strategy itself was based on their study of ALL history, including history for forward testing. For me, the forward test of my GDP and market model will be the future, so I opened this thread - posting predictions, see how they came true in real time. The work is not finished. There are a lot of ideas for non-linear data transformation. For example, some predictors like HOUST affect GDP growth via some threshold function.
Reason: