Market prediction based on macroeconomic indicators - page 2

 
An interesting subject for research. Thinking on this topic I have come to the conclusion about the need to analyse macroeconomic data in dynamics - in my opinion it is not enough to say that jobless claims have increased, as this may be due to seasonal fluctuations and is of short-term nature - the market may or may not react and again depending on the trend that prevails in the market. So maybe try to investigate the strength of indicators in terms of their ability to reverse a trend? For example take a zigzag, or identify the points of trend reversal and correction on the chart and look for a reversal of macroeconomic indicators within three days before the break, and gathering a group of such indicators to analyze what these indicators show and then look for patterns. Not every indicator will be the reason for the market reversal - the potential, the market trend and the indicator for the past periods are important, as well as the totality of other economic indicators.
 
-Aleks-:
This is an interesting object to investigate. Thinking about this topic I came to the conclusion about the need to analyse macroeconomic data on a dynamic basis - in my opinion it is not enough to say that jobless claims have increased as this may be due to seasonal fluctuations and is short term - the market may or may not react and again depending on the trend that prevails in the market. So maybe try to investigate the strength of indicators in terms of their ability to reverse a trend? For example take a zigzag, or identify the points of trend reversal and correction on the chart and look for a reversal of macroeconomic indicators within three days before the break, and gathering a group of such indicators to analyze what these indicators show and then look for patterns. Not every indicator will be the reason for the market reversal - the potential, the market trend and the indicator for the past periods are important, as well as the totality of other economic indicators.

This is quite an interesting idea. That is, focus only on trend breaks and identify key indicators that influence those breaks.

I've been thinking a lot about outliers. Do they need to be ignored, or vice versa, we should pay more attention to them. Classical regression theory teaches to ignore them. But sometimes it seems to me that small fluctuations of price around a trend are a noise, while the classic regression attaches the greatest importance to it. Steep turns in trends (outliers) are probably a more important signal. But all my attempts to build a model paying more attention to outliers (for example by choosing u>1) led to higher root-mean-square error of prediction. Smoothing outliers led to lower prediction error.

 
faa1947:

Hence, you have to manually look at the whole list of input variables and decide intuitively, or based on some other consideration, that "this input variable is likely to affect and this one is likely not."

...Manually selected some list, then filtered by an algorithm and got the list. And the value of such a list is fundamental: models using such a set of "influencing" inputs (using 3 different types of models) do NOT have the property of over-learning, which is the main ambush. It is overfitting that is the main consequence of using "noise" input data.


Were the in-sample plot of history sampled and tested in the out-of-sample plot? If you sample the predictors on the whole plot and then calculate out-of-sample error on part of the same plot, that's looking into the future.
 
gpwr:
Have you sampled on the in-sample plot of history and tested on the out-of-sample plot? If you select predictors in the whole plot and then calculate out-of-sample error in part of the same plot, that's looking into the future.

Even tougher.

Following the thread with great interest.

Given-Aleks- post, it is not clear what you are going to predict: direction or magnitude? If "direction", then it is classification models, and if "magnitude", then it is regression models, and they have problems with different ARIMA and ARCH. Detrending with differentiation does not solve the problem completely, on top of everything in macroeconomics confuses seasonality....

The idea of-Aleks- for the selection of predictors is very interesting. Generally, at the first stage I would make two preliminary steps:

1. select by-Aleks- some fairly large set of independent variables.

2. Constructed a regression and discarded all variables that have insignificant coefficients.

The last step is really not easy. Everything is as I wrote provided there is no correlation between independent variables. And there is always correlation above 0.7 and the list of discarded predictors depends on the order in which it is done.

After that one can look at it and decide what to do next.

 
avtomat:

The stationarity requirement is very strict and completely unjustified.

.

And "non-stationary" models work just fine ;)

gpwr:
You can say so about any model, not only regression, but also neural models, ARMA and others. If there's no relation between inputs and outputs then any model will generate a prediction, only inaccurate.
Any advice on where to start in order to understand stationarity, ARMA, neural models. I've been wanting to explore this direction for a long time. A lot of sources and it's hard to understand it all from scratch.
 
faa1947:

A large or small amount of input data is all relative.

The way I see it, you need to check the data one by one, identify the relevant ones and then use them in your EA. Only if you start to optimize the data values random hits are possible. This is probably why you should be especially careful with the search range in the optimization.
 
gpwr:

This is quite an interesting idea. That is, focus only on trend breaks and identify key indicators that influence those breaks.

I've been thinking a lot about outliers. Do they need to be ignored, or vice versa, you should pay more attention to them. Classical regression theory teaches to ignore them. But sometimes it seems to me that small fluctuations of price around a trend are a noise, while the classic regression gives them the greatest importance. Steep turns in trends (outliers) are probably a more important signal. But all my attempts to build a model paying more attention to outliers (for example by choosing u>1) led to higher root-mean-square error of prediction. Smoothing the outliers led to lower prediction error.

Elaborating on the idea, we should divide the different indicators by the frequency of their release - probably the rarer the indicator is released, the longer it has an impact on the market - this needs to be tested. Personally, I find it easier to perceive the information visually and in the dynamics of change over time - it is required to transfer indicators for convenience to MT in the form of a chart (my dream), but these indicators should be synchronized - say the news is released once a month, then what about the lower timeframes? As an option - fill them with bars from one indicator to another according to a linear function - this way we will clearly see the movement vector in MT. Knowing the vector, we can analyze the movement vector and its change - the time of the break relative to the break of the symbol price vector by the zig-zag.
Thus, we can consider the lag - and calculate the percentage deviation from the price break when the economic indicator exits. There are still some ideas, but the obviousness of their use can be understood on the spot.
In general, a lot of economic data is relative to the previous month or year, which should also be taken into account in the graphic display...
Another idea - probably it's not the data itself that influences the trend change, but its deviation from expected data or from the past dynamics - here you can also check - comparing the past dynamics of the index movement with its change (strong change along the vector or against - at least using SMA) and look at the change of price movement vector with a lag.
I'm not sure that all this work can be done by one person - you need a clear action plan and methodology of analysis of intermediate results - it may be the work of a lifetime, which will answer the question how the past economic performance of the market was influenced... However, the methodology developed will allow you to look for patterns in current market movements.
 
-Aleks-:
Elaborating on the idea, we should divide the different indicators by the frequency of their release - probably the rarer the indicator comes out, the longer it has an impact on the market

No. The most influential ones for the US are the UR andFOMC Meeting percent rates. They are monthly.

If unemployment data is formalised, then Fed meeting minutes numerically cannot be formalised at all.

Otherwise it would be like two fingers...

 

For this number of variables, 65 observations is very few.

At least i*10 observations + 15-20% for a forward test.

 
Demi:

No. The most influential ones for the US are the UR andFOMC Meeting percent rates. They are monthly.

If unemployment data is formalised, then Fed meeting minutes numerically cannot be formalised at all.

Otherwise it would be like two fingers...

These minutes contain economic data, I understand they can be obtained. If not, then you have to estimate these meetings in three ways - +1/-1/0 - the information for estimation can be taken from the media as an option.
Reason: