Machine learning in trading: theory, models, practice and algo-trading - page 257

 

I'm kind of confused.

There is a price. I take the last 200 bars, try to train a model on them for two classes (buy/sell). I can train anything, even a forest, even a neuron, but all this will be useless, because if we imagine all training examples as points in 200-dimensional space, then both classes are evenly mixed up there, and attempts to separate them with hyperplanes are not accurate enough.

Now a better option - all sorts of hedge funds create new predictors (indicators, clusters, some formulas, and anything else) based on the price. And on these new predictors they train the same model as in the first point, but in this case they cut the cabbage.

So, in the second case, no new information is formed and added there, it is all the same points in 200-dimensional space that have been moved into less-dimensional space. I.e. such a peculiar dimensional reduction, moving points of the same class closer to each other in space. But machine learning models also do this, they also use their algorithms to reduce the dimensionality and move classes closer together. What is the difference between these two methods?

Why is it that if you approximate points in space semi-automatically, by various tricks, with subsequent training of the model, it works. But if you trust the model itself to work with the original space, then it fails? These are similar operations in both cases.

 
Dr.Trader:

Now a better option - all sorts of hedge funds create new predictors (indicators, clusters, some formulas, and anything else) based on the price. And on these new predictors they train the same model as in the first point, but in this case they cut the cabbage.

...

Why if you approximate points in space semi-automatically, with different tricks, followed by teaching the model, then it works. And if you trust the model itself to work with the original space, then it fails? These are similar operations in both cases.

Don't you take into account that large market participants move the price?
 
Dr.Trader:

I'm kind of confused.

There is a price. I take the last 200 bars, try to train a model on them for two classes (buy/sell). I can train anything, even a forest, even a neuron, but all this will be useless, because if we imagine all training examples as points in 200-dimensional space, then both classes are evenly mixed up there, and attempts to separate them with hyperplanes are not accurate enough.

Now a better option - all sorts of hedge funds create new predictors (indicators, clusters, some formulas, and anything else) based on the price. And on these new predictors they train the same model as in the first point, but in this case they cut the cabbage.

So, in the second case, no new information is formed and added there, it is all the same points in 200-dimensional space that have been moved into less-dimensional space. I.e. such a peculiar dimensional reduction, moving points of the same class closer to each other in space. But machine learning models also do this, they also use their algorithms to reduce the dimensionality and move classes closer together. What is the difference between these two methods?

Why is it that if you approximate points in space semi-automatically, by various tricks, with subsequent training of the model, it works. But if you trust the model itself to work with the original space, then it fails? These are similar operations in both cases.

Why do we need dataminig in principle?

Why do we need different filters in radio engineering, and in econometrics, too? Smoothing and so on and so forth....

Statistics is a very sneaky science - you can easily slip into a game of numbers. At any step.

If you have decided on a target variable, then you need to pick predictors to that target variable, not any predictors, but only those that are CONSUMERLY relevant to that target variable. Always look at the predictor and try to answer the question: "what property, trait does this predictor of mine reflect in my target variable? And in general: what does the predictor have to do with the financial markets?

For example, RSI: seems to reflect the overbought/oversold market. Clearly refers to reversals. And so on.

Or let's remember Burnakoff (as I understand, the man was driven out of the site by his flooding): increments with quite large lags are a bell of periodicity.

And speaking generally, it is necessary to formulate a general, verbal model of the financial market.

For example, Hindiman's (forecast package). In his opinion, the market consists of:

  • three kinds of trend
  • three types of noise
  • cyclicality, which he has a constant period, which is consistent with production data, such as agricultural products.
That's probably not the only approach. But it is certainty, not noise.

With this approach, you will note the coffee grounds, the rings of Saturn (see list of astrologers)...

And also don't forget the scourge of financial data, called "non-stationarity".

And also don't forget that models for financial markets almost always turn out to be over-trained.

Have we all won? Let's smoke bamboo...

 
Dr.Trader:

I'm somehow confused.

1) maybe just because funds don't do that?

2) you need to understand what the market is, albeit in your own way...

3) you need to know clearly from whom and why you should take money, you need to have your own specific idea

4) All the MO and so on... this is just a toolbox to describe your specific idea, but not the idea itself, and this is what almost everyone chooses here, thinking that the MO will invent everything themselves

No matter how pompous it sounds, but I've got more or less adequate market forecasting, I've got quite a complicated algorithm, it needs about 6 minutes to calculate one candle, but some basic elements will be rewritten in C++

And the result calculated by this complex algorithm has to be analyzed with my eyes, it turns out not to be automatic but semi-automatic, but in the near future I'll try to replace my visual analysis with some MO for pattern recognition, by the way, it recognizes MO very well unlike forecasting.I looked through each output with my eyes and said: "This I consider a buy signal and this I don't consider a signal", I created the target according to my vision, it was an experiment so far, because I did not made much data with the target as a test... I had 100 trained samples and 50 controls and I trained a normal Forrest and what do you think? Forrest recognized 90% of the new sample

 

Good afternoon, Task:

- There is an array of X,Y,Z values;

- Take a slice - by X from 1 to 1000 on the nth Y:

-There are some points of minima and maxima. If we slice by X, then any values >1 are important.

Which way to look to recreate the calculation of the type of weights with respect to the axes.

That is to start measuring the object.

If the signal came to the cell x-55 y-163, the task is to determine the value (weight) of the point relative to the axis X and Y (possibly along the diagonal), to feel the position of the point on the object.

I think you need to look in the direction of the main statistical characteristics, dispersion, medians, mode, asymmetry.

In general it is necessary to begin to measure the object in some way, each unit in relation to each other, also that in values of a point on object the presence of other objects is taken into account.

Files:
eiova.jpg  382 kb
1.jpg  320 kb
 
Top2n:

Good time, Task:

Can you keep it simple, but it's not clear what you want
 
SanSanych Fomenko:


Thank you, I understand something.

Models essentially just optimally divide the predictor space into two subspaces, class-buy and class-sell.
If we start randomly and long to create new predictors, we can help the model a little, do some of its work ourselves. But it doesn't have to give better stability and predictability, it can just help the model do its work in fewer iterations, and there's really not as much utility in that operation as we'd like.

But those operations that you mentioned - noise cleaning, smoothing, trending, etc. - are not just creating predictors that are model-friendly. This is the creation of predictors that somehow describe the internal processes of the market.
I've looked at different old working strategies, they always have some constants - if MA, then 21, if RSI, then 14, etc. All these constants and indicators built with them not only help the model to classify data, but also have some properties describing internal market processes. Plus, the different constants in the formulas for predictors are some new data, so we add new information to the original data.

It turns out that if we start generating new predictors thoughtlessly, they will just help the model to achieve better accuracy in training, but they will not help to describe processes within the market, and hence predictions with them are unreliable. So you have to generate them with a lot of thinking, I agree :)

And, there was an interesting new property of predictors for me - the description of internal processes of the original data.
In other words, if I have, for example, a dozen of predictors that can easily reconstruct hundreds of price bars, then they obviously contain the necessary properties of the market and the model built on them should be better.

 
Top2n:


I may have misunderstood what you need, but I would take some radius, say 4, and for each point find the average value in that radius.
That is, if X=BC, Y=158, Z=1, then you can find the average of all points in that radius R=4. That would be the average weight of the point (BC,158,1) and its neighborhood. Do this for all points in the array, and you get a new array, where the higher the number - the more signals in the neighborhood of the original array.
Then you can project it all on a single axis (discard the Z coordinate, add up all the corresponding cells with X and Y, which had different Z). Then also discard Y by itself and sum up all cells by X.

Files:
w5rtduyghjbn.png  388 kb
 
Dr.Trader:


It turns out that if you start generating new predictors thoughtlessly, they will just help the model to achieve better accuracy in training, but they will not help to describe processes within the market, and hence predictions with them are unreliable. So you have to generate them with a lot of thinking, I agree :)


All the same, the oven from which one should dance is some verbal, intuitive description of the market.

I've been kicking around the idea that in financial markets this intuitive description is given by ZZ. If you look at it, you can:

  • you can see trends
  • you can see noise as a deviation from straight lines
  • you can see the periodicity

It seems to me that all our troubles are in this periodicity, which varies chaotically on both axes. That's what we're staring at. If we learn at least somehow to deal with this non-stationarity, the rest is easier.

 
SanSanych Fomenko:

All the same, the furnace from which one should dance is some verbal, intuitive description of the market.

I have long carried around the idea that in the financial markets this intuitive description is given by ZZ. If you look at it, you can:

  • you can see trends
  • you can see noise as a deviation from straight lines
  • you can see the periodicity

It seems to me that all our troubles are in this periodicity, which varies chaotically on both axes. That's what we're staring at. If we learn at least somehow to deal with this non-stationarity, the rest is easier.

Don't judge strictly and don't ask what I mean, but maybe a White_Noise Generator can help. By the way, if anyone can, please share your experience with Fourier_Laplace_Z-transforms.
Reason: