Machine learning in trading: theory, models, practice and algo-trading - page 2125

 
Igor Makanu:

? topical reading ))))

there , the beginning of the opus:

http://www.ievbras.ru/ecostat/Kiril/Library/Book1/Content0/Content0.htm#Ref


are you really reading this?

The point is that it was written without computing power and logic in the foreground, and, as noted, it works) a lot of water, of course, but certainly here and yourself can sift out. And the beginning, well, that's the same time, without the beginning of the book would not have been. You can take it into account, too.)

 
Maxim Dmitrievsky:

http://gmdh.net/articles/theory/bookInductModel.pdf

a big plus is that linear models always converge to a local minimum. That's why the method is still relevant.

saw this book a couple of years ago

It looks... well, yes, it fascinates, but really - what for? if the purpose of writing a diploma or PhD - yes, it's a desk book

if the purpose of time series - this book is about something else, about the invention of the random forest at the dawn of the computer

imho, even ensembles of NS poorly accustomed to application in practice, how to work with BP? well, as an option to mess up a bunch of a lot of NS, but in the end you get autoecoder? - I doubt that even a convolutional network can be obtained with the help of this book


Vorontsov is more relevant old knowledge, and data processing - I'm chewing on some online courses on BP - there is something in it ;)

 
elibrarius:

If all points from both the test and the Train are ranked in one common list (rearranged according to some pattern), then it means that they are mixed up. My understanding is this. The test should not be mixed in any way with the trail.

If the points are independent (no autocorrelation), you can and should mix them

in fact, this is how the random forest works

 
Igor Makanu:

saw this book a couple of years ago.

It looks... Well, yes, it fascinates, but really - what for? If the purpose of writing a diploma or a PhD - yes it is a board book

if the purpose of time series - this book is about something else, about the invention of the random forest at the dawn of the computer

imho, even ensembles of NS poorly accustomed to application in practice, how to work with BP? well, as an option to mess up a bunch of a lot of NS, and in the end you get autoecoder? - I doubt that even a convolutional network can be obtained with the help of this book


Old knowledge, Vorontsov is more relevant, and data processing - I have finished studying the online courses on BP - there's something in it ;)

What are you talking about? Are you drunk or what?

Ask Vorontsov who Ivakhnenko is for him ...

 
Maxim Dmitrievsky:

if the points are independent (no autocorrelation), then you can and should interfere

Actually, this is how a random forest works

There are 2-3 very correlated points on each side of the timeseries with each point. I.e. the independence condition is not satisfied
 
elibrarius:
There are 2-3 very correlated points on each side of the time series. That is, the independence condition is not satisfied.

There are special sesplicing methods for time series, they take it all into account

 
elibrarius:
There are 2-3 very correlated points on each side of the timeseries. That is, the condition of independence is not met

It is possible to remove these duplicates, and it will start working sharply on the new data, but the spread will not be covered

 
Maxim Dmitrievsky:

If the points are independent (no autocorrelation), then you can and should interfere

no

this is not the purpose of ACF in BP estimation

the autocorrelation may not be for lag = 1, but it may be for other lags

and ACF estimation is not an evaluation of lag dependence, it is just one of the ways of process model identification - after deciding to which process BP will refer, we start data preprocessing - or we will use BP itself, or we will use its sample of lags

 
Igor Makanu:

no

yes

Before

after the decorrelation to which the overtraining is going. The seriality of the marks should also be taken into account.


 
Maxim Dmitrievsky:

it is possible to remove these duplicates, it will start working immediately on the new data, but the spread will not be covered

And what's the point if it doesn't cover the spread?
Reason: