How to train boosting and Mgua for prediction - General

secret 2020.03.03 12:43 #16041

mytarmailS:

What you are doing(the test on the "third" sample) in terms of GMDH is called the "predictive power criterion".

I see you are a good expert. Could you please state the essence of GMDH in a few phrases, for non-mathematicians?

Maxim Dmitrievsky 2020.03.03 12:48 #16042

secret:

I see that you are a good specialist. Could you explain the essence of MGUA in a few phrases, for non-mathematicians?

regression model with enumeration of features transformed by different kernels (polynomial, splines, doesn't matter). The simplest model with the lowest error is preferred. It does not save from overtraining in the market.

Roughly speaking, this is bruteforcing models, where the simplest one is chosen, based on external criteria

it's like the basics of machine learning )

The Sultonov Regression Model Dependency statistics in quotes Random Flow Theory and

Maxim Dmitrievsky 2020.03.03 13:03 #16043

mytarmailS:

For example, the MSUA regression just mocks the regression of the modern random forest algorithm and all sorts of boosting...

Boosting is better in everything, if you prepare chips like for MSUA, it will be better.

but you don't care if you don't know what to teach

mytarmailS 2020.03.03 13:12 #16044

secret:

I see that you are a good specialist. Could you explain the essence of MSUA in a few phrases, for non-mathematicians?

I'm not an expert at all )) unfortunately....

If very simply, roughly and imprecisely, the principle of MSUA is self-organization...

For example, we have a set of features

x1,x2,x3.....x20...

from these attributes we create a set of candidate models

m1,m2,m3.....m10...

from these models the best ones are selected, and from the best ones new models are created, again the selection .... etc... and so on, until the error in the new data (previously unknown to the algorithm) decreases

The algorithm changes itself, complicates itself, organizes itself... Something like a genetic algorithm

EA N7S_AO_772012 Basket trading, pair trading. Need help! Can't solve

secret 2020.03.03 13:50 #16045

Maxim Dmitrievsky:

regression model with the enumeration of features transformed by different kernels (polynomial, splines, it does not matter). The simplest model with the smallest error is preferred. It does not save from overtraining in the market.

Roughly speaking, it's a bruteforcing of models, where the simplest one is chosen, based on external criteria

Then I see nothing new and original in this methodology.

secret 2020.03.03 13:51 #16046

mytarmailS:

from these models the best ones are selected, and from the best ones new models are created, again the selection .... etc... and so on until the error on the new data (previously unknown to the algorithm) decreases

The algorithm changes itself, complicates itself, organizes itself... Sounds a bit like a genetic algorithm.

Then I don't see mathematics here, it's more brain work, well, and coding. GA is a trivial thing.

Why then all run around with this MSUA, writing dissertations, so that it's impossible to understand them, if inside it's some primitive, intuitively understandable since kindergarten?

The great and terrible FOREX - Trends, forecasts Learning Arrays Help

mytarmailS 2020.03.03 13:59 #16047

Maxim Dmitrievsky:

Boosting is better in everything, if you prepare the features like for MSUA, it will be better

but it doesn't matter if you don't know what to teach

I disagree...

Let's make a small test, quick, by eye )

Create four variables (regular random) of 1000 elements each

z1 <- rnorm(1000)

z2 <- rnorm(1000)

z3 <- rnorm(1000)

z4 <- rnorm(1000)

create the target variable y as a sum of all four

y <- z1+z2+z3+z4

let's train boosting and mgua, not even to prediction, but just to make it explainy

I split the sample into three pieces, one training two for the test

green is MSUA

The red is Generalized Boosted Regression Modeling (GBM)

gray is the original data

remember, the target is the elementary sum of all predictors

http://prntscr.com/rawx14

As we see both algorithms have coped with the task very well

Now let's make the task a bit more complicated

let's add cumulative sum or trend to the data

z1 <- cumsum(rnorm(1000))

z2 <- cumsum(rnorm(1000))

z3 <- rnorm(1000)

z4 <- rnorm(1000)

and change the target to look like

y <- z1+z2+z3

so we add up two predictors with a trend and one standard one, and z4 turns out to be a noise, because it doesn't take part in the target y

and so we get the following result

http://prntscr.com/rax81b

Our boosting is all fucked up, and MGUA doesn't matter at all

I managed to "kill" MSUA only with this wild target

y <- ((z1*z2)/3)+((z3*2)/z4)

And even that's not completely, and what about the boosting ? )))

http://prntscr.com/raxdnz

code for games

set.seed(123)
z1 <- cumsum(rnorm(1000))
z2 <- cumsum(rnorm(1000))
z3 <- rnorm(1000)
z4 <- rnorm(1000)

y <- ((z1*z2)/3)+((z3*2)/z4)

x <- cbind.data.frame(z1,z2,z3,z4) ; colnames(x) <- paste0("z",1:ncol(x))

tr <- 1:500
ts <- 501:800
ts2<- 801:1000

library(gbm)
rf <- gbm(y[tr] ~ ., data = x[tr,],
          distribution = "gaussian", n.trees = 1000,
           cv.folds = 5)
best.iter.max <- gbm.perf(rf, method = "cv")
prg <- predict(rf,x[c(tr,ts,ts2),],n.trees = best.iter.max)

library(GMDHreg)
gmd <- gmdh.gia(X = as.matrix(x[tr,]),y = y[tr],prune = 5,
                    criteria = "PRESS")
prh <- predict(gmd,as.matrix(x[c(tr,ts,ts2),]))

par(mfrow=c(1,3))
plot(head(y[tr],30),t="l",col=8,lwd=10,main = "train ")
lines(head(prg[tr],30),col=2,lwd=2)
lines(head(prh[tr],30),col=3,lwd=2)
plot(head(y[ts],30),t="l",col=8,lwd=10,main = "test ")
lines(head(prg[ts],30),col=2,lwd=2)
lines(head(prh[ts],30),col=3,lwd=2)
plot(head(y[ts2],30),t="l",col=8,lwd=10,main = "test2 ")
lines(head(prg[ts2],30),col=2,lwd=2)
lines(head(prh[ts2],30),col=3,lwd=2)

Скриншот

prnt.sc

Снято с помощью Lightshot

The function of decomposing Discussion of article "Experiments Random Flow Theory and

mytarmailS 2020.03.03 14:09 #16048

secret:

Then I don't see the math here, it's more brain work, well, and coding. GA is a trivial thing.

Why then all run around with this MSUA, write dissertations, so that it is impossible to understand them, if inside it is some primitive, intuitively understandable from kindergarten?

I don't know, but it describes the data much better, post written, code posted

Maxim Dmitrievsky 2020.03.03 15:20 #16049

mytarmailS:

I disagree...

Let's make a small test, quick, by eye )

I have no desire to mess around with R (I use python), maybe the reason is that MSUA creates fey regressors, so it fits. If you do the same selection for boosting, there will be no difference

Here's an MSUA enumeration for the forest

https://www.mql5.com/ru/code/22915

RL algorithms

www.mql5.com

Данная библиотека имеет расширенный функционал, позволяющий создавать неограниченное количество "Агентов". Использование библиотеки: Пример заполнения входных значений нормированными ценами закрытия: Обучение происходит в тестере...

Your most effective strategy? question for #define experts How to set program

mytarmailS 2020.03.03 15:30 #16050

Maxim Dmitrievsky:

I have no desire to mess around with R (I use python), maybe the reason is that MSUA creates fey regressors, so it fits. If you do the same selection for boosting, there will be no difference

Here's an MSUA enumeration for the forest

https://www.mql5.com/ru/code/22915

First, what other fey regressors? What nonsense, then why does the MSUA go out when the problem gets harder?

Secondly, in my example I have the same data for both MSUA and Boost

thirdly, you don't need to mess around, can't you make a matrix with four random values in python and then make a cumulative sum of them? To check your boost?

2 lines of code ))

I'm curious what's in it myself.

Who will write an Questions from Beginners MQL5 Array references

Machine learning in trading: theory, models, practice and algo-trading - page 1605