How to fix the interaction between predictors is a revolution in statistics; There are no evidence that NS has handled anything - General

СанСаныч Фоменко 2016.05.28 22:01 #31

Alexey Burnakov:

NS did very well.

Random Forest could not handle such a problem, where the interaction of a set of variables. And the individual significance of each predictor was intentionally zero.

I don't see any evidence that NS has handled anything.

Overlearning is a worldwide evil in science and in model building in particular.

Therefore an error is needed for three sets:

learning sets. The way rattle understands it (OOB, test, validation) will do just fine.
A set that is outside, in terms of dates, the training set.
Another set that is outside, in the sense of by dates, the training set.

The last two sets are without mixing, since they come in terminal, bases behind the bar.

There should be about the same error on all three sets. At the same time you have to fix the set of predictors that you take when training the model.

Gann Grid - Gann Gann Grid - Gann Gann Grid - Gann

СанСаныч Фоменко 2016.05.28 22:08 #32

Alexey Burnakov:

Random Forest could not handle such a problem, where the interaction of a set of variables. And the individual significance of each predictor was intentionally zero.

Your idea to account for the interaction between predictors is a revolution in statistics. Until now, I thought the interaction between predictors was evil. Not only are predictors themselves usually non-stationary, but we're trying to account for relationships between these non-stationary random processes.

In machine learning, it is considered mandatory to get rid of interacting variables. Moreover, there are extremely efficient algorithms such as principal components method that allows getting rid of interaction and converting interacting set of predictors into a set of independent predictors.

MetaTrader 5 Built-in Trading Articles on the development Producer Input Prices -

Alexey Burnakov 2016.05.28 22:18 #33

SanSanych Fomenko:

I see no evidence that NS has coped with anything.

Overlearning is a worldwide evil in science and in model building in particular.

Therefore an error is needed for three sets:

learning sets. The way rattle understands it (OOB, test, validation) will do just fine.
A set that is outside, in terms of dates, the training set.
Another set that is outside, in the sense of by dates, the training set.

The last two sets are without mixing, since they come in terminal, bases behind the bar.

There should be about the same error on all three sets. That said, you will have to fix the set of predictors that you take when you train the model.

Let's put it this way. Despite the fact that this is not part of the assignment. I lay out a validation sample on which to run the trained model and measure the accuracy of the prediction.

But again, this is not necessary. Note that I wrote the validation again on the basis of the inferred pattern.

Files:

dummy_set_validation.zip 20 kb

Request Execution - Opening Request Execution - Opening Westpac Consumer Confidence -

Alexey Burnakov 2016.05.28 22:23 #34

A pattern embedded in the data:

Количество по полю input_19						output
input_1	input_3	input_5	input_7	input_9	input_11	0	1	сумма предикторов	четность
1	1	1	1	1	1		143	6	ИСТИНА
1	1	1	1	1	2	100		7	ЛОЖЬ
1	1	1	1	2	1	121		7	ЛОЖЬ
1	1	1	1	2	2		119	8	ИСТИНА
1	1	1	2	1	1	114		7	ЛОЖЬ
1	1	1	2	1	2		124	8	ИСТИНА
1	1	1	2	2	1		105	8	ИСТИНА
1	1	1	2	2	2	102		9	ЛОЖЬ
1	1	2	1	1	1	101		7	ЛОЖЬ
1	1	2	1	1	2		131	8	ИСТИНА
1	1	2	1	2	1		122	8	ИСТИНА
1	1	2	1	2	2	114		9	ЛОЖЬ
1	1	2	2	1	1		111	8	ИСТИНА
1	1	2	2	1	2	98		9	ЛОЖЬ
1	1	2	2	2	1	123		9	ЛОЖЬ
1	1	2	2	2	2		112	10	ИСТИНА
1	2	1	1	1	1	128		7	ЛОЖЬ
1	2	1	1	1	2		114	8	ИСТИНА
1	2	1	1	2	1		111	8	ИСТИНА
1	2	1	1	2	2	126		9	ЛОЖЬ
1	2	1	2	1	1		143	8	ИСТИНА
1	2	1	2	1	2	95		9	ЛОЖЬ
1	2	1	2	2	1	108		9	ЛОЖЬ
1	2	1	2	2	2		117	10	ИСТИНА
1	2	2	1	1	1		112	8	ИСТИНА
1	2	2	1	1	2	132		9	ЛОЖЬ
1	2	2	1	2	1	92		9	ЛОЖЬ
1	2	2	1	2	2		134	10	ИСТИНА
1	2	2	2	1	1	110		9	ЛОЖЬ
1	2	2	2	1	2		114	10	ИСТИНА
1	2	2	2	2	1		120	10	ИСТИНА
1	2	2	2	2	2	108		11	ЛОЖЬ
2	1	1	1	1	1	109		7	ЛОЖЬ
2	1	1	1	1	2		133	8	ИСТИНА
2	1	1	1	2	1		99	8	ИСТИНА
2	1	1	1	2	2	115		9	ЛОЖЬ
2	1	1	2	1	1		123	8	ИСТИНА
2	1	1	2	1	2	116		9	ЛОЖЬ
2	1	1	2	2	1	131		9	ЛОЖЬ
2	1	1	2	2	2		119	10	ИСТИНА
2	1	2	1	1	1		96	8	ИСТИНА
2	1	2	1	1	2	120		9	ЛОЖЬ
2	1	2	1	2	1	111		9	ЛОЖЬ
2	1	2	1	2	2		99	10	ИСТИНА
2	1	2	2	1	1	132		9	ЛОЖЬ
2	1	2	2	1	2		110	10	ИСТИНА
2	1	2	2	2	1		93	10	ИСТИНА
2	1	2	2	2	2	106		11	ЛОЖЬ
2	2	1	1	1	1		100	8	ИСТИНА
2	2	1	1	1	2	127		9	ЛОЖЬ
2	2	1	1	2	1	127		9	ЛОЖЬ
2	2	1	1	2	2		101	10	ИСТИНА
2	2	1	2	1	1	119		9	ЛОЖЬ
2	2	1	2	1	2		120	10	ИСТИНА
2	2	1	2	2	1		99	10	ИСТИНА
2	2	1	2	2	2	106		11	ЛОЖЬ
2	2	2	1	1	1	133		9	ЛОЖЬ
2	2	2	1	1	2		97	10	ИСТИНА
2	2	2	1	2	1		100	10	ИСТИНА
2	2	2	1	2	2	116		11	ЛОЖЬ
2	2	2	2	1	1		119	10	ИСТИНА
2	2	2	2	1	2	118		11	ЛОЖЬ
2	2	2	2	2	1	102		11	ЛОЖЬ
2	2	2	2	2	2		128	12	ИСТИНА

Questions from Beginners MQL5 Forex Correlation Theory Forex News (from InstaForex)

Dr. Trader 2016.05.28 23:11 #35

SanSanych Fomenko:

I do not see evidence that the NS has coped with something.

Neuronka solved this problem, atachment log with code from Rattle. A couple of changes in the code when calling the neuron - I increased the maximum number of iterations, and removed the links that go from the input directly to the output, bypassing the middle layer (skip=TRUE). Because these two limitations spoil everything.

I did validation on the new file, the errors in both cases are almost 0% (there is one single error when validating from the second file).

But since NS is like a black box, there is no way to know the logic of the solution. You can look at the weights, determine the average absolute value to each input, and draw a diagram. And find out that 1, 3, 5, 7, 9, 11 are more important than the rest. But at the same time other inputs are also used for some reason, there are no zero weights anywhere. That is, it turns out on the contrary, first we learn, then we can define important inputs.

Files:

r_nnet_train_validation.zip 3 kb

Moving Average - Trend Moving Average - Trend Chaikin Oscillator - Oscillators

Alexey Burnakov 2016.05.28 23:15 #36

Dr.Trader:

Neuronka solved this problem, atachment log with code from Rattle. A couple of changes in the code when calling the neuron - I increased the maximum number of iterations, and removed connections that go from the output of the incoming to the output immediately bypassing the middle layer (skip=TRUE). Because these two limitations spoil everything.

I did validation on the new file, the errors in both cases are almost 0% (there is one single error when validating from the second file).

But since NS is like a black box, there is no way to know the logic of the solution. You can look at the weights, determine the average absolute value to each input, and draw a diagram. And find out that 1, 3, 5, 7, 9, 11 are more important than the rest. But the other inputs are also used for some reason; there are no zero weights anywhere. So, it turns out that it is the other way around - first we learn, and then we can define important inputs.

It is. The other inputs are noise. This is the disadvantage of many methods - the noise variables are not completely removed.

Perhaps we need to teach longer and in smaller steps.

But on the whole, bravo. NS has solved a difficult problem.

MQL5 Wizard: Development of Code profiling - Developing Deleted Charts - Additional

Dmitry Fedoseev 2016.05.29 03:59 #37

SanSanych Fomenko:

I see no evidence that NS has coped with anything.

Overlearning is a worldwide evil in science and in model building in particular.

Therefore an error is needed for three sets:

learning sets. The way rattle understands it (OOB, test, validation) will do just fine.
A set that is outside, in terms of dates, the training set.
Another set that is outside, in the sense of by dates, the training set.

The last two sets are without mixing, since they come in terminal, bases behind the bar.

There should be about the same error on all three sets. In doing so, you will have to fix the set of predictors you take when training the model.

Does the obvious need to be proven? During training, the weights of inputs carrying contradictory data decreased, i.e. we can say that the incoming contradictory data is blocked.

There is no problem of retraining in this case, because the trained network is not used for any other purpose.

How feasible is this method, that is the question. Isn't it a bit heavy artillery.

Moving Average - Trend Moving Average - Trend Moving Average - Trend

Alexey Burnakov 2016.05.29 04:32 #38

Dmitry Fedoseev:

How feasible it is to use such a method, that is the question. Isn't the artillery a bit heavy?

You can try another method. But it seems to me that the remedy corresponds to the problem.

Dmitry Fedoseev 2016.05.29 04:40 #39

Alexey Burnakov:
You can try another way. But it seems to me that the remedy fits the problem.

It copes with the problem, and it does it well. But I always wonder if there is something more effective and easier.

Alexey Burnakov 2016.05.29 04:57 #40

Dmitry Fedoseev:
It does the job, and it does it well. But I always wonder if there is something more effective and easier.

Until you try it, you won't understand. The usual inclusions and exceptions won't work. What else is there?

Machine learning in trading: theory, models, practice and algo-trading - page 4