neural network and inputs - page 41

 
nikelodeon:
I have prepared a sampling of 11 columns + 1 column output. Question: when starting the Predictor what number of columns should be specified. Only the amount of data (11) or together with the output (12) ????

That is, if in the initial spreadsheet (csv file) the number of columns N > 46, and the number of rows M, then the calculation time cost is proportional to: 2 * (N - 2) + M - 2

If the number of columns in the spreadsheet N < 13, the time spent on calculations is proportional to 2 * (N - 2)2 + M - 2

That is, if the number of columns in the spreadsheet is N = 12 (10 inputs), then the computing time on the same computer will be the same as for N = 1025 (1023 inputs). Because for the number of inputs less than 11 the MSUA kernel transforms are activated.

 
OK, we've got the timing sorted out. But there's another thing I've noticed. If you optimise the same file, you get completely different results... Like completely.... Of course, they differ, sometimes even very much. What does this have to do with, Yuri? I thought that during optimization in this case we must come to the same result. But here it turns out that the result is different.... :-( What is it about?
 
nikelodeon:
OK, we've sorted out the timing. But here's another thing I have noticed. If you optimize one and the same file, you get completely different results... Like completely.... They differ, of course, sometimes even very much. What does this have to do with, Yuri? I thought that during optimization in this case we must come to the same result. But here it turns out that the result is different.... :-( What does it have to do with?

It has to do with randomness. The general sample is split into two sub-samples, a training sample and a control sample using jPrediction. jPrediction makes 100 attempts to split the general sample into two parts.

At each attempt, a model is built on the training subsample. On the control sample, the model is checked "for lousiness". The results obtained on the control (generalizability) are displayed. But the results of the training capability are not needed in the hell, because they are a fitting and therefore are not displayed anywhere.

If the best generalizability results are very different on the same sample with different runs, it means that the sample is unrepresentative - too much rubbish on the inputs. That is, the predictors have low significance.

If the sample is representative, the same best model can be built more than once in 100 runs, i.e. it does not depend so much on which examples are included in the training sample and which in the control sample.

 
Reshetov:


When predicting NS time series, a partitioning of the sample using a PRNG is of no practical use - complete nonsense, showing nothing.

Only artificial partitioning with a control sample at the end of the time series

 
Good evening ..Is there an example of an EA that uses a neural network with, say, a muving or some other indicator ? Or even easiera neural network in an EA built into MT on muving is there ?
Reason: