Interesting thing:
Обрезка минимального значения — это простой в использовании алгоритм, при котором на каждом шаге отключаются веса с наименьшим абсолютным значением. Этот алгоритм требует ретрансляции сети практически на каждом шаге и дает субоптимальные результаты.
Am I understanding the order of operation of this function correctly?
1) Fully train the original 12-8-5-1 network
2) find a link with minimum weight and remove the input
3) Re-train the 11-8-5-1 network without the remote input again
And so on for several dozens of retraining cycles. Until there is no 6-2-1-1 network left.
It seems to me that the time spent on such an elimination of insignificant weights, inputs and internal neurons will be much longer than a single full training (which we did in step 1).
What are the advantages of this approach?
Interesting thing:
Am I understanding the order of operation of this function correctly?
1) Fully train the original 12-8-5-1 network
2) find a link with minimum weight and remove the input
3) Re-train the 11-8-5-1 network again without remote input
And so on for several dozens of retraining cycles. Until there is no 6-2-1-1-1 network left.
It seems to me that the time spent on such an elimination of insignificant weights, inputs and internal neurons will be much longer than a single full training (which we did in step 1).
What are the advantages of this approach?
1- The algorithm works exactly like this. With one exception: neurons in all hidden layers are discarded.
2. a minimal set of inputs and a minimal structure is defined that gives the same result as the full set.
Advantages? We remove all unnecessary stuff that generates false classification. That's what the developers claim.
Just one way of selecting important predictors
Good luck
1. The algorithm works exactly like this. With one exception: neurons in all hidden layers are eliminated.
2. a minimal set of inputs and a minimal structure is defined, which give the same result as the full set.
Advantages? We remove all unnecessary stuff that generates false classification. That's what the developers claim.
Just one way of selecting important predictors
Good luck
1) If there are no connections from inputs to intrinsic neurons, then the inputs themselves can be switched off.
2) I am confused by the many times more time spent than if you just train the full model according to point 1 If the result is the same, why spend so much time?
I can assume that the eliminated predictors will be ignored in the future when retraining/retraining and the time saving will be just then. But the importance of predictors can also change over time.
I was interested in this trick as I started to do it too, but gave up after realising how much time it takes.
Perhaps dropout loops allow for more error and fewer epochs of retraining than final training.
I wonder what logic is used to screen out hidden neurons? Each neuron has many input connections. By the minimum sum of input weights? Or the minimum sum of output weights? Or the total sum?
1) If there are no connections from inputs to internal neurons, then the inputs themselves can be switched off.
2) I am confused by the many times more time-consuming than if you just train the full model according to point 1 If the result is the same, why waste so much time?
I can assume that the eliminated predictors will be ignored in the future when retraining/retraining and the time saving will be just then. But the importance of predictors can also change over time.
I was interested in this trick as I started to do it too, but gave up after realising how much time it takes.
Perhaps dropout loops allow for more error and fewer epochs of retraining than final training.
I wonder what logic is used to screen out hidden neurons? Each neuron has many input connections. By the minimum sum of input weights? Or the minimum sum of output weights? Or the total sum?
Look at the package and the function description. I haven't looked into it in depth. But in several models (like H2O) this is how they determine the importance of predictors. I just checked and I didn't find it reliable.
Of course, the importance of predictors changes over time. But if you have read my articles, you should have noticed that I strongly recommend to retrain the model regularly when the quality decreases below a pre-defined limit.
This is the only correct way. IMHO
Good luck
Good luck
Check out the package and the function description. I haven't studied it in depth. But in several models (e.g. H2O) this is how they determine the importance of predictors. I just checked and I didn't find it reliable.
Of course, the importance of predictors changes over time. But if you have read my articles, you must have noticed that I strongly recommend to retrain the model regularly when the quality decreases below a pre-defined limit.
This is the only correct way. IMHO
Good luck
Good luck
Wouldn't it be better to enter the hour and day data into the NS not with one predictor, but with separate predictors for the number of hours and days?
If one, then the weight/value of Monday (1) and Tuesday (2) will differ by 100%, and Thursday (4) and Friday (5) by 20%. With hours 1,2 and 22,23 the difference is even stronger. And going from 5 to 1 or 23 to 1 would be a huge jump in weight altogether.
That is, there will be distortions in the significance of days and hours if they are represented by a single predictor.
Wouldn't it be better to enter the hour and day data into the NS not with one predictor, but with separate predictors for the number of hours and days?
If one, then the weight/value of Monday (1) and Tuesday (2) will differ by 100%, and Thursday (4) and Friday (5) by 20%. With hours 1,2 and 22,23 the difference is even stronger. And going from 5 to 1 or 23 to 1 would be a huge jump in weight altogether.
That is, there will be distortions in the significance of days and hours if they are represented by a single predictor.
Hour of day and day(week, month, year) are nominal variables, not numeric. We can only talk about whether they are ordered or not. So thanks for the suggestion, but not accepted.
Use these variables as numeric variables ? You can experiment, but I'm not looking in that direction. If you have any results, please share.
Good luck
bayesian_plot()function?
The R package funModeling has not the " function?
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
You agree to website policy and terms of use
New article Deep Neural Networks (Part II). Working out and selecting predictors has been published:
The second article of the series about deep neural networks will consider the transformation and choice of predictors during the process of preparing data for training a model.
Now, we want to see the distribution of NA in variables after the outliers have been removed.
require(VIM) evalq(a <- aggr(x.sin.out), env)Fig.6. Distribution of NA in the data set
Author: Vladimir Perervenko