Discussing the article: "Neural networks made easy (Part 57): Stochastic Marginal Actor-Critic (SMAC)" - page 2
You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
To insert the code you need to apply the corresponding button
Regards, Vladimir.
Every pass of the Test EA generates drastically different results as if the modell were different from all previous ones. It is obvious that the model evolves every single pass of Test but the behaviour of this EA is hardly an evolution, so what stands behind it?
Here are some pictures:
Buy and sell transactions seem to be insufficiently controlled in the Test and possibly Research scripts. Here are some messages:
2024.04.27 13:40:29.423 Core 01 2024.04.22 18:30:00 current account state: Balance: 9892.14, Credit: 0.00, Commission: 0.00, Accumulated: 0.00, Assets: 0.00, Liabilities: 0.00, Equity 9892.14, Margin: 0.00, FreeMargin: 9892.14
2024.04.27 13:40:29.423 Core 01 2024.04.22 18:30:00 failed market buy 0.96 EURUSD.pro sl: 1.06306 tp: 1.08465 [No money]
Unless margin overruns are intended, simple limits put on buy_lot after line 275 and after line 296 put on sell_lot would eliminate this behaviour of the Test script.
Every pass of the Test EA generates drastically different results as if the modell were different from all previous ones. It is obvious that the model evolves every single pass of Test but the behaviour of this EA is hardly an evolution, so what stands behind it?
Here are some pictures:
This model use stochastic politic of Actor. So in the beginning of study we can see random deals at every pass. We collect this passes and restart study of the model. And repeat this process some times. While Actor find good politic of actions.
Let's put the question another way. Having collected (Research) samples and processed them (Study) we run the Test script. In several conscutive runs, without any Research or Study, the results obtained are completely different.
Test script loads a trained model in OnInit subroutine (line 99). Here we feed the EA with a model which should not change during Test processing. It should be stable, as far as I understand. Then, final results should not change.
In the meantime, we do not conduct any model training. Only collecting more samples is performed by the Test.
Randomness is rather observed in the Research module and possibly in the Study while optimizing a policy.
Actor is invoked in line 240 in order to calculate feedforward results. If it isn't randomly initialized at the creation moment, I believe this is the case, it should not behave randomly.
Do you find any misconception in the reasoning above?
Let's put the question another way. Having collected (Research) samples and processed them (Study) we run the Test script. In several conscutive runs, without any Research or Study, the results obtained are completely different.
Test script loads a trained model in OnInit subroutine (line 99). Here we feed the EA with a model which should not change during Test processing. It should be stable, as far as I understand. Then, final results should not change.
In the meantime, we do not conduct any model training. Only collecting more samples is performed by the Test.
Randomness is rather observed in the Research module and possibly in the Study while optimizing a policy.
Actor is invoked in line 240 in order to calculate feedforward results. If it isn't randomly initialized at the creation moment, I believe this is the case, it should not behave randomly.
Do you find any misconception in the reasoning above?
The Actor use stochastic policy. We implement it by VAE.
Layer CNeuronVAEOCL use data of previous layer as mean and STD of Gaussian distribution and sample same action from this distribution. At start we put in model random weights. So it generate random means and STDs. At final we have random actions at every pass of model test. At time of study model will find some means for every state and STD tends to zero.