Discussing the article: "Neural networks made easy (Part 39): Go-Explore, a different approach to exploration"

 

Check out the new article: Neural networks made easy (Part 39): Go-Explore, a different approach to exploration.

We continue studying the environment in reinforcement learning models. And in this article we will look at another algorithm – Go-Explore, which allows you to effectively explore the environment at the model training stage.

The main idea of Go-Explore is to remember and return to promising states. This is fundamental for effective operation when the number of rewards is limited. This idea is so flexible and broad that it can be implemented in a variety of ways. 


Unlike most reinforcement learning algorithms, Go-Explore does not focus on directly solving the target problem, but rather on finding relevant states and actions in the state space that can lead to achieving the target state. To achieve this, the algorithm has two main phases: search and reuse.

The first phase is to go through all the states in the state space and record each state visited in a state "map". After this, the algorithm begins to study each visited state in more detail and to collect information about actions that can lead to other interesting states.

The second phase is to reuse previously learned states and actions to find new solutions. The algorithm stores the most successful trajectories and uses them to generate new states that can lead to even more successful solutions.

Author: Dmitriy Gizlyk

 
Hi. Faza 1 worked in the tester and created one empty file in the shared folder GoExploer. bd. Faza 2 is not attached to the chart.
 
On the second attempt, the process started. The start date was set far away, set it as you have one month.
 
star-ik #:
On the second attempt, the process started. The start date was set far away, I set it as you have one month.
What's the result?
 
More or less. But the drawdowns are big. He opens a deal and waits a long time for a favourable moment to close it. Often refills. Very rarely sells, always only buys. The arrows are on every bar. As soon as the market starts working, I will try it on demo.
 
On the demo, he's specifically minus. It fills up at every opening of a new bar. I don't understand how he made a plus in the tester.
 
star-ik #:
On the demo he's specifically minus. It fills up at every opening of a new bar. I don't understand how it went to plus in the tester.

thanks

 
star-ik #:
On the demo he's specifically minus. It fills up at every opening of a new bar. I don't understand how he was making a plus in the tester.

Ahahahah))))

Groundhog Day.


I sympathise with you.

Try to press the "Start" button in the strategy tester several times. You'll be surprised.

 

Good afternoon Dimitri. Thank you for such a wonderful series of articles. I have tried all your Expert Advisors, but I have a problem with the latest ones.

The Expert Advisor from article 36 (the one with the largest neural network) passes the test in the tester, but the video card does not load during the test and the Expert Advisor does not try to trade. The balance graph does not change. There are no errors in the tester log. 1 kb files appear in the Common\Files folder.

Expert Advisors from articles 37, 38 are not tested at all. The test is started, but there is no progress. At the same time, the video card is loaded at 100%. And so on until the MT5 process is killed in the Manager. No files are created in Common\Files. There are no errors in the tester.

The Expert Advisor from this article Faza1 passes the test correctly without errors in the log, but the file GoExploer.bd is also created with the size of 1kb.

Can you please tell me where to dig? Other Expert Advisors from this series of articles (which are thrown on the chart) work normally and are considered as a video card. Video card RTX 3060 c 12Gb.

 
Viktor Kudriavtsev #:

The Expert Advisor from this article Faza1 also passes the test correctly without errors in the log, but the GoExploer.bd file is also created with the size of 1kb.


The Faza1 Expert Advisor adds data to the database only with positive profit according to the test results. If all passes were unprofitable, it will not save anything. Try to run it several times in optimisation mode.
 
star-ik #:
On the demo he's specifically minus. It fills up at every opening of a new bar. I don't understand how he made a plus in the tester.

What is the training period? A short training period allows only to see if the model can learn. But not such experience is not enough to interpolate it to future states of the system.