Discussing the article: "Neural networks made easy (Part 44): Learning skills with dynamics in mind"

 

Check out the new article: Neural networks made easy (Part 44): Learning skills with dynamics in mind.

In the previous article, we introduced the DIAYN method, which offers the algorithm for learning a variety of skills. The acquired skills can be used for various tasks. But such skills can be quite unpredictable, which can make them difficult to use. In this article, we will look at an algorithm for learning predictable skills.

Studying multiple individual behaviors and corresponding environmental changes allows model predictive control to be used for planning in behavior space rather than action space. In this regard, the main question is how we can obtain such behaviors, given that they can be random and unpredictable. The Dynamics-Aware Discovery of Skills (DADS) method proposes an unsupervised reinforcement learning system for learning low-level skills with the explicit goal of facilitating model-based control.

The skills learned using DADS are directly optimized for predictability, providing better insight from which predictive models can be learned. A key feature of skills is that they are acquired entirely through autonomous exploration. This means that the skill toolkit and its predictive model are learned before the task and reward function are designed. Thus, with a sufficient number, you can quite fully study the environment and develop skills to behave in it.

As in the DIAYN method, the DADS algorithm uses 2 models: a skill model (agent) and a discriminator (skill dynamics model).


Models are trained sequentially and iteratively. First, the discriminator is trained to predict a future state based on the current state and the skill being used. To do this, the current state and the one-hot skill identification vector are fed to the input of the agent model. The agent generates an action that is executed in the environment. As a result of the action, the agent moves to a new state of the environment.

Author: Dmitriy Gizlyk

 
In all previous ones I get this error:

2024.01.13 00:07:45.142 tester stopped because OnInit returns non-zero code 1

when stating the strategy tester.

I have searched a lot, do I need to create the file by myself? and if yes, where should I do that?
 
Dirar Alzoubi #:
In all previous ones I get this error:

2024.01.13 00:07:45.142 tester stopped because OnInit returns non-zero code 1

when stating the strategy tester.

I have searched a lot, do I need to create the file by myself? and if yes, where should I do that?

Hi, what's EA back error?
At first you must run Research.mq5 in strategy tester. And then run Study.mq5 in real mode. 

Reason: