Dmitriy Gizlyk
Dmitriy Gizlyk
4.4 (49)
  • Information
10+ years
experience
0
products
0
demo versions
134
jobs
0
signals
0
subscribers
Professional writing programs of any complexity for MT4, MT5, C#.
Dmitriy Gizlyk
Published article Neural networks made easy (Part 53): Reward decomposition
Neural networks made easy (Part 53): Reward decomposition

We have already talked more than once about the importance of correctly selecting the reward function, which we use to stimulate the desired behavior of the Agent by adding rewards or penalties for individual actions. But the question remains open about the decryption of our signals by the Agent. In this article, we will talk about reward decomposition in terms of transmitting individual signals to the trained Agent.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 52): Research with optimism and distribution correction
Neural networks made easy (Part 52): Research with optimism and distribution correction

As the model is trained based on the experience reproduction buffer, the current Actor policy moves further and further away from the stored examples, which reduces the efficiency of training the model as a whole. In this article, we will look at the algorithm of improving the efficiency of using samples in reinforcement learning algorithms.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 51): Behavior-Guided Actor-Critic (BAC)
Neural networks made easy (Part 51): Behavior-Guided Actor-Critic (BAC)

The last two articles considered the Soft Actor-Critic algorithm, which incorporates entropy regularization into the reward function. This approach balances environmental exploration and model exploitation, but it is only applicable to stochastic models. The current article proposes an alternative approach that is applicable to both stochastic and deterministic models.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 50): Soft Actor-Critic (model optimization)
Neural networks made easy (Part 50): Soft Actor-Critic (model optimization)

In the previous article, we implemented the Soft Actor-Critic algorithm, but were unable to train a profitable model. Here we will optimize the previously created model to obtain the desired results.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 49): Soft Actor-Critic
Neural networks made easy (Part 49): Soft Actor-Critic

We continue our discussion of reinforcement learning algorithms for solving continuous action space problems. In this article, I will present the Soft Actor-Critic (SAC) algorithm. The main advantage of SAC is the ability to find optimal policies that not only maximize the expected reward, but also have maximum entropy (diversity) of actions.

JimReaper
JimReaper 2023.07.14
Enjoy!
Shah Yahya
Shah Yahya 2023.07.21
Thanks so much Dmitry! Really appreciate this.
Dmitriy Gizlyk
Published article Neural networks made easy (Part 48): Methods for reducing overestimation of Q-function values
Neural networks made easy (Part 48): Methods for reducing overestimation of Q-function values

In the previous article, we introduced the DDPG method, which allows training models in a continuous action space. However, like other Q-learning methods, DDPG is prone to overestimating Q-function values. This problem often results in training an agent with a suboptimal strategy. In this article, we will look at some approaches to overcome the mentioned issue.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 47): Continuous action space
Neural networks made easy (Part 47): Continuous action space

In this article, we expand the range of tasks of our agent. The training process will include some aspects of money and risk management, which are an integral part of any trading strategy.

Tanaka Black
Tanaka Black 2023.06.29
hie Dimitriy, i have a job for you please check your message inbox
Dmitriy Gizlyk
Published article Neural networks made easy (Part 46): Goal-conditioned reinforcement learning (GCRL)
Neural networks made easy (Part 46): Goal-conditioned reinforcement learning (GCRL)

In this article, we will have a look at yet another reinforcement learning approach. It is called goal-conditioned reinforcement learning (GCRL). In this approach, an agent is trained to achieve different goals in specific scenarios.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 45): Training state exploration skills
Neural networks made easy (Part 45): Training state exploration skills

Training useful skills without an explicit reward function is one of the main challenges in hierarchical reinforcement learning. Previously, we already got acquainted with two algorithms for solving this problem. But the question of the completeness of environmental research remains open. This article demonstrates a different approach to skill training, the use of which directly depends on the current state of the system.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 44): Learning skills with dynamics in mind
Neural networks made easy (Part 44): Learning skills with dynamics in mind

In the previous article, we introduced the DIAYN method, which offers the algorithm for learning a variety of skills. The acquired skills can be used for various tasks. But such skills can be quite unpredictable, which can make them difficult to use. In this article, we will look at an algorithm for learning predictable skills.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 43): Mastering skills without the reward function
Neural networks made easy (Part 43): Mastering skills without the reward function

The problem of reinforcement learning lies in the need to define a reward function. It can be complex or difficult to formalize. To address this problem, activity-based and environment-based approaches are being explored to learn skills without an explicit reward function.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 42): Model procrastination, reasons and solutions
Neural networks made easy (Part 42): Model procrastination, reasons and solutions

In the context of reinforcement learning, model procrastination can be caused by several reasons. The article considers some of the possible causes of model procrastination and methods for overcoming them.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 41): Hierarchical models
Neural networks made easy (Part 41): Hierarchical models

The article describes hierarchical training models that offer an effective approach to solving complex machine learning problems. Hierarchical models consist of several levels, each of which is responsible for different aspects of the task.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 40): Using Go-Explore on large amounts of data
Neural networks made easy (Part 40): Using Go-Explore on large amounts of data

This article discusses the use of the Go-Explore algorithm over a long training period, since the random action selection strategy may not lead to a profitable pass as training time increases.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 39): Go-Explore, a different approach to exploration
Neural networks made easy (Part 39): Go-Explore, a different approach to exploration

We continue studying the environment in reinforcement learning models. And in this article we will look at another algorithm – Go-Explore, which allows you to effectively explore the environment at the model training stage.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement
Neural networks made easy (Part 38): Self-Supervised Exploration via Disagreement

One of the key problems within reinforcement learning is environmental exploration. Previously, we have already seen the research method based on Intrinsic Curiosity. Today I propose to look at another algorithm: Exploration via Disagreement.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 37): Sparse Attention
Neural networks made easy (Part 37): Sparse Attention

In the previous article, we discussed relational models which use attention mechanisms in their architecture. One of the specific features of these models is the intensive utilization of computing resources. In this article, we will consider one of the mechanisms for reducing the number of computational operations inside the Self-Attention block. This will increase the general performance of the model.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 36): Relational Reinforcement Learning
Neural networks made easy (Part 36): Relational Reinforcement Learning

In the reinforcement learning models we discussed in previous article, we used various variants of convolutional networks that are able to identify various objects in the original data. The main advantage of convolutional networks is the ability to identify objects regardless of their location. At the same time, convolutional networks do not always perform well when there are various deformations of objects and noise. These are the issues which the relational model can solve.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 35): Intrinsic Curiosity Module
Neural networks made easy (Part 35): Intrinsic Curiosity Module

We continue to study reinforcement learning algorithms. All the algorithms we have considered so far required the creation of a reward policy to enable the agent to evaluate each of its actions at each transition from one system state to another. However, this approach is rather artificial. In practice, there is some time lag between an action and a reward. In this article, we will get acquainted with a model training algorithm which can work with various time delays from the action to the reward.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 34): Fully Parameterized Quantile Function
Neural networks made easy (Part 34): Fully Parameterized Quantile Function

We continue studying distributed Q-learning algorithms. In previous articles, we have considered distributed and quantile Q-learning algorithms. In the first algorithm, we trained the probabilities of given ranges of values. In the second algorithm, we trained ranges with a given probability. In both of them, we used a priori knowledge of one distribution and trained another one. In this article, we will consider an algorithm which allows the model to train for both distributions.