Dmitriy Gizlyk
Dmitriy Gizlyk
4.4 (49)
  • Information
10+ years
experience
0
products
0
demo versions
134
jobs
0
signals
0
subscribers
Professional writing programs of any complexity for MT4, MT5, C#.
Dmitriy Gizlyk
Published article Нейросети — это просто (Часть 71): Прогнозирование будущих состояний с учетом поставленных целей (GCPC)
Нейросети — это просто (Часть 71): Прогнозирование будущих состояний с учетом поставленных целей (GCPC)

В предыдущих работах мы познакомились с методом Decision Transformer и несколькими производными от него алгоритмами. Мы экспериментировали с различными методами постановки цели. В процессе экспериментов мы работали с различными способами постановки целей, однако изучение моделью уже пройденной траектории всегда оставалось вне нашего внимания. В данной статье я хочу познакомить Вас с методом, который заполняет этот пробел.

3
Dmitriy Gizlyk
Published article Нейросети — это просто (Часть 70): Улучшение политики с использованием операторов в закрытой форме (CFPI)
Нейросети — это просто (Часть 70): Улучшение политики с использованием операторов в закрытой форме (CFPI)

В этой статье мы предлагаем познакомиться с алгоритмом, который использует операторы улучшения политики в закрытой форме для оптимизации действий Агента в офлайн режиме.

2
Dmitriy Gizlyk
Published article Neural networks made easy (Part 69): Density-based support constraint for the behavioral policy (SPOT)
Neural networks made easy (Part 69): Density-based support constraint for the behavioral policy (SPOT)

In offline learning, we use a fixed dataset, which limits the coverage of environmental diversity. During the learning process, our Agent can generate actions beyond this dataset. If there is no feedback from the environment, how can we be sure that the assessments of such actions are correct? Maintaining the Agent's policy within the training dataset becomes an important aspect to ensure the reliability of training. This is what we will talk about in this article.

JimReaper
JimReaper 2023.12.22
Hi Dmitriy, seems like the article is incomplete.
Dmitriy Gizlyk
Published article Neural networks made easy (Part 68): Offline Preference-guided Policy Optimization
Neural networks made easy (Part 68): Offline Preference-guided Policy Optimization

Since the first articles devoted to reinforcement learning, we have in one way or another touched upon 2 problems: exploring the environment and determining the reward function. Recent articles have been devoted to the problem of exploration in offline learning. In this article, I would like to introduce you to an algorithm whose authors completely eliminated the reward function.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 67): Using past experience to solve new tasks
Neural networks made easy (Part 67): Using past experience to solve new tasks

In this article, we continue discussing methods for collecting data into a training set. Obviously, the learning process requires constant interaction with the environment. However, situations can be different.

JimReaper
JimReaper 2023.12.09
THIS IS GENIUS WORK Dmitriy! I Love this!
Dmitriy Gizlyk
Published article Neural networks made easy (Part 66): Exploration problems in offline learning
Neural networks made easy (Part 66): Exploration problems in offline learning

Models are trained offline using data from a prepared training dataset. While providing certain advantages, its negative side is that information about the environment is greatly compressed to the size of the training dataset. Which, in turn, limits the possibilities of exploration. In this article, we will consider a method that enables the filling of a training dataset with the most diverse data possible.

JimReaper
JimReaper 2023.12.05
You are the best! Thank you so much for your research. Beautifully done.!
Dmitriy Gizlyk
Published article Neural networks made easy (Part 65): Distance Weighted Supervised Learning (DWSL)
Neural networks made easy (Part 65): Distance Weighted Supervised Learning (DWSL)

In this article, we will get acquainted with an interesting algorithm that is built at the intersection of supervised and reinforcement learning methods.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 64): ConserWeightive Behavioral Cloning (CWBC) method
Neural networks made easy (Part 64): ConserWeightive Behavioral Cloning (CWBC) method

As a result of tests performed in previous articles, we came to the conclusion that the optimality of the trained strategy largely depends on the training set used. In this article, we will get acquainted with a fairly simple yet effective method for selecting trajectories to train models.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)
Neural networks made easy (Part 63): Unsupervised Pretraining for Decision Transformer (PDT)

We continue to discuss the family of Decision Transformer methods. From previous article, we have already noticed that training the transformer underlying the architecture of these methods is a rather complex task and requires a large labeled dataset for training. In this article we will look at an algorithm for using unlabeled trajectories for preliminary model training.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 62): Using Decision Transformer in hierarchical models
Neural networks made easy (Part 62): Using Decision Transformer in hierarchical models

In recent articles, we have seen several options for using the Decision Transformer method. The method allows analyzing not only the current state, but also the trajectory of previous states and actions performed in them. In this article, we will focus on using this method in hierarchical models.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 61): Optimism issue in offline reinforcement learning
Neural networks made easy (Part 61): Optimism issue in offline reinforcement learning

During the offline learning, we optimize the Agent's policy based on the training sample data. The resulting strategy gives the Agent confidence in its actions. However, such optimism is not always justified and can cause increased risks during the model operation. Today we will look at one of the methods to reduce these risks.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 60): Online Decision Transformer (ODT)
Neural networks made easy (Part 60): Online Decision Transformer (ODT)

The last two articles were devoted to the Decision Transformer method, which models action sequences in the context of an autoregressive model of desired rewards. In this article, we will look at another optimization algorithm for this method.

Dmitriy Gizlyk
Published article Neural networks are easy (Part 59): Dichotomy of Control (DoC)
Neural networks are easy (Part 59): Dichotomy of Control (DoC)

In the previous article, we got acquainted with the Decision Transformer. But the complex stochastic environment of the foreign exchange market did not allow us to fully implement the potential of the presented method. In this article, I will introduce an algorithm that is aimed at improving the performance of algorithms in stochastic environments.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 58): Decision Transformer (DT)
Neural networks made easy (Part 58): Decision Transformer (DT)

We continue to explore reinforcement learning methods. In this article, I will focus on a slightly different algorithm that considers the Agent’s policy in the paradigm of constructing a sequence of actions.

Yao Wei Lai
Yao Wei Lai 2023.10.11
I greatly admire your article series "Neural Networks Make It Easy", but after reading it for a long time, I still don't understand how to generate models. Could you please send me the models used in each article? I would like to replicate your test to further learn relevant knowledge. Thank you!
Dmitriy Gizlyk
Published article Neural networks made easy (Part 57): Stochastic Marginal Actor-Critic (SMAC)
Neural networks made easy (Part 57): Stochastic Marginal Actor-Critic (SMAC)

Here I will consider the fairly new Stochastic Marginal Actor-Critic (SMAC) algorithm, which allows building latent variable policies within the framework of entropy maximization.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 56): Using nuclear norm to drive research
Neural networks made easy (Part 56): Using nuclear norm to drive research

The study of the environment in reinforcement learning is a pressing problem. We have already looked at some approaches previously. In this article, we will have a look at yet another method based on maximizing the nuclear norm. It allows agents to identify environmental states with a high degree of novelty and diversity.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 55): Contrastive intrinsic control (CIC)
Neural networks made easy (Part 55): Contrastive intrinsic control (CIC)

Contrastive training is an unsupervised method of training representation. Its goal is to train a model to highlight similarities and differences in data sets. In this article, we will talk about using contrastive training approaches to explore different Actor skills.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 54): Using random encoder for efficient research (RE3)
Neural networks made easy (Part 54): Using random encoder for efficient research (RE3)

Whenever we consider reinforcement learning methods, we are faced with the issue of efficiently exploring the environment. Solving this issue often leads to complication of the algorithm and training of additional models. In this article, we will look at an alternative approach to solving this problem.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 53): Reward decomposition
Neural networks made easy (Part 53): Reward decomposition

We have already talked more than once about the importance of correctly selecting the reward function, which we use to stimulate the desired behavior of the Agent by adding rewards or penalties for individual actions. But the question remains open about the decryption of our signals by the Agent. In this article, we will talk about reward decomposition in terms of transmitting individual signals to the trained Agent.

Dmitriy Gizlyk
Published article Neural networks made easy (Part 52): Research with optimism and distribution correction
Neural networks made easy (Part 52): Research with optimism and distribution correction

As the model is trained based on the experience reproduction buffer, the current Actor policy moves further and further away from the stored examples, which reduces the efficiency of training the model as a whole. In this article, we will look at the algorithm of improving the efficiency of using samples in reinforcement learning algorithms.