Обсуждение статьи "Random Decision Forest в обучении с подкреплением" - страница 7

 
FxTrader562:

Can you please provide me one example code of an indicator without fuzzy logic and where to put the indicator in the current implementation of the code? 

Сейчас не могу, вечером попробую (Now I can't, I'll try tonight)

 
mov :

Now I can not, I'll try in the evening (Now I can not, I'll try tonight)

OK, thank you. I will wait.

Basically, I just want to know how to feed other indicators like MACD,SAR, MA etc to the policy matrix to update policy and update reward on every profit and loss. It should be without fuzzy logic.

 

FxTrader562:

Basically, I just want to know how to feed other indicators like MACD,SAR, MA etc to the policy matrix to update policy and update reward on every profit and loss. It should be without fuzzy logic.

Посмотрел свой код, жуткая мешанина разных проверяемых алгоритмов. Для простоты внес необходимые для работы без fuzzy моменты в исходный код статьи. Надеюсь автор не будет в обиде. Проверил, вроде работает и ничего существенного не забыл. Количество индикаторов задает nIndicat.

 
mov :

I looked at my code, a terrible mess of different verifiable algorithms. For simplicity, I introduced the necessary elements for the work without fuzzy to the source code of the article. I hope the author will not be offended. I checked, it seemed to work. The number of indicators specifies nIndicat.

Thank you for the code. I will look into it.

By the way, one more thing. If you have tried to automate the optimisation process for iterative learning, then kindly let me know. I mean if you have any solution to run the optimiser automatically so that the EA will automatically call the optimiser upon every loss, then please let me know.

The author has told me that he will add the auto-optimisation feature in the future articles. But if someone else already has the code, then it will be great. Since the EA automatically maintains the optimal policy in the text files and hence, it is only required to run the optimiser automatically at regular intervals which I think is easy to implement.But I don't know how to do it.

 
FxTrader562:If you have tried to automate the optimisation process for iterative learning, then kindly let me know.

I tried, but my efficiency is much lower. As you would expect a new article by the author.

 
mov:

I tried, but my efficiency is much lower. As you would expect a new article by the author.

Anyway, thank you. I am also trying as well as waiting for the update from the author.

The code which you provided seems to work fine. I will try with various combinations and may update you again.

Many thanks again.

 
Hello people ,
First of all I would like to congratulate Maxim Dmitrievsky for his article.
Secondly, I want to say that I am keeping an eye on the topic, because the subject is very interesting.
Thirdly I would like to take a doubt because I am not able to understand how the execution of the reward in the EA of classification is made today, could anyone describe it?

What I understood is that when the EA closes the position with a negative value it makes a change in the 2 indexes of the vector (3 and 4).
Is anyone eligible? How do I know this reward is good? because I would like to increase the rewards when the operation is positive and take a certain amount of points.
//+------------------------------------------------------------------+
//|                                                                  |
//+------------------------------------------------------------------+
void updateReward()
  {
   if(MQLInfoInteger(MQL_OPTIMIZATION)==true)
     {
      int unierr;
      if(getLAstProfit()<0)
        {
         double likelyhood=MathRandomUniform(0,1,unierr);
         RDFpolicyMatrix[numberOfsamples-1].Set(3,likelyhood); // HERE 
         RDFpolicyMatrix[numberOfsamples-1].Set(4,1-likelyhood); // AND HERE 
        }
     }
  }


Thank you.
Ps. I used Google Translate, sorry if it is not understandable.
 
rogivilela:
What I understood is that when the EA closes the position with a negative value it makes a change in the 2 indexes of the vector (3 and 4).
Is anyone eligible? How do I know this reward is good? because I would like to increase the rewards when the operation is positive and take a certain amount of points.

Вы берете текст из статьи, посмотрите посты выше, постоянно идет обсуждение наилучшего вознаграждения, есть предложения более эффективного вознаграждения.

 
rogivilela:

 How do I know this reward is good?

Если убыток, то алгоритм должен попробовать не торговать или торговать в противоположном направлении, мы не знаем как правильно, используем случайное значение. Другого смысла в указанных строках нет

 
mov:

Если убыток, то алгоритм должен попробовать не торговать или торговать в противоположном направлении, мы не знаем как правильно, используем случайное значение. Другого смысла в указанных строках нет

сама статья и приведенный алгоритм имеет ознакомительный характер, чтобы получить результат и не только в тестере нужно готовить входные данные, я в последнее время много смотрю видео на ютуб на эту тему, вот очень познавательный пример, да и канал в целом


для начала думаю обучать по часам, т.е. 24 подготовленных нейросети, т.к. в разное время суток разная волатильность, ну  а там посмотрим