Обсуждение статьи "Random Decision Forest в обучении с подкреплением" - страница 6

 
FxTrader562:

Первоначально я пытался увеличить количество деревьев до 500 и 1000. Но я заметил, что все большее количество деревьев не улучшило результаты. Но внутренняя вещь, которую я вижу, превышает 500 деревьев, оптимизация постоянно падает и не создает текстовые файлы Mtrees.

Кроме того, я был протестирован, увеличив число от 50 до 100 и заметил, что лучшие результаты заключаются в итерациях в пределах от 20 до 25 итераций, и ничего больше, чем это не имеет смысла.


Мне просто интересно узнать результаты того, что произойдет с результатами, если мы подадим от 20 до 30 значений индикаторов в качестве входных данных для агента и попросим агента автоматически тренироваться.

В сети есть результаты экспериментов, 100 деревьев - наилучшее распознавание, 20-50 - если необходимо предсказание, я пробовал при 100, предсказательность ухудшается.

Пробовал 15-19 индикаторов на входе в расчете что при изменении обстановки лес при обучении выберет лучшие. Уже при 10 и выше результаты перестают расти. Учтите что при построении леса для каждого дерева используется лишь половина входов (в данной реализации леса). Вроде теоретически для задач классификации (как утверждают) лучше корень из числа входов (а не половина), но сам не пробовал.

 
mov :

The network has the results of experiments, 100 trees - the best recognition, 20-50 - if prediction is necessary, I tried at 100, predictability is getting worse.

I tried 15-19 indicators at the entrance in the calculation that if the situation changes the forest will choose the best when training. Already at 10 and above the results of cease to grow. Note that when building a forest, only half of the inputs are used for each tree. It seems that theoretically for classification problems (as they say) is the root of the number of inputs (rather than half) is better, but he did not try.

Thank you for your reply.

However, I want to know what happens if we can all know (15 to 20 indicators) at once and NOT using only the few indicators for the agent, but we should use all indicators. And then, train the agent for the past 1 year so that the agent can develop the best policy using all indicators. I mean that we should determine the current state of the agent at every candle close with more indicator values.

Till now what I have noticed so far is that one loss is wiping out the series of small profits due to lack of proper exit conditions. So both entry and exit conditions need to be fine tuned.

Can you please provide me one example code of an indicator without fuzzy logic and where to put the indicator in the current implementation of the code? 

I tried to add the indicators inside OnTick () function, but it did not help much. I am looking for a complete sample code of the current version of the EA without fuzzy logic.

 
FxTrader562:

Can you please provide me one example code of an indicator without fuzzy logic and where to put the indicator in the current implementation of the code? 

Сейчас не могу, вечером попробую (Now I can't, I'll try tonight)

 
mov :

Now I can not, I'll try in the evening (Now I can not, I'll try tonight)

OK, thank you. I will wait.

Basically, I just want to know how to feed other indicators like MACD,SAR, MA etc to the policy matrix to update policy and update reward on every profit and loss. It should be without fuzzy logic.

 

FxTrader562:

Basically, I just want to know how to feed other indicators like MACD,SAR, MA etc to the policy matrix to update policy and update reward on every profit and loss. It should be without fuzzy logic.

Посмотрел свой код, жуткая мешанина разных проверяемых алгоритмов. Для простоты внес необходимые для работы без fuzzy моменты в исходный код статьи. Надеюсь автор не будет в обиде. Проверил, вроде работает и ничего существенного не забыл. Количество индикаторов задает nIndicat.

 
mov :

I looked at my code, a terrible mess of different verifiable algorithms. For simplicity, I introduced the necessary elements for the work without fuzzy to the source code of the article. I hope the author will not be offended. I checked, it seemed to work. The number of indicators specifies nIndicat.

Thank you for the code. I will look into it.

By the way, one more thing. If you have tried to automate the optimisation process for iterative learning, then kindly let me know. I mean if you have any solution to run the optimiser automatically so that the EA will automatically call the optimiser upon every loss, then please let me know.

The author has told me that he will add the auto-optimisation feature in the future articles. But if someone else already has the code, then it will be great. Since the EA automatically maintains the optimal policy in the text files and hence, it is only required to run the optimiser automatically at regular intervals which I think is easy to implement.But I don't know how to do it.

 
FxTrader562:If you have tried to automate the optimisation process for iterative learning, then kindly let me know.

I tried, but my efficiency is much lower. As you would expect a new article by the author.

 
mov:

I tried, but my efficiency is much lower. As you would expect a new article by the author.

Anyway, thank you. I am also trying as well as waiting for the update from the author.

The code which you provided seems to work fine. I will try with various combinations and may update you again.

Many thanks again.

 
Hello people ,
First of all I would like to congratulate Maxim Dmitrievsky for his article.
Secondly, I want to say that I am keeping an eye on the topic, because the subject is very interesting.
Thirdly I would like to take a doubt because I am not able to understand how the execution of the reward in the EA of classification is made today, could anyone describe it?

What I understood is that when the EA closes the position with a negative value it makes a change in the 2 indexes of the vector (3 and 4).
Is anyone eligible? How do I know this reward is good? because I would like to increase the rewards when the operation is positive and take a certain amount of points.
//+------------------------------------------------------------------+
//|                                                                  |
//+------------------------------------------------------------------+
void updateReward()
  {
   if(MQLInfoInteger(MQL_OPTIMIZATION)==true)
     {
      int unierr;
      if(getLAstProfit()<0)
        {
         double likelyhood=MathRandomUniform(0,1,unierr);
         RDFpolicyMatrix[numberOfsamples-1].Set(3,likelyhood); // HERE 
         RDFpolicyMatrix[numberOfsamples-1].Set(4,1-likelyhood); // AND HERE 
        }
     }
  }


Thank you.
Ps. I used Google Translate, sorry if it is not understandable.
 
rogivilela:
What I understood is that when the EA closes the position with a negative value it makes a change in the 2 indexes of the vector (3 and 4).
Is anyone eligible? How do I know this reward is good? because I would like to increase the rewards when the operation is positive and take a certain amount of points.

Вы берете текст из статьи, посмотрите посты выше, постоянно идет обсуждение наилучшего вознаграждения, есть предложения более эффективного вознаграждения.

Причина обращения: