Machine learning in trading: theory, models, practice and algo-trading - page 1269

 

From the creators of AlphaGo Zero fresh, enjoy watching :)


 
Maxim Dmitrievsky:

From the creators of AlphaGo Zero fresh, enjoy watching :)


Is there any detailed instructions on how to create/train/connect models for Star Kraft?

 
Maxim Dmitrievsky:

I guess I don't play, I just watch the matches.

Judging by the replays, AlfaTrader, if such a thing is made, will trade better than any bag of dice

I think that doing such a robot can get new skills in the MO, and it's just interesting. I myself have played StarCraft 2 a couple times since the new chapters came out (there's a storyline broken into several parts). When playing against the AI, he often wins not through logic of action, but through control of units - a person physically can't control the entire map and every unit at once.

 
Vladimir Perervenko:

I don't monitor my own, and I don't know anyone else's. In the article quoted above, there is not enough information to reproduce and the code is too complicated. I think everything can be implemented with standard package layers, without using R6.

Good luck

I don't understand what good luck you wished.

Please make a demo at least

If the result of the expert's work with elements of MO is acceptable, then I will reread the whole branch from beginning to end.

 
Maxim Dmitrievsky:

Here it is not so, exactly close to what a person sees and how he does - a limited field of vision, the average apm is lower than that of the pro player. i.e. it is a fight of intelligences, i.e. strategies, rather than speeds (where the machine will naturally always win)

And the built-in AI in SC is just scrambled uninteresting opponents. This one, on the contrary, plays like a human. I would not distinguish the pro player from this AI, ie Turing test passed )))

Even visualized the cloud of active neurons of the electronic brain


From the screen you can not make unambiguous conclusions. Yes, perhaps to control only what fits on the screen - no problem, for these purposes, use the hotkeys on the unit / structure or group of units, then do not necessarily observe them visually at this point, and you can coordinate on the map, which is also always visible on the screen. The trick is in microcontrol, I personally watched a couple of videos and did not see something clever there in terms of strategy, but the use of the potential of individual units there disclosed on the full. That is, the emphasis there is on assessing the potential threat and options to counter it - depending on the branch of enemy development develops its own branch, plus a separate module on the economy - different methods have seen, and control, and I think that the first two modules (branch of development and economy) are programmed or applied fuzzy logic, something relatively clumsy for stability, but control occurs on the situation and here just work AI in full measure. By the way, it is not clear how information about objects is transmitted, how to generalize them to make a decision and take into account their movement is constant, the chips and targets are not clear.

 
By the way, I sometimes play Warcraft III on the Blizard network and there I am often accused of AI, I wonder if it can be used there too. And if they use it, I wonder how many times I played with such AI.
 
Maxim Dmitrievsky:

There by the way originally on the screen fit the whole map for the bot, and then did as a player and then the bot began to stumble and the man won (at the end of the video). Well filmed bad mb for such a situation. On the other hand how to evaluate the effectiveness - once certain strata lead to success then the bot chose them.

I think that if you make the constraints on control commensurate with humans, the bots will beat the average user, because the behavior of the crowd is similar, and it will be the most frequent. By the way, when I tried to play with my son on the network for one clan against bots toy Warcraft 3, then at first it was just as difficult (and before that I had a wealth of experience in clan games in Battlenet against people), but then get used to the behavior of the bot and gradually outplaying him using non-standard solutions (for example cut through the passage to the mine through the trees, thereby protecting units from attack from the ground). So I wonder what weight can be given to non-standard strategies, so they would be taken into account in the MO, ie need to somehow divide the standard behavior and non-standard and have a different approach to them, and at the same time do not interfere with each other. It's like a trend and a flat - one model is very difficult to train both at once, at least I don't know how.

 
Maxim Dmitrievsky:

I think there is no weight. If situations are rare, the bot will simply ignore such options. If a person adjusts to the strategies of the bot, then the bot needs to train all the time to suit the strata of the person, otherwise it will not be an equal situation.)

I do not know, then it turns out that a lot depends on the sample, if the sample is different, then the bots will behave differently in a fight with each other, ie here is not only training but also the luck factor (who was trained on what).

I.e. it's not always possible to see (correctly assess) the result of training, since there's no valid sample for comparing results.

 
Maxim Dmitrievsky:

Yeah, that's how learning happens there - through adversarial networks, sort of like that. The AI plays against the AI thousands of times, reproducing many different strategies. Eventually the network works out the optimal strategies. If the number of games exceeds the number of games played by a professional player (as they say it's equal to 200 years of play), the statistical advantage will be on the side of the bot, it has considered more combinations. But the probability of finding a unique winning stratum still remain of course the man

The topic is interesting, but shrouded in mystery :) Trading is different in that we can't influence the market with our behavior and we have no possibility to correct our mistakes, maybe with position averaging...

 
Maxim Dmitrievsky:

If you break the graph into thousands and millions of chunks and make the bot play against it so many times, maybe it will learn to beat it all the time, again it depends on the chips

I see it slightly differently, in the game conventionally there is a mathematical evaluation of each side, consisting of many factors - the number of bots and their potential, property, money, and the goal of the opponent to reduce this evaluation index so as to keep his evaluation index above the opponent, i.e. to spend less energy on the result. Thus you get a mutually influential system, where it is clear that by sacrificing a unit you will decrease the opponent's estimated asset value by more than the unit's estimated value, then this is the right decision, and if not, then it's wrong. In trading we have no guarantees, only probability, while in the toy there are mathematical guarantees that can be calculated.

We can't influence the situation, while in the game we can, including creating profitable situations ourselves.
Reason: