One armed bandit
Machine learning, in problems of statistics and time series forecasting, is the mainstream. The transition from primitive algorithmic expert systems to self-learning, intelligent systems is a nontrivial task. "One armed bandit" implements such an approach, is developed over a year and gradually evolves.
Financial markets contain a large number of random events that in a hidden way affect the change in rates. Markets are very risky and constantly changing, therefore it is difficult to create a universal system that will generate income constantly and with low risk.
"One armed bandit" identifies non-random components and exploits the patterns found in the logic of making trade decisions. Since the patterns are subject to drift, and sometimes drastically change, the expert needs to be retrained periodically.
The backbone of the robot is based on learning with reinforcement (Reinforcement learning), and its name is associated with the mathematical problem of the multi-armed bandit. Initially, the system does not know the market patterns and, interacting with a financial asset, develops an optimal trading strategy. Trade is always one position with stop loss.
Features of the trading system:
- Trained and traded on any financial instruments and any time periods (FOREX, stocks, futures, commodities, cryptocurrencies)
- You can run any number of copies of the adviser on different time-frames and symbols by changing Order magic
- Before trading, the system must be trained in the optimizer
- Trades on every new bar, not sensitive to spread
- Can be used on netting and hedging accounts
- Order magic - unique positions magic
- Opt counter - the number of iterations in the optimizer, the only parameter that needs to be optimized is recommended 5-50 passes
- Trades frequency - the frequency of trades for training, from 0 to 1, the higher the more trades, 0.2 - 0.5 is recommended
- Regularization - the more, the higher the learning accuracy, but the worse the trade on new data, 0.2 - 0.5 is recommended
- Decision boundary - decision boundary, in the range from 0 to 0.5, the higher the value, the greater the accuracy, but fewer transactions
- Maximum risk - a progressive lot management system; the higher the value, the greater the position volume.
- Custom lot - fixed volume trading, if 0 then progressive is used
- Stop loss - protective stop loss for each position
- Breakeven - transfer of the position to break-even through the specified number of points
- Set the "Trades frequency" parameter to the desired number of transactions. This parameter is used only when optimizing for the selection of strategies, and does not participate in real trading.
- Set the parameter "Regularization", you can usually leave 0.2 by default. It is also used only in the process of optimization, in real trading it is not necessary to regulate it.
After waiting for the optimization to finish, run the robot in the tester, increasing the test period to see how the adviser trades on new data. Adjust the "Decision boundary" parameter, which will get rid of too frequent transactions and improve system performance. Set Custom lot or Maximum risk to an acceptable value for you, check again in the tester. After that, you can save the set and load the system on the chart, applying the saved set, or set the settings manually.
Do not train an expert for too long sections of history: this can take a long time. Focus on testing and evaluating trading with new data.
Added setting for transferring transactions to breakeven.