Discussing the article: "Neural Networks in Trading: Enhancing Transformer Efficiency by Reducing Sharpness (Final Part)"

You are missing trading opportunities:
- Free trading apps
- Over 8,000 signals for copying
- Economic news for exploring financial markets
Registration
Log in
You agree to website policy and terms of use
If you do not have an account, please register
Check out the new article: Neural Networks in Trading: Enhancing Transformer Efficiency by Reducing Sharpness (Final Part).
Training for all three models was conducted simultaneously. The results of testing the trained Actor policy are presented below. The testing was performed on real historical data for January 2024, with all other training parameters unchanged.
Before examining the results, I would like to mention several points regarding model training. First, SAM optimization inherently smooths the loss landscape. This, in turn, allows us to consider higher learning rates. While in earlier works we primarily used a learning rate of 3.0e-04, in this case we increased it to 1.0e-03.
Second, the use of only a single attention layer reduced the total number of trainable parameters, helping to offset the computational overhead introduced by the additional feed-forward pass required by SAM optimization.
Author: Dmitriy Gizlyk