Discussing the article: "Neural Networks in Trading: Reducing Memory Consumption with Adam-mini Optimization"

 

Check out the new article: Neural Networks in Trading: Reducing Memory Consumption with Adam-mini Optimization.

One of the directions for increasing the efficiency of the model training and convergence process is the improvement of optimization methods. Adam-mini is an adaptive optimization method designed to improve on the basic Adam algorithm.

When we first started learning about neural networks, we discussed various approaches to optimizing model parameters. We use different approaches in our work. Most often I use the Adam method, which allows adaptively adjusting the optimal learning rate of each model parameter. However, this adaptability comes at a price. The Adam algorithm uses first and second-moment estimates for each model parameter, requiring the memory of the model itself. This memory consumption poses a significant issue when training large-scale models. In practice, maintaining an algorithm with such high memory demands often necessitates offloading computations to the CPU, increasing latency and slowing down the training process. Given these challenges, the search for new optimization methods or improvements to existing techniques has become increasingly relevant. 

A promising solution was proposed in the paper "Adam-mini: Use Fewer Learning Rates To Gain More", published in July 2024. The authors introduced a modification of the Adam optimizer that maintains its performance while reducing memory consumption. The new optimizer, called Adam-mini, segments model parameters into blocks, assigns a single learning rate per block, and offers the following advantages:

  • Lightweight: Adam-mini significantly reduces the number of learning rates used in Adam, which allows to reduce memory consumption by 45-50%.
  • Efficiency: Despite lower resource usage, Adam-mini achieves performance comparable to or even better than standard Adam.


Author: Dmitriy Gizlyk

 
Hello, I wanted to ask you, when I run Study, I get Error of execution kernel UpdateWeightsAdamMini: 5109, what is the reason and how to solve it, thank you very much in advance.
 
ezequiel moya #:
Hello, I wanted to ask you, when I run Study, I get Error of execution kernel UpdateWeightsAdamMini: 5109, what is the reason and how to solve it, thank you very much in advance.

Good afternoon, can you post the execution log and architecture of the model you are using?

 

Hello, I am sending you the Studio Encode and Study recordings. As for the architecture, it is almost the same as you presented, except that the number of candles in the study is 12 and the data of these candles is 11. Also in the output layer I have only 4 parameters.

Files:
20240804.log  23 kb
 
Dmitry Gizlyk #:

Buenas tardes, ¿puedes publicar el registro de ejecución y la arquitectura del modelo utilizado?

Hello, I am sending you the Studio Encode and Study recordings. Regarding the architecture, it is almost the same as you presented, except that the number of candles in the study is 12 and the data of these candles is 11. Also in the output layer I have only 4 parameters.
Files:
20240804.log  23 kb