Discussing the article: "Neural networks made easy (Part 76): Exploring diverse interaction patterns with Multi-future Transformer"

 

Check out the new article: Neural networks made easy (Part 76): Exploring diverse interaction patterns with Multi-future Transformer.

This article continues the topic of predicting the upcoming price movement. I invite you to get acquainted with the Multi-future Transformer architecture. Its main idea is to decompose the multimodal distribution of the future into several unimodal distributions, which allows you to effectively simulate various models of interaction between agents on the scene.

The core of the MFT model is a parallel interaction module, which consists of several interaction blocks in a parallel structure and studies the future features of the movement of agents for each mode. The three prediction headers include:

  • Motion decoder,
  • Agent score decoder,
  • Scene score decoder.

They are responsible for decoding future trajectories for each agent and estimating confidence scores for each predicted trajectory and each scene mode. In this architecture, the paths along which the feed-forward and backpropagation signals of each mode pass are independent of each other, and each path contains a unique interaction block that provides information interaction between signals of the same mode. Therefore, interaction units can simultaneously capture the corresponding interaction patterns of different modes. However, encoders and prediction headers are common to each mode, while interaction blocks are parameterized as different objects. Therefore, each unimodal distribution, which theoretically has different parameters, can be modeled in a more parameter-efficient way. The original visualization of the method is shown below.

Author: Dmitriy Gizlyk