Discussing the article: "Neural Networks in Trading: A Parameter-Efficient Transformer with Segmented Attention (PSformer)"

 

Check out the new article: Neural Networks in Trading: A Parameter-Efficient Transformer with Segmented Attention (PSformer).

This article introduces the new PSformer framework, which adapts the architecture of the vanilla Transformer to solving problems related to multivariate time series forecasting. The framework is based on two key innovations: the Parameter Sharing (PS) mechanism and the Segment Attention (SegAtt).

The authors of "PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting" propose an innovative Transformer-based model for multivariate time series forecasting that incorporates parameter sharing principles.

They introduce a Transformer encoder with a two-level segment-based attention mechanism, where each encoder layer includes a shared-parameter block. This block contains three fully connected layers with residual connections, enabling a low overall parameter count while maintaining effective information exchange across model components. To focus attention within segments, they apply a patching method that splits variable sequences into separate patches. Patches occupying the same position across different variables are then grouped into segments. Each segment becomes a spatial extension of a single-variable patch, effectively dividing the multivariate time series into multiple segments.

Within each segment, attention mechanisms enhance the capture of local spatio-temporal relationships, while cross-segment information integration improves overall forecasting accuracy. The authors also incorporate the SAM optimization method to further reduce overfitting without degrading learning performance. Extensive experiments on long-term time series forecasting datasets show that PSformer delivers strong results. PSformer outperforms state-of-the-art models in 6 out of 8 key forecasting benchmarks.


Author: Dmitriy Gizlyk

 

I observed that the second parameter 'SecondInput' is unused, as CNeuronBaseOCL's feedForward method with two parameters internally calls the single-parameter version. Can you verify if this is a bug?

class CNeuronBaseOCL : public CObject

{

...

virtual bool feedForward(CNeuronBaseOCL *NeuronOCL);

virtual bool feedForward(CNeuronBaseOCL *NeuronOCL, CBufferFloat *SecondInput) { return feedForward(NeuronOCL); }

..

}

Actor.feedForward((CBufferFloat*)GetPointer(bAccount), 1, false, GetPointer(Encoder),LatentLayer); ??

Encoder.feedForward((CBufferFloat*)GetPointer(bState), 1, false, GetPointer(bAccount)); ???