Abstract
In recent years, Transformer-based models have dominated the field of long-term time series forecasting. However, the quadratic complexity of attention mechanisms makes both training and inference computationally expensive. The SOFTS model has emerged as an efficient alternative, replacing attention mechanisms with the STAR module to preserve linear complexity while achieving performance comparable to or better than competing approaches. The SOFTS model builds on the iTransformer architecture, which marked a significant advancement in long-term time series forecasting. Although neither iTransformer nor SOFTS incorporates positional embeddings, our analysis revealed a clear opportunity to improve forecasting accuracy by introducing them. However, the straightforward inclusion of positional embeddings leads to convergence and generalization issues. To address this, we propose a simple yet effective technique: during training, positional embeddings are randomly omitted in certain forward passes, which reduces instability and helps the model generalize better. We refer to this novel form of using positional embeddings as Learnable Stochastic Positional Embedding. Additionally, we incorporate multiple dropout layers to mitigate overfitting and improve accuracy. These modifications result in SOFTS++, a fast and accurate model that achieves the best performance on at least 10 out of 12 standard benchmark datasets. By maintaining linear complexity and requiring minimal computational resources, SOFTS++ stands out as a capable and resource-efficient method for multivariate long-term forecasting tasks.
Keywords
Get full access to this article
View all access options for this article.
