Sage Journals: Discover world-class research

Abstract

The worldwide surge in AI adoption into enterprise applications has elevated computational demands. The AI workloads meant to be processed by GPUs have typical characteristics such as huge size, high dynamicity with sudden spikes and falls and non-linearity. Forecasting the AI workloads’ demand for resources upfront would facilitate efficient resource management at the cloud service provider’s end. The classic models such as ARIMA, RNNs and LSTM variants that were employed to forecast cloud workload have several shortcomings such as the inability to address non-linearity in the data, short-term memory, lack of parallel processing, and computational complexity etc., Many state-of-the-art time series forecasting solutions employ Transformer neural networks, originally meant for natural language processing because of their ability to uncover long-range dependencies in sequential data. However, the architecture of transformer neural networks requires certain adaptations to capture the temporal relationships in the data points precisely. In this paper, a novel transformer-based fusion model that enhances the temporal encoding of Multi-head attention transformer with Gated recurrent units (GRU) is proposed. The proposed model also integrates multi-resolution analysis implemented through fast wavelet and a novel data pre-processing pipeline to ensure rich data representation and reduced computational complexity. Further, the FWT-enhanced temporal encoding is fused with a Multi-head (MA) transformer to forecast the GPU workloads in the cloud environment precisely. For experimentation and validation, cluster traces from Alibaba’s Platform for Artificial Intelligence are used. Experimental results demonstrate that the proposed fusion model enhances the performance of the pure MA transformer by 67% and hybrid LSTM-MA transformer by 43%. Further, the superior performance of the proposed fusion model is confirmed by the Pearson correlation coefficient close to 1. For the primary GPU utilization target, TransGRU achieved an exceptionally low Normalized Root Mean Squared Error (NRMSE) of only 1.91%, confirming near-perfect predictive accuracy.

Keywords

multihead attention transformer gated recurrent units temporal encoding fusion model AI workload fast wavelet transform

Get full access to this article

View all access options for this article.

References

Azure VM Traces . (nd) https://github.com/Azure/AzurePublicDataset/blob/master/AzurePublicDatasetV1.md

Baig

Iqbal

Berral

, et al. (2019) Adaptive prediction models for data center resources utilization estimation. IEEE Transactions on Network and Service Management 16(4): 1681–1693.

Baldan

Ramirez-Gallego

Bergmeir

, et al. (2018) A forecasting methodology for workload forecasting in cloud systems. IEEE Transactions on Cloud Computing 6(4): 929–941.

Cen

Wang

(2019) Crude oil price prediction model with long short term memory deep learning based on prior knowledge data transfer. Energy 169: 160–171.

Chaovalit

Gangopadhyay

Karabatis

, et al. (2011) Discrete wavelet transform-based time series analysis and mining. ACM Computing Surveys 43(2): 1–37.

Chen

Gupta

Tragoudas

(2022) Improving the forecasting and classification of extreme events in imbalanced time series through block resampling in the joint predictor-forecast space. IEEE Access 10: 121048–121079.

Gupta

Saxena

Singh

, et al. (2024) A multiple controlled toffoli driven adaptive quantum neural network model for dynamic workload prediction in cloud environments. IEEE Transactions on Pattern Analysis and Machine Intelligence 46(12): 7574–7588.

Harerimana

Kim

Jang

(2022) A multi-headed transformer approach for predicting the patient’s clinical time-series variables from charted vital signs. IEEE Access 10: 105993–106004.

Haryono

Sarno

Sungkono

(2023) Transformer-gated recurrent unit method for predicting stock price based on news sentiments and technical indicators. IEEE Access.

10.

Huang

, et al. (2024) Price prediction of power transformer materials based on CEEMD and GRU. Global Energy Interconnection 7(2): 217–227.

11.

Kim

Wang

, et al. (2020) Forecasting cloud application workloads with cloudinsight for predictive resource management. IEEE Transactions on Cloud Computing 10(3): 1848–1863.

12.

Koç

(2022) Fractional fourier transform in time series prediction. IEEE Signal Processing Letters 29: 2542–2546.

13.

Kumar

Saxena

Gupta

, et al. (2025) A comprehensively adaptive architectural optimization-ingrained quantum neural network model for cloud workloads prediction. IEEE Transactions on Neural Networks and Learning Systems.

14.

Wang

Qiu

, et al. (2013) A workload prediction-based multi-vm provisioning mechanism in cloud computing. In 2013 15th Asia-Pacific Network Operations and Management Symposium (APNOMS). IEEE, 1–6.

15.

Liang

Chai

Sun

, et al. (2024) GTformer: graph-based temporal-order-aware transformer for long-term series forecasting. IEEE.

16.

Lindemann

Müller

Vietz

, et al. (2021) A survey on long short-term memory networks for time series prediction. Procedia CIRP 99: 650–655.

17.

Liu

Xiao

Lin

, et al. (2020) A fuzzy interval time-series energy and financial forecasting model using network-based multiple time-frequency spaces and the induced-ordered weighted averaging aggregation operation. IEEE Transactions on Fuzzy Systems 28(11): 2677–2690.

18.

Liu

Cao

Zhang

, et al. (2023) Short-term power load forecasting in FGSM-Bi-LSTM networks based on empirical wavelet transform. IEEE Access.

19.

Mohammadi

Ghofrani

Nikseresht

(2023) Using empirical wavelet transform and high-order fuzzy cognitive maps for time series forecasting. Applied Soft Computing 135: 109990.

20.

Nayak

Alam

Avinash

, et al. (2024) Transformer-based deep learning architecture for time series forecasting. Software Impacts 22: 100716.

21.

Percival

(2000) An introduction to the wavelet analysis of time series. Symp. Tutorials 6: 2000.

22.

Qiu

Dai

Wang

, et al. (2024) Evaluation of different deep learning methods for meteorological element forecasting. IEEE Access.

23.

Sagheer

Kotb

(2019) Time series forecasting of petroleum production using deep LSTM recurrent networks. Neurocomputing 323: 203–213.

24.

Schlüter

Deuschle

(2010) Using wavelets for time series forecasting: does it pay off? IWQW Discussion Papers, No. 04/2010. Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.

25.

Vaswani

(2017) Attention is all you need. Advances in Neural Information Processing Systems, 30.

26.

Weng

Xiao

, et al. (2022) {MLaaS} in the wild: Workload analysis and scheduling in {Large-Scale} heterogeneous {GPU} clusters. 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX, 945–960.

27.

Wang

, et al. (2021) Autoformer: decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems 34: 22419–22430.

28.

Ying

(2024) TFEformer: temporal feature enhanced transformer for multivariate time series forecasting. IEEE Access.

29.

Zhou

Zhang

Peng

, et al. (2021) Informer: beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence 35(12): 11106–11115.

TransGRU-X – A fusion Seq2Seq network enhanced with multiresolution analysis and gating for forecasting of AI/ML workloads in cloud environments

Abstract

Keywords

Get full access to this article

References