Abstract
The worldwide surge in AI adoption into enterprise applications has elevated computational demands. The AI workloads meant to be processed by GPUs have typical characteristics such as huge size, high dynamicity with sudden spikes and falls and non-linearity. Forecasting the AI workloads’ demand for resources upfront would facilitate efficient resource management at the cloud service provider’s end. The classic models such as ARIMA, RNNs and LSTM variants that were employed to forecast cloud workload have several shortcomings such as the inability to address non-linearity in the data, short-term memory, lack of parallel processing, and computational complexity etc., Many state-of-the-art time series forecasting solutions employ Transformer neural networks, originally meant for natural language processing because of their ability to uncover long-range dependencies in sequential data. However, the architecture of transformer neural networks requires certain adaptations to capture the temporal relationships in the data points precisely. In this paper, a novel transformer-based fusion model that enhances the temporal encoding of Multi-head attention transformer with Gated recurrent units (GRU) is proposed. The proposed model also integrates multi-resolution analysis implemented through fast wavelet and a novel data pre-processing pipeline to ensure rich data representation and reduced computational complexity. Further, the FWT-enhanced temporal encoding is fused with a Multi-head (MA) transformer to forecast the GPU workloads in the cloud environment precisely. For experimentation and validation, cluster traces from Alibaba’s Platform for Artificial Intelligence are used. Experimental results demonstrate that the proposed fusion model enhances the performance of the pure MA transformer by 67% and hybrid LSTM-MA transformer by 43%. Further, the superior performance of the proposed fusion model is confirmed by the Pearson correlation coefficient close to 1. For the primary GPU utilization target, TransGRU achieved an exceptionally low Normalized Root Mean Squared Error (NRMSE) of only 1.91%, confirming near-perfect predictive accuracy.
Keywords
Get full access to this article
View all access options for this article.
