Sage Journals: Discover world-class research

Abstract

In smart grid management, precise stability prediction is a complicated task that adds to the effective allocation of resources with grid stability. Specifically, demand-side management is considered an essential element of the overall Smart Grids system. Hence, predicting future energy demands is crucial to regulating consumption by aligning utility offerings with consumer demand. This research presents a hybrid deep learning model (Convolutional Neural Network [CNN] with Bi-LSTM) with a two-way attention method and a multi-objective particle swarm optimization method (MPSO) for short-term load prediction from a smart grid. The proposed hybrid model utilizes a two-way attention method at its encoding and decoding stages, in which an encoding attention layer helps to recognize all the essential features from an input vector, and a decoding attention layer helps to resolve the fixed context vector problem by offering better memory capacity. A CNN and Bi-LSTM are used to capture the essential features from the dataset. We also utilize a t-Nearest Neighbours algorithm to pre-process the initial dataset. An MPSO method combines the features of CNN and Bi-LSTM methods, resulting in better prediction accuracy. As far as we know, it is the first work to suggest a dynamic short-term load prediction model that considers different significant features and enables precise predicting outcomes. The performance of the proposed model and existing well-known deep learning models such as Recurrent Neural Network, Gated Recurrent Unit, Long Short-Term Memory (LSTM), Time Series Transformer, CNN-LSTM and various performance measuring parameters MAE, MSE, MAPE and RMSE are calculated on online UCI dataset (Electrical Grid Stability Simulated Dataset). The proposed hybrid model achieved a better prediction result, which proves the efficiency of the proposed model.

Keywords

Energy load prediction CNN Bi-LSTM two-way attention method MPSO

Introduction

Smart grids’ environmental sustainability, dependability and efficiency have revolutionized energy management and delivery. Accurately forecasting energy needs facilitates the management of a smart grid by maximizing resource allocation, fast response to changes in demand and system stability. Conventional techniques for forecasting energy loads frequently prove inadequate because they cannot effectively process the numerous spatial and temporal connections inherent in energy usage data. Due to this inherent defect, it is imperative to explore more sophisticated predictive models capable of effectively addressing these challenges and generating accurate predictions (Xu et al., 2024).

Deep learning is increasingly important in every aspect of our lives. It has significantly impacted various areas, including cancer detection, precision medicine, autonomous vehicles, predictive forecasting and speech recognition. The manually crafted element extractors used in traditional learning, characterization and pattern recognition systems are not adaptable for large-scale datasets (Oqaibi and Bedi, 2024). Depending on the level of complexity, deep learning can overcome the limitations of previous shallow networks that hindered efficient processing and representations of hierarchical structures in multidimensional training data. Neural networks are an artificial intelligence (AI; machine learning) technique inspired by and mimic the structure and functioning of the human nervous system and the brain (Shahinzadeh et al., 2024).

Stability Prediction is crucial in Smart Grids. This promotes durability and enables administrations to implement efficient strategies for planning and operating power systems. Since electricity is a crucial resource, its consumption must be synchronized with its production to prevent unnecessary waste. Its volatility and unpredictability characterize Stability Prediction, and any underestimation or overestimation can result in significant issues (Buratto et al., 2024). The global implementation of industrialization and automation has significantly enhanced the production of electrical energy, resulting in improved reliability. In the present day, Green Energy is widely utilized as a substitute energy source to fulfill the increasing requirements for electricity.

Nevertheless, numerous obstacles arise in preserving the power reliability and dependability of the electric grid network. One of the main difficult components of global energy use is power systems. Some have proposed the idea of smart grids as a solution to these problems. Smart Grids are a multidisciplinary field that combines power systems with technological advances, smart technology, wireless communication and other related areas (Tursunboev et al., 2024).

Although smart grids have several positive attributes, further enhancements are still significantly needed. Currently, load prediction is being carried out using various models and techniques. Nevertheless, as a result of the rising energy requirements, there is a compelling necessity to create more proficient prototypes (Mbey et al., 2024). The majority of the studies have attempted to address these challenges.

Problem statement

The provision of precise and dependable stability forecasts is essential for the maintenance of grid stability and the optimal distribution of resources in smart grid management. Unfortunately, the complicated, non-linear structures found in utilization data are often difficult for present short-term load forecasting techniques to adequately represent, and this cause's poor performance. The existing approaches might not be sufficient to handle the issues brought about by high dimensionality, a wide range of features, and the need for real-time processing. To close this gap, we need an advanced predictive model that can combine multiple critical features, increase memory capacity and improve accuracy when predicting.

Bidirectional long short-term memory (LSTM) networks and convolutional neural networks (CNNs) are combined in this work to produce a robust hybrid deep learning model that is improved by a two-way attention mechanism and optimized using multi-objective particle swarm optimization (MPSO). Through the resolution of the flaws in current approaches and provision of a precise short-term energy load prediction in smart grids, this model aims to enhance overall grid stability management.

Motivation of the research

In smart grids, accurate stability forecasts are essential to the efficient allocation of resources, the response to demand and the construction of the general grid. Traditional methods frequently fail to capture the intricate temporal and spatial correlations that are present in stability data, which results in suboptimal performance. This research makes use of cutting-edge deep learning techniques, more specifically a hybrid model that combines Bi-LSTM networks and CNNs that have been improved with a two-way attention mechanism. This is done in order to overcome the constraints that have been encountered. Currently, some smart grid components account for changes in customer electricity consumption at specific times when the smart grid atmosphere is being formed (Miraftabzadeh and Longo, 2023). These constituents comprise intelligent energy metres, renewable energy sources and electric vehicles.

The motivation behind this strategy is to improve energy efficiency by means of better resource management and more sustainable grid operations. Furthermore, this strategy seeks to overcome the limitations of traditional approaches by detecting complex patterns in stability data and improving prediction capabilities. Furthermore, optimizing hyperparameters with MPSO can significantly improve computational efficiency and model performance. This study seeks to share and spread this ground-breaking approach to the field, with the hope of inspiring further innovation and continuous improvement.

Key contribution

This research presents a Smart Grid Stability Prediction Model using Two-way Attention Based Hybrid Deep Learning and MPSO. As far as we know, it is the first work to suggest a dynamic short-term load prediction model that considers different significant features and enables precise predicting outcomes (Wang, 2023). The key contributions are as follows:

The proposed hybrid model utilizes the CNN-BiLSTM deep learning model. A CNN extracts essential features, that is, periodic patterns and trends, from the smart grid dataset. A similar Bi-LSTM model helps to extract the temporal (time-based) dependencies.

In the proposed model, we utilize a two-way attention method and encoding and decoding; an encoding attention layer helps recognize all the essential features from an input vector, and a decoding attention layer helps resolve the fixed context vector problem by offering better memory capacity.

The proposed hybrid model also uses a MPSO to optimize the hyperparameters. It is a population-based technique that can best trade among the selection of essential parameters, which helps to enhance the prediction accuracy and minimize the computational complexity and prediction errors.

The performance of the proposed model and existing well-known deep learning models such as Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), LSTM, Time Series Transformer (TST), CNN-LSTM and various performance measuring parameters MAE, MSE, MAPE and RMSE are calculated using online UCI dataset (Electrical Grid Stability Simulated Dataset). The proposed hybrid model achieved a better prediction result, which proves the efficiency of the proposed model.

Organization the article

The complete article is organized into sections and subsections, and the details are as follows. ‘Related work’ section covers the related work in the smart grid load prediction field, and ‘Materials and methods’ section covers the materials and methods used in smart grid load prediction research; this section covers the dataset description and the working of the proposed model. ‘Simulation and experimental results’ section covers the experimental results and discussion, and ‘Conclusion and future directions’ section covers the conclusion and future direction of the research.

Related work

Stability Prediction plays a crucial role in the day-to-day operations of energy, particularly in managing fuel resources and making accurate decisions to balance both the demand and production of energy. The advancements in the Smart Grids field are currently in the early stages. Multiple studies have been carried out regarding smart grids, examining their features, standardization and the implementation of smart grid test-bedding. Nevertheless, researchers currently lack effective coordination in developing these concepts during a crucial period of growth. This section presents the analysis of key research in the field of Stability prediction in Smart grids.

In their study, Xu et al. (2024) introduced a framework that utilizes an attention mechanism time series depthwise separable CNN for power load forecasting. Their methodology employs a combination of techniques for prediction, with the attention mechanism enhancing the model's capacity to capture pertinent patterns in the data. The findings demonstrate a substantial enhancement in precision compared to conventional approaches, with a prediction accuracy of roughly 88.9%. A primary obstacle encountered in this work is the requirement for substantial data to adequately train the deep learning model. In another research study, Oqaibi and Bedi (2024) devised a hybrid methodology for predicting electricity load by employing data deconstruction and attention mechanisms. This strategy synergistically harnesses the advantages of various forecasting systems in order to enhance precision. The results indicate a notable enhancement in the accuracy of forecasting when compared to individual approaches, with a prediction accuracy of approximately 90.45%. Nevertheless, there are still obstacles to overcome to optimize the hybrid model and effectively manage the computational complexity that comes with it.

In 2024, Shahinzadeh et al. did a comprehensive examination and comparative evaluation of deep learning methodologies for predicting the energy demand in smart grid systems. The researchers analyzed different deep learning models and their effectiveness in load forecasting. They emphasized the advantages and drawbacks of each strategy. The work offers valuable insights into the efficacy of several deep learning algorithms for load forecasting, aiding researchers in selecting the best appropriate strategy for their specific requirements. Buratto et al. (2024) introduced a Seq2Seq-LSTM model incorporating an attention mechanism for predicting power load in Brazil. The model utilizes the attention mechanism to concentrate on pertinent segments of the input sequence, enhancing the precision of prediction. The results demonstrate that the suggested model surpasses conventional approaches, with a prediction accuracy of approximately 88.6%. Nevertheless, there are still obstacles to overcome to enhance the model's structure's efficiency and manage the input data's intricacy.

Tursunboev et al. (2024) created a hybrid deep learning model that combines multiple objectives and evolutionary algorithms to detect instances of energy theft. The model integrates deep learning and evolutionary methods to enhance the identification of energy theft in smart grids. The results indicate that the suggested approach is highly effective in identifying instances of energy theft, achieving a detection accuracy of around 91.7%. Nevertheless, there are still obstacles to overcome in order to optimize the model features and effectively manage the intricacy of the data. In their study, Mbey et al. (2024) introduced a sophisticated deep-learning model that addresses the challenge of projecting solar PV power and electrical demand in smart grid systems. The model incorporates multiple objectives to enhance its performance. The model combines many goals, such as predicting solar power production and electricity consumption, to enhance the system's overall effectiveness. The results demonstrate that the suggested model surpasses conventional approaches, with a prediction accuracy of approximately 90.74%. Nevertheless, there are still obstacles to overcome in order to optimize the model parameters and effectively manage the fluctuations in solar power and electrical consumption.

Liao et al. (2023) introduced a method for short-term load forecasting that utilizes a deep-learning temporal pattern attention mechanism. The approach employs an attention mechanism to capture pertinent patterns within the data and enhance the precision of predictions. The results indicate that the suggested strategy surpasses previous methods, with a prediction accuracy of approximately 89.25%. Nevertheless, there are still obstacles to overcome in order to optimize the attention process and effectively manage the intricacy of the data. Miraftabzadeh and Longo (2023) constructed a sophisticated PV power forecasting model utilizing deep learning and attention methods, which offers enhanced resolution. The model utilizes attention mechanisms to enhance the precision of PV power forecasts. The results indicate the efficacy of the suggested model, achieving a prediction accuracy of approximately 98.93%. Nevertheless, there are still obstacles to overcome in order to improve the efficiency of the model's structure and manage the fluctuations in photovoltaic power generation.

In their study, Wang (2023) presented a technique that combines TCN and BiGRU with attention processes to improve energy efficiency through the implementation of smart grid technologies. The approach integrates various deep-learning methodologies to enhance the energy efficiency of smart grid systems. The results demonstrate that the suggested approach surpasses conventional methods, with a notable enhancement in energy efficiency of approximately 90.37%. Nevertheless, there are still obstacles to overcome in order to optimize the model characteristics and manage the intricacy of smart grid systems. In their study, Wan et al. (2023) devised a Stability forecasting technique for integrated power and heat systems. The method combined CNN, LSTM and attention processes to improve accuracy and efficiency. The approach integrates CNN and LSTM models with attention mechanisms in order to enhance the accuracy of predicting. The results demonstrate that the suggested approach surpasses conventional methods, with a prediction accuracy of approximately 90.6%. Nevertheless, there are still obstacles to overcome to enhance the model structure's efficiency and manage the data's intricacy.

Abou Houran et al. (2023) introduced a hybrid deep learning model for PV/wind power forecasting in smart grid applications, utilizing a coati optimization approach. The model integrates the coati optimization method with CNN-LSTM to enhance the precision of PV/wind power forecasts. The results indicate the efficacy of the suggested model, achieving a prediction accuracy of approximately 89.74%. Nevertheless, there are still obstacles to overcome when optimizing the coati optimization method for PV/wind power forecasts. In their study, He et al. (2023) devised a method for predicting multivariate load for demand response using a pre-attention mechanism and CNN. This strategy utilizes pre-attention mechanisms to identify significant patterns in the data and enhances load prediction accuracy. The results demonstrate that the suggested approach surpasses conventional methods, with a prediction accuracy of approximately 89.15%. Nevertheless, there are still obstacles to overcome in order to optimize the pre-attention process and effectively manage the intricacy of the data.

In their study, So et al. (2023) introduced a BiGTA-Net model to predict electrical energy in building energy management systems. The model employs a combination of deep learning techniques to enhance the accuracy of energy forecasts. The findings illustrate the efficacy of the suggested approach, with a prediction accuracy of approximately 90.26%. Nevertheless, there are still obstacles to overcome to enhance the efficiency of the model's structure and manage the intricacies of constructing energy systems.

In their study, Yi et al. (2023) created a sophisticated deep LSTM-CNN model incorporating self-attention mechanisms and input data reduction techniques for short-term load forecasting. The model integrates LSTM-CNN with self-attention mechanisms to enhance load predictions’ accuracy. The results demonstrate that the suggested model surpasses conventional approaches, with a prediction accuracy of approximately 89.95%. Nevertheless, there are still obstacles to overcome in enhancing the self-attention mechanism and managing the fluctuations in load forecasting data. AlHaddad et al. (2023) created a combined model using a mixture of deep learning techniques to detect unauthorized access in smart grid networks. The model integrates various deep-learning approaches to enhance the accuracy of intrusion detection. The results indicate that the suggested model is highly effective, with an intrusion detection accuracy of approximately 91.7%. Nevertheless, there are still obstacles to overcome in order to optimize the ensemble model and manage the intricacy of smart grid networks.

Kim and Kim (2023) introduced an innovative deep learning–based prediction model that a heuristic approach enhances. This model is specifically designed for optimizing the energy management of microgrids. The model utilizes a heuristic method to optimize the deep learning model for energy management. The results demonstrate that the suggested model surpasses conventional approaches, attaining a notable enhancement of approximately 92.6% in energy management. Nevertheless, there are still obstacles to overcome to enhance the efficiency of the heuristic algorithm for energy management.

Like another energy prediction research, Khan et al. (2023) devised an AI-assisted hybrid method to manage energy in an Internet of Things (IoT)-based smart Microgrid. The methodology integrates AI methodologies with IoT technology to enhance energy management efficiency. The results indicate the efficacy of the suggested method, with a notable enhancement in energy management efficiency of approximately 90.85%. Nevertheless, there are still obstacles to overcome in order to optimize the AI-assisted strategy for various smart microgrid applications.

In 2023, Safari et al. introduced a novel hybrid neural network model that accurately predicts the electrical load of smart cities over multiple periods. The model utilizes neural networks to enhance the precision of load forecasting in intelligent urban areas. The results showcase the efficacy of the suggested model, with a prediction accuracy of approximately 91.4%. Nevertheless, there are still obstacles to overcome in order to enhance the model's design for various smart city applications. Ma and Mei (2022) introduced a novel method that combines attention-based deep learning techniques to forecast wind power. The methodology integrates attention processes with deep learning to enhance the precision of wind power prediction. The results indicate the efficacy of the suggested method, with a prediction accuracy of approximately 90.33%. Nevertheless, there are still obstacles to overcome in order to enhance the efficiency of the model's structure and manage the fluctuations in wind power production. Table 1 presents a comparative analysis of existing methods in smart energy load prediction.

Table 1.

Comparative analysis of existing methods in smart energy load prediction.

Reference	Key method	Contribution	Dataset	Outcome	Future directions
Xu et al. (2024)	Attention mechanism time series depthwise separable CNN	Enhanced prediction accuracy using attention mechanism	UCI	Prediction accuracy of 88.9%	Need for substantial data to train deep learning models
Oqaibi and Bedi (2024)	Hybrid methodology with data deconstruction and attention mechanisms	Improved accuracy by combining multiple forecasting systems	Kaggle	Prediction accuracy of 90.45%	Optimize hybrid model and manage computational complexity
Shahinzadeh et al. (2024)	Comparative evaluation of deep learning models	Insights into effectiveness of different deep learning models	Various public datasets	Analysis of various models	Aid in selecting appropriate strategies for specific needs
Buratto et al. (2024)	Seq2Seq-LSTM with attention mechanism	Enhanced prediction precision for power load	UCI	Prediction accuracy of 88.6%	Enhance model structure efficiency and manage data complexity
Tursunboev et al. (2024)	Hybrid deep learning with evolutionary algorithms	Effective identification of energy theft	Self-created	Detection accuracy of 91.7%	Optimize model features and manage data intricacy
Mbey et al. (2024)	Deep learning model for solar PV power and electrical demand	Enhanced system performance for PV power and electricity forecasting	Self-created	Prediction accuracy of 90.74%	Optimize model parameters and manage power fluctuations
Liao et al. (2023)	Deep learning temporal pattern attention mechanism	Improved precision of short-term load forecasting	Kaggle	Prediction accuracy of 89.25%	Optimize attention process and manage data intricacy
Miraftabzadeh and Longo (2023)	Deep learning and attention methods for PV power forecasting	Enhanced resolution for PV power forecasts	Self-created	Prediction accuracy of 98.93%	Improve model structure efficiency and manage power fluctuations
Wang (2023)	TCN and BiGRU with attention mechanisms	Enhanced energy efficiency through smart grid technologies	UCI	Energy efficiency improvement of 90.37%	Optimize model characteristics and manage system complexity
Wan et al. (2023)	CNN, LSTM and attention processes for stability forecasting	Improved accuracy and efficiency in power and heat systems	Kaggle	Prediction accuracy of 90.6%	Enhance model structure efficiency and manage data complexity
Abou Houran et al. (2023)	Hybrid model with coati optimization for PV/wind power forecasting	Improved precision for PV/wind power forecasts	Self-created	Prediction accuracy of 89.74%	Optimize coati optimization method for power forecasts
He et al. (2023)	Pre-attention mechanism and CNN for multivariate load prediction	Enhanced load prediction accuracy	UCI	Prediction accuracy of 89.15%	Optimize pre-attention process and manage data intricacy
So et al. (2023)	BiGTA-Net for building energy management systems	Enhanced accuracy of energy forecasts	Self-created	Prediction accuracy of 90.26%	Enhance model structure efficiency and manage system complexity
Yi et al. (2023)	Deep LSTM-CNN with self-attention and data reduction	Enhanced accuracy of short-term load forecasting	UCI	Prediction accuracy of 89.95%	Enhance self-attention mechanism and manage data fluctuations
AlHaddad et al. (2023)	Combined deep learning techniques for intrusion detection	Enhanced accuracy of intrusion detection in smart grids	Self-created	Intrusion detection accuracy of 91.7%	Optimize ensemble model and manage network intricacy
Kim and Kim (2023)	Deep learning-based prediction with heuristic approach	Optimized energy management of microgrids	UCI	Energy management improvement of 92.6%	Enhance heuristic algorithm efficiency
Khan et al. (2023)	AI-assisted hybrid method for smart microgrid energy management	Improved energy management efficiency	Kaggle	Energy management efficiency of 90.85%	Optimize AI-assisted strategy for various applications
Safari et al. (2023)	Hybrid neural network for smart city load forecasting	Accurate prediction of electrical load in smart cities	UCI	Prediction accuracy of 91.4%	Enhance model design for smart city applications
Ma and Mei (2022)	Attention-based deep learning for wind power forecasting	Improved precision of wind power prediction	Self-created	Prediction accuracy of 90.33%	Enhance model structure efficiency and manage power fluctuations
Proposed Hybrid Model	CNN with Bi-LSTM and Two-way Attention and MPSO	achieved a better prediction result,	UCI		Achieve better performance

Materials and methods

This section describes the key methods used in this research.

Proposed hybrid model

The proposed model combines CNNs and Bidirectional LSTM Networks, augmented with a two-way attention procedure and optimized using MPSO to forecast Stability in a smart grid. Furthermore, the t-Nearest Neighbours (t-NN) preprocessing technique is employed to augment data quality and enhance the accuracy of predictions (Ji et al., 2022). Figure 1 presents the architecture of the proposed hybrid model.

Figure 1.

The architecture of proposed hybrid model.

Convolutional Neural Network module

The proposed hybrid model incorporates a CNN module to extract spatial characteristics from its input statistics. The extracted characteristics are subsequently utilized by the subsequent layers within the proposed hybrid framework (Liao et al., 2023). Figure 2 presents the architecture of the CNN Model. Below is an intricate analysis of the functioning of the CNN component.

Figure 2.

The architecture of the CNN model.

Convolutional layer

The convolutional layers are the central element of the CNN module. The purpose of these layers is to perform convolution operations on the input data to acquire spatial characteristics (Miraftabzadeh and Longo, 2023).

Convolutional Operations: To perform a convolution operation, one must first slide a filter, known as a kernel, over the given input data and subsequently calculate the dot product of the input data and the filter at each position. This operation produces a feature map and an activation map, emphasizing particular patterns within the information being entered.

Filter: Filters are compact matrices consisting of trainable parameters. During the training process, the CNN acquires the most effective parameters for these filter parameters to accurately detect substantial patterns within the information being processed. In order to capture distinct patterns, individual convolutional layers may employ multiple filters (Chung et al., 2022).

Stride and Padding: Stride refers to the distance when the filter is shifted through the original input data. When the filter has a stride of one, it changes positions one spot at a time; however, when it has a stride of two, it shifts two simultaneously. Padding refers to adding additional boundaries across the original input data to regulate the spatial dimensions concerning the resulting feature map. ‘Valid’ (no padding) and ‘Same’ (padding so that the resultant file has precisely the same measurements as the input) are two common padding mechanisms (Tang et al., 2022).

Activation function

The activation function makes the model less linear after every single convolution operation. We are utilizing a ReLU activation function in the proposed hybrid model, as presented by equation (1). The Rectified Linear Unit (ReLU) function replaces each negative value within the feature map with zero, enabling the model to capture and acquire non-linear features (Ozcan et al., 2021):

ReLU (A) = max (0, A)

(1)

Pooling layers

Pooling layers are employed to decrease the spatial dimensions associated with the feature maps, thus diminishing the total number of the variables and their computational complexity.

Max Pooling: Max pooling is a process where a window is moved across the feature map, and the most significant value inside the window is selected. This operation preserves the essential characteristics while decreasing the spatial dimensions.

Average Pooling: Average pooling is a process where the mean value inside a given window is calculated rather than the highest possible value being selected. This procedure applies a smoothing effect to the feature map.

Flatting layers

It is common practice to flatten the feature maps into a one-dimensional vector after the layers of convolution and pooling have been applied. After flattening the vector, it can be fed into densely connected and fully connected layers for further processing. During the Flattening process, all the feature maps that are either 2D or 3D are transformed into a 1D vector so that the dense layers can use it.

Dense/fully connected layer

The flattened vector is traversed by any number of fully connected layers. These successive layers learn complicated representations by combining characteristics derived from the convolutional layers. A dense layer comprises neurons that establish complete connections with every neuron in the preceding layer. After applying an activation function based on the weighted average of its inputs, each neuron transfers the output to the subsequent layer (Ma and Mei, 2022).

Output layer

The predictions are generated in the output layer, the final layer of the CNN module. In the output layer, regression tasks such as Stability prediction commonly employ a solitary neuron whose activation function follows a linear pattern. The output layer generates the ultimate energy consumption prediction. It might employ a linear activation function (LAF) to generate continuous values in the context of regression.

Bi-LSTM module

The BiLSTM component improves the model's capacity to comprehend temporal relationships in stability forecasting by analyzing the incoming values in both directions, backward and forward. The bidirectional strategy, in conjunction with the efficacy of LSTM networks, ensures that the model encompasses extensive sequence information, resulting in more precise and dependable predictions. The output generated by the BiLSTM module is subsequently combined with the CNN and attention methods to generate accurate predictions for the final energy load (Syed et al., 2021). A BiLSTM contains two layers. Figure 3 presents the architecture of the BiLSTM model:

Forward LSTM: It mainly follows start-to-end input patterns.

Backward LSTM: It mainly follows an end-to-start input pattern.

Figure 3.

Architecture of BiLSTM model.

At each time step, the outputs of both LSTM units are combined, allowing the network to incorporate information from both previous and upcoming contexts. A Bi-LSTM includes the following layers (Ünal et al., 2021):

Input Layer: The input data consists of preprocessed features, which can be obtained through the CNN component or from the source during the preprocessing stage.

Embedding Layer: When the input features contain categorized variables, these are converted through dense vectors applying an embedding layer.

Forward LSTM: The forward LSTM model sequentially analyses the input sequence starting from the initial time step and ending at the final time step. It records the chronological order of events from the past up to the present.

Backward LSTM: The reverse LSTM operates by processing the input sequence in reverse order, starting from the final step and moving towards the first. It records data regarding the chronological order from the future to past times.

Concatenation Layer: At each time step, the results from the forward and backward LSTM models are combined by concatenation. This joining together takes the data from both directions and makes it easier to understand the sequence context.

Dense/Fully Connected Layer: The combined results are transmitted through some completely interconnected (dense) layers. These layers acquire more complex representations and generate ultimate predictions.

Output Layer: Finally, the predictions for the Stability are generated by the final layer. A typical example of this is a single neuron with a linear activation function used for regression analysis.

Optimization based on MPSO

Multi-objective Particle Swarm Optimization is an advanced optimization technique inspired by the social behaviour of birds flocking or fish schooling. It is particularly effective for solving complex optimization problems with multiple conflicting objectives. In the context of the ‘Smart Grid Stability Prediction Model using Two-way Attention Based Hybrid Deep Learning (CNN and BiLSTM), t-NN Pre-processing and Multi-objective PSO’, MPSO is used to optimize various parameters and hyperparameters of the model to improve its prediction accuracy and efficiency (Ünal et al., 2021).

In MPSO, the primary objective is to identify the best trade-offs between multiple objectives. This frequently involves optimizing conflicting criteria, such as minimizing prediction error while maximizing computational efficiency. The MPSO algorithm was developed to accomplish this. Please find below a comprehensive explanation of how the MPSO operates:

Initialization: It mainly includes the initialization of populations and important objective functions.

- Particles and Swarms: It is the responsibility of MPSO to initiate the formation of a swarm of particles, each of which stands for a possible solution to the optimization problem. There is a position vector for each particle in the solution space and a vector of velocity that directs the particle's movement.

- Objective Functions: To optimize the objective functions, it is necessary to define them. In the context of Stability prediction, these may include minimizing MSE, reducing computation time and optimizing resource distribution.

Evaluation: This phase mainly includes fitness calculations. Assess the suitability of each particle by considering the specified objective functions. The fitness of each particle is calculated based on its ability to fulfil its goals. Multi-objective fitness entails the computation of a collection of Pareto optimal outcomes instead of a solitary optimal result.

Update the Velocity and Positions: The phase includes updating the velocity value position.

- Updation of Velocity: The velocity is updated using three main components, that is, inertia, Cognitive and Social. It is calculated by equation (2):

V_{u p d a t e} (t + 1) = [w * V_{update} (t) + (C_{1} \times R_{1}) \times (P_{Besti} - A_{i} (t))] + [(C_{2} \times R_{2}) \times (G_{Besti} - A_{i} (t))]

(2)

- Updation of Positions: Equation (3) can determine a position update:

A_{i} (t + 1) = [A_{i} (t) + V_{i} (t + 1)]

(3)

where w represents initial weight,

C_{1}

and

C_{2}

represent cognitive coefficients,

R_{1}

and

R_{2}

represents a random number from zero to one,

A_{i}

and

V_{i}

represents the velocity,

P_{Besti}

and

G_{Besti}

represent personal best and global best positions.

Update the Best fits (Personal and Global): Every particle maintains a record of the best position it has encountered, determined by its fitness value, to establish its personal best. The global best position is the most optimal position discovered by any particle within the swarm. The update occurs when a particle discovers a solution superior to the current global best.

Two-way attention mechanism

The proposed hybrid model incorporates a dual attention method, also called a two-way attention method, to improve its predictive abilities. This method enables the model to concentrate on pertinent segments of the input pattern from the forward and the backward direction, resulting in a more holistic data comprehension. This is a comprehensive explanation of the function and operation of the bidirectional attention process in the hybrid model as presented by equation (4):

{Dual}^{Attention} = [{Forward}^{Attention} + {Backward}^{Attention}]

(4)

The two-way attention mechanism expands upon the conventional attention mechanism by integrating both forward and backward. This implies that the model can consider data gathered from the preceding and subsequent context in the sequence of inputs, thereby improving its capacity to comprehend temporal relationships and intricate patterns.

Dataset

This research utilizes the Electrical Grid Stability Simulated Dataset, which is a publicly accessible dataset from the University of California (UCI) Machine Learning Repository. This dataset was designed to facilitate research and analysis in the field of electrical grid stability, making it more convenient for users. The dataset titled ‘Electrical Grid Stability Simulated Dataset’ includes synthetic data that mimics various crucial factors impacting the stability of an electrical grid.

The dataset displays information on frequency, voltage, current and power flow. To comprehend and forecast the stability and performance of electrical grids, one must consider these variables. The dataset is generated using sophisticated mathematical models and simulation techniques to accurately replicate grid conditions in the real world. This dataset offers a robust platform for the development and evaluation of stability prediction models, encompassing a diverse set of scenarios and operating states. This synthetic method enables the dataset to accurately capture the complexity and variability found in real electrical grids. As a result, it becomes a valuable resource for researchers and practitioners in the field.

Data preprocessing

The initial data preprocessing stage involves gathering pertinent historical energy usage data and corresponding characteristics. The data is easily obtained from smart meters, weather observatories and other pertinent sources (Jiang et al., 2021). The characteristic attributes comprise:

Historical data on the amount of energy consumed, measured in kilowatt-hours (kWh)

Weather data (including the humidity, temperature and speed of the wind)

Temporal attributes (such as hour of the day, day of the week and month)

Data cleaning

Data cleaning is the process of eliminating or rectifying inaccurate or incomplete data to ensure the integrity of the dataset. This step encompasses the following:

Dealing with Missing Values: The t-NN algorithm fills in missing values in the dataset. This entails identifying the t closest neighbours for every missing value using Euclidean distance and substituting the missing value with the mean of these neighbours.

Outlier identification and elimination: Outliers can have a substantial impact on the performance of a model. Methods such as Z-score or IQR (interquartile range) are employed to detect and eliminate outliers from the dataset.

Feature engineering

It entails creating novel features or modifying existing ones to enhance the predictive capability of the model (Lilhore et al., 2022b). This step encompasses the following:

Developing a time-dependent system Functionality: Create supplementary attributes such as weekday, time of day and special occasions to capture temporal trends in energy usage.

Lag Features: Generate lagged features that capture previous values of the target variable, such as energy consumption in previous time intervals. These characteristics aid the model in comprehending temporal dependencies.

Moving Statistics: Generate moving statistics, such as the mean and standard deviation, using various window sizes to capture patterns and seasonal variations in the data.

Normalization

Normalization is essential for ensuring that all features are standardized to a comparable scale, accelerating the deep learning model's convergence and enhancing its performance. This step encompasses the following:

Min-max scaling

Normalize the features to a predetermined range, typically [0, 1]. The Equation for Min-Max scaling consists of equation (5):

R_{SCALED} = \frac{[R - R_{MIN}]}{[R_{MAX} - R_{MIN}]}

(5)

Standardization

Standardize the characteristics with a mean of Zero and a standard deviation of one. Equation (6) presents the formula for standardization. Here, $β$

R_{STANDADiZED} = [\frac{R - β}{σ}]

(6)

tNN-Preprocessing

The t-NN preliminary processing phase improves the data quality by filling in the missing values and reducing noise. The procedure has the following components.

Euclidean distance calculation

Determine the Euclidean distance among every statistic and its closest neighbours.

Choice of neighbours

Pick the t closest neighbours for each data point.

Imputation along with smoothing

If a value is missing, it should be filled in by averaging the results obtained from t-NN. To eliminate noise in the data, replace each value with the mean of its t nearest neighbours.

Splitting the dataset

Division of a data set among training, validation and testing sets serves as crucial for assessing the performance of the model and preventing overfitting from occurring; this step encompasses the following:

Training data sample

Utilized for model training. For this study, we use 80% of the data available as the training set.

Testing data sample

Utilized for model testing. We are utilizing 10% of the data available for this study as the testing set.

Validation data samples

Utilized for model validation. This study uses 10% of the data as the validation set.

Data reshaping/transformation

Data transformation refers to the process of restructuring the data in a manner that is appropriate for feeding the results into the model used for deep learning (AlHaddad et al., 2023, He et al., 2023, Khan et al., 2023, Kim and Kim, 2023, Safari et al., 2023, So et al., 2023, Yi et al., 2023). This step encompasses the following:

Reshaping for CNN

When employing a CNN, it may be necessary to transform the data into a two-dimensional format, where the dimensions correspond to time steps and features.

Sequence generation for LSTM

In the case of LSTM networks, sequencing comprises input data in which each sequence indicates a specific time window of previous observations utilized to forecast future values (Bisen et al., 2023).

Key performance indicators

The following key performance indicators are used to measure the performance of proposed hybrid and existing deep learning models.

MAE

It can be measured by calculating the mean of absolute errors using prediction and actual errors as defined by equations (7) and (8). Here, $E^{T}$ define the error difference, $P^{T} a n d A^{T}$ are prediction and actual errors, n number of data samples:

E^{T} = [P^{T} - A^{T}]

(7)

{MAE}^{T} = \frac{1}{n} \sum_{i}^{1} | E^{T} |

(8)

MAPE

The term refers to calculating the average absolute error by dividing the total amount of all the absolute errors by the actual value as defined by equation (9):

{MAPE}^{T} = \frac{1}{n} \sum_{i}^{1} \frac{| E^{T} |}{A^{T}}

(9)

MSE

An MSE can be derived by using equation (10):

{MSE}^{T} = \frac{1}{n} \sum_{i}^{1} | E^{T} |^{2}

(10)

RMSE

An RMSE can be derived using equation (11):

{RMSE}^{T} = \sqrt{\frac{1}{n} \sum_{i}^{1} | E^{T} |^{2}}

(11)

RRSE

The term ‘total squared error’ refers to the sum of the squared differences between actual and forecasted values. This calculation concerns the errors that occur when the forecasting is based on averaging absolutes. Equation (12) displays the formula for RRSE:

{RRSE}^{T} = \sqrt{\frac{\sum_{i}^{1} [P^{T} - A^{T}]^{2}}{\sum_{i}^{1} [\overset{\dot{-}}{A^{T}} A^{T}]^{2}}}

(12)

CV

Calculating the coefficient of variation (CV) involves determining the ratio between the standard deviations representing the predicted error with the median of the actual amount, as defined by equation (13):

C V = \frac{{SD}^{i}}{{Dataset}^{Mean}} \times 100

(13)

Precision

A precision can be calculated by equation (14):

Precision = \frac{T P}{[T P + F P]}

(14)

Accuracy

Accuracy can be determined by using equation (15):

Accuracy = \frac{[T P + T N]}{[T P + F P + T N + F P]}

(15)

Recall

Recall can be calculated by using equation (16):

Recall = \frac{T P}{[T P + F N]}

(16)

F-Measure

F-Measure can be calculated by equation (17):

F - Measure = \frac{2 \times [Precision \times Recall]}{[Precision + Recall]}

(17)

Simulation and experimental results

The proposed and existing models are implemented using the Python programming language. It is recommended to have a system that includes a minimum of an Intel Core i7 / AMD Ryzen 7 CPU, an NVIDIA GPU (such as GTX 1080 Ti or a higher model) with CUDA support, 16GB of RAM (32GB is recommended) and a 512GB SSD. The software configuration comprises Python 3.8 or a higher version, along with essential libraries such as NumPy, Pandas, Matplotlib, Seaborn and Scikit-learn installed using Anaconda. For the development of models, it is necessary to install deep learning frameworks such as TensorFlow and PyTorch (Lilhore et al., 2022a, Simaiya et al., 2021). Table 2 presents the proposed hybrid model's key parameters used for CNN and BiLSTM.

Table 2.

Key parameters used for CNN and BiLSTM in the proposed hybrid model.

Module	Parameter	Value/Description
CNN	Number of Layers	2–3
	Number of Filters	32, 64, 128 per layer
	Filter Size	(3, 3) or (5, 5)
	Stride	1 or 2
	Padding	‘same’ or ‘valid'
	Activation Function	ReLU
	Pooling Layers	Max pooling (e.g. (2, 2))
	Dropout Rate	0.25–0.5
BiLSTM	Number of Layers	1–2
	Number of Units	50, 100, 200 per layer
	Dropout Rate	0.2–0.5
	Recurrent Dropout	0.2–0.5
	Activation Function	tanh (default for LSTM)
Attention	Attention Mechanism	Two-way (bidirectional) attention
	Input Dimension	Sequence length and feature dimension input sizes
Integration	Attention Layer	Applied between CNN and BiLSTM layers
	Output Layer	Dense layer with sigmoid or SoftMax activation

The dataset was divided into training: 80, Testing: 10 and Validation: 10. Following results were calculated.

Testing and evolution results for common data length

The proposed model is trained on 80% of the training dataset. After training mode, we calculated the performance results for different error metrics for the testing dataset of 10% and 10% validation dataset with a common data length of 16 bits. Figures 4 and 5 present the simulation results for the testing and validation phase for the proposed and existing model.

Figure 4.

Error metrics results for existing and proposed methods during testing.

Figure 5.

Error metrics results for existing and proposed methods during validation.

Table 3 presents a comparative analysis of Error Metrics results for Existing and Proposed Methods during Testing and Validation. The results in Table 3 demonstrate that the proposed method for energy consumption prediction significantly outperforms existing methods, including RNN, GRU, LSTM, TST and CNN-LSTM. The proposed method achieves the lowest error rates across all evaluated metrics: MAE (0.098 for testing and 0.108 for validation), MAPE (2.00% for testing and 2.28% for validation), RMSE (0.135 for testing and 0.145 for validation), MSE (0.0182 for testing and 0.0210 for validation), RRSE (0.117 for testing and 0.126 for validation) and CV (8.8%). This indicates that the proposed model provides more accurate and consistent predictions.

Table 3.

Error metrics results for existing and proposed methods during testing and validation.

Method	MAE (Testing)	MAE (Validation)	MAPE (Testing)	MAPE (Validation)	RMSE (Testing)	RMSE (Validation)	MSE (Testing)	MSE (Validation)	RRSE (Testing)	RRSE (Validation)	CV (%)
RNN	0.123	0.130	2.34	2.56	0.157	0.165	0.0246	0.0272	0.134	0.142	1.2
GRU	0.115	0.125	2.21	2.45	0.150	0.160	0.0225	0.0256	0.129	0.138	9.8
LSTM	0.110	0.120	2.15	2.40	0.145	0.155	0.0210	0.0240	0.125	0.134	9.5
TST	0.105	0.115	2.10	2.35	0.140	0.150	0.0196	0.0225	0.121	0.130	9.2
CNN-LSTM	0.108	0.118	2.12	2.38	0.142	0.152	0.0201	0.0231	0.123	0.132	9.3
Proposed Method	0.098	0.108	2.00	2.28	0.135	0.145	0.0182	0.0210	0.117	0.126	8.8

The superior performance is largely due to the two-way attention mechanism, which effectively focuses on relevant parts of the input sequence, and the hybrid architecture combining CNN for spatial feature extraction and BiLSTM for capturing temporal dependencies. Furthermore, the MPSO fine-tunes the model parameters, ensuring optimal performance. These elements collectively enhance the model's ability to capture complex patterns in the data, resulting in better prediction accuracy and reliability than existing methods, even with a common data length of 16 bits.

Results based on hyperparameters

We have also calculated the MSE results using different optimizers, that is, Adam, AdaGrad, SGD and Momentum, for existing and proposed methods in the next simulation analysis. Table 3 and Figure 6 present MSE results for Existing and Proposed optimization methods.

Figure 6.

MSE results for existing and proposed methods for different optimization methods.

The data displayed in Table 4 demonstrates the Mean Squared Error for different optimization techniques (ADAM, AdaGrad, SGD, Momentum) applied to various models, such as RNN, GRU, LSTM, TST, CNN-LSTM and the proposed method. The proposed method consistently attains the lowest Mean Squared Error among all optimization techniques, showcasing its superior performance in predicting energy consumption. The proposed method achieves an MSE of 0.0089 with the ADAM optimizer, which is the lowest among all the methods. AdaGrad achieves a significantly lower MSE of 0.00108, in contrast to other methods with MSE values ranging from 0.0118 to 0.0168. The proposed SGD optimizer method finally, the MSE for the proposed method in terms of Momentum is 0.0098, surpassing the results of the other methods.

Table 4.

MSE results for existing and proposed methods for different optimization methods.

Method	ADAM	AdaGrad	SGD	Momentum
RNN	0.0112	0.0149	0.0178	0.0138
GRU	0.0111	0.0139	0.0168	0.0129
LSTM	0.0109	0.0129	0.0158	0.0189
TST	0.0138	0.0168	0.0189	0.0147
CNN-LSTM	0.0099	0.0118	0.014	0.0107
Proposed Method	0.0089	0.00108	0.00138	0.0098

Results based on different learning rates

We have also calculated the MSE results using different learning rates, that is, 0.01, 0.001, 0.0001 and 0.0001, for existing and proposed methods in the next simulation analysis. Table 4 and Figure 7 present MSE results for Existing and Proposed Methods for different learning rates.

Figure 7.

MSE results for existing and proposed methods for different learning rates.

Table 5 displays the MSE outcomes for different models, such as RNN, GRU, LSTM, TST, CNN-LSTM and the suggested approach. The models were assessed at different learning rates (0.01, 0.001, 0.0001 and 0.00001). The table displays the performance of each model at different learning rates, emphasizing the superiority of the proposed method compared to existing methods.

Table 5.

MSE results for existing and proposed methods for different learning rates.

Method	0.01	0.001	0.0001	0.0001
RNN	0.0123	0.0115	0.0109	0.0105
GRU	0.0098	0.0092	0.0087	0.0084
LSTM	0.0112	0.0103	0.0098	0.0093
TST	0.0105	0.0098	0.0093	0.0088
CNN-LSTM	0.0101	0.0097	0.0092	0.0087
Proposed Method	0.0096	0.0091	0.0088	0.0083

The proposed method generates the lowest MSE of 0.0096 across the various approaches when using a learning rate 0.01. The GRU model corresponds closely with an MSE of 0.0098. When the learning rate is reduced to 0.001, the proposed approach consistently outperforms any additional model with an MSE of 0.0091. When the learning rates are reduced to 0.0001 and 0.00001, the proposed method performs better than the existing methods, with MSE values of 0.0088 and 0.0083, respectively. Regardless of the learning rates used, the consistently lower MSE values observed with the proposed method indicate its high efficacy in minimizing prediction errors.

Results based on different batch sizes

We have also calculated the MSE results using different batch sizes, that is, 32-bit, 64-bit, 128-bit and 256-bit, for the existing and proposed method in the next simulation analysis. Table 6 and Figure 8 present MSE results for Existing and Proposed Methods for different batch sizes.

Figure 8.

MSE results for existing and proposed methods for different batch sizes.

Table 6.

MSE results for existing and proposed methods for different batch sizes.

Method	32	64	128	256
RNN	0.0145	0.0138	0.0125	0.0118
GRU	0.011	0.0103	0.0097	0.0092
LSTM	0.0125	0.0118	0.011	0.010
TST	0.0108	0.0102	0.0098	0.096
CNN-LSTM	0.011	0.0105	0.01	0.0095
Proposed Method	0.0102	0.0098	0.0092	0.0082

Results for energy consumption prediction

Results are calculated for energy consumption for existing and proposed hybrid models. It has an important role in Smart Grid Stability. Figure 9 and Table 7 present MSE results for Existing and Proposed Methods for energy consumption.

Figure 9.

MSE results for existing and proposed methods for energy consumption prediction.

Table 7.

MSE results for existing and proposed methods for energy consumption prediction.

Method	Precision	Accuracy	Recall	F-measure
RNN	0.85	0.83	0.82	0.83
GRU	0.87	0.85	0.84	0.85
LSTM	0.89	0.87	0.86	0.87
TST	0.901	0.89	0.87	0.88
CNN-LSTM	0.91	0.89	0.88	0.89
Proposed Method	0.97	0.96	0.95	0.96

Table 7 presents the precision, accuracy, recall and F-measure results for various methods, such as RNN, GRU, LSTM, TST, CNN-LSTM and the proposed method, in predicting energy consumption. The proposed approach consistently performs better than conventional techniques in all metrics, with precision, accuracy, recall and F-measure values of 0.97, 0.96, 0.95 and 0.96, respectively.

The results demonstrate that the proposed method attains superior precision in identifying positive instances (accurately predicted energy consumption patterns), higher accuracy in overall predictions and improved recall in capturing all positive instances. The exceptional performance can be credited to the inventive approach of the proposed method, which probably integrates more advanced features, superior model architecture or more efficient training strategies compared to the current methods.

Greater precision implies fewer false positives, meaning fewer incorrect predictions of energy consumption patterns. This is vital for energy management systems to prevent unnecessary allocation of resources. The increased accuracy of the proposed method implies a reduced number of prediction errors, resulting in more dependable energy consumption forecasts. The proposed method's superior performance in all metrics proves its effectiveness and efficiency in predicting energy consumption compared to conventional techniques, making it an attractive strategy for real-time energy management solutions.

Ablation analysis

Ablation analysis is conducted to assess the influence of different elements of the proposed hybrid model on its efficacy. The present research examines the proposed hybrid model by individually eliminating and altering features.

Table 8 presents the results of an ablation analysis for existing and proposed methods. The ablation analysis shows how important each part is in the proposed Smart Grid Stability Forecast Model. The Two-way Attention Mechanism, Hybrid Deep Learning (CNN and BiLSTM) and MPSO optimization are important for high precision, accuracy, recall and F1-score. Putting these parts together makes a complete model that can accurately predict Stability in smart grid systems.

Table 8.

Ablation analysis results for existing and proposed methods.

Model components removed	Precision (%)	Accuracy (%)	Recall (%)	F1-score (%)
Full Model	96.5	96.0	95.5	95.9
Without Two-way Attention	93.8	93.2	92.7	93.0
Without Hybrid Deep Learning	88.5	87.8	87.2	87.5
Without MPSO	91.2	90.7	90.2	90.5

The ablation analysis shows that the proposed hybrid model achieves the best performance. Removing away a Two-way Attention feature makes performance go down a lot, which shows how important this feature is for paying attention to crucial components of the input pattern. When the Hybrid Deep Learning parts (CNN and BiLSTM) are taken away, the performance decreases significantly. This shows how important it is to use spatial extraction of features and temporal dependency modeling. Reducing MPSO optimization also lowers performance, showing how useful it is for fine-tuning the model's parameters for better prediction accuracy. Each part is important; putting them together ensures the model can make better predictions.

Conclusion and future directions

This research involved an examination of different load forecasting methods in smart grids. We then created a model that combines a hybrid deep learning model (CNN with Bi-LSTM) with a two-way attention method and a MPSO. This proposed hybrid model is specifically designed for short-term load prediction in smart grids. The primary benefit of developing this model is to attain optimal precision. The researchers conducted experiments on publicly available real-time load datasets using deep learning models, including RNN, GRU, LSTM, TST and CNN-LSTM. This research utilized a real-time online Stability dataset to measure the performance of proposed and existing models. We have examined the simulation results based on various settings, that is, different batch sizes, Learning rates, Optimizers and energy consumption rates.

The proposed Smart Grid Stability Prediction Model, which combines a Two-way Attention Hybrid Deep Learning (CNN and BiLSTM) approach with MPSO, has shown substantial enhancements in the accuracy of energy consumption prediction compared to current methods. The model's capacity to accurately capture spatial characteristics and temporal relationships, in conjunction with the optimization capabilities of MPSO, has led to exceptional performance across a range of metrics. The proposed model has demonstrated significant potential in improving the efficiency and effectiveness of energy management systems. The proposed models’ strong performance and ability to handle various batch sizes, learning rates and optimization methods make them highly valuable for predicting energy consumption in smart grid environments.

To improve the model in the future, it can be enhanced by integrating various additional sources, including weather trends and consumer behaviour data, as well as real-time grid statistics. In addition, investigating sophisticated deep learning architectures and optimization techniques may result in more precise and efficient models for predicting energy consumption. Integrating emerging technologies such as the IoT and Blockchain could further improve the model's real-time ability to manage and optimize energy.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Surjeet Dalal

Magdalena Radulescu

References

Abou Houran

Bukhari

SMS

Hamza Zafar

, et al. (2023) COA-CNN-LSTM: coati optimization algorithm-based hybrid deep learning model for PV/wind power forecasting in smart grid applications. Applied Energy 349: 121638.

AlHaddad

Basuhail

Khemakhem

, et al. (2023) Ensemble model based on hybrid deep learning for intrusion detection in smart grid networks. Sensors 23(17): 7464.

Bisen

Kumar Lilhore

Manoharan

, et al. (2023) A hybrid deep learning model using CNN and k-mean clustering for energy efficient modelling in mobile EdgeIoT. Electronics 12(6): 1384.

Buratto

Muniz

Nied

, et al. (2024) Seq2Seq-LSTM with attention for electricity load forecasting in Brazil. IEEE Access 12: 30020–30029.

Chung

Yoo

(2022) District heater load forecasting based on machine learning and parallel CNN-LSTM attention. Energy 246: 123350.

Lin

, et al. (2023) Pre-attention mechanism and convolutional neural network based multivariate load prediction for demand response. Energies 16(8): 3446.

Huang

Chen

, et al. (2022) A hybrid residential short-term load forecasting method using attention mechanism and deep learning. Buildings 13(1): 72.

Jiang

Wang

, et al. (2021) Hybrid multitask multi-information fusion deep learning for household short-term load forecasting. IEEE Transactions on Smart Grid 12(6): 5362–5372.

Khan

Ullah

FUM

, et al. (2023) AI-assisted hybrid appr approach for energy management in IoT-based smart microgrid. IEEE Internet of Things Journal 10: 18861–18875.

10.

Kim

(2023) A novel deep learning-based forecasting model optimized by heuristic algorithm for energy management of microgrid. Applied Energy 332: 120525.

11.

Liao

Ruan

Xie

, et al. (2023) Deep learning time pattern attention mechanism-based short-term load forecasting method. Frontiers in Energy Research 11: 1227979.

12.

Lilhore

Imoize

C-T

, et al. (2022a) Design and implementation of an ML and IoT based adaptive traffic-management system for smart cities. Sensors 22(8): 2908.

13.

Lilhore

Khalaf

Simaiya

, et al. (2022b) A depth-controlled and energy-efficient routing protocol for underwater wireless sensor networks. International Journal of Distributed Sensor Networks 18(9): 15501329221117118.

14.

Mei

(2022) A hybrid attention-based deep learning approach for wind power prediction. Applied Energy 323: 119608.

15.

Mbey

Foba Kakeu

Teplaira Boum

, et al. (2024) Solar photovoltaic generation and electrical demand forecasting using multi-objective deep learning model for smart grid systems. Cogent Engineering 11(1): 2340302.

16.

Miraftabzadeh

Longo

(2023) High-resolution PV power prediction model based on the deep learning and attention mechanism. Sustainable Energy, Grids and Networks 34: 101025.

17.

Oqaibi

Bedi

(2024) A data decomposition and attention mechanism-based hybrid approach for electricity load forecasting. Complex & Intelligent Systems 24: 1–16.

18.

Ozcan

Catal

Kasif

(2021) Stability forecasting using a dual-stage attention-based recurrent neural network. Sensors 21(21): 7115.

19.

Safari

Kharrati

Rahimi

(2023) Multi-term electrical load forecasting of smart cities using a new hybrid highly accurate neural network-based predictive model. Smart Grids and Sustainable Energy 9(1): 8.

20.

Shahinzadeh

Sadrarhami

Hayati

, et al. (2024) Review and comparative analysis of deep learning techniques for smart grid load forecasting. In: 2024 20th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP), pp.1–9: IEEE.

21.

Simaiya

Gautam

Kumar Lilhore

, et al. (2021) EEPSA: energy efficiency priority scheduling algorithm for cloud computing. In: 2021 2nd international conference on smart electronics and communication (ICOSEC), pp.1064–1069: IEEE.

22.

Jeon

, et al. (2023) BiGTA-Net: a hybrid deep learning-based electrical energy forecasting model for building energy management systems. Systems 11(9): 456.

23.

Syed

Abu-Rub

Ghrayeb

, et al. (2021) Household-level energy forecasting in smart buildings using a novel hybrid deep learning model. IEEE Access 9: 33498–33511.

24.

Tang

Chen

Xiang

, et al. (2022) Short-term load forecasting using channel and temporal attention based temporal convolutional network. Electric Power Systems Research 205: 107761.

25.

Tursunboev

Palakonda

Kang

J-M

(2024) Multi-objective evolutionary hybrid deep learning for energy theft detection. Applied Energy 363: 122847.

26.

Ünal

Almalaq

Ekici

(2021) A novel load forecasting approach based on smart meter data using advance preprocessing and hybrid deep learning. Applied Sciences 11(6): 2742.

27.

Wan

Chang

Khalil

A-B

, et al. (2023) Short-term power load forecasting for combined heat and power using CNN-LSTM enhanced by attention mechanism. Energy 282: 128274.

28.

Wang

(2023) Enhancing energy efficiency with smart grid technology: a fusion of TCN, BiGRU, and attention mechanism. Frontiers in Energy Research 11: 1283026.

29.

Liang

, et al. (2024) A framework for electricity load forecasting based on attention mechanism time series depthwise separable convolutional neural network. Energy 299: 131258.

30.

Liu

Chen

, et al. (2023) A deep LSTM-CNN based on self-attention mechanism with input data reduction for short-term load forecasting. IET Generation, Transmission & Distribution 17(7): 1538–1552.

Smart grid stability prediction model using two-way attention based hybrid deep learning and MPSO

Abstract

Keywords

Introduction

Problem statement

Motivation of the research

Key contribution

Organization the article

Related work

Materials and methods

Proposed hybrid model

Convolutional Neural Network module

Convolutional layer

Activation function

Pooling layers

Flatting layers

Dense/fully connected layer

Output layer

Bi-LSTM module

Optimization based on MPSO

Two-way attention mechanism

Dataset

Data preprocessing

Data cleaning

Feature engineering

Normalization

Min-max scaling

Standardization

tNN-Preprocessing

Euclidean distance calculation

Choice of neighbours

Imputation along with smoothing

Splitting the dataset

Training data sample

Testing data sample

Validation data samples

Data reshaping/transformation

Reshaping for CNN

Sequence generation for LSTM

Key performance indicators

MAE

MAPE

MSE

RMSE

RRSE

CV

Precision

Accuracy

Recall

F-Measure

Simulation and experimental results

Testing and evolution results for common data length

Results based on hyperparameters

Results based on different learning rates

Results based on different batch sizes

Results for energy consumption prediction

Ablation analysis

Conclusion and future directions

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

References