Abstract
Accurate forecasting of electricity imbalances is critical for maintaining power system stability and improving market efficiency, particularly in systems with high renewable energy penetration. This study proposes an advanced short-term forecasting methodology for the Ukrainian integrated power system using optimized long short-term memory (LSTM) neural networks. The novelty of the approach lies in the integration of automatic hyperparameter tuning and ensemble learning within LSTM architectures, specifically tailored to handle the high non-stationarity and extreme variability in real-world Ukrainian imbalance data. This hybrid structure enables robust performance under volatile market conditions. A comprehensive statistical analysis confirms significant skewness and volatility in hourly imbalance data from March 2022 to September 2024, obtained from NPC “Ukrenergo,” with variation coefficients of 100.14% and 77.37% for positive and negative imbalances, respectively. The proposed LSTM model with a 24-h input window achieved the best standalone accuracy, reducing mean absolute percentage error (MAPE) to 47.49% for positive and 35.96% for negative imbalances. Ensemble configurations (e.g. 4 + 24, 12 + 24) further improved stability, with correlation coefficients R(f, p) reaching 91.9% for positive and 77.83% for negative forecasts. In contrast, benchmark auto-regressive integrated moving average and seasonal auto-regressive integrated moving average models yielded significantly higher errors, with MAPE exceeding 92% and root mean square error values up to 522.15 MW·h. Statistical tests, including the Diebold–Mariano and Durbin–Watson tests, validated the superior accuracy and residual independence of the LSTM ensembles. The proposed framework demonstrates substantial improvements in both accuracy and forecast stability, making it a scalable solution for real-time imbalance management in power systems undergoing structural transitions and renewable energy integration. This approach offers practical value for transmission system operators and balancing market participants seeking to enhance predictive performance and reduce imbalance-related penalties.
Keywords
Introduction
The Integrated power system (IPS) of Ukraine currently operates in synchronous mode with European Network of Transmission System Operators for Electricity (ENTSO-E) (Kyrylenko et al., 2022), aligning its operational standards with those of the Continental European power grid. This synchronization has introduced new technical and economic challenges, particularly in the context of the country's liberalized electricity market (Law of Ukraine, 2017; NEURC, 2018). One of the most pressing challenges is the accurate short-term forecasting of electricity imbalances, which has become increasingly crucial with the implementation of the balancing market (BM) (Blinov et al., 2017). The BM plays a pivotal role in maintaining real-time equilibrium between electricity supply and demand, and its efficiency hinges on precise forecasting methodologies. Moreover, Ukraine's potential integration with European electricity markets further amplifies the need for robust forecasting mechanisms, as cross-border electricity trading introduces additional complexity in imbalance management.
Improving imbalance prediction accuracy is more than a technical necessity (Wang et al., 2024); it is a fundamental requirement for ensuring the stability, reliability, and cost-effectiveness of the electricity market (Sychova, 2022). The economic viability of market participants, including grid operators, power producers, and consumers, is directly influenced by forecasting accuracy, as it determines their ability to anticipate fluctuations, optimize resource allocation, and minimize financial risks associated with imbalance settlements.
This study contributes to the field by proposing an advanced short-term electricity imbalance forecasting approach based on deep learning models. Unlike traditional statistical models such as auto-regressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) (Blinov, 2017), this research introduces an optimized long short-term memory (LSTM) model that incorporates automatic hyperparameter tuning and ensemble learning to improve forecast stability and accuracy. Additionally, a comparative analysis is conducted to evaluate the advantages of deep learning models over statistical approaches in handling high-variability, non-stationary electricity market data.
The novelty of this work lies in its practical application to the Ukrainian BM, where operational challenges due to forced emergency outages require more adaptive and robust forecasting methodologies. This research builds upon prior studies in electricity imbalance forecasting (Blinov, 2023), addressing the specific challenges faced by a power system undergoing significant transformation.
This study leverages real-world hourly electricity imbalance data from March 2022 to September 2024, which was obtained from “Ukrenergo.” The methodology consists of: data preprocessing; model training, where LSTM networks are optimized using automatic hyperparameter selection; performance evaluation, utilizing metrics such as mean absolute percentage error (MAPE), root mean square error (RMSE), and correlation coefficient R(f, p); comparative analysis, where results from LSTM are benchmarked against ARIMA and SARIMA models (Blinov et al., 2017); and ensemble modeling, designed to improve forecast stability and accuracy.
This research aims to investigate the effectiveness of LSTM-based models in electricity imbalance forecasting compared to traditional statistical approaches such as ARIMA and SARIMA. Additionally, it examines the impact of ensemble learning on forecasting accuracy and stability, assessing whether combining multiple models can enhance predictive performance. Furthermore, the study analyzes the role of hyperparameter optimization in improving the accuracy and robustness of LSTM networks, ensuring their adaptability to high-variability electricity market conditions.
Literature review
Extensive research has been conducted on the development of models for forecasting electricity imbalances, with numerous studies focusing on improving predictive accuracy and system stability. One such study by Goodarzi et al. (2019) investigates the impact of forecasting errors in wind and solar power generation on imbalance volumes in the German electricity market. The authors employ regression-based techniques, including least squares regression, quantile regression, and auto-regressive models, to analyze the correlation between forecasting inaccuracies and imbalance fluctuations. Their findings demonstrate that errors in wind power generation forecasts significantly contribute to the increase in imbalances. This highlights the need for more robust forecasting models to improve grid management strategies in systems with a high share of renewable energy sources (RESs).
Similarly, Garcia et al. (2006) propose an innovative methodology that integrates classical statistical forecasting approaches with data mining techniques to predict electricity imbalances in England and Wales. Their approach leverages the strengths of both traditional time-series models and advanced data-driven algorithms to improve forecasting performance. The results indicate that combining historical imbalance patterns with machine learning techniques can significantly enhance forecasting precision. Thus, making their method a promising tool for energy market operators seeking to mitigate imbalance-related risks and optimize resource allocation.
Recent advancements in renewable energy forecasting have demonstrated the critical role of machine learning and hybrid deep learning models in addressing the challenges of non-stationary time series and high variability in power systems. Guermoui et al. (2024) conducted an extensive analysis of multi-scale fusion techniques to improve photovoltaic power forecasting, showing that combining data across different temporal and spatial resolutions can significantly enhance prediction accuracy in environments with pronounced fluctuations. Mfetoum et al. (2024) developed a multi-layer perceptron neural network incorporating meteorological insights to optimize solar irradiance forecasting in Central Africa, demonstrating that integrating exogenous weather variables with machine learning models improves reliability under rapidly changing conditions. Singh et al. (2024) applied machine learning-based forecasting and energy management strategies in grid-connected microgrids with multiple distributed energy sources, revealing that advanced predictive approaches are effective for enhancing operational stability and managing uncertainty in renewable-dominated systems. Mouloud et al. (2024) assessed hybrid bidirectional deep learning configurations, including Long short-term memory (LSTM) architectures, confirming their suitability for capturing non-linear temporal dependencies in short-term global horizontal irradiance prediction. Molu et al. (2024) implemented a hybrid deep learning approach enhanced by Bayesian optimization to improve forecast accuracy, highlighting the value of automated hyperparameter tuning for achieving consistent performance in volatile renewable generation scenarios. Louiza et al. (2024) demonstrated that combining convolutional neural networks with Bidirectional Gated Recurrent Unit architectures enables seasonal forecasting of global horizontal irradiance by effectively modeling both spatial and temporal dynamics, supporting more robust predictions in grids with substantial renewable integration. Khelifi et al. (2023) proposed a hybrid forecasting strategy that integrates time-varying filtering, empirical mode decomposition, and extreme learning machines to address non-stationarity in photovoltaic power time series, illustrating that decomposing complex signals into more predictable components can further improve forecast precision when dealing with abrupt variations and uncertainty.
In their article, Bâra and Oprea (2024) present a method for forecasting the volumes of imbalances in the electricity market of Romania using a combined approach. First, machine learning algorithms (eXtreme Gradient Boosting (XGB), Light Gradient Boosting (LGBM), random forest (RF), multilayer perceptron (MLP)) are used to classify the sign of the imbalance (positive or negative) and then a LSTM neural network model forecasts the total imbalance volume. The essence of this model is to integrate the previous classification result in forecasting the imbalance volume. It helps increase the accuracy and reliability of forecasts in the complex dynamics of the energy market.
In his study, Narajewski (2022) carries out comparison of different forecasting models based on data of the energy system of Germany. However, none of the considered models exceeded the accuracy of the naive forecasting model, which indicates the restraints of the existing approaches and the need for their improvement.
In their work, di Persio et al. (2017) conducted an analysis of application of exponential smoothing models and auto-regressive models auto-regressive moving average (ARMA), auto-regressive integrated moving average (ARIMA), autoregressive Integrated moving average with exogenous inputs (ARIMAX), and generalized autoregressive conditional heteroskedasticity (GARCH) for forecasting imbalances 1 day ahead. The study focuses on determining the sign of the imbalance as an important component for stable forecasting.
And in their work, Lisi and Edoli (2018) reviewed the specifications of models for forecasting imbalances in Italy with a focus on macro-areas using parametric and semi-parametric models.
Salem et al. (2019) devote their work to intra-hourly forecasting of imbalances using the quantile regression forest (QRF) model. This model not only improved the forecasting accuracy, but also provided the possibility to assess the reliability of forecasts through confidence intervals.
In their study, Kratochvíl and Bejbl (2015) presented a model for forecasting the intra-hourly trend of imbalances in the electricity market of the Czech Republic with an accuracy of 81.8%.
In their work, Toubeau et al. (2021) developed a global model for forecasting imbalances in the Belgian energy system with a 15-min discreteness. The accuracy estimation metrics used include the Winkler estimation and continuous ranked probability score (CRPS). The provided model demonstrated the best results compared to naive models and the popular ARIMA and QRF algorithms.
Also, for the forecasting of electricity imbalances in Belgium, Urdiales (2023) presents linear and non-linear machine learning modeling methods combined to create a comprehensive methodology.
In their work, Carnevale et al. (2024) present an improved artificial intelligence methodology for forecasting the signs of energy imbalances in the day-ahead electricity market.
In their work, Balázs et al. (2024) propose a multi-step version of the distributed lag auto-regressive model for short-term forecasting of the system imbalances. The model based on the assumption that the system imbalance is correlated with the measured values of system variables, as well forecasts of exogenous variables.
Therefore, the achievements of the scientific community demonstrate significant progress in the development of models for forecasting electricity imbalances. Most studies apply machine learning and regression models to improve the accuracy of forecasts. However, there are restraints as regards the universality of the developed models, since they are oriented to the data structure of a specific energy system.
One of the promising forecasting methods is artificial neural networks (ANNs), which allow modeling complex dependencies between data, analyzing large volumes of information and revealing hidden regularities. This mathematical apparatus is widely used for forecasting problems in the electric power industry.
For example, in works (Bouktif et al., 2018; Kong et al., 2019) ANN LSTM is used for short-term forecasting of electric load, in works (Cantillo-Luna et al., 2023; Gülmez, 2023; Tschora et al., 2022) it is used for forecasting electricity prices, and in works (Campos et al., 2024; Cui et al., 2023; Khan et al., 2024) it is used for forecasting wind and solar power generation.
In order to forecast electricity imbalances, Makri et al. (2021) propose a model based on a deep neural network for forecasting imbalances in the UK market. The results obtained demonstrate superiority over the results obtained using the ARIMA model.
In an article, Demir (2008) presents an extensive comparative analysis of different types of neural networks. Namely, back propagation networks, recurrent ANNs and radial basis ANNs, for solving the problem of short-term forecasting of electricity imbalances.
In their work, Plakas et al. (2023) proposed a few models for forecasting imbalances in the energy system of Greece. The authors use ensemble machine learning methods such as RF, LSTM, and the linear regression method. The highest accuracy of forecasts was achieved using the RF method.
The accurate forecasting of wind and solar energy generation plays a crucial role in reducing electricity imbalances and improving the efficiency of BM operations. Since RESs exhibit high variability due to meteorological conditions, their integration into the power grid introduces significant uncertainty in supply–demand equilibrium. Developing advanced forecasting techniques for RES generation can mitigate these challenges by enhancing the predictability of energy flows and reducing imbalance settlements. Recent studies have demonstrated the effectiveness of machine learning and hybrid models in improving wind and solar energy forecasting accuracy (Yang et al., 2024). Additionally, optimization-based forecasting approaches have been proposed to enhance the reliability of power system operations under high renewable penetration (Li et al., 2023). Moreover, ensemble forecasting techniques have been shown to provide greater robustness in predicting renewable energy output, thereby supporting grid stability and minimizing market volatility (Yang et al., 2024). By improving RES generation forecasting, power system operators and market participants can better anticipate potential electricity imbalances, optimize reserve capacity planning, and reduce the financial risks associated with imbalance settlements.
Recent studies have advanced probabilistic approaches to improve forecasting reliability in renewable energy systems. For example, Li et al. (2025) developed a spatio-temporal probabilistic forecasting framework tailored for regional wind power prediction, demonstrating enhanced accuracy through spatial correlation modeling. This is particularly relevant for systems with high renewable penetration, where spatial variability can significantly affect imbalance dynamics. Furthermore, stacking-based deep learning architectures have shown promise in multi-step forecasting tasks. Takara et al. (2024) proposed an integrated approach combining advanced deep neural networks with probabilistic stacking techniques to improve forecast accuracy and uncertainty quantification. Their method illustrates the benefits of ensemble learning for capturing complex temporal patterns in renewable energy outputs, which closely aligns with the ensemble-based LSTM modeling approach presented in this study.
Despite significant advancements in electricity imbalance forecasting, existing models still exhibit key limitations, particularly in their adaptability to rapidly changing market conditions and high system variability. Most conventional statistical approaches, such as ARIMA and SARIMA, struggle to capture the complex, non-linear dependencies in imbalance fluctuations, leading to reduced accuracy under volatile conditions. While deep learning models, including LSTM networks, have demonstrated superior predictive capabilities, many existing studies primarily focus on single-model architectures without incorporating ensemble learning or automated hyperparameter tuning. Additionally, a considerable gap remains in the ability of forecasting frameworks to dynamically adjust to real-time grid disturbances, extreme fluctuations in renewable energy generation, and structural changes in power market regulations.
The Ukrainian power system is currently undergoing significant transformations, including synchronization with ENTSO-E and increasing integration of RESs, which amplifies the challenges of imbalance forecasting. These changes introduce higher uncertainty and increased frequency of supply–demand mismatches, requiring a forecasting approach that can effectively handle non-stationarity, extreme deviations, and evolving system dynamics. To address this gap, there is a need for a more adaptable, computationally efficient model that integrates ensemble learning, automated hyperparameter optimization, and self-adjusting mechanisms to enhance both short-term accuracy and long-term stability. Such a model would enable more resilient and proactive imbalance management, improving market efficiency and ensuring grid reliability under uncertain operating conditions.
Table 1 summarizes key studies on electricity imbalance forecasting, highlighting their methodologies, findings, and limitations. While machine learning and hybrid models have demonstrated improved predictive accuracy, challenges remain in generalizing these models across different energy systems. The increasing share of RESs in power grids further necessitates advanced forecasting techniques to mitigate imbalance risks.
Summary of the literature review.
RES: renewable energy source; LSTM: long short-term memory; ARMA: auto-regressive moving average; ARIMA: auto-regressive integrated moving average; ANN: artificial neural network; RF: random forest; RNN: Recurrent Neural Network.
This study aims to develop and compare predictive models for short-term forecasting of total electricity imbalances, using the IPS of Ukraine as a case study. To achieve this objective, the following key contributions have been made:
Comprehensive statistical analysis of positive and negative electricity imbalances was conducted to assess data distribution, variability, and stationarity. Based on these insights, auto-regressive models (ARIMA, SARIMA) and LSTM-based ANNs were identified as suitable forecasting approaches. Model development and enhancement were undertaken, incorporating hyperparameter optimization and ensemble learning to improve the adaptability and accuracy of LSTM models. Extensive performance evaluation and comparative analysis of the proposed models with conventional statistical approaches were conducted using key forecasting metrics, including MAPE, RMSE, and correlation coefficient R(f, p).
The findings of this study provide quantitative insights into the effectiveness of different forecasting techniques and demonstrate the superiority of LSTM-based models in handling the high volatility and non-stationary nature of electricity imbalances. By improving forecasting accuracy, this research contributes to the optimization of power system operations, enhanced market efficiency, and increased grid stability within the IPS of Ukraine.
The article is structured as follows: Introduction provides an overview of the IPS of Ukraine, its synchronization with ENTSO-E, and the critical role of electricity imbalance forecasting, followed by a detailed literature review identifying research gaps. “Proposed method” section outlines the Ukrainian electricity market structure, the BM's significance, and the statistical analysis of imbalance data, leading to the selection and development of forecasting models, including ARIMA, SARIMA, and optimized LSTM with hyperparameter tuning and ensemble learning. “Results” section presents the model evaluation, comparing forecasting accuracy using key metrics such as MAPE, RMSE, and correlation coefficients. “Discussion” section interprets the findings, highlighting the advantages of LSTM-based models over traditional statistical methods and addressing the implications for grid stability and energy market operations. “Conclusion” section summarizes key contributions, emphasizing the superiority of LSTM for forecasting electricity imbalances in the IPS of Ukraine and suggesting future research directions for further enhancing forecasting accuracy and model adaptability.
Proposed method
General information on the electricity market of Ukraine
Until 2019, the Ukrainian electricity market operated in a monopolistic system with centralized pricing, which led to inefficiencies. A new market model introduced to increase transparency and competition. Its fundamental feature is the BM, which operates in near real time. The BM ensures a balance between electricity demand and supply, while eliminating financial and physical imbalances. Its main functions include buying and selling electricity both to balance daily fluctuations in demand and supply and to settle imbalances between market participants.
The key task of the BM is to manage electricity imbalances covered by balancing service providers. The Settlement Administrator processes data from the Commercial Account Administrator and the Transmission System Operator (TSO) to determine payments, imbalance prices, and cost volumes. Accurate imbalance forecasting is crucial at the nomination stage, where requests for balancing service are submitted. In cases of positive imbalances, excess electricity is sold at the BM price during downward regulation, while negative imbalances require the purchase of additional electricity during upward regulation. These transactions depend on the day-ahead market price and the imbalance market settlement price set by the TSO.
Statistical analysis of electricity imbalance samples
The research in this work has been carried out using hourly discrete samples data on positive and negative electricity imbalances of the IPS of Ukraine from March 2022 to September 2024. The data samples do not contain gaps. This subsection provides a detailed analysis of the imbalance samples. The data was obtained from the official website of the National Power Company “Ukrenergo” (NPC Ukrenergo, 2024). Descriptive statistics of these samples are provided in Table 2.
Descriptive statistics of the studied samples.
The median value is much less than the average (A) value of both the samples, indicating a high skewness of the sample. A very high maximum value compared to the mean and median indicates the presence of extreme outliers in the sample, which may be associated with emergency situations. The value of the variation coefficient (Kv) of 100.14% for the sample of positive imbalances indicates high variability of the data. Kv is equal to 77.37% for the sample of negative imbalances and means much lower but also high variability. The values of the auto-correlation coefficients (Ra) of more than 90% for both the samples indicate a strong dependence between consecutive values.
The observed higher variability of positive imbalances compared to negative ones can be attributed to the unpredictability of surplus electricity generation, particularly from RESs. Wind and solar power output is inherently variable due to fluctuating meteorological conditions, leading to periods of overgeneration that are difficult to anticipate accurately. On the other hand, negative imbalances are more constrained by structural demand patterns and reserve capacity planning, resulting in relatively lower fluctuations. Additionally, market regulations and grid stability measures tend to prioritize avoiding supply shortages, which may contribute to more controlled variations in negative imbalances.
For a more thorough analysis of the samples, Figure 1 shows a box plot for both the samples; additionally, Figures 2 and 3 show the moving average and standard deviation graphs.

Boxplot of positive and negative electricity imbalances.

Moving average and standard deviation graphs for positive and negative electricity imbalance samples.

Moving average and standard deviation graphs for negative electricity imbalance samples.
Both distributions have a pronounced right-sided skewness, since most of the data is concentrated closer to zero and the frequency of larger values decreases significantly. The largest proportion of values for both the samples is concentrated in the range of 0–2000. This may indicate that the values of electricity imbalances that are higher are atypical or anomalous.
The box plot graphs confirm the strong skewness of the data distribution of both the samples, taking into account that the median of both the samples is closer to the lower limit. The error bars are very short, which means that the main part of the data is located in a very narrow range compared to the range of values. A large number of outliers above the upper error bar indicates the presence of significant anomalies or non-standard values in the data, which may indicate some extreme situations in the system.
The moving average (orange line) and standard deviation (green line) show time trends for positive and negative imbalances. It can be seen that there are certain periods of high volatility (spikes in the graph). The rolling mean changes unevenly, in particular in positive imbalances, where sharp jumps are visible. This may indicate instability in energy generation or consumption in certain periods. The standard deviation also has periods of increase, indicating significant fluctuations in values of imbalances. In negative imbalances, such sharp changes are less pronounced but still present. Based on these graphs, it is possible to lay down the hypothesis of non-stationarity of both time series, since both the rolling mean and standard deviation demonstrate instability over time.
To confirm or refute the hypothesis of non-stationarity of the studied time series, the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test has been performed, the results of which are presented in Table 3.
Results of the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test for imbalances samples.
The KPSS test assays the null hypothesis of stationarity of time series. If the value of the test statistic exceeds the critical values, then the null hypothesis is rejected, which means that the series is non-stationary (the presence of a trend or variable variance). The obtained values of the KPSS statistic (5.6549 and 7.8626) significantly exceed all critical values for the significance levels of 1%, 2.5%, 5%, and 10%. This means that the null hypothesis of stationarity is rejected for both series (positive and negative imbalances). Thus, the time series are non-stationary, which indicates the presence of a trend or variable variance over time.
Additionally, the periodicity analysis of the studied time series has been performed. For both the samples, the characteristic periods are 24, 12, and 4. As shown in our previous study (Sychova, 2022), the hourly correlation coefficients between electricity imbalances and the volume of renewable generation range from 1% to 28% for positive imbalances and from 1% to 38% for negative imbalances. This indicates a moderate but noticeable relationship between forecasting errors in RES output and resulting imbalance magnitudes.
Description of models
The selection of forecasting models in this study is based on their established relevance in electricity imbalance prediction. ARMA, ARIMA, and SARIMA (Blinov et al., 2022; Box and George 2015) are widely used statistical models for time-series forecasting, particularly in structured and stationary datasets. However, these models struggle with capturing long-term dependencies and non-linear patterns, which are common in electricity imbalances. In contrast, LSTM (Hochreiter and Schmidhuber, 1997) networks are well-suited for handling sequential dependencies and high-variability data due to their ability to retain long-term temporal information. Therefore, this study employs ARIMA and SARIMA as baseline statistical models and compares their performance with LSTM, a deep learning model specifically designed to capture complex temporal patterns and non-stationary behavior in energy markets.
To enhance the flexibility and accuracy of the LSTM model for short-term electricity imbalance forecasting, the model has been improved. This enhancement enables the automatic selection of optimal hyperparameters. Previous studies have shown that it is most appropriate to use a sample of half a year with the following division: 90% for a training sample, 5% for a validation sample, and 5% for a test sample (220 points).
Variable parameters, for which optimization has been carried out, and the limits, in which according to previous studies it is advisable to carry out the search:
number of layers, num layers [1; 2]; dimension of internal states of layers hidden, dim [16; 32; 64; 128]; learning step, lr [0.001; 0.01; 0.1].
Studies have been carried out for window lengths = 4; 12; 24, which correspond to the sample periods.
Learning of each model with variable hyperparameters is performed on the training sample for 1000 epochs. On the validation sample, the model creates a forecast and using the criterion of minimum error selects a model variation, i.e. a model with optimal hyperparameters. Direct forecasting is performed on the test sample.
To solve the problem of increasing the stability of the model operation, it is proposed to test ensembles of ANNs. Ensembles of networks are built based on possible combinations of variations of the described models, which differ in window length (“4 + 12”; “4 + 24”; “12 + 24”; “4 + 12 + 24”). The result of forecasting ensembles of networks is the mean value of the forecasts of respective models. The scheme of operation of LSTM models and their ensembles is shown in Figure 4.

Scheme of operation of long short-term memory (LSTM) models and their ensembles.
Therefore, for the study, the following models are built:
ARIMA in which the optimal values of the coefficients are searched for: p = [1, 2, 3, 4, 5]; d = [0, 1]; q = [1, 2, 3]. Since the d coefficient may take the values 0 or 1, the ARMA model is included in this search and does not need to be built as a separate model. The forecasting has been carried out in the hourly profile, experiments have been conducted with a history of 30, 60, and 300 days; SARIMA, in which the optimal values of the coefficients are searched for: p = d = q = P = D = Q = [1, 2]; m = 24. The history is 100 days; LSTM: three LSTM models have been built with the selection of optimal hyperparameters, lr = [0.001; 0.01; 0.1]; hidden dim = [16; 32; 64; 128]; num layers = [1; 2] with window lengths of 4, 12, 24, and 4; their ensembles are 4 + 12; 4 + 24; 12 + 24, and 4 + 12 + 24, respectively. The history is 648 days.
Results
The testing of the provided models was carried out within the period of September 1 to September 30, 2024; the forecasting horizon is 24 points. Metrics for assessing the quality of forecasting were as follows: MAPE and RMSE, and correlation coefficient between factual and forecast values (R(f, p)). The values of forecasting errors using auto-regressive models are provided in Table 4; using LSTM models, in Table 5.
Values of forecasting errors of ARIMA and SARIMA models.
ARIMA: auto-regressive integrated moving average; SARIMA: seasonal auto-regressive integrated moving average; MAPE: mean absolute percentage error; RMSE: root mean square error. The best result of each row is highlighted in bold.
Values of forecasting errors of LSTM models and their ensembles.
LSTM: long short-term memory; MAPE: mean absolute percentage error; RMSE: root mean square error. The best result of each row is highlighted in bold.
According to the results of the imbalances forecasting provided in Tables 4 and 5, the highest accuracy is achieved by LSTM models. Wherein the most accurate model according to MAPE is the model with a window length of 24, according to RMSE, it is the ensemble of models with window lengths of 4 and 24 for positive imbalances, and the ensemble of models with window lengths of 12 and 24 for negative electricity imbalances. The difference between these models and their ensembles is insignificant.
The results of the forecasting carried out using the ARIMA model turned out to be of lower quality but the accuracy increases with an increase in the sample of the history. The results of the forecasting of the SARIMA model are moderate. It should be noted that the forecasts of negative imbalances are noticeably more accurate than those of positive imbalances, which can be explained by the greater variability of the sample of positive imbalances. For ARIMA models, R(f, p) are in the range of 41.14–43.08% for positive imbalances and 23.71–28.58% for negative ones and indicate a weak relation between factual and forecast values for positive imbalances. This means that the model has limited accuracy.
The R(f, p) values of SARIMA results: negative for positive imbalances (−3.31%) and close to 0 for negative imbalances (0.65%). It indicates that this model is not suitable for forecasting imbalances, since it is almost completely unable to capture their trend.
For all variants of the LSTM model, when forecasting positive imbalances, the R(f, p) values are stable and high (91.05–91.9%), indicating a strong relation between factual and forecast values. For the negative imbalance forecast results, R(f, p) ranges between 77.2 and 77.83%, indicating a moderate level of accuracy. Therefore, LSTM models are much more effective in forecasting imbalances compared to ARIMA and SARIMA.
It should be noted that the forecasting errors for positive imbalances are higher than for negative ones but at the same time the R(f, p) values are also higher in the case of positive imbalances. This can be explained by the fact that positive imbalances have greater variability than negative imbalances. This results in the model not being able to accurately foresee exact values, which increases the MAPE and RMSE errors. However, the correlation may remain high, since the model identifies a general trend of positive imbalances. In addition, their single too high values may have a significant impact on increasing the overall error values of forecasting positive imbalances. This is analyzed in detail using heat maps of daily MAPE and RMSE values, which are shown in Figures 5 and 6.

Heat maps of daily mean absolute percentage error (MAPE) of forecasting positive and negative electricity imbalances.

Heat maps of daily root mean square error (RMSE) of forecasting positive and negative electricity imbalances.
Analysis of the values of daily MAPE and RMSE forecasting shows that ARIMA is more prone to extreme error values, which negatively affects the overall result of the experiment. LSTM demonstrates more accurate and more stable results among the considered models.
For a more complete analysis of the obtained results, their comparison was carried out using additional metrics for assessing the accuracy of forecasting, namely mean absolute error (MAE), mean squared error (MSE), mean absolute relative error (MARE), mean squared relative error (MSRE), root squared mean percentage error (RSMPE), root mean squared relative error (RMSRE). Their values are given in Tables 6 and 7.
Value of additional forecast errors of ARIMA and SARIMA models.
ARIMA: auto-regressive integrated moving average; SARIMA: seasonal auto-regressive integrated moving average; MAE: mean absolute error; MSE: mean squared error; MARE: mean absolute relative error; MSRE: mean squared relative error; RSMPE: root squared mean percentage error; RMSRE: root mean squared relative error. The best result of each row is highlighted in bold.
Values of additional forecasting errors of LSTM models and their ensembles.
LSTM: long short-term memory; MAE: mean absolute error; MSE: mean squared error; MARE: mean absolute relative error; MSRE: mean squared relative error; RSMPE: root squared mean percentage error; RMSRE: root mean squared relative error. The best result of each row is highlighted in bold.
According to the values presented in Tables 6 and 7, MAE and MSE for positive imbalances are significantly higher than for negative ones. This indicates that the model performs worse in predicting positive imbalances. Among the ARIMA models (Table 5), the lowest MAE and MSE are achieved with ARIMA(300), while SARIMA shows the worst results. Among the LSTM models and their ensembles (Table 6), the best MAE and MSE values are obtained using 4 + 24 and 12 + 24, confirming previous findings. The values of MARE, MSRE, RMSRE, and RSMPE for positive imbalances are significantly higher than for negative ones. This confirms the trend of better forecasting performance for negative imbalances due to their lower variability and better predictability. The most accurate results were obtained using ARIMA(300) among auto-regressive models and the ensembles 4 + 24 and 12 + 24. SARIMA and ARIMA(30, 60) are less effective, especially for positive imbalances.
Two statistical tests were performed to assess the quality of the forecasting models: The Diebold–Mariano (DM) (Iftikhar et al., 2024; Qureshi et al., 2024) test and the Durbin–Watson (DW) test. The DM test was used to compare the accuracy of the models’ forecasts based on the squared error of the absolute error. The DW test was used to detect autocorrelation of the residuals. The DW statistic will always have a value ranging between 0 and 4. A value of 2.0 indicates there is no autocorrelation detected in the sample. Values from 0 to less than 2 point to positive autocorrelation, and values from 2 to 4 mean negative autocorrelation. The comparison was performed between the model that demonstrated the best forecasting accuracy (LSTM 4 + 24 for positive imbalances, LSTM 12 + 24 for negative ones) and other models. The results are shown in Table 8.
DM and DW test results.
DM: Diebold–Mariano; DW: Durbin–Watson; LSTM: long short-term memory; ARIMA: auto-regressive integrated moving average; SARIMA: seasonal auto-regressive integrated moving average.
ARIMA and SARIMA models have strong residual autocorrelation and the worst accuracy in most cases. The ensembles of LSTM models (4 + 12, 4 + 12 + 24) have the best balance between accuracy and autocorrelation for positive imbalances. For negative imbalances, LSTM models 4, 12, and 24 are the best, but they have low DW statistics, indicating potential correlation of errors. The models closest to optimal are: 4 + 12 + 24 for positive imbalances and 4 + 24 for negative imbalances.
For example, Figure 7 shows daily graphs of factual values of electricity imbalances forecast using the model that showed the best results in terms of accuracy according to RMSE.

Daily graphs of factual and forecast values of positive (a) and negative (b) electricity imbalances.
Discussion
The forecasting results indicate that LSTM-based models achieve the highest accuracy for both positive and negative imbalances compared to auto-regressive ARIMA and SARIMA models. These, in contrast to the conclusions in previous works (Balázs et al., 2024; di Persio et al., 2017), turned out to be less accurate and ineffective for forecasting imbalances for selected retrospective data of the IPS of Ukraine. It can be explained by the fact that these models are more difficult to adapt to high data variability. In addition, the LSTM models provide for not only higher accuracy but also greater stability in forecasts. For example, in previous work (Plakas et al., 2023), a conclusion was obtained about the advantage of the RF model over the LSTM. The same study shows ways to improve and increase the performance of this model, which may increase interest in its application for forecasting electricity imbalances. The described algorithm for controlling variations of the LSTM network, along with the selection of optimal hyperparameters and their ensembles, ensures greater stability of forecasting results. Additionally, it reduces the process and time required for model tuning. Similar conclusions regarding the advantages of ANN models for forecasting time series of imbalances due to their prospective of expansion and modifications are the basis of works (Bâra and Oprea, 2024; Carnevale et al., 2024). The forecasting accuracy achieved in this study is somewhat lower than the values provided in previous works (Plakas et al., 2023; Toubeau et al., 2021). This may be justified by the values, higher by an order of magnitude, of electricity imbalances and a more complex structure of their samples. It is characterized by high variability and the presence of anomalous extreme values due to emergency situations caused by the impact of unpredictable external factors.
Thus, the approach, provided in this work, to forecasting electricity imbalances using the LSTM models and their ensembles demonstrates competitive advantages over existing ones. And it also opens up the prospects for future developments to improve it.
The findings of this study have several important implications. First, they confirm that deep learning models, particularly LSTM-based ensembles, are highly effective in handling non-stationary, high-variability electricity imbalance data in real-world conditions. This is especially critical for systems undergoing structural transitions, such as the Ukrainian IPS. Second, the combination of automatic hyperparameter optimization and model ensemble strategies contributes to improved stability and accuracy without the need for extensive manual tuning—a key benefit in operational forecasting environments.
The proposed methodology is designed with adaptability in mind. The LSTM ensemble framework, based on modular architectures and data-driven tuning, allows for relatively straightforward transfer to other power systems, provided that minimum data requirements are met. Thus, while this work focuses on the Ukrainian context, its core methodology is generalizable and can be replicated across different markets with suitable adjustments.
Compared to recent state-of-the-art hybrid models and transformer-based approaches, our LSTM-based method provides a balance between accuracy and computational efficiency. While transformers may offer marginally higher accuracy, their computational cost is significantly higher, making them less suitable for real-time forecasting applications.
The purpose of the study has been to develop and compare various models for forecasting electricity imbalances, and the obtained results have confirmed this achievement. The developed model using an LSTM ANN may be effectively used to forecast total imbalances in the IPS of Ukraine. While this study focuses on the Ukrainian electricity market, the proposed forecasting approach is applicable to other power systems with high renewable energy integration. Markets with significant RES penetration face similar challenges in imbalance management, making deep learning-based forecasting models highly relevant. The adaptability of LSTM-based models allows for easy customization to different grid structures and regulatory frameworks, ensuring their effectiveness in diverse market environments.
The proposed forecasting framework has direct implications for real-world electricity market operations. By improving imbalance prediction accuracy, grid operators can enhance market stability, optimize reserve allocation, and reduce financial penalties associated with imbalance settlements. Additionally, the model's adaptability allows it to be integrated into automated energy management systems, enabling more efficient real-time decision-making. The implementation of such forecasting techniques can lead to cost savings for both energy providers and consumers, while also contributing to the reliability of power system operations under fluctuating renewable energy penetration.
Challenges, limitations, and future works
Despite the strong performance of the proposed LSTM-based approach, several challenges remain. One key limitation is the need for a large historical dataset for model training, as deep learning methods rely on extensive past data to capture complex patterns. While the model effectively handles non-stationarity and extreme fluctuations, its accuracy may decrease when operating in markets with limited historical records or insufficient data granularity.
Another challenge is the computational cost of training and deploying deep learning models. Although the optimized hyperparameter tuning and ensemble learning improve efficiency, real-time forecasting in high-frequency markets may require further optimization of processing power and resource allocation. Future work should focus on developing more computationally efficient LSTM implementations or exploring alternative architectures, such as transformers, to balance accuracy and speed.
Extreme values in power imbalance data, such as sudden spikes due to unexpected power outages or rapid fluctuations in renewable energy generation, pose a significant challenge for forecasting models. Statistical models such as ARIMA and SARIMA struggle with such anomalies because they assume relatively stable patterns in the data. In contrast, LSTM models demonstrate better adaptability to extreme values due to their ability to capture complex temporal relationships. However, they can still be affected by extreme deviations, especially if the training dataset is underrepresented in such events. To minimize such effects, it is recommended to apply data preprocessing techniques.
Additionally, while the proposed methodology is adaptable to different power systems and market structures, further validation across diverse energy systems with varying regulatory frameworks and renewable energy shares is necessary. Future research should also investigate hybrid models that incorporate external factors such as weather conditions and economic indicators influences to further enhance forecasting robustness and reliability.
Conclusion
The findings of this study confirm that traditional statistical models are limited in their ability to provide accurate short-term forecasts of electricity imbalances under conditions of high data variability and non-stationarity. In contrast, deep learning methods based on LSTM architectures demonstrate a significantly higher capacity to model complex temporal dependencies and provide more stable and reliable predictions.
The results support the hypothesis that electricity imbalances in modern power systems, especially those with increasing shares of renewable energy, require forecasting approaches that are flexible, data-driven, and able to adapt to irregular patterns. The ensemble configuration of LSTM models, combined with automated hyperparameter selection, further enhances this adaptability and contributes to improved performance stability.
Overall, this study highlights the relevance of advanced machine learning techniques for addressing the operational challenges of imbalance forecasting in power systems undergoing structural and regulatory transformations. The proposed approach demonstrates the ability to reduce forecasting error and better capture imbalance dynamics, thereby contributing to more informed and effective system-level decision-making.
Footnotes
Acknowledgements
The authors would like to express their sincere gratitude to Stanislav Misak for his exceptional supervision, project administration, and overall guidance throughout the course of this project. His expertise and support were instrumental to its success.
Author contributions
Ihor Blinov, Ievgen Zaitsev, Mohit Bajaj, and Volodymyr Miroshnyk: conceptualization, methodology, software, visualization, investigation, and writing—original draft preparation. Viktoriia Sychova, Pavlo Shymaniuk, Vojtech Blazek, and Lukas Prokop: project administration, supervision, resources, writing—review and editing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is funded by European Union under the REFRESH—Research Excellence For Region Sustainability and High-Tech Industries Project via the Operational Programme Just Transition under grant CZ.10.03.01/00/22_003/0000048; in part by the National Centre for Energy II and ExPEDite Project a Research and Innovation Action to Support the Implementation of the Climate Neutral and Smart Cities Mission Project TN02000025; and in part by ExPEDite through European Union's Horizon Mission Programme under grant 101139527.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
