Abstract
Due to rising demand for energy-efficient buildings, advanced predictive models are needed to evaluate heating and cooling load requirements. This research presents a unified strategy that blends LSTM networks and GBM to improve building energy load estimates’ precision and reliability. Data on energy usage, weather conditions, occupancy trends, and building features is collected and prepared to start the process. GBM model attributes are created using sequential relationships and initial load projections using LSTM networks. Combining LSTM with GBM takes advantage of each model's strengths: LSTM's sequential data processing and GBM's complex nonlinear connection capture. Performance measures like RMSE and MAE are used to evaluate the hybrid model's validity. Compared to individual models, the integrated LSTM-GBM method improves prediction accuracy. This higher predictive capacity allows real-time energy management systems, improving building operations and reducing energy use. Implementing this integrated model in Building Management Systems (BMS) shows its practicality in achieving sustainable building energy efficiency.
Keywords
Introduction
Recent considerations of environmental sustainability and climate change have made energy efficiency a priority in building design and management. Building energy efficiency is how well a building uses energy for heating, cooling, lighting, and appliances. Assessing energy efficiency involves analysing building features that significantly affect energy use. The energy efficiency of a structure depends on its envelope—walls, roof, windows, and doors. Insulation, materials, and building procedures affect heat transfer between indoor and outdoor settings. Wall and roof insulation, high-performance windows, and airtight construction can reduce heating and cooling needs, enhancing energy efficiency.
Energy use can be affected by a structure's orientation with the sun. Buildings that promote natural light and passive solar heating can reduce their use of artificial lighting and heating. Rooms and places can be designed to maximise natural airflow and light, reducing energy consumption even more (Ge et al., 2014; Karakus et al., 2013; Mathieu et al., 2011).
Building energy efficiency depends on HVAC system performance. The kind, size, and efficiency of heating and cooling equipment and distribution system architecture affect energy usage. Modern HVAC systems with variable-speed drives, heat recovery ventilation, and smart thermostats can save a lot of energy. Structures use a lot of energy for lighting. Using natural sunlight and energy-efficient lighting like LED fixtures can cut energy use. Implementing lighting controls like occupancy sensors and daylight-responsive dimming systems can maximise energy efficiency by using lights only when needed. Advanced building automation and control systems precisely regulate energy use. These systems analyse occupancy patterns and ambient factors to control HVAC, lighting, and security. Optimising building systems using automation can save a lot of energy. Structure energy efficiency depends on construction materials and methods. Sustainable, high-thermal-mass materials store and release heat, stabilising indoor temperatures. Airtight construction and thermal bridging reduction reduce energy utilisation. Energy efficiency is crucial to environmental protection and sustainable growth. Energy efficiency is the ability to maintain service or output while minimising energy use. Examining energy efficiency is vital for reducing energy use, costs, environmental damage, and energy security (Acerbi et al., 2018; Pinto et al., 2019; Rabie and Adebisi, 2017; Tan et al., 2018).
Methods and metrics for assessing energy efficiency
Each energy efficiency measuring method is developed for a certain context and level of study. These are included:
Energy Audits: Energy audits methodically evaluate a building, business, or organization's energy use. They identify inefficient energy use and recommend improvements. Comprehensive audits include data collection, monitoring, and modelling, as well as walkthroughs. Energy Performance Indicators (EPIs): EPIs measure system or process energy efficiency numerically. EPIs often measure energy consumption per unit of output, such as kilowatt-hours per tonne of product, and energy usage intensity (EUI), such as kWh per square metre. Benchmarking: Benchmarking compares a facility or process's energy efficiency to industry standards or best practices. This comparison helps identify inefficiencies and set performance goals. Energy Management Systems (EnMS): Energy Management Systems (EnMS) like ISO 50001 increase energy performance through continuous development. Setting energy performance goals, monitoring energy consumption, and conserving energy are the steps. Life Cycle Assessment (LCA): Life Cycle Assessment (LCA) evaluates a product's energy use and environmental impact from manufacturing to disposal. This thorough approach identifies energy-saving opportunities throughout the life cycle.
Energy efficiency lowers home, corporate, and government energy costs. This is significant in energy-intensive industries where energy bills account for a large amount of operational costs. Energy consumption decreases and reduces greenhouse gas emissions and other energy-related pollution. This reduces climate change and improves air quality. Energy efficiency conserves precious resources like fossil fuels for future generations (Kirillov et al., 2020; Ramokone et al., 2020; Zhang et al., 2020). Improving energy efficiency reduces dependence on imported energy, improving energy security and reducing risk of price volatility and supply disruptions. Energy optimisation may cut costs and boost sustainability, giving companies an edge. Additionally, this may create energy efficiency industry jobs (Garifi et al., 2020).
Challenges in assessing energy efficiency
Despite its importance, assessing energy efficiency poses several challenges:
Data Availability and Quality: For evaluation purposes, energy use data has to be precise and comprehensive. Lack of completeness, consistency, or accessibility of data is a common problem in complex or decentralised systems. Standardization: It is difficult to evaluate performance across entities and industries due to the varied methodology and criteria used to assess energy efficiency. Technological Barriers: In developing countries, access to affordable energy may be a challenge due to a lack of resources and expertise. Behavioral Factors: Organisational culture and human conduct impact energy use. To encourage energy-saving practices, we need to educate people, incentivize them, and shift cultural norms. Regulatory and Policy Frameworks: Regulatory and policy frameworks that are conducive to energy efficiency assessment and improvement are essential. Possible roadblocks include insufficient or poorly enforced regulations.
Different situations call for different methods and metrics when evaluating energy efficiency. Reducing costs, protecting the environment, conserving resources, ensuring energy security, and being economically competitive are all dependent on it. Addressing data availability, standardisation, technological limits, behavioural concerns, and legal frameworks is crucial for increasing energy efficiency evaluation and improvement (Lala et al., 2022; Lat et al., 2022; Saragih et al., 2022; Senarathne et al., 2022; Tien et al., 2022). A sustainable future will be shaped by robust and all-encompassing energy efficiency research as the world grapples with energy consumption and environmental sustainability.
There are following research contributions of this paper as below:
This paper demonstrates the integration of LSTM and GBM models significantly improves the accuracy of heating and cooling load predictions. This paper utilizes LSTM networks to effectively capture and incorporate temporal dependencies in building energy data. The authors develop a novel framework combining the strengths of LSTM and GBM for robust energy load forecasting. This paper facilitates real-time predictions for more efficient building energy management systems. This paper contributes to the advancement of sustainable building operations through improved energy load assessments and reduced energy consumption.
This paper's organisation: Building energy efficiency and dependable heating and cooling load projection models are stressed in the introduction. The literature review highlights LSTM network and GBM benefits and techniques. The technique includes data collection, preprocessing, LSTM and GBM model construction, and training. The integration technique employs LSTM predictions as GBM model features. Results demonstrate that the integrated model outperforms solo models in RMSE and MAE. After examining practical implications, including real-time deployment in Building Management Systems (BMS), the conclusion outlines key contributions and future research.
Review of literature
Mathieu et al. (2011) provide a detailed analysis of electric demand in commercial and industrial buildings, focusing on a 15-min time frame. These techniques enable building managers to comprehend and evaluate electricity consumption. They have the ability to recognise enhancements in demand response, energy efficiency, waste reduction, and peak load control. Our main focus is on demand response. Graphs are employed to evaluate electric load statistics. Furthermore, a regression-based model utilises a time-of-week indicator variable and a piecewise linear and continuous outdoor air temperature dependence to make predictions about electricity load. Furthermore, the electrical loads and demand response behaviour of the facility are distinguished by a multitude of distinct features. These solutions have the potential to become tools that are user-friendly for facilities managers in the future.
Pinto et al. (2019) provide a recommender system for intelligent building energy management that utilises case-based reasoning (CBR). The suggested technique uses data from prior similar occurrences to determine the optimal amount of energy reduction for a building at any particular moment. The support vector machines approach optimises the weighting of each instance's factors, while the k-nearest neighbour clustering algorithm identifies the most comparable previous instances. A customised system with specific guidelines guarantees that the response is appropriate and relevant to the current situation. An individual software agent incorporates the Case-Based Reasoning (CBR) technique into a community of multi-agent systems to analyse energy systems. The results indicate that the proposed approach offers appropriate recommendations for reducing energy consumption. This can be accomplished by comparing its outcomes with those obtained from a particle swarm optimisation method and previous reductions. The outcomes of the suggested method are employed in a household energy resources management system to evaluate its feasibility.
Ramokone et al. (2020) employed an artificial neural network to replicate patterns in residential energy usage, considering factors such as household activities, income, and occupancy. These elements enhance the modelling process by enabling it to address issues related to unforeseen changes in data and intricate linkages. This allows for accurate energy estimation and prediction. The Levenberg-Marquardt (LM) models produce satisfactory outcomes, with coefficients of determination ranging from 0.87 to 0.91.
The study conducted by Zhang et al. (2020) presents innovative research on building design and the implementation of wireless networks. This work presented the concept of interference gain (IG) as a natural indicator of a building's wireless performance in blocking interference signals. It devised analytical models to calculate IG and devised a novel approach to find the ideal transmitting power for maximising IG. The IG is determined by mathematically converting the probability density function (PDF) of the distance between a probe user equipment (UE) and a wall. The position of the UE is random, and its direction is evenly distributed. The random distance probability density function (PDF) for rectangular rooms is determined using analytical calculations. Calculating the information gain (IG) of a rectangular room and corridor building under design (BUD) is simplified by this. The technique of random shooting analysis (RSA) is employed to compute the probability density function (PDF) of irregularly shaped rooms in order to determine the building usage density (BUD). Comparing and validating the closed-form statement with the RSA algorithm. The numerical findings demonstrate the accuracy of both the information gain (IG) model and the method employed to ascertain the optimal transmitting power for a building. The findings assist architects and radio engineers in optimising the wireless performance of buildings and determining the optimal quantity of wireless access points.
In 2022, Alammar and Jabi (2022) employed an Artificial Neural Network (ANN) model to expedite the estimation of AF's hourly cooling requirements, surpassing the speed of BPS. We utilised generative parametric modelling techniques to create a model of office towers, using AF shading. This study investigated energy consumption. The ANN model was trained using Grasshopper's Honeybee add-on, which is integrated with EnergyPlus. An exceptionally precise model was demonstrated to rapidly calculate cooling requirements.
Ceccarelli et al. (2022) investigate the potential of machine learning to replace numerical simulations and accelerate building design, while also examining the impact of architectural features. The case study included several simulations of an agricultural building to train and evaluate supervised regression models. The tree-based Extreme Gradient Boosting method achieved the highest performance. The research of model explainability involved the application of SHAP and feature importance. This work is crucial for scholars and experts to enhance the methods used in designing new constructions and retrofitting.
According to Lat et al. (2022), there has been an increase in the utilisation of machine learning (ML) as more sectors embrace Industry 4.0. To assess the influence of several elements on the green building rating (GBR) systems of LEED and BERDE, we employed an artificial neural network (ANN) technique called Garson's Algorithm (GA). The investigation specifically examined characteristics listed by LEED and BERDE, which include Location and Transportation (LT), Water Efficiency (WE), Energy and Atmosphere (EA), Materials and Resources (MR), and Indoor Environmental Quality (Ind). The findings demonstrated that the overall attributes and aspects related to energy efficiency and conservation had a substantial impact on the LEED and BERDE green building grading systems. Neural networks have been shown to be a powerful method for doing sensitivity analysis (SA) to evaluate and measure the influence of multiple factors.
Saragih et al. (2022) prioritise the analysis of individual findings rather than the overall outcomes. Previous research has shown that Multivariate Random Forest has achieved good performance. This study employed the technique of Multivariate Extreme Gradient Boosting. This approach employs the identical principle as Extreme Gradient Boosting, but it is specifically designed for multivariate data. The optimal outcomes were achieved with a mean squared error (MSE) of 0.537471 and a root mean squared error (RMSE) of 0.733124. By setting the gamma value to 0.1, the n-estimator to 200, and splitting the training and testing data in a 90:10 ratio, the following results were obtained. The data was not subjected to any normalisation during testing. The results of this test demonstrate that Multivariate Extreme Gradient Boosting surpasses earlier studies in terms of performance.
Senarathne et al. (2022) investigated the impact of several design input characteristics, such as relative compactness, surface area, wall area, roof area, overall height, orientation, glazing area, and glazing area distribution, on heating and cooling load. The UCI machine learning repository had a dataset of 768 buildings. The coefficients of the Logistic and Linear regression models were utilised in a 10-fold cross validation to assess their influence. The utilisation of a Word-Cloud display facilitated the comprehension of the observations. The linear regression model yielded root mean square errors of 2.82 and 2.13, as well as mean absolute errors of 1.97 and 2.13. The HL logistic regression model achieved an accuracy of 76.30%, while the CL logistic regression model achieved an accuracy of 73.17%. The findings indicated that reducing building height, glazing area, and relative compactness can improve energy efficiency. These findings can be applied to construct energy-efficient structures in the real world.
Campodonico Avendano et al. (2023) investigate the impact of baseline load forecast pipeline performance on the accuracy of grid operator flexibility estimation for various building types. The study conducted a comparison of various machine learning algorithms, as well as sliding-window and offline training approaches, in order to estimate baseline load one hour ahead. The smart metre data is utilised to ascertain the optimal training window sizes and the most suitable pipeline for each building category. Subsequently, the energy consumption patterns of five buildings from each category are simulated, accounting for their initial load and adaptability. Ultimately, the identified pipelines are utilised to predict the initial loads and measure the discrepancy in flexibility. The prediction pipeline that shows the most potential is the one that utilises the extra trees approach with a sliding window of 5 weeks. This pipeline performs better than offline training, with an average F2 score of 0.91 compared to 0.87. These pipes allow for precise assessment of flexibility, with a mean relative error in the flexibility index ranging from −2.45% to +2.79%. This enables the grid operator to provide equitable compensation to buildings for their capacity to adapt.
Chen et al. (2023) conducted a comprehensive analysis of research on building energy management, specifically focusing on the use of interpretable machine learning techniques. These studies are evaluated to enhance the interpretability of the model. Research is originally categorised into ante-hoc and post-hoc stages of interpretable machine learning. The research is subsequently examined utilising specific methodologies and subjected to critical comparison. The analysis revealed that interpretable machine learning is extensively employed in the development of energy management systems. However, there are several challenges associated with its implementation. Firstly, the utilisation of diverse terminologies to describe model interpretability can lead to confusion. Secondly, comparing the performance of interpretable machine learning models across different tasks is a complex task. Lastly, the interpretability provided by SHAP and LIME techniques is limited in scope. Lastly, we address the necessary future research and development required to enhance the understanding of opaque models, which has the potential to expedite the use of machine learning in the field of building energy management.
Jhamb and Ahmed (2023) assess eight characteristics of energy-efficient buildings in their study. The factors encompassed are relative compactness, surface area, wall area, roof area, height, orientation, glazing area, and distribution. The writers utilise the data provided by Tsanas and Xifara. The dataset accurately forecasted heating and cooling requirements with a minimal mean squared error by utilising 768 models of residential buildings. Machine learning algorithms can optimise these attributes to accurately estimate a structure's heating and cooling requirements, thereby lowering energy consumption and enhancing quality of life. Using diverse statistical methodologies, the study examined the correlation between these input factors and the two output variables. This was accomplished by identifying the input variable that exhibited the highest association with the output load. The Decision Tree Regressor exhibits the highest level of certainty. The 8-variable model provided an accurate prediction of the heating and cooling requirements for residential buildings.
In 2024, Wang et al. (2024) developed a carbon evaluation model based on a process-oriented approach. Carbon emissions were quantified during the construction of building foundations in China, specifically in cases where basements were not included, by utilising Building Information Modelling (BIM). In addition, correlation analysis was employed to ascertain emissions factors. The study revealed that each building foundation floor area emits carbon dioxide equivalent (CO2e) in the range of 100 to 2000 kilogrammes per square metre. This implies the need to maximise carbon emission reduction during the process of designing buildings. Materials account for 78% to 97% of carbon emissions. The utilisation of BIM technology has significant promise in mitigating carbon emissions during the design stage.
Zini and Carcasci (2024) utilise machine learning algorithms to develop an accurate monitoring technique that does not necessitate any user expertise. The proposed methodology was employed to evaluate the electricity consumption of HVAC (Heating, Ventilation, and Air Conditioning) components at an Italian hospital. The efficacy of the building energy monitoring technique is evaluated in terms of its ability to detect subtle variations in energy consumption patterns using the obtained models. This study assesses the advantages of implementing this technology on specific system components, even if it requires additional technical and budgetary resources for data collection. The proposed technology provides a viable alternative for enhancing intelligent building energy management, thanks to its high repeatability and seamless integration with centralised building energy management systems.
Zoubir, Er-retby, et al. (2024) and Zoubir, Es-sakali, et al. (2024) propose aligning energy consumption patterns with precise production peak estimations in order to improve energy utilisation. This strategy maximises energy efficiency by aligning energy use with periods of high and low production. This study utilised meteorological and temporal factors as predictors. The results were presented in a structured manner. This study utilised meteorological and temporal factors as predictors. LightGBM had superior performance compared to the other models, thereby confirming its effectiveness in projecting PV energy.
Research gaps
Assessing energy efficiency using machine learning (ML) offers promising capabilities, but it also has several limitations that researchers must consider. Here are the key challenges and limitations in this area:
Precise machine learning models need lots of good data. Complete building statistics, including insulation levels, HVAC efficiency, occupancy patterns, and energy usage, are sometimes unavailable. Energy usage data may lack temporal or location specificity. High-frequency data is needed to accurately capture and analyse short-term energy usage variations. Raw data sometimes needs considerable preprocessing to remove missing values, noise, and outliers before analysis. If not done carefully, this technique may introduce biases or errors. Building energy systems involve complex, non-linear interactions between weather, human activity, and building materials. Getting these interactions right is difficult. Due to occupancy schedules, maintenance, and retrofits, energy consumption patterns change. Static models may not capture this dynamic nature. When trained on restricted or unrepresentative datasets, machine learning models may overfit, especially if they do not cover many building types and circumstances. Models trained on certain buildings or places may not generalise well due to temperature, construction codes, or tenant behaviour. This is because many ML models, especially deep learning models, are “black boxes.” Understanding how input parameters affect energy efficiency is difficult. Absence of openness can hinder model confidence and acceptance. Despite the challenges of complex machine learning algorithms, decision-makers must provide clear predicted rationale. Advanced machine learning models require computational resources and time that not all academics have. Using machine learning models on large datasets or customising them for energy management may be tricky. ML models may be difficult to integrate with building management systems and procedures. Machine learning models in real-time energy management systems require robust and efficient algorithms to perform within real-time processing restrictions.
By addressing these limitations, the integration of ML in assessing building energy efficiency can become more robust, reliable, and widely applicable.
Material and method
Dataset
This dataset comes from the UCI Machine Learning Repository. Energy study utilising 12 Ecotect-simulated building types is being done with this dataset. Buildings differ in glazing area, distribution, orientation, and other factors. Simulating scenarios based on attributes generates 768 building shapes. To predict two real-valued responses, the dataset has 768 samples and 8 characteristics. Multi-class categorization is possible if the response is rounded to the closest integer.
Relative compactness - measures compactness of the closure or building Surface area - total surface area of the building Wall area - Roof Area Overall Height Orientation Glazing Area Glazing Area Distribution Heating load Cooling load
Figure 1 demonstrates the distribution of the features.

Distribution of all the features.
With the help of Heatmap, we can identify the heatmap features with the strongest heating and cooling load relationships. This analysis illuminates the main building energy usage factors. This information greatly improves predictive models and energy efficiency assessments (Alammar and Jabi, 2022; Amarkhil, 2023; Ceccarelli et al., 2022; Ibne Bashir, 2022). The heat map of these features is shown in Figure 2 as below.

Heatmap of all the features.
The heatmap shows how each component affects the integrated model's predictive abilities, underlining the benefits of a hybrid approach to evaluating heating and cooling load demands in building energy efficiency (Bebortta et al., 2023; Suguna et al., 2023). Figure 3 depicts of the correlation plot of the features as below.

Correlation plot of all the features.
Figure 3 helps identify variables that strongly affect heating and cooling loads. This technique identifies the most important model characteristics, improving prediction accuracy. The plot emphasises highly related features to reduce dimensionality. Omitting features with little or no relationship to the target variable simplifies the model and reduces computing cost (Abedinia et al., 2023; Albayati et al., 2023; Reji et al., 2023; Tang et al., 2023).
Methods
Sustainable living and energy efficiency require accurate building heating and cooling load forecasts. This field has been changed by advanced machine learning approaches like LSTM networks. LSTM networks, a type of Recurrent Neural Network (RNN), are excellent at modelling time-dependent events, making them ideal for anticipating building energy demands using various parameters. Traditional RNNs struggle to capture long-term dependencies in sequential data. LSTM networks were created to overcome this (Akhtar et al., 2023; Bhandarkar et al., 2023; Campodonico Avendano et al., 2023; Chen et al., 2023; Pronichev and Shishkov, 2023). Long short-term memory (LSTM) networks use memory cells to store information longer than RNNs, which struggle with vanishing and exploding gradients. Extended Short-Term Memory (LSTM) models can learn and remember long sequences because of its architecture. This makes them ideal for building energy load estimation. Building characteristics that affect energy usage must be considered when calculating heating and cooling load needs. The criteria include:
Building Geometry: Structure shape, size, and direction greatly affect energy efficiency. Larger buildings with more exposed surface area may need more heating or cooling. Insulation and Materials: Heat transfer depends on construction materials’ thermal properties, such as insulation. Well-insulated buildings use less heating and cooling. Window-to-Wall Ratio: The number and size of windows and their thermal efficiency affect heat intake or loss. HVAC Systems: Energy loads depend on HVAC efficiency. Occupancy Patterns: Human activity in the structure raises internal temperatures, affecting heating and cooling needs. Weather Conditions: Temperature, humidity, and sun radiation outside a building affect how much energy it uses.
LSTM networks can capture temporal connections, making them ideal for predicting time-varying energy usage trends. Long Short-Term Memory (LSTM) models may adapt to changes in building occupancy, weather, and other variables by learning from new data (Boutahri and Tilioua, 2024; Chen et al., 2024; Jhamb and Ahmed, 2023; Kini et al., 2023; Riva et al., 2024). LSTM models integrate various building elements to provide accurate forecasts, improving energy management and optimisation. LSTM networks can be scaled to analyse large datasets, making them suitable for complex structures and collections (Karatzas et al., 2024; Silvestri et al., 2024; Zheng et al., 2024).
Enhancing the energy efficiency of buildings is a crucial component of sustainable development, since it directly influences the preservation of the environment and leads to significant economic savings (Izonin et al., 2024; Rätz et al., 2024; Si et al., 2024). Accurately evaluating the heating and cooling load demands is crucial for maximising energy efficiency in buildings defined in Table 1.
Internal configuration.
Total params: 26,114
Although beneficial, LSTM networks are difficult to use for energy load prediction. These include ample and high-quality data, significant computational resources for training, and complex model interpretation. Future research could focus on hybrid models that combine Long Short-Term Memory (LSTM) with other machine learning methods to improve understanding and resilience. LSTM networks reliably assess building heating and cooling load demands depending on numerous parameters. These models optimise energy use, save operational costs, and promote sustainable living by accurately anticipating energy loads. LSTM networks will become more useful in building energy management as they are integrated with other technologies and data quality improves. Figure 4 depicts the initial configuration of existing LSTM networks (Ali et al., 2024; Di Giovanni et al., 2024; Zoubir, Er-retby, et al., 2024).

Basic of LSTM network.
Figure 5 presents a systematic methodology for utilising LSTM networks to forecast heating and cooling load demands, a critical factor in enhancing building energy efficiency. Conventional approaches frequently prove inadequate because of the intricate and fluctuating nature of factors that have an impact, such as meteorological conditions, patterns of occupancy, and construction materials. Nevertheless, the progress in machine learning presents encouraging solutions. The combination of Long Short-Term Memory (LSTM) networks with Gradient Boosting Machines (GBM) is a potent method for enhancing the precision and resilience of energy efficiency forecasts.

LSTM networks for predicting heating and cooling load requirements.
LSTM networks, a variant of recurrent neural networks (RNN), are specifically engineered to process sequential input and capture intricate relationships over long periods of time. Due to their characteristics, they are well-suited for time series forecasting jobs, specifically for projecting energy consumption patterns that exhibit temporal variations (Álvarez-Sanz et al., 2024; Ismail et al., 2024; Nur-E-Alam et al., 2024; Park et al., 2024; Wang et al., 2024; Zini and Carcasci, 2024; Zoubir, Es-sakali, et al., 2024). LSTM networks have the ability to acquire knowledge from past data in order to detect patterns and periodic variations, which are essential for estimating the heating and cooling demands of buildings. An LSTM network can be trained using historical weather data, occupancy patterns, and past energy usage to accurately forecast future energy demands. LSTM models can offer precise predictions by comprehending the impact of these elements on energy consumption over time, aiding in the efficient planning and management of energy resources.
GBM are a type of ensemble learning approach that constructs several decision trees in a sequential manner. Each subsequent tree rectifies the inaccuracies of its predecessors, leading to the development of a very precise predictive model. GBM is a powerful tool for representing intricate and non-linear connections between input features. This makes it well-suited for evaluating the various elements that impact building energy efficiency. GBM models have the ability to include a diverse set of features, ranging from fixed building attributes such as insulation levels and window kinds, to dynamic aspects like real-time occupancy and weather conditions. GBM can offer comprehensive insights into the impact of many parameters on heating and cooling loads by assimilating knowledge from a wide range of inputs.
The combination of LSTM and GBM capitalises on the advantages of each model, leading to a holistic strategy for forecasting building energy demands. The integration method commonly entails utilising the LSTM network to predict time-dependent factors, such as forthcoming weather conditions and occupancy patterns. Subsequently, these predictions are employed as supplementary attributes in the GBM model, which integrates them with other pertinent inputs to anticipate heating and cooling demands. The initial phase of this integration process entails gathering and preparing data. Data on historical energy use, weather conditions, occupancy patterns, and building attributes are collected and processed to remove any errors or inconsistencies. Feature engineering is essential, as it involves the creation of new features that capture significant patterns and trends. In order to utilise LSTM, the data needs to be structured as a time series, whereas GBM necessitates a comprehensive collection of features that encompasses both static and dynamic factors.
A Long Short-Term Memory (LSTM) network is specifically developed and trained using temporal data. The network's architecture is optimised to effectively capture temporal dependencies, taking into account the number of layers and neurons. After being trained, the LSTM model produces predictions for factors that change over time, offering a comprehensive understanding of future circumstances that influence energy consumption. The LSTM forecasts are incorporated into the GBM model as supplementary features. The GBM model is trained on an enhanced feature set, utilising the combined data to generate precise forecasts of heating and cooling demands. The iterative training of decision trees in GBM guarantees that the model captures intricate relationships among the data, resulting in more accurate and dependable predictions. Assessing the effectiveness of the integrated model is crucial to verify its precision and dependability. Cross-validation methods are employed to evaluate the performance of the model, and hyperparameters for both LSTM and GBM are optimised using approaches such as grid search or random search. The evaluation measures, such as RMSE, MAE, quantify the accuracy of the model's predictions. The fusion of LSTM and GBM presents numerous advantages in evaluating the energy efficiency of buildings:
Temporal Accuracy: LSTM model efficiently records sequential relationships, hence improving its capability to forecast energy consumption patterns that are influenced by time. Feature Enrichment: The GBM model utilises the forecasts generated by LSTM as additional features, hence enhancing the total predictive accuracy by capturing non-linear correlations. Robustness: By combining models, the limitations of individual models are reduced, resulting in forecasts that are more resilient and dependable.
LSTM networks and Gradient Boosting Machines improve building heating and cooling load demand energy efficiency evaluation. This hybrid model forecasts energy demand and improves building management by combining the benefits of both models. This integrated technique accurately predicts energy demands, advancing sustainable energy management and energy-efficient buildings. Thus, it benefits the environment and economy. Building energy efficiency and sustainability will benefit from integrated strategies as machine learning technology progress.
Results and analysis
Experimental setup
To implement the proposed model on the given dataset, a comprehensive experimental setup is required, which includes both hardware and software components.
Minimum hardware requirements
Central Processing Unit: To effectively handle the computational demands of training deep learning and gradient boosting models, a multi-core central processor unit (CPU) like the Intel Core i7 or AMD Ryzen 7 series is necessary.
Memory: To effectively manage extensive datasets and fulfil the memory requirements of the models, a minimum of 16GB of RAM is essential.
Storage: Sufficient storage capacity to accommodate datasets, model checkpoints, and experimental results.
GPU: Using a dedicated graphics processing unit (GPU) like NVIDIA GeForce GTX/RTX or AMD Radeon series, which has support for CUDA, can significantly accelerate the training of deep learning models, such as LSTM.
Software prerequisites
Python programming language: The computer language used for generating the models and conducting experiments.
Data analysis and manipulation: NumPy, Pandas, and scikit-learn are utilised for the purpose of data preprocessing and feature engineering in libraries.
Deep Learning Framework: TensorFlow or PyTorch for implementing LSTM networks.
Gradient Boosting Framework: XGBoost, LightGBM, and CatBoost are all viable choices for implementing the Gradient Boosting Machine (GBM) model.
Integrated Development Environment (IDE): Jupyter Notebook, Google Colab.
Initially, the authors define the feature importance for predicting the heating load in Tables 2 and 3.
Heating load predicting feature importance.
Cooling load predicting feature importance.
Building heating load prediction requires identifying the model's most important elements. Understanding feature significance improves model interpretation, optimisation, and decision-making. To assess the importance of these features, the authors use machine learning models like Gradient Boosting Machines (GBM) which already have features.
In Figure 6, high-ranking features substantially impact heating load projections. These must be monitored and measured precisely. Low-significance traits have little effect on predictions. Removing these may simplify the model without affecting performance. The authors know which heating load estimation aspects are most necessary to construct efficient and successful building energy management systems. By simplifying models, optimising performance and projections by focusing on the most important elements improves energy efficiency and sustainability in building operations. Next, the authors define the feature importance for predicting the cooling load in Tables 2 and 3.

Heating load predicting feature importance.
Outdoor temperature, humidity, and sun radiation should be relevant. The cooling load is mostly caused by these causes. Weather trumps interior heat gains, occupancy, and HVAC system efficiency. The cooling load estimate is affected by factors including the building's age, construction materials, and form, however, they are less important.
Figure depicts that outdoor temperature, humidity, and sun radiation should be relevant. The cooling load is mostly caused by these causes. Weather trumps interior heat gains, occupancy, and HVAC system efficiency. The cooling load estimate is affected by factors including the building's age, construction materials, and form, however they are less important.
Heat and cooling loads are crucial to energy-efficient building design and operation. Glass area distribution in a building greatly affects these loads. As shown in Figure 7 and 8, the glazing area—all windows in the building envelope—affects solar gain, heat loss, and thermal performance. Increased glass area increases cooling loads due to solar heat gains. Passive solar heating may reduce heating loads in bright, cold places with additional glass shown in Figure 9.

Cooling load predicting feature importance.

Heating load & cooling load by glazing area distribution.

Heating load & cooling load coefficients.
Due to heat loss, heating demands may rise under darker or colder conditions. An optimal glazing area balances heating and cooling loads, reducing energy use and maintaining thermal comfort. The size and placement of glass areas affect building heating and cooling needs shown in Table 4.
Model for heating load prediction.
By evaluating the environment, using advanced glass technologies, and optimising glazing regions with shading devices, designers may improve building energy efficiency. Additional research and simulations tailored to diverse building types and locales improve these techniques, supporting sustainable and energy-efficient constructions. Figure 10 depicts the performance of various model for heating load and cooling load.

Heating load & cooling load performance.
As shown in Figure 11, LSTM networks and GBM provide a resilient framework for building heating and cooling load forecasting. The reduced RMSE values show that this hybrid method captures temporal dependencies and complicated nonlinear interactions to improve prediction accuracy. This improved predictive capacity improves energy management, optimising heating and cooling operations, reducing energy use, and saving money.

Proposed model performance in term of RMSE.
Various machine learning techniques can predict building cooling loads shown in Table 5. Linear models like LR are simple but may not effectively capture complex patterns. Advanced models like long short-term memory (LSTM) and gradient boosting machines (GBM) make more accurate predictions but demand more computer resources and complexity. Consider the trade-off between accuracy, interpretability, and processing efficiency while choosing a model, taking into account the application's needs and restrictions shown in Figure 12.

Heating load & cooling load model in scenario 1 - using all features.
Model for cooling load prediction.
Ablation study
The ablation study shows that LSTM and GBM components improve the integrated model's predictive ability. The LSTM network captures temporal dependency well, while the GBM model models complex nonlinear interactions to boost prediction accuracy. Table 6 includes Scenario 1 which includes all features
Scenario 1 - using all features.
The merged LSTM-GBM model excels with the lowest RMSE and MAE, indicating synergy. Eliminating temporal variables from the LSTM model or LSTM predictions from the GBM model increases errors, confirming both components in the hybrid technique shown in Figure 12.
Model trained with Surface_Area [Scenario 2] as additional feature has lower performance than model with Glazing_Area and Wall_Area feature [Scenario 3]. The performance significantly decreasing for MLP model.
The integrated model can forecast heating and cooling needs in real time, enabling proactive adjustments. This feature could save energy and improve comfort. The combined LSTM-GBM model optimises energy use for sustainable building operations. Energy conservation cuts costs and promotes sustainability. Integrating local temperature data and building-specific characteristics allows the system to be tailored to different structures and geographic areas. This versatility lets the approach be employed in many building applications. To increase data quality and prediction, the LSTM-GBM model could be enhanced with IoT sensors and intelligent devices. Integration could lead to more accurate and adaptable energy management systems. Automating hyperparameter tweaking can speed up model training and improve performance. Solar panels and wind turbines can be added to the model to improve building energy efficiency. This extension would boost building sustainability. Table 7 includes Scenario 2 which includes feature ‘Surface_Area’ & ‘Wall_Area’ as additional feature.
Scenario 2 - using feature `Surface_Areà & `Wall_Areà as additional feature.
LSTM networks and Gradient Boosting Machines forecast building heating and cooling loads effectively shown in Figure 13.

Heating load & cooling load model adjusted R2.
This strategy improves prediction accuracy, energy control, and building sustainability by combining both models. Although challenging to implement, this integrated strategy is essential for energy-efficient buildings due to its benefits and prospective improvements. Table 8 includes Scenario 3 which include feature ‘Wall_Area’ & ‘Glazing_Area’ as additional features
Scenario 3 - using feature ‘wall_area’ & ‘glazing_area’ as additional features.
Meanwhile, there's not much difference between using all features [Scenario 1] and using just important features with additional Glazing_Area and Wall_Area [Scenario 3]. The authors observed following facts during ablation study as below:
MLP/SVR/RF/GBoost - no significant difference KNN - significant improvement using important features only XGBoost - lower performance using important features only
The authors use only five features from total eight features to lower the computation time.
Discussion
The integration of LSTM networks and Gradient Boosting Machines (GBM) is a breakthrough in predictive modelling for building energy efficiency. This hybrid method uses LSTM networks and GBMs to estimate heating and cooling needs. This enhances forecast accuracy and reliability. LSTM networks are essential for modelling building energy use, which is affected by weather and occupancy patterns, because they capture temporal relationships and long-term patterns in sequential data. Gradient Boosting Machines (GBMs) are excellent at modelling complex non-linear feature relationships. Integrating LSTM-generated features allows the GBM to use temporal insights to improve prediction accuracy. Empirical evidence shows that the integrated strategy outperforms separate models in accuracy measures like RMSE and MAE. LSTM networks and GBMs synergize to improve performance. Two advanced models require data preprocessing, including normalisation, missing value correction, and data arrangement for LSTM and GBM models. LSTM temporal alignment and GBM tabular data organisation can be tough. LSTM and GBM model training and optimisation demand significant computational resources and expertise. Optimising both models requires hyperparameter tweaking, which can be time-consuming and complicated. Smooth integration and real-time data stream management are needed to deploy the integrated model in a live Building Management System (BMS). Real-time forecast durability and dependability add complexity.
Conclusion and future scope
This research work shows that LSTM networks and Gradient Boosting Machines (GBM) can accurately estimate building heating and cooling load demands, boosting energy efficiency. The hybrid model uses LSTM networks, which are good at capturing temporal relationships in time series data, and GBM, which is good at modelling complex nonlinear interactions. Performance measures like Root Mean Squared Error (RMSE) and Mean Absolute Error show that integrated methods improve prediction accuracy over standalone models. The LSTM model's initial projections and the GBM's ability to improve them with additional attributes make energy load forecasting resilient. Load forecast precision improves energy administration, optimising building heating and cooling, reducing energy use, and saving money. The integrated model is used in Building Management Systems (BMS) to make real-time changes and optimise building performance.
Though promising, there are more areas for research and improvement:
Integrating more data, such as real-time IoT sensor data, could increase model accuracy and responsiveness. The model's forecasts could be improved by adding building-specific attributes, tenant behaviour patterns, and external environmental components. Superior performance can be achieved by fine-tuning hyperparameters for LSTM and GBM models using advanced optimisation methods. Combining machine learning and deep learning models like LSTM with Random Forests may reveal new insights and improvements. For wider adoption, the model's ability to manage different building types and sizes and adapt to different climates should be assessed. Improving BMS operational efficiency by creating frameworks that seamlessly integrate and handle real-time data. Investigating predictive models for energy storage and renewable energy sources to improve energy use and sustainability. Improving Building Management System (BMS) user interfaces to provide real insights and feedback from model predictions to simplify and improve energy management.
These future directions can improve LSTM network-Gradient Boosting Machine integration for sustainable and energy-efficient constructions. This ongoing research will help create smarter, greener building systems that combat climate change and promote sustainability.
Footnotes
Author contributions
The authors confirm their contribution to the paper as follows: study conception and design: RB, SA and SR; data collection: KR, MMS and MAS; analysis and interpretation of results: AS, SR and RB; draft manuscript preparation: RB, and AS. All authors reviewed the results and approved the final version of the manuscript.
Availability of data and materials
Publicly available datasets were analysed in this study.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
