Abstract
This paper deals with short-term load forecasting for energy management systems in small and middle-sized buildings. Unlike existing studies that focus on the forecasting accuracy, this study examined some candidate load forecasting methods with regard to convenience and cost-efficiency. Three-year energy use patterns of office buildings were identified according to calendar data and weather data. Simple forecasting equations were derived based on regression analyses using linear, seasonal linear, and quadratic models. The quadratic model was found most appropriate for Korea’s climate with four distinct seasons. The forecasting equation derived from the energy consumption of 2017 was verified by comparing the energy consumption forecast obtained by applying the weather data of 2018 to the equation and the actual energy consumption of 2018. This study will be using our simple load forecasting system that does not need to install sensors in all the target buildings but only in some representative buildings of similar shapes and calculate energy consumption forecasts for each target building by using the least possible data.
Keywords
Introduction
An energy management system (EMS) facilitates efficient energy use by monitoring and controlling the energy supply and demand. Generally, the EMS is utilized to operate a nationwide power grid that encompasses generation, transmission, and distribution. Traditionally, power generation was concentrated in large plants and the energy thus produced was consumed in remote cities or plants. A nationwide EMS was appropriate for this purpose. However, renewable energy systems are installed near users; therefore, EMS is required for small-scale consumers such as cities, building complexes, and plants. In power plant construction, the conventional large-scale EMS has been used to reflect the load forecasts and stably operate power transmission and distribution systems (Amjady and Daraeepour, 2010; Cerjan et al., 2019; Hyndman and Fan, 2009; Zhang et al., 2019). Recently, a small-scale EMS was used to predict the energy consumption in a building or a plant to determine the optimal size of the renewable energy source and calculate the installed capacity of the ESS. Such an EMS is also used to determine the charge and discharge capacities and cycles of the ESS, which are valuable data for reducing the energy cost (Wen et al., 2019). Accordingly, the importance of load forecasting technologies for EMS is recognized not only by operators of traditional power grids but also by energy managers of plants and buildings (Ferrari et al., 2019; Finck et al., 2019).
Load forecasting methods are mainly classified into engineering methods, statistical methods, and artificial intelligence-based methods (Fallah et al., 2019). Engineering methods predict the energy consumption based on the detailed data of the building characteristics, HVAC system, and weather conditions, which influence the energy performance. DOE-2, EnergyPlus, BLAST, ESP-r, and many other software programs have been developed for this purpose (Glavan et al., 2019; Zhang and Wen, 2019; Zhao and Magoulès, 2012). Several mechanical and architectural engineers have actively employed engineering methods. Engineering methods are accurate but demand extensive detailed information on building and environmental parameters.
Statistical methods are most widely implemented for load forecasting. These methods utilize data related to past energy consumptions, calendar data, and weather data (Fan and Hyndman, 2011). Regression analysis is mainly used to calculate a forecasting equation by deriving a correlation between factors influencing energy consumption and the actual energy consumption (Tso and Yau, 2007). Stochastic methods, exponential smoothing methods (Taylor, 2011), and ARIMA method (Huang and Shih, 2003; Taylor and McSharry, 2007) are examples of such methods. Statistical methods can hardly respond to unprecedented changes like addition of equipment and extension of building.
Artificial neural network (ANN) and support vector machine (SVM) are the most representative artificial intelligence-based methods (Barman and Choudhury, 2019; Bashir and EL-Hawary, 2009; Chen et al., 2009; Hong and Fan, 2019; Lee et al., 2019; Taylor and Buizza, 2002). Since 2000, many studies have been actively conducted for large-scale load forecasting from the perspective of a power grid operator. Unlike statistical methods, ANN-based and SVM-based methods can extract a non-linear correlation between various types of data; therefore, the accuracy of these methods is higher than that of the other methods (Ahmad et al., 2019). Among the studies on building load forecasting, Ekici and Aksoy (2009) analyzed the exterior types and characteristics of a building and used transparency (%), insulation thickness (cm), and direction as the input data to calculate the heating energy load. However, these methods require high CPU usage that can process a complex algorithm in real time (Bouktif et al., 2019; Gao et al., 2019).
Based on the above three types of load forecasting methods, many studies have proposed various modified forecasting models, analyzed their applications, and compared their performances. However, except the study by Rueda et al. (2019), all the studies focused on the accuracy of the forecasting models. In contrast, this study is mainly interested in a simple and cost-efficient load forecast that can be easily installed for the EMS in a building. Since 2017, the South Korea government has mandated the installation of EMS in the new building with an area of 10,000 m2 or more. For the efficient management of energy in small buildings, simple and low-cost forecasting models for building below 10,000 m2 without the need to install EMS are present in this paper. Owners of small and middle-sized buildings want to reduce energy costs by a certain degree without large investment in equipment or constant maintenance. Accordingly, this study employs statistical methods that utilize only easily accessible data, rather than artificial intelligence-based methods, which require extensive data and constant CPU operation, or engineering methods, which require the information on major energy-use equipment and building characteristics.
Regression models, semi-parametric models, exponential smoothing models, autoregressive moving average models in statistical methods are well known (Hong and Fan, 2016). Geng et al. (2018) describe linear regression models with inflection points according to outdoor temperature for use in building energy performance diagnosis. Various regression models, including multiple linear models, polynomial models, and hybrid artificial intelligence models are verified in a variety of applications: home (Fumo and Biswas, 2015; Tso and Yau, 2007), holiday (Wi et al., 2011), natural gas consumption (Soldo et al., 2014).
Semi-parametric models are based on regression models but with adding non-linear components. Fan and Hyndman (2011) proposed model adds lagged actual demand as the default input for weather effects. Goude et al. (2014) proposed an optimum semi-parametric model that uses calendar information and temperature as its primary input to 2200 substations in France and adds holiday and daylight saving time. It showed the ability to predict a huge number of energy consumptions on the distribution grid in France.
Exponential smoothing models reduce the weight of historical energy data exponentially over time. Taylor and Mcsharry (2007) have suggested exponential smoothing models with good accuracy, but they are suitable for stable weather conditions in large areas such as the national scale (Hong et al., 2014). Autoregressive moving average (ARMA) models provide two polynomials: autoregressive models and moving-average models. Although Pappas et al. (2010) and Lee and Ko (2011) use this models to produce meaningful results, there are has rich class and subjective that require a lot of data. This is sometimes an obstacle to application (Dudek, 2016).
As mention above, there are various statistical models, each with their own advantage. As is well known, energy consumption pattern varies according to various circumstances, such as season, weather, economy, area, etc. This study targets a simple and low-cost load forecasting method for office buildings in South Korea with four distinct seasons.
Background
Objective
As mentioned in Introduction, the main interest of this study is not high accuracy but the realization of a simple and cost-efficient load forecasting method that can be used in office buildings. Figure 1 shows our ultimate objective. It is assumed that the energy consumption pattern will be similar depending on the type of building, such as homes, offices, and stores. Energy usage sensor will be installed only some buildings. Where the sensor is installed, the energy consumption is predicted by the simple regression models proposed in this paper. In the absence of sensors, data from neighboring buildings with sensor and easily available information such as electric bills will be used. Our ultimate objective is to implement a load forecasting system that does not need to install sensors in all the target buildings but only in some sample buildings of similar shapes, as depicted in Figure 1, to enable the calculation of load forecasts for each target building by using the least possible data (Chen et al., 2009; Fan et al., 2019).

Our project’s ultimate objective and this research overview.
Weather in Korea
The Republic of Korea lies in the temperate zone and experiences four distinct seasons. Geographically, it is located in the middle latitudes of the Northern Hemisphere. Each season has different wind directions, rainfall characteristics. The winter weather is cold and dry owing to the influence of the Siberian high-pressure system, whereas the summer weather is hot and humid because of the North Pacific anticyclone. The rainy season or monsoon starts from the middle of June in the southern region and continues for approximately 30 days. Accordingly, the summer rainfall accounts for as much as 50–60% of the annual total rainfall. In most parts of the nation, the annual average humidity ranges from 60% to 75%. In July and August, the average humidity reaches up to 70–85%. Spring and autumn are influenced by migratory anticyclones and are associated with clear and dry days. The average humidity of March and April, which are the months of spring, is 50–70%. Figure 2 shows the daily average temperature, humidity, wind speed and solar radiation of a particular region in Korea from 2016 to 2018. The temperatures are high during summer, that is, from June to August, and fall below zero in winter, that is, from November to February Thus, the temperature variation indicates the four distinct seasons of Korea (Min et al., 2011).

Trends of energy consumption in sample buildings from 2016 to 2018 and trends of daily average temperature, humidity, wind speed and solar radiation for the same period in Korea. The same pattern with the highest temperature in summer and the lowest temperature in winter is repeated in the one-year intervals.
Sample building
A group of six buildings with gross area of 53,828 m2 was selected as the sample. Totally, 111 knowledge enterprises have offices in these buildings. As of 2017, the total energy consumption was 5,789 MWh, which was mostly supplied by the power grid. This study used load data collected at 15 min intervals by the telemetering system of Korea Electric Power Corporation (KEPCO), which operates the power grid of Korea. This study also utilized the hourly and daily weather data, which were provided by the Korea Meteorological Administration (KMA). The weather data, namely, temperature, wind speed, solar radiation, and humidity, were measured at an observatory located 16 km from the sample buildings. The humidity measurements were obtained eight times (03:00, 06:00, 09:00, 12:00, 15:00, 18:00, 21:00, 24:00) a day (Table 1).
Sample buildings information.
Simple linear model of weather and energy consumption from 2016 to 2018.
MAPE: mean absolute percentage error.
Simple load forecasting method
Correlation between weather and building energy consumption
Figure 3 illustrates a correlation between energy consumption of the sample building and temperature, wind speed, humidity, and solar radiation. As mentioned previously, the goal of this study was not to achieve high accuracy but to identify a simple and cost-efficient load forecasting method. In this regard, wind speed, humidity, and solar radiation show little correlation with energy consumption. On the other hand, temperature displays a clear hockey stick-shaped correlation with the energy consumption (Moral-Carcedo and Pérez-García, 2019; Xie et al., 2016). In Figure 3(a), the energy consumption according to temperature is distributed into the upper and lower parts, which correspond to weekdays and weekends, respectively. Beyond the temperature of approximately 13°C, the energy consumption increased with rising temperature. At temperatures below this, the energy consumption increased as the temperature decreased. It can be inferred that the air-conditioning system was operated at high temperatures, whereas the heating system was operated at low temperatures, both of which increased the energy consumption.

Correlation between the energy consumption of a sample building and (a) temperature, (b) wind speed, (c) humidity, and (d) solar radiation in the year 2017. Temperature and energy consumption show a hockey stick-shaped correlation.
Regression model
As illustrated in Figure 4, there are three types of regression analysis models. Figure 4(a) is a simple linear model suitable for regions with temperate climate. This model was utilized by Fumo and Biswas (2015).

(a) Simple linear, (b) seasonal linear and (c) quadratic models.
α1 is the sensitivity of the energy consumption to temperature and β1 denotes the base load. This does not seem to be suitable for Korea, which has four distinct seasons including winter.
Geng et al. (2018) distinguished between the heating degree-day (HDD) section with a linear decrease and cooling degree-day (CDD) section with a linear increase by using a correlation between building energy consumption and temperatures of a particular region in China. The results of applying this distinction to the four distinct seasons of Korea are illustrated in Figure 4(b).
The HDD section corresponds to winter, the CDD section corresponds to summer, and the no degree-day (NDD) section, which is less sensitive to temperatures, is allocated to spring and autumn, as expressed in the following equation.
Valor et al. (2001) conducted a regression analysis of the energy consumption and temperatures in Spain by using quadratic equations (Wi et al., 2011).
The first coefficient (α3) indicates the efficiency of the cooling and heating systems at high or low temperatures. The second coefficient (β3) denotes the inflection temperatures of cooling and heating, and the third constant (γ) is the base load that is constant irrespective of temperature.
As the increase in energy consumption exceeds the increment or decrement in temperature, the energy consumptions of the cooling and heating systems can be described more faithfully.
The regression analysis applied the daily average temperature, wind speed, humidity, and solar radiation as the independent variables and the daily energy consumption as a dependent variable. The accuracy of the regression analysis model was expressed by mean absolute percentage error (MAPE), which is the difference in absolute values between the actual and estimated data.
N denotes the number of datasets, Ai is the actual load and Pi indicates the load forecast.
Short-term load forecasting in sample buildings
Single regression analysis
Tables 2 to 4 present the regression analysis results by applying the simple linear model, the seasonal linear model, and the quadratic model discussed in Section 3 to the daily energy consumption of the sample buildings mentioned in Section 2. Weekdays and weekends were distinguished in the analysis.
The MAPE in the case of the regression analysis using the simple linear model was mostly over 20%. In other words, the linear model could not express the energy consumption accurately. As mentioned previously, this model was not appropriate for regions that experience both winter and summer.
The seasonal linear model may produce different results depending on how the inflection points of CDD, NDD, and HDD are designated. As this study focused on economy and user convenience, four seasons were adopted as followed by KEPCO, which is the charging authority for energy consumption in Korea. In other words, CDD and HDD were allocated to summer (June to August) and winter (November to February), respectively. NDD was allocated to both spring (March to May) and autumn (September and October). The regression analysis results obtained under this assumption are provided in Table 3. |α| in equation (2) indicates the sensitivity of energy consumption to temperature and is higher on weekdays than on weekends. This result is expected because the sample building contains offices. Moreover, spring showed a negative energy sensitivity (α) like winter, and autumn showed a positive energy sensitivity (α) like summer.
Seasonal linear model of weather and energy consumption from 2016 to 2018.
MAPE: mean absolute percentage error.
Table 4 presents the results of the quadratic regression analysis. As was expected, temperature had a higher correlation with energy consumption than wind speed, humidity, and solar radiation. In most cases, the sensitivity to temperature (α) and the base load (γ) were higher on weekdays than on weekends. In addition, as α is higher in summer than in winter, the cooling system caused a larger increase in energy consumption according to temperature than the heating system.
Quadratic model of weather and energy consumption from 2016 to 2018.
MAPE: mean absolute percentage error.
Among the three models, the quadratic model had the lowest MAPE. Although, the MAPE of the quadratic model was only approximately 0.2% lower than that of the seasonal linear model, the former was lower both on weekdays and on weekends from 2016 to 2018. Consequently, the quadratic model was selected as most appropriate for the weather in Korea.
Multiple regression analysis
The results of the multiple regression analysis using the quadratic model expressed in equation (5) are presented in Table 5. In equation (5), t is temperature and k denotes wind speed, humidity, or solar radiation. As shown in Figure 2, wind speed, humidity, and solar radiation also have sinusoidal shapes such as temperature; therefore, they also used the quadratic function.
Multiple regression analysis for temperature, additional weather factor, and energy consumption.
MAPE: mean absolute percentage error.
Figure 5 shows the total MAPE for each year of seasonal linear, quadratic, and multiple regression models on weekdays and weekends each year in Tables 3 to 5. The multiple regression models use humidity, which has the lowest MAPE, as a second variable (k). The difference of MAPEs is small. In most case, however, the quadratic model is better than the seasonal linear model. As the number of variables increases, the correlation coefficient becomes higher due to the structure of the regression analysis. Accordingly, the multiple regression analysis showed best MAPE values.

Total MAPE of seasonal linear models, quadratic models, multiple regression models on weekdays and weekends in 2016, 2017, 2018.
Hourly load forecasting
Figure 6 schematically illustrates the correlations between energy consumption and temperature, wind speed, humidity, and solar radiation at specific hours of the day (03:00, 09:00, 13:00, and 17:00) in 2017. As solar radiation exists only during the day, there is no graph for early morning and night. Although the correlation between energy consumption and temperature at early morning shows no difference between weekdays and weekends, the graphs are not significantly different from that of the daily average energy consumption in Figure 3. The forecasting equation for hourly energy consumption was calculated by regression analysis using not the daily average energy consumption but the energy consumption at each given time and the weather information. The result was similar to the daily average.

Correlation between energy consumption and weather at specific hours of the day for the year 2017. The correlation analysis between energy consumption and temperature shows a certain pattern. The correlation analyses for the remaining weather factors do not reveal such a pattern.
Verification of forecasting model
Each forecasting equation was derived based on the energy consumption and weather data of 2017. The equations thus obtained were applied to the case of 2018 and the results were compared with the actual data. Figure 7 shows the actual data and forecasting date. The values forecasted with the seasonal linear equation were −217.6x + 19,101 for weekday of spring, 916.2x − 1954.7 for weekday of summer, 303.1x + 10,033 for weekday of autumn, −535.8x + 21,634 for weekday of winter, −62.7x + 10,112 for weekend of spring, 206.0x + 5377.6 for weekend of summer, 221.8x + 4961.8 for weekend of autumn, and −372.4x + 11,535 for weekend of winter. The values forecasted with the quadratic equation were 18.1x2 − 653.2x + 21,163.6 for weekday of winter/spring, 45.3x2 − 1079.8x + 19,972.6 for weekday of summer/autumn, 11.9x2 − 343.1x + 11,375.0 for weekend of winter/spring, and 1.931.1x2 + 138.0x + 5844.6 for weekend of summer/autumn. The values forecasted with the multiple regression equation using temperature and humidity as independent variables were 18.873t2−669.064t + 0.684k2−67.516k + 22,639.998 for weekday of winter/spring, 45.144t2−1129.212t + 0.873k2−71.952k + 21,523.981 for weekday of summer/autumn, 13.437t2−368.329t−0.275k2+47.702k + 9520.7401 for weekend of winter/spring, and 2.263t2+124.522t + 0.5985k2−88.7k + 9128.1879 for weekend of summer/autumn. Here, t is temperature and k denotes humidity. Figure 7 compares the energy consumption forecasts against the actual values in 2018. The MAPEs of the seasonal linear model for weekday and weekend were 11.47 and 7.45, respectively, whereas those of the quadratic model were 11.21 and 7.58, respectively and those of the multiple regression analysis were 11.09 and 7.5, respectively.

Actual energy consumption of 2018; energy consumption forecasts of the seasonal linear model, quadratic model, and multiple regression model.
Figure 8 shows the application results of the hourly forecasts for specific dates in 2018, which were designated as business days representing the four seasons. Thus, 3 April 2018 represented spring; 14 August 2018 represented summer; 2 October 2018 represented autumn; 2 January 2018 represented winter. Energy consumption forecasts were calculated by applying the data on each date to the forecasting equations. The graphs in Figure 8 show the actual energy consumptions and forecasts obtained by seasonal linear model, quadratic model and multiple regression analysis. The forecasting accuracies of the seasonal linear model were 89.33% for spring, 82.33% for summer, 93.67% for autumn, and 93.62% for winter, whereas those of the quadratic model were 89.85%, 89.08%, 92.69%, and 93.06%, respectively and those of the multiple regression analysis were 90.25%, 88.55%, 90.74%, and 93.09%, respectively.

Comparison between actual values and forecasts obtained by forecasting equations for specific dates: (a) winter (2 January 2018), (b) spring (3 April 2018), (c) summer (14 August 2018), (d) autumn (2 October 2018).
Conclusion
This study examined load forecasting functions and their performance from the perspective of simplicity and cost efficiency for application to the EMSs of small and middle-sized buildings. The functions demand only limited amount of data and small investment on equipment. A statistical method was employed to produce load forecasts by using only weather data and past energy consumptions. The energy consumption of an actual building was compared with the weather data to identify a correlation. The correlations between energy consumption data and weather data such as temperature, wind speed, humidity, and solar radiation were examined by using linear, seasonal linear, and quadratic models, and through multiple regression analysis. Only temperature showed a significant correlation with the energy consumption. The quadratic model was the most appropriate for the weather of Korea, which has four distinct seasons. Accordingly, quadratic equations were derived for each season, that is, according to the temperature. When temperature increased in summer, the cooling system was operated more frequently, resulting in increased energy consumption. When temperature decreased in winter, the heating system was operated more frequently, again resulting in increased energy consumption. Forecasting equations derived from past data were applied to the weather data of 2018 to predict the energy consumption.
Comparisons of the forecasts obtained by the seasonal linear model and the seasonal quadratic simple regression analysis with the actual daily energy consumption of 2018 showed forecast accuracies of 90.85% and 91.24%, respectively. The forecast of the multiple regression analysis using temperature and humidity was 91.54% accurate. Forecasts for the hourly energy consumptions on specific dates with the seasonal linear model, the quadratic model, and the multiple regression analysis had the accuracies of 82.33%–93.67%, 89.08%–93.06%, and 88.55%–93.09%, respectively. This study did not aim for high forecasting accuracy but examined the load forecasting methods to assess their simplicity and cost efficiency. Further study will seek to improve the forecasting accuracy of such simple and cost-efficient methods by combining the measurements of a sample building and the approximate energy consumption of a target building.
Footnotes
Acknowledgements
The author would like to thank Miss Hye-jin Lee of Yeungnam University College for collecting the weather data for this paper.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Yeungnam University College Research grants in 2018.
