Abstract
Palestine lacks sufficient conventional energy sources that meet the daily needs of the Palestinian people, and consequently, it heavily relies on neighboring countries for its supply with energy compensations. Wind energy is recognized as an abundant, effective, and eco-friendly power source, but it poses several challenges in harnessing due to the inherent variability of wind characteristics. The main objective of this research study is to delve into the wind energy landscape in Palestine, and to offer some insights into the feasibility of wind speed forecasting for implementing sustainable energy solutions, with a special focus on ARIMA; a widely used statistical method for time series forecasting. It specifically explores the potential of using ARIMA models to forecast wind speed using a data captured from a meteorological station located in east Jerusalem, Palestine for a duration of 2 years—January 1, 2021 to December 31, 2022. To find the optimal values of ARIMA parameters (p, d, q) for the considered study site, a set of experiments were conducted and the model's forecasting accuracy was evaluated using three metrics: RMSE, MAE, and the coefficient of determination (R2). The results have shown that ARIMA (21,2) emerges as the most accurate structure with an input period that demonstrates superior estimation with minimal RMSE (1.74), minimal MAE (1.58) and higher R2 (0.76) values. This means that the optimal estimation is achieved when an autoregressive process is based on the previous two lagged observations and the moving average process incorporates the dependency between the observation and the residual error from a second-order moving average applied to the lagged observations. These findings give valuable insights into the feasibility and precision of wind speed forecasting models for sustainable energy solutions, and emphasize the potential for harnessing wind energy in the region as clarified by ARIMA forecasting accuracy.
Introduction
In the current era of the information age, the investment in renewable energy sources is attracting a global attention worldwide. Many countries are now investing huge budgets in renewable energy projects to meet their current and future energy needs (Elsaraiti and Merabet, 2021). The sun and wind energies are among the most important sources of energy sustainability in Palestine, where solar energy has received an in-depth study, but wind energy lacks studies at the national level, although it is one of the solutions to generate electricity, as it provides a clean and practical solution to generate energy from wind using small turbines similar to home solar projects (Hanifi et al., 2020). Since the wind speed is the main factor on which the amount of energy production depends, and there are not enough monitoring stations in terms of the highest construction and operational costs, the researchers are currently working toward the development of digital models that deeply analyze and predict the highly variable wind speed values at a specific study site for one region and circulate them to other areas (Chodakowska et al., 2023). Since the wind chaotic nature is large and considers a main challenge for producing energy. However, predicting wind speed accurately at a specific site must be given special attention when considering projects related to the installation of wind farms. Thus, before the development and installation of wind power applications at a site; the main challenge that requires special attention throughout the feasibility study is the wind speed profile at that site (Chen et al., 2022). In this regard, an accurate and deep analysis of wind profiles must be carried out to achieve higher prediction accuracy of wind energy at a meteorological station to get the optimal benefits out of the wind (Shang et al., 2022).
To compensate for their energy demands, the Palestinian people depend on neighboring countries. They lack their own sustainability in energy due to several challenges such as the bad political situations they live in, the economic issues induced by lockout and limited income, and other environmental and social issues (Alsamamra et al., 2022). For this reason, most areas of Palestine suffer from a severe shortage of energy supplies throughout the year, without any radical solutions appearing on the horizon (Juaidi et al., 2022b). Currently, the Palestinian Authority considers renewable energy sources as one of the top governmental priorities in its sustainable development goals (SDGs) plan for the coming years. It encourages and supports all development projects targeting this sustainable goal.
The geographical nature of Palestine helps to install wind turbines in some areas with good wind speed characteristics for generating electricity (Salah et al., 2022). The presented study is an important work since studies in exploring long-term wind speed profiles in east Jerusalem to predict wind are very limited. Although there are similar contributions conducted worldwide, the research community emphasized that the nature and characteristics of wind speed differ from location to location (site-dependent), and an optimal solution for one location is not necessary to be valid for others. Due to the lack of similar contributions in Palestine, this study's output will be used as a reference point for those who are planning to invest in wind energy projects in the region.
Palestine suffers from the scarcity of traditional energy sources, higher energy consumption prices, as well as the control of the occupation authority over the quantities that feed the Palestinian lands. As a result of this, the Palestinians always seek to rely more on alternative energy sources, as this approach comes in line with the growing global trends to exploit alternative energy sources. In this regard, the Palestinian Authority is continuously supporting all individuals and any collective initiative that works to produce electric energy from alternative sources (Ibrik, 2019). This was also represented in the Palestinian government enacting regulatory legislation that encourages investors to invest in renewable energy and regulates the relationship between the parties involved in any investment (PCBS, 2024). Also, the Ministry of Higher Education (MOHE) has worked to encourage the national universities to develop academic programs in renewable energy and support researchers in this field, recognizing the importance of the energy sector in general and an alternative energy in particular, because they believe that this sector contributes to achieving the independence of the energy sector which is one of the main factors in the independence of countries and sustainable development. Also, they believe that any sustainable energy project will create new job opportunities in society, since this sector opens many opportunities, and it is considered as one of the highest demanded professionals in the world (Abdallah and Camur, 2022).
Compared to other nearby countries, the energy sector in Palestine has special considerations due to several reasons: unstable political and economic conditions, a scarcity of natural resources, high population density, fully dependent on nearby countries and difficult financial problems (Hamed et al., 2012). Due to its weak economy, Palestine desperately needs all kinds of energy, the Palestinian government has recently adopted the renewable energy polices and implemented many projects aimed to increase the investment in the renewable energy sector (Juaidi et al., 2022a). Globally, in the next 30 years, the consumption of renewable energy will keep growing by a factor of 147% (Sun and Jin, 2022). However, investment projects in sustainable and clean energy have attracted global attention; for every 1 USD spent on fossil fuels, there is 1.7 USD is spent on clean energy compared to a 1:1 ratio 5 years ago (Li et al., 2023a). Compared to all sources of renewable energy, in 2021 wind-based energy increased by an amount of up to 17%, which was 45% higher growth compared to the year 2020, and it was the highest among all renewable energy technologies (GWEC, 2021). In their report, the Global Wind Energy Council (GWEC) stated that the international generated electricity based on wind power systems has an increase of 94 GW between the years 2021 and 2022 (GWEC, 2022). However, a precise knowledge of wind speed conditions at a wind farm plays a crucial role to the success of a project during assessments and operational phases. Furthermore, the increase of grid-connected wind power technology and the nonstationary of wind speed, the stability of the power system is a challenge (Kitaneh et al., 2012). Therefore, designing accurate wind speed forecasting models will help in building efficient wind power systems to get the best utilization out of this nature energy source by minimizing the side effects of wind uncertainties on the stability of these systems (Kosanoglu, 2022).
The remaining parts of the article are structured as follows. A review of the related work is presented in the second section. The technical description of the ARIMA model and its components are discussed in the third section. The fourth section presents the dataset and discusses the research methodology. The fifth section overviews the common performance metrics used to evaluate the proposed model. The results and their discussions are presented in the sixth section. The last section summarizes the main findings and provides further research lines.
Related work
Internationally, the significance in contributing intraday marketplaces is expanding as the amount of renewable irregular energy production expands. Being reasonable on the network closer to distribution time aids both power systems and market participants by shrinking the need for related costs and reserves. Additionally, the intraday marketplace is a vital instrument for market contributors to coordinate unforeseen shifts in energy consumption and outages (Shoaib et al., 2019). Wind speed has strong intermittency and randomness, and the continuous growth of the scale of the wind power systems might lead to a frequent destruction of the stability of the power system and a regular fluctuation in its frequency. A precise estimation of wind speed can have a direct impact on controlling wind power turbines in a single meteorological station as well as in a large-scale wind power integration (Hur, 2021). Consequently, building efficient prediction models for wind speed becomes an essential step for improving the operational control level of wind turbines (Hu and Wang, 2015). Accordingly, efficient wind speed prediction can help in providing accurate estimates of the predictable power of a wind turbine for short-, medium-, and long-term time intervals; which leads to an improvement in operating costs, investment profits, and production (Liu et al., 2019). Therefore, the production of wind power from short and long terms has special focus by system operators to guarantee consistent supply in electricity, thus running the system steadily in terms of auxiliary services, quality, and system stability (Mi et al., 2018).
Despite the chaotic nature of the wind, there are some predictive methods for forecasting. The scientific research has made great efforts to increase the accuracy of the prediction models of wind speed. These models can be basic preprocessing methods, predictive models, and hybrid strategies with the main aim of minimizing the prediction error of wind speed time series data (Hong and Satriani, 2020). For basic predictive models, different methods have been presented. The physical model commonly employs physical attributes like temperature and pressure to forecast wind speed (Zeng et al., 2020). Numerical methods for weather prediction are an example of such technologies. But, due to the weak correlation between physical parameters and short-term wind speed, this technology behaves well for medium- and long-term wind speed forecasting (Demolli et al., 2019). Additionally, it requires a long duration of operational activities and a huge volume of computational resources. Although statistical models obtain some knowledge from the observed data, the need to adapt any measurement to finetune the model a priori is not required, that is, the tolerance between the data and the online measurements adaptability (Adebiyi et al., 2014). However, the prediction of short-term wind speed is usually predicted by examining the fundamental laws of chronological wind speed data (Salman et al., 2018). Lately, researchers focused more on artificial intelligence methods that model the human brain. Nowadays, a combined approach of physics and statistical methods is widely employed to wind power estimation (Sahoo et al., 2019). The accuracy of time series forecasting process varies based on the prediction model, as time series data can have linear and nonlinear features and suffer from disturbances and randomness (Manwell et al., 2010). This indicates that adopting traditional methods will be inefficient, which has encouraged the scientific community to think about novel techniques for the optimal prediction of wind speeds and their future levels (Chandra et al., 2021).
Wind speed forecasting problems have been addressed from the early 1980s by Brown et al. (1984) who proposed to provide data on the uncertainty of forecasting wind speeds. Currently, by referring to the literature related to sustainable energy, wind speed forecasting, and wind power is advancing promptly. A major track in this advancement is devoted to building time series models such as AR, ARMA, ARIMA, etc. Torres et al. (2005) applied ARMA models to predict hourly wind speed in Navarre, Spain and made a comparative analysis with persistence models, they concluded that ARMA model outperforms the persistence model with minimum prediction errors. Cadenas and Rivera (2007) conducted an experiment to forecast wind speed for three regions located in Mexico. The model was developed using a hybrid ARIMA–ANN. The ARIMA models were exploited to forecast time series wind speed data, and the prediction errors were employed to construct the ANN to describe the nonlinear behavior that the ARIMA model could not characterize. The experimental results showed that hybrid models outperform ARIMA and ANN with higher accuracy in wind speed predictions for the three sites. In this regard, ARIMA models have been widely employed in the last years to model wind speed variations for large time lags. They are simple to implement and can be interpreted as discretized forms of differential equations. For the description of any fluctuation induced on short time periods of about 10 min, ARIMA models can be adopted by taking large number of regressions and moving average with higher order terms to construct frequency decompositions (Contreras et al., 2003). Moreover, ARIMA models are reasonable in terms of computational costs since they are constructed by applying lightweight iterative procedures (Dimri et al., 2020).
Liu et al. (2012) suggested hybrid methods of ARIMA–Kalman and ARIMA–ANN to forecast hourly wind speed. The extracted results have shown that both methods produce good outputs, and they are suitable for forecasting dynamic wind power systems. Du et al. (2019) used ARIMA techniques to forecast loads in power systems with better accuracy. Rejesh et al. applied ARIMA model to forecast wind speed with an accuracy enhancement by an amount of 42% compared with the persistence technique. Tyass et al. (2022) conducted experiments on a meteorological station located in Casablanca, Morocco using seasonal ARIMA model to forecast short-term wind speed. The experimental results have denoted that the applied model achieved excellent forecasting accuracy. The ARIMA model can provide better model accuracy for nonstationary time series by converting them into an associated stationary time series by extracting the differences and keeping the basic statistical features the same (Mantalos et al., 2010).
When comes to wind speed data analysis, two major problems arise that need careful inspection: ARIMA models’ applications and their usefulness. Wind speed can be modeled using some statistical models such as Weibull distribution, other statistical models such as Gaussian distribution might not give an accurate description of the wind (Hoolohan et al., 2018), this also includes any increment in the distribution of wind speed that clearly shows fat tails (Dumitru and Gligor, 2017). The adaption of any ARIMA model needs to check whether dataset is stationary with compilation of invertibility conditions. ARIMA models in the form (0, d, q) are stationary time series, but in order to identify whether the model is indicated correctly, the time series must satisfy the other condition invertibility. Since all ARIMA models with structure (p, d, 0) are invertible based on the parameters’ values, they may not be stationary. Hence, the model may have various representations; for example, it is practical to look for the simplest descriptions to estimate wind speed (Hussin et al., 2021). Grigonytė and Butkevičiūtė (2016) conducted an experiment to find the best ARIMA structure (p, d, q) for wind speed forecasting in the Baltic region locality, which was the model (3, 1, 1). They concluded that this structure can be used in several seasons of the year including summer, winter, and autumn seasons, for spring seasons the model should be changed due to unexpected changes in wind speed data.
Recently, wind speed distribution at east Jerusalem was studied by Alsamamra et al. (2022). In this work, the authors explored the distribution of the wind to find the optimal estimation of the two Weibull parameters using 10 years daily data based on five numerical methods. The experimental results showed that out of the five estimation methods, both the method of moment and the empirical method were the optimal statistical methods in determining the Weibull shape and scale parameters that correctly describe the wind speed distribution. Furthermore, Salah et al. (2022) made a comparative analysis among several machine-learning algorithms to predict wind speed using a ground wind speed dataset collected from a meteorological station located in east Jerusalem. Wind speed data was experimented using six machine-learning techniques, namely support vector regression, random forest, multiple linear regression, ridge regression, lasso regression, and long short-term memory. The results showed that the random forest followed by the long short-term memory provides better prediction accuracy compared to the other techniques for the study site.
In summary, researchers proposed various models to address the technical issues of wind speed prediction. They can be normally categorized into three groups: statistical models, machine learning models, and deep learning models (Li et al., 2023b). Statistical models such as ARIMA model are widely employed methods in different climatological sites because it can be easily applied to short-term forecasting scenarios, and require minimum resources and simple calculations that offer economic solutions for time series data analysis. Thus, it is capable of effectively handling variations in time series data. It is particularly well-suited for scenarios with relatively stable demands and moderate variations. However, to achieve good performance accuracy of ARIMA models, it is very important to determine an accurate model's parameters. These parameters were found in the literature to assign different values depending on the length of the time series data, the stationarity of the time series and the region. Therefore, this research study aims at proposing an optimal ARIMA model to forecast hourly wind speed for a time series data gathered from a meteorological station installed in east Jerusalem region from January 1, 2021 to December 31, 2022. The performance accuracy of the model was validated based on MAE, RMSE, and R-squared. The extracted results were also compared with relevant contributions from the literature. The importance of this work stems from the fact that Palestine lacks sufficient conventional energy sources that meet Palestinian people’s daily needs. A high percentage of energy needs is provided by some nearby countries. Thus, to the best of the authors’ knowledge, it is the first study to tackle the problem of modeling wind speed in east Jerusalem. Although similar studies are available worldwide, the characteristics of wind may differ from location to another, thus any forecasting model is site-dependence; that is, an optimal forecasting model for a specific location might not be the optimal for others.
ARIMA model
The ARIMA model is the most broadly applied approach to working with time series and its analysis. It was first introduced by Box and Jenkins in 1970 (Box and Jenkins, 1970). It is useful in such a way that it may characterize various time series data like pure autoregressive, pure moving average, and combined approach. Thus, ARIMA (p, d, q) is the general model, where p, d, and q are autoregressive parameter, number of differencing operators, and moving average parameter, respectively.
AR
An AR is employed to forecast a time series where AR(1) denotes the first-order autoregressive and Yt is regressed on Yt−1. The autoregressive model of pth order is represented by AR(p). In multiple regression models, the variable of interest is predicted using a linear combination of a set of predictors. A linear combination of a set of past values of the variable is used to build the autoregression model. The term autoregression implies that it is a regression of a variable versus itself. Hence, any autoregressive model of pth order can be mathematically represented as:
MA
The dependent variable in moving average process is normally estimated considering both a constant and a moving average of error terms, that is, it is also a regression which is based on current and lagged error terms that behave like a first-order moving average process denoted by MA(1). Instead of using predecessor values of the forecast variable in the regression process, past forecast errors are used in a moving average model in a regression-like model. Additionally, q number of error terms included in the model typically follows the qth order moving average process, denoted by MA(q) which is written as.
ARIMA
Differencing with moving average model and autoregression can be combined together to obtain a non-seasonal ARIMA model. As a consequence, the generated model can be mathematically represented as.
A nonstationary time series can be converted to stationary by applying d number of times differencing, it is integrated of order d, denoted by I(d). By referring to the above section, the ARIMA model is expressed by ARIMA (p, d, q) which indicates that the AR is of the pth order, and the time series is thus incorporated d number of times and the moving average takes the qth order of differencing. This indicates that if both MA and AR models are of the first order, and the time series has a stationary characteristic at the first order differencing, the ARIMA model is then denoted as ARIMA (1, 1, 1). It is worth noting that an ARIMA model is a theoretic model and is not obtained from any economic theory. The ARIMA model can predict a given time series using its own past values. It can be pertained to any non-seasonal time series of numbers that exhibit patterns and is not a time series of random occurrences. A key characteristic of the time series data is to be collected over a series of constant and regular intervals.
Dataset and methodology
In this work, we explore the potential of using ARIMA models to forecast wind speed using a data captured from a meteorological station located in Palestine for a duration of 2 years—January 1, 2021 to December 31, 2022. The collected wind data were constantly logged at an altitude of 20 m by a rotating cup generator anemometer installed in Jabal Al-Mukabber's village nearby east Jerusalem with an accuracy of (3%), and the calibration was performed by a linear regression uncertainty with a percentage of 0.2% to 5.0%. As shown in Table 1, the dataset contained seven variables. The captured records have a frequency of 10 min. Some preliminary preprocessing steps were applied in order to process the data, the original data were thoroughly reviewed for errors, and few null and missing values were filled by taking the average of five previous values. Also, a statistical analysis was used to clarify the main attributes of the time series data as shown in Table 2, we found out that each attribute had (104951) readings. The average values of the relative humidity were found to be 56.82%, the mean temperature (18.11°C), mean maximum temperature (18.26°C), mean minimum temperature (17.97°C), and mean rainfall (0.01 mm). The mean value of wind speed was 2.96 m/s with wind direction values ranging from 0 to 360 degrees, the wind direction's mean showed that the overall wind direction was southwest. The standard deviation of the time series was also found to check the stationarity of the dataset. Besides, 75% of the values are less than or equal to (3.8 m/s), which is a reasonable value that is suitable for installing small wind turbines, about 75% of the air temperature values were below 24°C, 75% of wind direction was originated from the northwest and 50% nearly to the south. The peak value of wind speed was (11.9 m/s), while the peak value of the air temperature was found to be around (40°C). A graphical representation of the first 5 days of January 2021 was presented in Figure 1 to provide a clear visualization of the wind speed values and their variation.

An example of five-day time series wind speed data.
A sample record of raw data and their attributes.
Descriptive statistics of the dataset showing the central tendency, dispersion, and shape of a dataset distribution.
By referring to the Box–Jenkins methodology for proposing an ARIMA model, the data stationarity and seasonality criteria must be carried out. Regarding the seasonality criterion for the considered wind speed time series dataset, there were no peaks at seasonal frequencies, and thus, no trends for seasonal wind speed which reveals no random variations in the wind speed data as illustrated in Figures 1 and 2. Also, referring to Figure 2, it is clearly shown that the trend and residuals reveal that the time series is stationary. In addition to that, by referring to Table 3, which shows the results of the Dicky–Fuller test; a widely used unit root test method to evaluate the stationarity of a time series, the test values clearly indicate that the wind speed data had no unit roots and can be characterized as a stationary time series. As a consequence, the null hypothesis was rejected at a 95% level of confidence (Dickey–Fuller test: −3.903, P < 0.05%).

Decomposition example of the time series into trend, seasonal, and residual components.
Results of the Dicky–Fuller test.
In this regard, ACF and PACF are two measures that are typically used to find the association between current and past values of a series and to indicate which past values are the most valuable in predicting the future ones. With these measures, researchers can determine the order of processes in the ARIMA model. The autocorrelation of stationary data might reduce to zero almost quickly, whereas for a nonstationary time series, it is significantly at a distance from zero for many times. Some samples of ACF and PACF of the observed time series are presented in Figure 3, and as it is observed, the autocorrelation function (ACF) decays quickly to zero and there are no clear patterns in both plotted samples. As a consequence, the considered time series dataset seems to be stationary.

Plots of ACF and PACF.
Working with stationary time series, a set of ARIMA (p, d, q) models was constructed by changing the values of p and q with integers ranging [0–4]. The maximum value is 5, which was selected to keep frugality in mind. Among the possible constructed models, the optimal model was detected after capability check. The analysis of ACF and PACF functions did not show the accurate values of p and q. But as the scientific literature argued, the d parameter mostly takes the value 0, in some scenarios it might also take the value 1 to avoid the weak stationarity cases. To find the optimal model, the parameters p and q can be assigned various values. For each pair of values, a new model was constructed.
Figure 4 illustrates the flowchart of ARIMA model. The time series dataset is first checked to determine whether it is stationary or nonstationary. This test can be made by utilizing several methods such as data visualization, summary statistics, and statistical methods. If the data is stationary, the plots of ACF and PACF will be created to find patterns in the data and to identify the occurrence of AR and MA components in the residuals. If the dataset is nonstationary, a new step is necessary to make the series stationary by converting a stationary time series by using the differencing technique. Differencing the time series is the change between a set of consecutive data points in the series by subtracting the current value from its predecessor, or from a lagged value. Next, the estimation of the optimal parameters for the ARIMA model is done to identify the best fit ARIMA model on the fly, and then check the accuracy of the model using several performance metrics. Finally, the model will be built and validated by making a comparative analysis between actual and predicted values in the validation sample.

Flowchart of the ARIMA time series forecasting model (Mani and Volety, 2021).
Performance metrics of wind speed forecasting
RMSE
As given in equation (4), the RMSE is the most widely applied performance metric to estimate the average value of the prediction error by calculating the differences between predicted and observed/actual values.
MAE
As indicated by equation (5), the MAE represents the mean of absolute differences between actual and predicted values; it corresponds to the estimated level of absolute error.
R-square
The coefficient of determination or R-square (R2) (equation (6)) represents the level of correlation between actual and predicted values; it can be described as the variance of the dependent variable which is predicted from the set of independent variables. It is mainly used in time series datasets having large amplitudes.
Results and discussions
To find the optimal values of the ARIMA parameters for the study site, several experiments were conducted by adjusting the values of p and q [0–4], for each pair, a new model was constructed and validated. As it is clearly observed from Figure 2, the plot provides a time series data that exhibits a repetitive behavior with visible and regular recurring cycles. This periodic behavior assures that the underlying process of interest can be regular, and the frequency of oscillations that describes the behavior of the series will help to identify it. The series mainly shows two types of fluctuations: sinusoidal waves with bottom and top peaks and a slower frequency that seems to be repeated periodically. Typically, nonstationary data cannot be modeled or predicted. The results extracted from nonstationary time series can be inconsistent or unreliable, because they can relate two variables where neither of them exists. To determine the orders p and q, the plots of both ACF and partial autocorrelation function (PACF) were examined as illustrated in Figure 3. These plots clearly show that the values tend to deteriorate quickly to zero without clear patterns, which is also verified using the Dicky–Fuller test results, which confirms stationarity nature of the dataset. The PACF plots provide that the AR (2) model is more suitable for the observed data, because of the cut-off at lag 2.
After accomplishing the stationarity test of the time series, the next step is to check whether AR or MA terms are needed to correct any autocorrelation required to fit the ARIMA model. The analysis of autocorrelation and partial autocorrelation did not give the correct values of the parameters p and q. However, referring to the literature, the d parameter should take either 0 or 1 for stationary time series. The zero values differencing models (d = 0) were also presented in the results (Table 4) to confirm that these models can provide higher prediction errors compared to d = 1. Consequently, the best model was found by assigning different values [0–4] for both p and q parameters. A new model was built for each pair of values, and some statistical performance metrics were used to compare between the assigned AR and MA values (p and q parameters). The best forecasting wind speed model was found to be ARIMA (2, 1, 2) which means that; an autoregressive AR(2) process in which the current observation value is based on the previous two lagged observations, the weak stationarity of the wind speed time series was avoided by taking the first order differencing of the raw observations, that is, subtraction of the current value with its previous time step value, and moving average MA(2) incorporates that the dependency between the observation and the residual error from a second order moving average is applied to the lagged observations.
Experimental results of various error metrics to identify the optimal parameters for the ARIMA (p, d, q) model.
Table 4 shows the experimental results of various error metrics to identify the optimal parameters for the ARIMA model. It is clearly shown that the highest value for both p and q are 4 and 4, respectively. Since the model should give high prediction accuracy with the lowest errors and should be the easiest one. When p and q take larger values, the model becomes more complex to calculate and analyze. The best forecasting model was selected based on accuracy and performance. To evaluate the models and select the best one that describes the wind speed data accurately, several accuracy methods were employed. The results presented in Table 4 show that ARIMA (21,2) gives the most accurate estimation with minimal RMSE (1.74), minimal MAE (1.58), and higher R2 (0.76) values. It is achieved when the autoregressive process is based on the previous two lagged observations, and the moving average process relates the dependency between the observations and the residual errors using a second-order moving average applied to the lagged observations. By referring to Table 4, it was found the MAE values for all models ranging from 1.58 to 2.03, where the RMSE values come to be in the range of 1.73 to 2.19, and the values of R2 ranged from 0 to 0.76. Furthermore, ARIMA (4, 1, 4) gives good wind speed prediction with lower errors. However, ARIMA (2, 1, 2) has a simpler construction and the difference of errors is not big enough to choose a more complex model. A comparison of the actual wind speed data with the best ARIMA (2, 1,2) model for three days is presented in Figure 5. It is observed that both curves are matched and the ARIMA (2, 1, 2) model is well representing the observed wind speed pattern. The training dataset predicted by the model described well the actual training wind speed values with an overestimation for values less than 3 m/s and underestimation of wind speed for values greater than 3 m/s, while the ARIMA model failed to describe the test wind speed dataset.

Wind speed forecasting of ARIMA (21,2) versus actual data.
To sum up, the performance of the ARIMA models was estimated using wind speed dataset using various evaluation metrics. RMSE, MAE, and R2 were used to assess the models, and the results have shown that ARIMA model achieves acceptable forecasting accuracy of wind speed in east Jerusalem as compared with similar studies from the literature.
As it is clearly emphasized in the literature, the analysis of wind speed is site-dependent; that is, it depends on the intrinsic nature of the wind and its characteristics. Nevertheless, to study how the proposed model behaves compared to other similar studies, Table 5 summarizes the results along with some of those presented in the literature. El-Kashty et al. (2023) modeled the wind speed for short and long term in Ras-Gharib, Egypt by using several time-series forecasting methods. The authors used daily wind speed data from January 2017 to December 2021. Among the compared methods, ARIMA (3, 0, 2) provided the best statistical performance with RMSE (1.47) and MAE (1.18). Elsaraiti and Merabet (2021) studied the effectiveness of predicting wind speed for Halifax region, Canada. They used a short time series wind data from 1 May 2021 to 20 June 2021, and the results have shown that ARIMA (2, 1, 2) provided the best statistical performance with RMSE (3.423) and MAE (2.772). In Iran, Helmand Basin, wind speed spatial patterns assessment and forecasting models were studied by Dargahian and Doostkamian (2021), daily data from 1990 to 2018 collected from seven stations in Helmand region were used to conduct the study. The authors concluded that the statistical performance of ARIMA (0, 1, 1) model was more accurate for Zahak station than others with values 1.69, 1.21, and 0.76, for RMSE, MAE, and R2, respectively. As shown in the table, the statistical performance of ARIMA (2, 1, 2) model presented in this work provides reasonable values when compared to that in the literature.
Comparative results between the proposed model and other models.
These results give a baseline that allows to establishing the optimal ARIMA structure and the length of the input period in the study site. The conducted study revealed that the identification of the optimal ARIMA model can give a significant improvement in accuracy compared to the casual structures for three days ahead forecast. As indicated in the literature, the ARIMA model can produce better results when applied to small datasets. However, for large datasets, to obtain better forecasts, it is highly recommended to propose a couple of ARIMA models with other effective methods combined together as hybrid models or using models generated based on deep learning paradigms.
Conclusion
In this work, we developed and evaluated a set of ARIMA models for wind speed forecasting using a data captured from a meteorological station located in east Jerusalem, Palestine for a duration of 2 years—January 1, 2021 to December 31, 2022. To find the optimal values of ARIMA parameters (p, d, q), a set of experiments were conducted and the model's forecasting accuracy was evaluated using three metrics: RMSE, MAE, and the coefficient of determination (R2). The results have shown that ARIMA (21,2) emerges as the most accurate ARIMA structure and the length of the input period demonstrate superior estimation with minimal RMSE (1.74), minimal MAE (1.58), and higher R2 (0.76) values.
The output of this work exhibits a number of points to consider (a) it should be indicated that with the tremendous growing of wind-based applications, power utilities in Palestine have to integrate a variety of renewable energy systems, an accurate estimate of wind speed plays a vital role in socioeconomic benefits resulting from an appropriate power grid supervision in the Palestinian territories; (b) this study can help in ease managing of wind farm programs and operations, for the obtained ARIMA models can not only get the best accuracy when compared with existing models but also can be simply calculated; and (c) the findings suggest that ARIMA models remain effective for short-term wind speed prediction and could aid decision-makers in Palestine to discover the local wind potential and offer some insights into the feasibility of wind speed forecasting for sustainable energy solutions.
In the future research work, we will conduct a deep analysis to explore the behavior of the ARIMA model per season, and explore hybrid models that combine ARIMA with other forecasting techniques to further improve the performance under different conditions. Another line of research would be to obtain other wind speed datasets from different measurement sites in Palestine and make a comparative analysis.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
