Sage Journals: Discover world-class research

Abstract

Combination forecasting is an effective tool to improve the forecasting rate by combining single forecasting methods. The purpose of this paper is to apply a new combination forecasting model to predicting the BRT crude oil price based on the dispersion degree of two triangular fuzzy numbers with the circumcenter distance and radius of the circumcircle. First, a dispersion degree of two triangular fuzzy numbers is proposed to measure the triangular fuzzy numbers with the circumcenter distance and radius of the circumcircle, which can be used to predict the fluctuating trend and is suitable for crude oil futures price. Second, three single prediction methods (ARIMA, LSSVR and GRNN) are then presented to combine traditional statistical time set prediction with the latest machine learning time prediction methods which can strengthen the advantage and weaken the disadvantage. Finally, the practical example of crude oil price forecasting for London Brent crude futures is employed to illustrate the validity of the proposed forecasting method. The experimental results show that the proposed forecasting method produces much better forecasting performance than some existing triangular fuzzy models. The prediction error is reduced to 2.7 from 3–5 in oil price combination forecasting, in another comparison experiment the error is reduced to 0.0135 from 1. The proposed combination forecasting method, which fully capitalizes on the time sets forecasting model and intelligent algorithm, makes the triangular fuzzy prediction more accurate than before and has effective applicability.

Keywords

Oil price forecasting dispersion degree of two triangular fuzzy numbers ARIMA LSSVR GRNN

1 Introduction

Crude oil is a natural liquid fossil fuel found in geological formations beneath the earth’s surface. It has mostly been extracted by oil drilling, which comes after the studies of structural geology, sedimentary basin analysis, and reservoir characterization [1]. Crude oil is one of the most important energy resources on earth. So far, it remains the world’s leading fuel, with nearly one-third of global energy consumption.

There are three major reference crude oil prices in international crude oil market, WTI Crude Oil Futures of NYMEX in America, Brent Crude Oil Futures of Royal Exchange, and Dubai Reference Oil in Middle East, in which, the influence of Brent Reference Oil is the strongest in international market, and about 70% of spot crude oil trading pricing is based on Brent scale [2], so this paper will focus on predicting Brent oil.

The world’s environment is affected by the oil price falling. With the drop of oil prices, the fuel bills are lowered. As a result, consumers are very likely to use more oil and thus increase the carbon emission. In addition, there is less incentive to develop renewable and clean energy resources. On the other hand, sustained low oil prices could lead to a drop in global oil and gas exploration, exploitation activities [3].

Fluctuating oil prices also play an important role in the global economy [4]. The fall in oil prices would result in a modest boost to global economic activity, although the owners of oil sectors suffer income losses. Owing to COVID-19, a considerable drop has been observed in international crude oil markets. Indeed, energy prices have known a declining tendency owing to the fall of energy demand following containment [5, 6]. To sum up, oil price forecasting is of great significance in political, economic and environmental aspects, so further research on oil price forecasting is needed to benefit all parties. Many methods have been developed for oil price prediction. Crude oil price forecasting approaches can be divided into three categories: (1) heuristic approaches; (2) statistical models; and (3) machine learning techniques.

Heuristic approaches for oil price prediction include professional and survey forecasts, which are mainly based on professional knowledge, judgments, opinion and intuition. Another heuristic approach, the so-called no-change forecast, uses the current price of oil as the best prediction of future oil prices. Despite its simplicity, the no-change forecast appears to be a good baseline approach for oil price prediction and is better than other heuristic judgmental approaches[7].

Statistical models are the most widely used approaches for oil price prediction, which include autoregressive moving average (ARMA) models and vector autoregressive (VAR) models, with possibly different input variables [8]. These statistical models provide more accurate prediction than the no-change model at least at some horizons. Recently, a combination forecasting approach is proposed by Baumeister and Kilian [9], which combines 6 different oil price prediction models including both statistical models (such as the VAR model) and the no-change model. It should be noted that most of the statistical models are linear models and are not able to capture the nonlinearity of oil prices. In recent years, deep learning methods have attracted extensive attention owing to their excellent performance in terms of prediction accuracy and stability [6 , 10–13]and several machine learning techniques are proposed for oil price prediction, such as artificial neural networks (ANN) [14], and support vector machine (SVM)[15]. These are nonlinear models which may produce more accurate predictions if the oil price data are strongly nonlinear [16]. However, these machine learning techniques, like other traditional machine learning techniques, rely on a fixed set of training data to train a machine learning model and then apply the model to a test set. Such an approach works well if the training data and the test data are generated from a stationary process, but may not be effective for non-stationary time series data such as oil price data. At present, Theerthagiri and Ruby [17] proposed seasonal learning based on ARIMA algorithm for prediction of Brent oil trends, Kim and Jang [18] integrated a convolutional neural network and a recurrent neural network with skip connections to predict petroleum price, Lazcano et al. [19] combined the characteristics of a graph convolutional network (GCN) and a bidirectional long short-term memory (BiLSTM) network to forecast the price of oil. These methods of statistics and machine learning still have the problems mentioned above. In recent years, decomposition-reconstruction methods have been used to predict oil price, Guliyev and Mustafayev [20] proposed some statistical ensemble methods such as adaptive boosting (Adaboost) to predict oil price. Zhang et al. [21] proposed a hybrid GRU neural network based on decomposition-reconstruction methods to predict oil price, Zhao et al. [22] proposed a hybrid wavelet decomposer and ARDL-SVR ensemble model to forecast oil inventory changes with Google trends. These decomposition-reconstruction methods deconstruct the time series and then make the forecast, which makes the original complex data better to forecast. However, these methods forecast the real number but the oil price fluctuates within an interval every day, while the real number forecast has very insufficient information for the futures market.

Figure 1 shows the volatility of BRT oil prices over 10 years. Due to the high volatility of oil prices [23], it remains one of the most challenging forecasting problems. So previous studies conducted on real numbers are not applicable to these data. In the fuzzy environment, triangular fuzzy numbers are a common expression form for uncertain information, which compensate for the lack of real numbers and interval numbers. Thus, it is natural to develop a triangular fuzzy sets forecasting method for the crude oil price forecasting. Zeng et al. [24] proposed a triangular fuzzy sets forecasting method based on the grey model and neural network. However, the essence of grey model is matching the raw sets with an exponential type curve, and its prediction curve is a smooth curve which gives expression to the development trend of the sets, so the grey model cannot predict the fluctuating trend of the sets effectively. Recently, Zhang and Chen [25] used the IOWA operator to build a multi-objective combination forecasting model based on the correlation coefficients of corresponding area sequence and gravity center sequence of triangular fuzzy numbers, and the model is transformed into a single objective programming model by bringing in the importance parameter. However, this method can only be used to deal with symmetric triangle fuzziness, which is a special kind of triangle fuzziness and cannot be adapted to all triangular fuzzy numbers, especially in crude oil price prediction. It cannot be simply found the midpoint of two numbers to become the medium point of triangular fuzzy numbers, because this obviously cannot explain the relationship of membership degree of triangular fuzzy numbers. Thus, an interesting and important issue to be solved is how to establish a model suitable for arbitrary triangular fuzzy numbers. Up to now, there has been no research about this issue.

Fig. 1

The high volatility of oil prices.

The combination forecasting is proposed by Bates and Granger for the first time in 1969 [26]. It is an appropriate weighted average of single forecasting methods which form the optimal planning problem in one of criterion, arriving at the optimal solution of the original question is the weighting factor, so that we can take full advantage of effective information of various single prediction method, improving the accuracy of prediction and providing a basis for science analysis. Combination forecasting method can overcome the limitation of single method and is widely used [27]. In the field of aviation, Guo et al. [28] used double-level combination approach for demand forecasting of repairable airplane spare parts. In the energy field, Perera M et al. [29] carried on multi-horizon distributed solar PV power forecasting with combination forecasting approach. And in environmental protection, Wang et al. [30] used a combination forecasting model to forecast port pollutant discharge. The key question of combination forecasting method is how to find the weighted average coefficients and increase forecasting precision more effectively [31]. Zhou et al. [32] proposed a model based on CMBCF to dynamically determine weights and Valle Dos Santos et al. [33] used horizon-optimized weights, i.e., weights that may vary over the forecasting horizon. In recent years, time series decomposition has been carried out in combination forecasting to study decomposition integration [34]. With the update of combination forecast, it has developed from real number type to interval type, but triangular fuzzy type is rarely mentioned. This paper also hopes to further popularize triangular combination forecasting.

From the literature review, there is no literature comparing the difference between two triangular fuzzy numbers and building an effective model based on it for application to the best of our knowledge. Second, most combination forecasting models are only applicable to point-valued or interval-valued time series [34] and triangular fuzzy combination forecasting models need to be further developed. Third, according to some existing oil price forecasting models, using a single model is no longer appropriate. The different model has different data characteristics, so it is necessary to adopt more appropriate model assumptions when forecasting it. Therefore, developing a triangular fuzzy combination method based on the difference between two triangular fuzzy numbers and proposing a new triangular-valued oil price combination forecasting model are the main tasks of this research.

In this paper, we will propose a BRT oil price combination forecasting approach based on the dispersion degree of two triangular fuzzy numbers. The dispersion degree of two triangular fuzzy numbers based on the center distance and radius of the circle model will be developed, which can predict the fluctuating trend of the triangular fuzzy sets effectively. Moreover, in order to improve the accuracy of oil price forecasting, we will use three existing efficient individual oil price forecasting methods mentioned above to make a combination forecast. Finally, an illustrated example for forecasting oil price will be used to verify the effectiveness of our proposed method. The proposed method provides a new triangular fuzzy combination forecasting framework to enrich the oil price forecasting. The contributions of this paper and the novelty of the proposed approach are summarized as follows:

The distance measure between two triangular fuzzy numbers is improved, which is no longer limited to symmetric triangular fuzzy numbers, but extended to all triangular fuzzy numbers.

The circumradius of triangular fuzzy numbers is introduced to construct dispersion degree which can measure the difference between two triangular fuzzy numbers better.

A triangular fuzzy combination forecasting model is proposed to predict triangular-valued oil prices accurately and stably. The model is suitable for all triangular oil price data including asymmetric ones which solves the problem that triangular fuzzy numbers are difficult to explain in reality and the problem of triangular fuzzy number combination forecasting with geometry. At the same time, it provides a data processing technique for triangular-valued time series.

Considering the linearity and non-linearity characteristics of the triangular-valued oil prices series, both statistical and artificial intelligence methods are involved in oil price forecasting, which can improve the performance of triangular fuzzy forecasting.

The rest of the paper is organized as follows. Section 2 introduces the basic concepts of triangular fuzzy numbers and information integration operators, also three single prediction methods are introduced. The dispersion degree of triangular fuzzy sets is defined in section 3. In section 4, the prediction model of dispersion degree for triangular fuzzy set based on circumcenter and circumradius is established. Section 5 presents two practical examples: power load prediction and Brent oil price prediction. Finally, conclusions are drawn in Section 6.

2 Preliminaries

2.1 Fuzzy sets and distance measure of fuzzy sets

The concept of fuzzy set (FS) is first proposed by Zadeh [35], which is widely used in uncertain environment.

Definition 1. A FS δ on the domain ξ is defined as

$δ = {〈 x, μ_{δ} (x) 〉 | x \in ξ} .$ (1)

Where μ_δ (x) : ξ → [0, 1] is the membership function of the fuzzy set δ and μ_δ (x) ∈ [0, 1] is the membership of x ∈ ξ in δ.

The distance measures of two FSs usually used are defined as follows [36]:

Definition 2. Let δ, σ in be two FSs and μ_δ (x_i), μ_σ (x_i) are the membership functions, then their most widely used distances are:

the Hamming distance d_h (δ, σ):

$d_{h} (δ, σ) = \sum_{i = 1}^{n} | μ_{δ} (x_{i}) - μ_{σ} (x_{i}) | .$ (2) The normalized Hamming distance l (δ, σ):

$l (δ, σ) = \frac{1}{n} \sum_{i = 1}^{n} | μ_{δ} (x_{i}) - μ_{σ} (x_{i}) | .$ (3) The Euclidean distance d_e (δ, σ):

$d_{e} (δ, σ) = \sqrt{\sum_{i = 1}^{n} {(μ_{δ} (x_{i}) - μ_{σ} (x_{i}))}^{2}} .$ (4) The normalized Euclidean distance q (δ, σ):

$q (δ, σ) = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(μ_{δ} (x_{i}) - μ_{σ} (x_{i}))}^{2}} .$ (5)

Let δ, σ, ζ be three FSs and d (sbull, sbull) be distance measure, then d (sbull, sbull) satisfies following properties [36]:

d (δ, σ) ≥0.

d (δ, σ) = d (σ, δ).

d (δ, δ) =0.

d (δ, σ) + d (σ, ζ) ≥ d (δ, ζ).

2.2 Triangular fuzzy numbers and distance measure

Triangular fuzzy number (TF) is a special case of FS, which is defined as follow:

Definition 3. Let a = (a_l, a_m, a_u) be a fuzzy number on the real domain R meet the condition a_l ≤ a_m ≤ a_u, its membership function u (x) is:

$u (x) = {\begin{matrix} (x - a_{l}) / (a_{m} - a_{l}), a_{l} < x < m \\ (a_{u} - x) / (a_{u} - a_{m}), a_{m} < x < a_{u} \\ 0, else \end{matrix},$ (6) then a is called a triangular fuzzy number. If the left endpoint a_l is greater than 0, then a is called positive triangular fuzzy number. If a_l - a_m = a_u - a_m, then a is called symmetric triangular fuzzy number, otherwise a is called an asymmetry triangular fuzzy number.

Especially when a_l = a_m = a_u, a degenerates into a real number.

For any two triangular fuzzy numbers a = (a_l, a_m, a_u) and b = (b_l, b_m, b_u), they have operational rules as follow:

a = b ⇔ a_l = b_l, a_m = b_m, a_u = b_u.

a ± b = (a_l ± b_l, a_m ± b_m, a_u ± b_u).

ka = (ka_l, ka_m, ka_u) , k ≥ 0.

Harish and Rani [37] extended the fuzzy distance into the triangular fuzzy set and designed a new distance measure for symmetric FSs. The formula of the circumcenter point of the symmetric triangular fuzzy set is firstly given as follows:

Definition 4. Let a = (a_l, a_m, a_u) be a triangular fuzzy number and u_a be the membership of a:

$H_{c} (a) = (\frac{1 + a_{l} - u_{a} (a_{l})}{2}, \frac{3 (1 - a_{l}) + 5 u_{a} (a_{l})}{8}),$ (7) then H_c (a) is called the circumcenter point of a.

Definition 5. Let a = (a_l, a_m, a_u) and b = (b_l, b_m, b_u) be two triangular fuzzy numbers and u_a, u_b be the memberships of a and b, respectively, the distance measure d_cc (a, b) is constructed based on circumcenter as follow:

$\begin{matrix} d_{cc} (a, b) = (\frac{| (a_{l} - b_{l}) - (u_{a} (a_{l}) - u_{b} (b_{l})) |}{2}) \cdot \\ (1 - \frac{(a_{u} - a_{l}) + (b_{u} - b_{l})}{2}) \\ + (\frac{| 5 (u_{a} (a_{l}) - u_{b} (b_{l})) - 3 (a_{l} - b_{l}) |}{8}) \cdot \\ (\frac{(a_{u} - a_{l}) + (b_{u} - b_{l})}{2}), \end{matrix}$ (8)

then d_cc (a, b) is called the distance of a and b based on circumcenter points.

As it can be seen from formula 8, the distance measure of a and b based on circumcenter points uses Hamming distance, but it is only applicable to the symmetric triangular fuzzy set, so it is needed to improve the distance measure to be applicable to the general triangular fuzzy set. The improved distance measure which is used the characteristic of triangular fuzzy numbers will be described in detail in Section 3.

2.3 Single prediction methods

(1) Autoregressive Integrated Moving Average model

Model ARIMA (p,d,q), Autoregressive Integrated Moving Average model, has the following form:

$Φ (β) \nabla^{d} Y_{t} = Θ (β) e_{t},$ (9) where ∇^d = (1 - β) ^d, and Φ (β) = 1 - τ₁ β - τ₂ β² - τ_p β^p is the autoregressive coefficient of stationary reversible ARMA (p,q) and Θ (B) = 1 - θ₁B - θ₂B² - θ_qB^q is the moving average coefficient of stationary reversible ARMA(p,q).

Stationary test is needed for original time series, while, if the series can’t meet stationarity condition, difference conversion can be used to make it meet the condition, which can receive the value of d in the model. Therefore, the stationarity of the sequence needs to be tested first by unit root test (this paper mainly uses ADF test).

(2) Least square support vector regression model

Least square support vector regression (LSSVR) is Suykens and Vandewalb’s improvement on support vector regression [38, 39]. Based on the standard algorithm, the least squares estimation has been introduced into the SVR algorithm. By converting the inequality constraints in the original algorithm into equality constraints, the solution of quadratic linear programming problem is changed to the solution of linear equations, which can greatly reduce the amount of calculation and improve the efficiency of operation.

There are many choices of kernel function, in this paper, radial basis kernel function.

(3) Generalized Regression Neural Network model

GRNN, Generalized Regression Neural Network is a kind of Radial Basis Function (RBF)neural network. GRNN has strong nonlinear mapping ability, flexible network structure and high fault tolerance and robustness, which is suitable for solving nonlinear problems. GRNN has stronger advantages than RBF network in terms of approximation ability and learning speed. The network finally converges to the optimized regression surface with large sample size accumulation, and the prediction effect is also better when the sample data is small. In addition, the network can handle unstable data [40].

Single prediction methods can be replaced, such as Adaboost Bagging SVR, etc. The specific application of the above three methods will be discussed in detail, and the comparison of other single methods will be carried out in the Section 5.

2.4 Information aggregation operator

Information aggregation operators can be used in prediction models, and there are two commonly used operators which first suggested by Yager [41].

Definition 6. The n-dimensional function OWA: Rⁿ → R satisfies:

$OWA (a_{1}, a_{2}, \dots, a_{n}) = \sum_{j = 1}^{n} w_{j} b_{j},$ (10) where the weighting vector w = (w₁, w₂, ⋯ , w_n) ^T satisfies w_j ∈ [0, 1] and $\sum_{j = 1}^{n} w_{j} = 1$ , b_j is the jth largest of the a_i.

Definition 7. Let 〈v₁, a₁〉 , 〈 v₂, a₂ 〉 , ⋯ , 〈 v_n, a_n 〉 be n two-dimensional arrays, then

$\begin{matrix} IOWGA (〈 v_{1}, a_{1} 〉, 〈 v_{2}, a_{2} 〉, \dots, 〈 v_{n}, a_{n} 〉) \\ = \prod_{i = 1}^{n} b_{v - index (i)}^{w_{i}}, \end{matrix}$ (11) is called induced ordered weighted geometric averaging (IOWGA) operator, where w = (w₁, w₂, ⋯ , w_n) is the weighting vector of IOWGA operator, which satisfies nonnegativity and normalization, and v - index (i) is the subscript corresponding to the ith element in v₁, v₂, ⋯ , v_n, v₁, v₂, ⋯ , v_n is called the induction variables.

For triangular fuzzy numbers, the predicted triangular fuzzy numbers should be closer to the original data, so the difference between the two triangular fuzzy numbers should be minimized. Although Harish and Rani [37] proposed the definition of triangular fuzzy number distance to measure the difference, it is also mentioned above that this definition only applies to symmetric triangular fuzzy numbers. Therefore, this paper is inspired to define a new index to measure the difference of two triangular fuzzy numbers based on the circumcenter of triangular fuzzy numbers to improve the triangular fuzzy combination forecasting. In the prediction, the circumcenter point of the triangular fuzzy numbers represents the center of the triangle fuzzy number, and the radius represents the range, which can be interpreted as the expansion of the middle point radius of the interval combination prediction in the triangle fuzzy numbers. Next, the specific definition and rationality are elaborated.

3 Dispersion degree of two triangular fuzzy numbers based on circumcenter point and circumradius

In geometry, a triangle has one and only one circumcircle. In other word, it can be concluded from Fig. 2 that a circumcircle contains all the information of a triangle.

Fig. 2

Triangular fuzzy numbers and circumcircle.

Definition 8. Let a = (a_l, a_m, a_u) be a triangular fuzzy number and u be the membership of a, then the triangle obtained by connecting three points (a_l, u (a_l)), (a_m, u (a_m)), (a_u, u (a_u)) is called the triangle corresponding to a.

So, the information of a can be transformed into the circumcenter and radius of the circumcircle by the properties of the circumcircle which is constructed by the triangle corresponding to a.

Theorem 1. Let any triangular fuzzy number a = (a_l, a_m, a_u) we can get (Cx_TF, Cy_TF) and r_TF as follow:

$\begin{matrix} (C x_{TF}, C y_{TF}) = \\ (\frac{a_{l}^{2} - a_{u}^{2}}{2 (a_{l} - a_{u})}, \frac{a_{l} - a_{u} + (a_{l} - a_{m}) (a_{m} - a_{u}) (a_{u} - a_{l})}{2 (a_{l} - a_{u})}) . \end{matrix}$ (12)

$r_{TF} = \frac{\sqrt{{(a_{m} - a_{l})}^{2} + 1} \cdot \sqrt{{(a_{u} - a_{m})}^{2} + 1}}{2} .$ (13)

Then (Cx_TF, Cy_TF) is called the circumcenter point of a and r_TF is the circumradius of a.

Proof.

The calculation method of the circumcenter of a circumcircle in a plane is to solve the following equation: $\begin{matrix} {\begin{matrix} {(a_{l} - C x_{TF})}^{2} - {(u (a_{l}) - C y_{TF})}^{2} = {(a_{m} - C x_{TF})}^{2} - {(u (a_{m}) - C y_{TF})}^{2} \\ {(a_{m} - C x_{TF})}^{2} - {(u (a_{m}) - C y_{TF})}^{2} = {(a_{u} - C x_{TF})}^{2} - {(u (a_{u}) - C y_{TF})}^{2} . \end{matrix} \end{matrix}$

And the circumradius is to solve the following equation: $\begin{matrix} r_{TF} = \frac{Δ_{a_{l} a_{m}} \cdot Δ_{a_{m} a_{u}} \cdot Δ_{a_{l} a_{u}}}{4 S_{a}}, \end{matrix}$ among them: $\begin{matrix} \begin{matrix} Δ_{a_{l} a_{m}} = \sqrt{{(a_{l} - a_{m})}^{2} + {(u (a_{l}) - u (a_{m}))}^{2}}, \\ Δ_{a_{m} a_{u}} = \sqrt{{(a_{m} - a_{u})}^{2} + {(u (a_{m}) - u (a_{u}))}^{2}}, \\ Δ_{a_{l} a_{u}} = \sqrt{{(a_{l} - a_{u})}^{2} + {(u (a_{l}) - u (a_{u}))}^{2}}, \\ S_{a} = \frac{u (a_{m}) \cdot Δ_{a_{l} a_{u}}}{2} . \end{matrix}, \end{matrix}$

The solution to the system is obtained, and the theorem holds.

Based on theorem 1 a new distance measure of triangular fuzzy numbers can be defined as follow:

Definition 9. Let a = (a_l, a_m, a_u), b = (b_l, b_m, b_u) be two triangular fuzzy numbers and (Cx_a, Cy_a), (Cx_b, Cy_b) are the circumcenter points of a and b:

$D (a, b) = \sqrt{{(C x_{a} - C x_{b})}^{2} + {(C y_{a} - C y_{b})}^{2}},$ (14) then D (a, b) is called the Euclidean distance between a and b.

Theorem 2. Let a = (a_l, a_m, a_u), b = (b_l, b_m, b_u) and c = (c_l, c_m, c_u) be three triangular fuzzy numbers, D (sbull, sbull) be the distance measure of triangular fuzzy numbers then:

D (a, b) ≥0,

D (a, b) = D (b, a),

D (a, a) =0,

D (a, b) + D (b, c) ≥ D (a, c).

Definition 10. Let a₁, a₂, ⋯ , a_n be n triangular fuzzy numbers, then A = (a₁, a₂, ⋯ , a_n) is called a triangular fuzzy vector.

Definition 11. Let A = (a₁, a₂, ⋯ , a_n), B = (b₁, b₂, ⋯ , b_n) be two triangular fuzzy vectors, where $a_{i} = (a_{i}^{l}, a_{i}^{m}, a_{i}^{u}), i = 1 \dots n$ and $b_{i} = (b_{i}^{l}, b_{i}^{m}, b_{i}^{u}), i = 1 \dots n$ are triangular fuzzy numbers, then:

$\begin{matrix} Q (A, B) = \\ \sqrt{\frac{1}{N} (\sum_{i = 1}^{N} {(C x_{a_{i}} - C x_{b_{i}})}^{2} + {(C y_{a_{i}} - C y_{b_{i}})}^{2})}, \end{matrix}$ (15) is called the distance of A and B, where (Cx_{a
_i}, Cy_{a
_i}) is the circumcenter point of a_i and (Cx_{b
_i}, Cy_{b
_i}) is the circumcenter point of b_i.

Theorem 3. Let A = (a₁, a₂, ⋯ , a_n), B = (b₁, b₂, ⋯ , b_n), C = (c₁, c₂, ⋯ , c_n) be three triangular fuzzy vectors where $a_{i} = (a_{i}^{l}, a_{i}^{m}, a_{i}^{u})$ , $b_{i} = (b_{i}^{l}, b_{i}^{m}, b_{i}^{u})$ , $c_{i} = (c_{i}^{l}, c_{i}^{m}, c_{i}^{u})$ i = 1 ⋯ n are triangular fuzzy numbers, Q (sbull, sbull) be the distance measure of triangular fuzzy vectors then:

Q (A, B) ≥0,

Q (A, B) = Q (B, A),

Q (A, A) =0,

Q (A, B) + Q (B, C) ≥ Q (A, C).

In the prediction field, it can be only used the center and radius of the circumcircle to define, because a triangular fuzzy number can be determined by the intersection of the circumcircle and the X-axis with y = 1, and there is no information missing in this process. If the inscribed circle is used, it can be found from Fig. 3 that we can construct countless triangular fuzzy numbers even with the same circle center and radius, which will cause information overlap and lead to the situation that the dispersion degree is 0 but the triangular fuzzy numbers are not equal. So, the dispersion degree of two triangular fuzzy numbers proposed in this paper is based on the circumcircle of triangular fuzzy numbers.

However, it can be found from Fig. 4 that a definite triangular fuzzy number cannot be obtained only by the coordinates of the circumcenter point, this means that there will be a deviation in measuring the difference between two triangular fuzzy numbers only by the distance between two circumcenter points. Therefore, the radius of the circumcircle is introduced to construct a new index to quantify the difference between the triangular fuzzy numbers.

Fig. 3

Inscribed circle of a triangular fuzzy number.

Fig. 4

Two triangular fuzzy numbers with the same circumcenter point.

Definition 12. Let a = (a_l, a_m, a_u), b = (b_l, b_m, b_u) be two triangular fuzzy numbers, (Cx_a, Cy_a), (Cx_b, Cy_b) are the circumcenter points of a, b and r_a, r_a are the circumradius of a, b then:

$\begin{matrix} DD (a, b) = & α \sqrt{{(C x_{a} - C x_{b})}^{2} + {(C y_{a} - C y_{b})}^{2}} \\ + (1 - α) \sqrt{{(r_{a} - r_{b})}^{2}}, \end{matrix}$ (16) is called a dispersion degree of a and b, where α ∈ (0, 1) is an attitude parameter.

From Equation (16), attitude parameter α takes the radius into account which determines who is more important between triangular fuzzy distance and radius. If α approaches to 1, then the triangular fuzzy distance is more important for dispersion degree, while if α approaches to 0, then the circumradius is more important for dispersion degree.

Theorem 4. Let a = (a_l, a_m, a_u), b = (b_l, b_m, b_u) and c = (c_l, c_m, c_u) be three triangular fuzzy numbers, DD (sbull, sbull) be the dispersion degree of two triangular fuzzy numbers then:

DD (a, b) ≥0,

DD (a, b) = DD (b, a),

DD (a, a) =0,

DD (a, b) + DD (b, c) ≥ DD (a, c).

Proof.

(1) Since the dispersion degree is the square root of the sum of two squares, the dispersion degree is clearly nonnegative.

(2) Interchangeability. $\begin{matrix} \begin{matrix} DD (a, b) = α \sqrt{{(C x_{a} - C x_{b})}^{2} + {(C y_{a} - C y_{b})}^{2}} \\ + (1 - α) \sqrt{{(r_{a} - r_{b})}^{2}} \\ = α \sqrt{{(C x_{b} - C x_{a})}^{2} + {(C y_{b} - C y_{a})}^{2}} \\ + (1 - α) \sqrt{{(r_{b} - r_{a})}^{2}} \\ = DD (b, a) . \end{matrix} \end{matrix}$

(3) $DD (a, a) = α \sqrt{{(C x_{a} - C x_{a})}^{2} + {(C y_{a} - C y_{a})}^{2}} + (1 - α) \sqrt{{(r_{a} - r_{a})}^{2}} = 0$ .

(4) Triangle inequality.

By the property of the distance, we can get $\begin{matrix} \begin{matrix} α \sqrt{{(C x_{a} - C x_{b})}^{2} + {(C y_{a} - C y_{b})}^{2}} \\ + α \sqrt{{(C x_{b} - C x_{c})}^{2} + {(C y_{b} - C y_{c})}^{2}} \\ \geq α \sqrt{{(C x_{a} - C x_{c})}^{2} + {(C y_{a} - C y_{c})}^{2}} . \end{matrix} \end{matrix}$ $\begin{matrix} (1 - α) \sqrt{{(r_{a} - r_{b})}^{2}} \\ + (1 - α) \sqrt{{(r_{b} - r_{c})}^{2}} \geq (1 - α) \sqrt{{(r_{a} - r_{c})}^{2}} . \end{matrix}$

Add two formulas together we can get $\begin{matrix} \begin{matrix} α \sqrt{{(C x_{a} - C x_{b})}^{2} + {(C y_{a} - C y_{b})}^{2}} \\ + (1 - α) \sqrt{{(r_{a} - r_{b})}^{2}} \\ + α \sqrt{{(C x_{b} - C x_{c})}^{2} + {(C y_{b} - C y_{c})}^{2}} \\ + (1 - α) \sqrt{{(r_{b} - r_{c})}^{2}} \\ \geq \sqrt{{(C x_{a} - C x_{c})}^{2} + {(C y_{a} - C y_{c})}^{2}} + \sqrt{{(r_{a} - r_{c})}^{2}} \end{matrix} . \end{matrix}$

Thus DD (a, b) + DD (b, c) ≥ DD (a, c).

Similarly, the dispersion degree of two triangular vectors (DDTV) can be defined:

Definition 13. Let A = (a₁, a₂, ⋯ , a_n), B = (b₁, b₂, ⋯ , b_n) be two triangular fuzzy vectors, where $a_{i} = (a_{i}^{l}, a_{i}^{m}, a_{i}^{u}), i = 1 \dots n$ and $b_{i} = (b_{i}^{l}, b_{i}^{m}, b_{i}^{u}), i = 1 \dots n$ are triangular fuzzy numbers, then:

$\begin{matrix} DDTV (A, B) \\ = α \sqrt{\frac{1}{N} (\sum_{i = 1}^{N} {(C x_{a_{i}} - C x_{b_{i}})}^{2} + {(C y_{a_{i}} - C y_{b_{i}})}^{2})} \\ + (1 - α) \sqrt{\frac{1}{N} (\sum_{i = 1}^{N} {(r_{a_{i}} - r_{b_{i}})}^{2})}, \end{matrix}$ (17) is called the dispersion degree of A and B, where (Cx_{a
_i}, Cy_{a
_i}) is the circumcenter points of a_i and (Cx_{b
_i}, Cy_{b
_i}) is the circumcenter points of b_i.

Theorem 5. Let A = (a₁, a₂, ⋯ , a_n), B = (b₁, b₂, ⋯ , b_n), C = (c₁, c₂, ⋯ , c_n) be three triangular fuzzy vectors, where $a_{i} = (a_{i}^{l}, a_{i}^{m}, a_{i}^{u})$ , $b_{i} = (b_{i}^{l}, b_{i}^{m}, b_{i}^{u})$ , $c_{i} = (c_{i}^{l}, c_{i}^{m}, c_{i}^{u})$ i = 1 ⋯ n are triangular fuzzy numbers, DDTV (· , ·) be the dispersion degree of triangular fuzzy vectors, then:

DDTV (A, B) ≥0,

DDTV (A, B) = DDTV (B, A),

DDTV (A, A) =0,

DDTV (A, B) + DDTV (B, C) ≥ DDTV (A, C).

The proof is the same as Theorem 4.

4 Combination forecasting model based on IOWGA and DDTV

4.1 Transformation of the triangular fuzzy series

In order to combine the advantages of various individual forecasting methods the combination forecasting is usually carried out.

Definition 14. Let ${\tilde{Y}}_{t}$ be the fuzzy observation value (output value) of a fuzzy prediction problem at time t (t = 1, 2, ⋯ , n), ${\tilde{Y}}_{it}$ represents the predicted value of the ith prediction method at time t, then the combination forecasting of k fuzzy prediction methods is:

${\tilde{Y}}_{t} = f ({\tilde{Y}}_{1 t}, {\tilde{Y}}_{2 t}, \dots, {\tilde{Y}}_{kt}) .$ (18)f represents an information aggregation operator, which is put forward in the field of decision-making at the beginning.

Definition 15. Let actual values of the index sequence of a complex system be triangular fuzzy time sequences {x_t}, where $x_{t} = (x_{t}^{L}, x_{t}^{M}, x_{t}^{U})$ , t = 1, 2, ⋯ N are triangular fuzzy numbers, then

$(C x_{t}, C y_{t}) = (\frac{{(x_{t}^{L})}^{2} - {(x_{t}^{U})}^{2}}{2 (x_{t}^{L} - x_{t}^{U})}, \frac{x_{t}^{L} - x_{t}^{U} + (x_{t}^{L} - x_{t}^{M}) (x_{t}^{M} - x_{t}^{U}) (x_{t}^{U} - x_{t}^{L})}{2 (x_{t}^{L} - x_{t}^{U})}),$ (19) is called the actual circumcenter of the triangular fuzzy number time sequences at time t,

$r_{t} = \frac{\sqrt{{(x_{t}^{M} - x_{t}^{L})}^{2} + 1} \cdot \sqrt{{(x_{t}^{U} - x_{t}^{M})}^{2} + 1}}{2},$ (20) is called the actual circumradius of the triangular fuzzy number time sequences at time t.

Definition 16. If there are m single prediction methods to predict the system, then {x_it} is called triangular fuzzy prediction time sequences of the ith single prediction method at time t, where $x_{it} = (x_{it}^{L}, x_{it}^{M}, x_{it}^{U})$ , t = 1, 2, ⋯ N, i = 1, 2, ⋯ , m are triangular fuzzy numbers.

Definition 17. Let {x_t} be the triangular fuzzy time sequences, where $x_{t} = (x_{t}^{L}, x_{t}^{M}, x_{t}^{U})$ , t = 1, 2, ⋯ N are triangular fuzzy numbers, {x_it} be the triangular fuzzy prediction time sequences, where $x_{it} = (x_{it}^{L}, x_{it}^{M}, x_{it}^{U})$ , t = 1, 2, ⋯ N, i = 1, 2, ⋯ , m are triangular fuzzy numbers, then

$ɛ_{it} = {\begin{matrix} 1 - | (x_{t}^{L} - x_{it}^{L}) / x_{it}^{L} |, | (x_{t}^{L} - x_{it}^{L}) / x_{it}^{L} | < 1 \\ 0, | (x_{t}^{L} - x_{it}^{L}) / x_{it}^{L} | > 1 \end{matrix},$ (21) is called the prediction accuracy of the ith single prediction method for lower point at time t,

$\begin{matrix} φ_{it} \\ = {\begin{matrix} 1 - | (x_{t}^{M} - x_{it}^{M}) / x_{it}^{M} |, | (x_{t}^{M} - x_{it}^{M}) / x_{it}^{M} | < 1 \\ 0, | (x_{t}^{M} - x_{it}^{M}) / x_{it}^{M} | > 1 \end{matrix}, \end{matrix}$ (22) is called the prediction accuracy of the ith single prediction method for medium point at time t,

$φ_{it} = {\begin{matrix} 1 - | (x_{t}^{U} - x_{it}^{U}) / x_{it}^{U} |, | (x_{t}^{U} - x_{it}^{U}) / x_{it}^{U} | < 1 \\ 0, | (x_{t}^{U} - x_{it}^{U}) / x_{it}^{U} | > 1 \end{matrix},$ (23) is called the prediction accuracy of the ith single prediction method for upper point at time t.

Definition 18. Let {x_it} be the triangular fuzzy prediction time sequences, where $x_{it} = (x_{it}^{L}, x_{it}^{M}, x_{it}^{U})$ , t = 1, 2, ⋯ N, i = 1, 2, ⋯ , m are triangular fuzzy numbers, then ${{\hat{x}}_{t}}$ is called the combination forecasting sequences at time t based on IOWGA, where ɛ_it, φ_it, φ_it are induced values defined in Definition 17, and ${\hat{x}}_{t} = ({\hat{x}}_{t}^{L}, {\hat{x}}_{t}^{M}, {\hat{x}}_{t}^{U})$ , t = 1, 2, ⋯ N are triangular fuzzy numbers:

$\begin{matrix} {\hat{x}}_{t}^{L} = IOWG A_{L} (〈 ɛ_{1 t}, x_{1 t}^{L} 〉, 〈 ɛ_{2 t}, x_{2 t}^{L} 〉, \\ \dots, 〈 ɛ_{mt}, x_{mt}^{L} 〉) = \prod_{i = 1}^{m} x_{\bar{u} - index (it)}^{w_{i}}, \end{matrix}$ (24) ${\hat{x}}_{t}^{L}$ is lower combination forecasting point at time t,

$\begin{matrix} {\hat{x}}_{t}^{M} = IOWG A_{M} (〈 φ_{1 t}, x_{1 t}^{M} 〉, 〈 φ_{2 t}, x_{2 t}^{M} 〉, \\ \dots, 〈 φ_{mt}, x_{mt}^{M} 〉) = \prod_{i = 1}^{m} x_{\bar{u} - index (it)}^{w_{i}}, \end{matrix}$ (25) ${\hat{x}}_{t}^{M}$ is medium combination forecasting point at time t,

$\begin{matrix} {\hat{x}}_{t}^{U} = IOWG A_{U} (〈 φ_{1 t}, x_{1 t}^{U} 〉, 〈 φ_{2 t}, x_{2 t}^{U} 〉, \\ \dots, 〈 φ_{mt}, x_{mt}^{U} 〉) = \prod_{i = 1}^{m} x_{\bar{u} - index (it)}^{w_{i}}, \end{matrix}$ (26)

${\hat{x}}_{t}^{U}$ is upper combination forecasting point at time t.

Similarly, by substituting Equations (12) and (13) $(C {\hat{x}}_{t}, C {\hat{y}}_{t})$ and ${\hat{r}}_{t}$ can be obtained from Theorem 1:

$\begin{array}{l} (C {\hat{x}}_{t}, C {\hat{y}}_{t}) = \frac{{\hat{x}}_{t}^{L}^{2} - {\hat{x}}_{t}^{U}^{2}}{2 ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U})}, \\ \frac{{\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U} + ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{M}) ({\hat{x}}_{t}^{M} - {\hat{x}}_{t}^{U}) (x_{t}^{U} - x_{t}^{L})}{2 ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U})}) . \end{array}$ (27)

${\hat{r}}_{t} = \frac{\sqrt{{({\hat{x}}_{t}^{M} - x_{t}^{L})}^{2} + 1} \cdot \sqrt{{({\hat{x}}_{t}^{U} - {\hat{x}}_{t}^{M})}^{2} + 1}}{2} .$ (28)

Where $(C {\hat{x}}_{t}, C {\hat{y}}_{t})$ is combination forecasting circumcenter of the actual triangular fuzzy time prediction sequences at time t, ${\hat{r}}_{t}$ is combination forecasting circumradius of the actual triangular fuzzy time prediction sequences at time t.

4.2 Triangular fuzzy series combination forecasting based on IOWGA and DDTV

Most of the existing triangular fuzzy combination forecasting methods use the original triangular fuzzy sets to solve the weight, and most of these are based on symmetric triangular fuzzy numbers. However, symmetric triangular fuzzy numbers are only special cases in the actual situation, so it is important to find a relationship applicable to all triangular fuzzy numbers. Therefore, DDTV proposed in this paper is used to solve the weight, this method can not only closely link three points of triangular fuzzy number, and apply to all cases of triangular fuzzy numbers. Moreover, since a triangle has a peripheral circle, which covers all the information of the triangular fuzzy numbers, the combined predictive value can be obtained more accurately. To this end, combination forecasting model can be constructed, for the vector X = (x₁, x₂, ⋯ , x_t) which is made up of the sequence ${x_{t} | x_{t} = (x_{t}^{L}, x_{t}^{M}, x_{t}^{U})}$ , t = 1, 2, ⋯ N from Definition 15. The vector $\hat{X} = ({\hat{x}}_{1}, {\hat{x}}_{2}, \dots, {\hat{x}}_{t})$ which is made up of the sequence ${{\hat{x}}_{t} | {\hat{x}}_{t} = ({\hat{x}}_{t}^{L}, {\hat{x}}_{t}^{M}, {\hat{x}}_{t}^{U})}$ , t = 1, 2, ⋯ N from Definition 18. To solve the weights of combination forecasting model, it’s need to minimize $DDTV (X, \hat{X})$ and obtain the following optimization model: $\begin{matrix} min F_{w} \\ = α \sqrt{\frac{1}{N} (\sum_{t = 1}^{N} {(C {\hat{x}}_{t} - C x_{t})}^{2} + {(C {\hat{y}}_{t} - C y_{t})}^{2})} \\ + (1 - α) \sqrt{\frac{1}{N} \sum_{t = 1}^{N} {({\hat{r}}_{t} - r_{t})}^{2}} \end{matrix}$

$s . t . {\begin{matrix} \sum_{i = 1}^{m} w_{i} = 1, w_{i} \geq 0, i = 1, 2, \dots, m \\ (C {\hat{x}}_{t}, C {\hat{y}}_{t}) = (\frac{{({\hat{x}}_{t}^{L})}^{2} - {({\hat{x}}_{t}^{U})}^{2}}{2 ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U})}, \frac{{\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U} + ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{M}) ({\hat{x}}_{t}^{M} - {\hat{x}}_{t}^{U}) (x_{t}^{U} - x_{t}^{L})}{2 ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U})}) \\ {\hat{r}}_{t} = \frac{\sqrt{{({\hat{x}}_{t}^{M} - x_{t}^{L})}^{2} + 1} \cdot \sqrt{{({\hat{x}}_{t}^{U} - {\hat{x}}_{t}^{M})}^{2} + 1}}{2} \\ (C x_{t}, C y_{t}) = (\frac{{(x_{t}^{L})}^{2} - {(x_{t}^{U})}^{2}}{2 (x_{t}^{L} - x_{t}^{U})}, \frac{x_{t}^{L} - x_{t}^{U} + (x_{t}^{L} - x_{t}^{M}) (x_{t}^{M} - x_{t}^{U}) (x_{t}^{U} - x_{t}^{L})}{2 (x_{t}^{L} - x_{t}^{U})}) \\ r_{t} = \frac{\sqrt{{(x_{t}^{M} - x_{t}^{L})}^{2} + 1} \cdot \sqrt{{(x_{t}^{U} - x_{t}^{M})}^{2} + 1}}{2} . \end{matrix}$ (29)

If α approaches to 1, then the triangular fuzzy distance is more important for dispersion degree, while if α approaches to 0, then the circumradius is more important for dispersion degree. Under symmetric conditions the radius can be ignored because the radius can be represented by Cy as r = 1 - Cy. Therefore, under symmetric conditions the combination forecasting model can be simplified based on $DD (X, \hat{X})$ , which is also need to minimize $DD (X, \hat{X})$ and obtain the following optimization model: $\begin{matrix} min F_{s - w} = \sqrt{\frac{1}{N} (\sum_{t = 1}^{N} {(C {\hat{x}}_{t} - C x_{t})}^{2} + {(C {\hat{y}}_{t} - C y_{t})}^{2})} \end{matrix}$

$s . t . {\begin{matrix} \sum_{i = 1}^{m} w_{i} = 1, w_{i} \geq 0, i = 1, 2, \dots, m \\ (C {\hat{x}}_{t}, C {\hat{y}}_{t}) = (\frac{{({\hat{x}}_{t}^{L})}^{2} - {({\hat{x}}_{t}^{U})}^{2}}{2 ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U})}, \frac{{\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U} + ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{M}) ({\hat{x}}_{t}^{M} - {\hat{x}}_{t}^{U}) (x_{t}^{U} - x_{t}^{L})}{2 ({\hat{x}}_{t}^{L} - {\hat{x}}_{t}^{U})}) \\ (C x_{t}, C y_{t}) = (\frac{{(x_{t}^{L})}^{2} - {(x_{t}^{U})}^{2}}{2 (x_{t}^{L} - x_{t}^{U})}, \frac{x_{t}^{L} - x_{t}^{U} + (x_{t}^{L} - x_{t}^{M}) (x_{t}^{M} - x_{t}^{U}) (x_{t}^{U} - x_{t}^{L})}{2 (x_{t}^{L} - x_{t}^{U})}) . \end{matrix}$ (30)

In this article, the optimal weight is mainly solved by using python’s ipopt solver. As a result, the entire modeling process of the combination forecasting model has been obtained, and Fig. 5 shows the overall process.

Root mean square error (RMSE) has been commonly used over recent publications in the field to score point forecasts performances, so the following evaluation metrics are defined to directly reflect the effectiveness of model based on RMSE:

Fig. 5

The flowchart of combination forecasting model.

Table 1

The raw data and prediction results of power load

t	Original Data	TFGM (1,1)	NNTFGM (1,1)	SVMTFGM (1,1)
1	(3.7,4.8,6)	(4.49,5.44,6.85)	(3.71,4.82,5.98)	(3.73,4.81,6.01)
2	(5.4,6.5,8.3)	(4.47,5.41,6.81)	(5.37,6.46,8.28)	(5.38,6.47,8.3)
3	(7.2,7.7,8.8)	(4.45,5.38,6.78)	(7.2,7.7,8.8)	(7.26,7.57,8.84)
4	(4.6,5.2,6.2)	(4.42,5.35,6.74)	(4.6,5.24,6.2)	(4.62,5.22,6.22)
5	(4.7,5.1,5.3)	(4.40,5.33,6.71)	(4.69,5.09,5.33)	(4.67,5.09,5.36)
6	(4.9,5.4,6.6)	(4.38,5.30,6.67)	(4.9,5.44,6.6)	(4.87,5.44,6.59)
7	(4.8,6.5,7.4)	(4.35,5.27,6.64)	(4.8,6.5,7.4)	(4.8,6.49,7.38)
8	(3,3.5,4.2)	(4.33,5.24,6.6)	(3,3.48,4.2)	(3,3.49,4.22)
9	(3.5,4.4,5.6)	(4.31,5.22,6.57)	(3.5,4.35,5.61)	(3.52,4.36,5.6)
10	(4.3,5.1,6.3)	(4.28,5.19,6.53)	(4.33,5.09,6.3)	(4.32,5.08,6.3)
11	(5.1,5.9,6.3)	(4.26,5.16,6.50)	(5.1,5.85,6.3)	(5.07,5.86,6.29)
12	(3.2,3.6,4.2)	(4.24,5.13,6.47)	(3.06,3.8,4.92)	(3.47,3.72,5.16)
13	(3.3,4.1,5)	(4.22,5.11,6.43)	(3.21,4.42,4.95)	(3.63,4.53,5.2)
14	(4.1,4.6,5.5)	(4.20,5.08,6.40)	(4.3,5,5.97)	(3.99,4.58,5.19)
15	(4.8,6.3,7.3)	(4.17,5.05,6.37)	(5.27,5.51,6.53)	(5.14,5.86,8.32)

Definition 19. Let ${X_{t} | X_{t} = (X_{t}^{L}, X_{t}^{M}, X_{t}^{U})}$ , t = 1, 2, ⋯ N be actual triangular fuzzy time sequences, and triangular fuzzy time prediction sequences be ${{\hat{X}}_{t} | {\hat{X}}_{t} = ({\hat{X}}_{t}^{L}, {\hat{X}}_{t}^{M}, {\hat{X}}_{t}^{U})}$ , t = 1, 2, ⋯ N, then:

$RMSEL = \frac{1}{N} \sqrt{\sum_{t = 1}^{N} {(X_{t}^{L} - {\hat{X}}_{t}^{L})}^{2}},$ (31) is called the prediction root mean square error based on the lower points,

$RMSEM = \frac{1}{N} \sqrt{\sum_{t = 1}^{N} {(X_{t}^{M} - {\hat{X}}_{t}^{M})}^{2}},$ (32) is called the prediction root mean square error based on the medium points,

$RMSEU = \frac{1}{N} \sqrt{\sum_{t = 1}^{N} {(X_{t}^{U} - {\hat{X}}_{t}^{U})}^{2}},$ (33) is called the prediction root mean square error based on the upper points,

$\begin{matrix} RMSET = \frac{1}{N} [\sqrt{\sum_{t = 1}^{N} {(X_{t}^{L} - {\hat{X}}_{t}^{L})}^{2}} \\ + \sqrt{\sum_{t = 1}^{N} {(X_{t}^{M} - {\hat{X}}_{t}^{M})}^{2}} + \sqrt{\sum_{t = 1}^{N} {(X_{t}^{U} - {\hat{X}}_{t}^{U})}^{2}}], \end{matrix}$ (34) is called the total prediction root mean square error of two triangular fuzzy sequences.

RMSEL, RMSEM, RMSEU are evaluation metrics to reflect the effectiveness of three points and RMSET is the overall evaluation metric of triangular fuzzy number prediction, the lower values are (and the closer to zero), the more accuracy the prediction is.

5 Practical example

5.1 The application in power load prediction

Before proceeding to a new empirical application, it is need to verify whether the combination forecasting model proposed in this paper is effective through an experiment, so the data from Zeng [24] who used the power load data of one district of Guilin, China from September 2 to 5, 2014 are used to construct three Triangular fuzzy series forecasting models. These three single models will be combined with DDTV to verify the validity of our model. Raw data and single forecasting model data are as follows:

For the above 15 periods of data, it is need to divide them into in-sample and out-of-sample data for verification respectively to determine the effectiveness of the model. For out-of-sample test, 20% -40% of samples need to be reserved in advance for prediction on this sample after model establishment. Therefore, the data of the first 10 periods are selected as in-sample data and the data of the last five periods as out-of-sample data for verification respectively.

First is in-sample test, according to Equations (21)–(23) in Definition 17, the prediction accuracy sequence of three points can be obtained as follows:

Table 2
The prediction accuracy sequence of three points for power load

t ɛ_1t ɛ_2t ɛ_3t φ _1t φ _2t φ _3t φ _1t φ _2t φ _3t

1 0.7865 0.9973 0.9919 0.8667 0.9958 0.9979 0.8583 0.9967 0.9983

2 0.8278 0.9944 0.9963 0.8323 0.9938 0.9954 0.8205 0.9976 1.0000

3 0.6181 1.0000 0.9917 0.6987 1.0000 0.9831 0.7705 1.0000 0.9955

4 0.9609 1.0000 0.9957 0.9712 0.9923 0.9962 0.9129 1.0000 0.9968

5 0.9362 0.9979 0.9936 0.9549 0.9980 0.9980 0.7340 0.9943 0.9887

6 0.8939 1.0000 0.9939 0.9815 0.9926 0.9926 0.9894 1.0000 0.9985

7 0.9063 1.0000 1.0000 0.8108 1.0000 0.9985 0.8973 1.0000 0.9973

8 0.5567 1.0000 1.0000 0.5029 0.9943 0.9971 0.4286 1.0000 0.9952

9 0.7686 1.0000 0.9943 0.8136 0.9886 0.9909 0.8268 0.9982 1.0000

10 0.9953 0.9930 0.9953 0.9824 0.9980 0.9961 0.9635 1.0000 1.0000

t	ɛ_1t	ɛ_2t	ɛ_3t	φ _1t	φ _2t	φ _3t	φ _1t	φ _2t	φ _3t
1	0.7865	0.9973	0.9919	0.8667	0.9958	0.9979	0.8583	0.9967	0.9983
2	0.8278	0.9944	0.9963	0.8323	0.9938	0.9954	0.8205	0.9976	1.0000
3	0.6181	1.0000	0.9917	0.6987	1.0000	0.9831	0.7705	1.0000	0.9955
4	0.9609	1.0000	0.9957	0.9712	0.9923	0.9962	0.9129	1.0000	0.9968
5	0.9362	0.9979	0.9936	0.9549	0.9980	0.9980	0.7340	0.9943	0.9887
6	0.8939	1.0000	0.9939	0.9815	0.9926	0.9926	0.9894	1.0000	0.9985
7	0.9063	1.0000	1.0000	0.8108	1.0000	0.9985	0.8973	1.0000	0.9973
8	0.5567	1.0000	1.0000	0.5029	0.9943	0.9971	0.4286	1.0000	0.9952
9	0.7686	1.0000	0.9943	0.8136	0.9886	0.9909	0.8268	0.9982	1.0000
10	0.9953	0.9930	0.9953	0.9824	0.9980	0.9961	0.9635	1.0000	1.0000

Table 3

Optimal weights with different parameter values for power load

	w ₁	w ₂	w ₃
α = 0	0.8717	0.1238	0.0044
α = 0.5	0.9636	0.0364	0.0000
α = 1	0.9974	0.0026	0.0000

Finally, the combination forecasting values can be obtained in the Table 4.

Table 4

The triangular fuzzy combination forecasting values for power load (In-sample)

t	α = 0	α = 0.5	α = 1
1	(3.7107,4.8104,6.0089)	(3.7156,4.8139,6.0098)	(3.7101,4.8100,6.0099)
2	(5.3796,6.4696,8.2993)	(5.3744,6.4636,8.2903)	(5.3800,6.4700,8.2999)
3	(7.2022,7.6952,8.8015)	(7.1921,7.6716,8.7948)	(7.2002,7.6997,8.8001)
4	(4.6007,5.2207,6.2007)	(4.6017,5.2230,6.2048)	(4.6001,5.2201,6.2001)
5	(4.6893,5.0900,5.3311)	(4.6862,5.0910,5.3391)	(4.6899,5.0900,5.3301)
6	(4.8989,5.4400,6.5996)	(4.8938,5.4394,6.5991)	(4.8999,5.4400,6.6000)
7	(4.8000,6.4996,7.3993)	(4.7979,6.4927,7.3940)	(4.8000,6.5000,7.3999)
8	(3.0000,3.4896,4.2007)	(3.0049,3.4950,4.2109)	(3.0000,3.4900,4.2001)
9	(3.5007,4.3596,5.6004)	(3.5057,4.3622,5.6052)	(3.5001,4.3600,5.6000)
10	(4.2814,5.0896,6.3000)	(4.2852,5.0892,6.3010)	(4.2801,5.0900,6.3000)

In order to verify the effectiveness of the combination forecasting model in this paper, the method in literature [25] is selected for comparison. The errors for each group are calculated by Definition 19. In order to facilitate the comparison with the single prediction method, the prediction error indicators of each prediction method and several special values are shown in Table 5.

Table 5

The triangular error indicators for power load (In-sample)

Model	RMSEL	RMSEM	RMSEU	RMSET
TFGM (1,1) [24]	0.3474	0.3504	0.4073	1.1051
NNTFGM (1,1) [24]	0.0033	0.0091	0.0042	0.0166
SVMTFGM (1,1) [24]	0.0087	0.0148	0.0081	0.0315
The method in literature [25]	1.5134	0.0524	1.4854	3.0511
IOWGA-DDTV (α = 0)	0.0038	0.0078	0.0044	0.0161
IOWGA-DDTV (α = 0.5)	0.0032	0.0071	0.0032	0.0135
IOWGA-DDTV (α = 1)	0.0032	0.0070	0.0032	0.0133

It can be found that, for the model proposed in this paper, the RMSET of any α is smaller than that of other methods, and as α becomes larger, the error of each point will be smaller than that of other methods. However, the effectiveness of the method in literature [25] is not only worse than the model proposed in this paper, but also worse than the single models. This is because symmetric triangular fuzzy numbers are required when converting triangular fuzzy numbers into area and center of mass, otherwise serious information loss will be caused, resulting in poor model effect eventually. It can be concluded that the proposed combination forecasting model performs well in the in-sample test.

And then is an out of sample test, which needs to get the prediction accuracy out of sample first. For the data of the last five periods, we use the moving average method to calculate the accuracy, considering that the original data is hourly and the amount of data is small, it is not suitable for the step length to be too long, so we choose the three-step moving average and take the weight under α = 0.5 as an example to predict the future, and finally get the prediction values as Table 6.

Table 6

The triangular fuzzy combination forecasting values for power load (Out-sample)

t	Original Data	TFGM (1,1)	NNTFGM (1,1)	SVMTFGM (1,1)	IOWGA-DDTV
1	(5.1,5.9,6.3)	(4.26,5.16,6.5)	(5.10,5.85,6.30)	(5.07,5.86,6.29)	(5.10,5.86,6.30)
2	(3.2,3.6,4.2)	(4.24,5.13,6.47)	(3.06,3.80,4.92)	(3.47,3.72,5.16)	(3.07,3.72,5.15)
3	(3.3,4.1,5)	(4.22,5.11,6.43)	(3.21,4.42,4.95)	(3.63,4.53,5.20)	(3.22,4.53,4.96)
4	(4.1,4.6,5.5)	(4.20,5.08,6.40)	(4.30,5.00,5.97)	(3.99,4.58,5.19)	(4.29,4.98,5.94)
5	(4.8,6.3,7.3)	(4.17,5.05,6.37)	(5.27,5.51,6.53)	(5.14,5.86,8.32)	(5.27,5.85,6.59)

Then the errors are calculated to conduct the out of sample test, and the final result is shown in Table 7.

Table 7

The triangular error indicators for power load (Out-sample)

Model	RMSEL	RMSEM	RMSEU	RMSET
TFGM (1,1) [24]	0.3487	0.4776	0.5971	1.4234
NNTFGM (1,1) [24]	0.1074	0.1928	0.2311	0.5313
SVMTFGM (1,1) [24]	0.1114	0.1257	0.2897	0.5268
IOWGA-DDTV (α = 0.5)	0.1046	0.1485	0.2535	0.5066

Finally, it can be found that the data of the five periods prediction out of sample is generally better than that of the single prediction, which can prove the effectiveness of the combination forecasting. In the future, the method of determining the accuracy to get better prediction effect will be further studied.

After conducting out-sample and in-sample tests, it can be found that the combination forecasting method proposed in this paper performed well in the two tests and could effectively predict the triangular fuzzy data, so the application of this method in oil price forecast will be further studied.

5.2 The application in Brent oil price prediction

In this section, the Brent oil price data from March 28, 2012 to April 27, 2022 are used as an example, using data from investing.com. Brent is a light, sweet crude oil from the Brent and Ninian fields in the North Sea. It is widely traded in the futures, over-the-counter swaps, forward and spot markets. More than 65% of the world’s physical crude is now pegged to the Brent system. Among ten years of daily data, the closing price of each day is taken as the medium point of triangular fuzzy number. Meanwhile, the minimum and maximum of each day are taken as the lower and upper point of triangular fuzzy number respectively. As mentioned above, spot market data perfectly fits the definition of triangular fuzzy numbers. The lower and upper point don’t need to be explained too much, they represent the lowest and the highest price for a day and they won’t appear again on this day, so their memberships are equal to 0. The select of medium point is why this kind of data perfectly fits the definition of triangular fuzzy sets, in other articles they always choose mean value to be the medium point, but they can’t explain why the membership of mean value equals to 1. In this paper the closing price is selected to be the medium point because the closing price is obviously between the highest and lowest price, and it’s the final price every day, and it doesn’t change anymore, which means that probability of the closing price is 1, so the membership is equal to 1. Therefore, the triangular fuzzy sets can be constructed by processing the obtained data. It is worth mentioning that for general time series data the upper and lower points can be constructed by the maximum and minimum values, and the medium points can be constructed by the mean values and the invariant values.

5.2.1 The predicted results of the single forecasting model

Then ARIMA, GRNN and LSSVR are used to predict the Brent oil price which are represented by the triangular fuzzy numbers respectively, For ARIMA model, the data of the last year are used to make predictions and get the data of the last 10 periods. Then the data of the first 9 years are used as the feature set of machine learning, and the first 90% of the last year is used as the training set and the last 10% is used as the test set, and the statistical characteristics of the selected samples are summarized in Table 8. Finally, the data of the last 10 days are selected to correspond to the ARIMA model. For the parameters of the model, random state equals to 8 and we select original parameters to ensure consistency. Thus, the prediction results can be obtained from three single prediction methods, as shown in Table 9.

In order to facilitate comparison, the data are processed and get a sequence of symmetric triangular fuzzy data in Table 10.

Table 8
The statistical characteristics of the selected samples

Lowest Closing Highest

Mean 71.22 72.18 73.09

Min 20.07 21.33 23.22

Max 121.70 125.43 139.13

	Lowest	Closing	Highest
Mean	71.22	72.18	73.09
Min	20.07	21.33	23.22
Max	121.70	125.43	139.13

Table 9

The prediction results from three single prediction methods

t	Original Data	ARIMA	LSSVR	GRNN
1	(107.13,112.67,113.91)	(99.1534,104.3587,108.9118)	(110.4200,113.9951,122.8481)	(107.5706,112.6774,114.0192)
2	(97.75,106.64,107.5)	(99.6881,105.8851,109.5327)	(99.3148,101.5748,106.9248)	(99.1735,102.3920,107.4887)
3	(114.45,121.6,122.34)	(99.6387,106.1662,110.1536)	(108.1885,112.2952,114.3449)	(109.1023,113.1015,117.2337)
4	(104.84,110.23,114.83)	(100.4501,107.0345,110.7745)	(112.0219,115.4239,118.8437)	(110.8157,114.1886,119.1447)
5	(102.89,107.53,108.68)	(100.9505,107.2812,111.3954)	(105.8648,107.9339,110.2965)	(106.6142,108.2770,110.0355)
6	(99.65,102.78,103.3)	(100.7667,107.6498,112.0163)	(101.0788,102.7615,106.9123)	(100.8235,103.0844,107.6304)
7	(106.63,111.7,112.39)	(100.8241,107.3010,112.6372)	(100.6537,104.4044,106.1040)	(101.8006,106.6760,107.5159)
8	(106.76,108.33,109.8)	(101.1735,107.4528,113.2581)	(106.7080,108.4758,111.5789)	(107.4445,108.6926,111.2352)
9	(100.93,104.61,105.76)	(101.5603,108.0366,113.8790)	(102.7296,104.4166,105.9922)	(102.8402,105.1002,106.9014)
10	(106.36,106.94,107.4)	(102.0135,108.3137,114.4999)	(104.4796,107.2503,107.9715)	(105.9547,108.1252,108.9364)

Table 10

The symmetric prediction results from three single prediction methods

t	Symmetric Data	ARIMA	LSSVR	GRNN
1	(107.13,110.52,113.91)	(99.1534,104.0326,108.9118)	(110.4200,116.6340,122.8481)	(107.5706,110.7949,114.0192)
2	(97.75,102.625,107.5)	(99.6881,104.6104,109.5327)	(99.3148,103.1198,106.9248)	(99.1735,103.3311,107.4887)
3	(114.45,118.395,122.34)	(99.6387,104.8962,110.1536)	(108.1885,111.2667,114.3449)	(109.1023,113.1680,117.2337)
4	(104.84,109.835,114.83)	(100.4501,105.6123,110.7745)	(112.0219,115.4328,118.8437)	(110.8157,114.9802,119.1447)
5	(102.89,105.785,108.68)	(100.9505,106.1729,111.3954)	(105.8648,108.0807,110.2965)	(106.6142,108.3249,110.0355)
6	(99.65,101.475,103.3)	(100.7667,106.3915,112.0163)	(101.0788,103.9956,106.9123)	(100.8235,104.2270,107.6304)
7	(106.63,109.51,112.39)	(100.8241,106.7307,112.6372)	(100.6537,103.3789,106.1040)	(101.8006,104.6583,107.5159)
8	(106.76,108.28,109.8)	(101.1735,107.2158,113.2581)	(106.7080,109.1434,111.5789)	(107.4445,109.3399,111.2352)
9	(100.93,103.345,105.76)	(101.5603,107.7196,113.8790)	(102.7296,104.3609,105.9922)	(102.8402,104.8708,106.9014)
10	(106.36,106.88,107.4)	(102.0135,108.2567,114.4999)	(104.4796,106.2255,107.9715)	(105.9547,107.4455,108.9364)

Fig. 6

ACF of the original sequence.

Fig. 7

PACF of the original sequence.

(1) Validity test of ARIMA:

In the Arima model, the time series require to be stationary. If the time series is not stationary, a difference is usually used to ensure the stationarity, so we draw the ACF and PACF of the original sequence:

Then the unit root test is used to further confirm the non-stationarity, the ADF test results are as Table 11.

Table 11

ADF test of the original sequence

	lowest	closing	highest
ADF Test Statistic	–0.178563	0.568847	0.457980
p-value	0.941029	0.986822	0.983538

It can be found that the original time series is not stationary because p-values of ADF tests are all greater than 0.05, so we make first-order difference, and then get the ACF and PACF graphs of first-order difference:

Also, the unit root test is used to further confirm the stationarity, the ADF test results for the first-order difference sequences are in Table 12.

After the difference of ARIMA, the p-value of ADF test is less than 0.05. It shows that the sequence after the first difference is stationary and can be modeled.

AIC and BIC information criteria are used to obtain the final ARIMA model and to predict the data of 10 periods. Finally, Lagrange multipliers are tested for residual terms and Table 13 is obtained. It can be seen from Table 13 that the prediction result is effective.

Table 12

ADF test for the first-order difference sequences

	Lowest	Closing	Highest
ADF Test Statistic	–4.821756	–11.847134	–3.751175
diff-p-value	0.000049	0	0.003448

Table 13

Diagnostic test results

	Lowest	Closing	Highest
AIC	483.207	492.184	463.82
BIC	490.992	499.97	469.011
F	0.718	0.423	0.994
P-value of F	0.706	0.931	0.456
T *R2	7.497	4.583	10.06
P-value of T	0.678	0.917	0.435

(2) The fitting effect of LSSVR and GRNN:

The model LSSVR and GRNN are also need to be tested. For the machine learning method, the fit degree curve is usually used to measure the model effect so we draw the fitting renderings of two prediction methods, Fig. 10 is the fitting renderings of two prediction methods at the lowest point, Fig. 11 is the fitting renderings of two prediction methods at the closing point, Fig. 12 is the fitting renderings of two prediction methods at the highest point. It can be seen from Figs. 10–12 that the forecast data of the model LSSVR and GRNN are basically consistent with the trend of oil price data. Then we calculate the fit degree of the two models to test the effect of model more accurately.

Fig. 8

ACF of the first-order difference.

Fig. 9

PACF of the of first-order difference.

Fig. 10

Fitting renderings of two prediction methods at the lowest point.

Fig. 11

Fitting renderings of two prediction methods at the closing point.

Fig. 12

Fitting renderings of two prediction methods at the highest point.

The R² fitted by the two prediction methods is calculated as Table 14, and R² under different sample sizes are also calculated to ensure the stability of the model.

Table 14

The R² fitted by the two prediction methods

R ²	Lowest	Closing	Highest
LSSVR-10%	0.9640	0.9669	0.9440
LSSVR-20%	0.9899	0.9898	0.9844
LSSVR-30%	0.9830	0.9854	0.9805
GRNN-10%	0.9688	0.9726	0.9605
GRNN-20%	0.9913	0.9915	0.9888
GRNN-30%	0.9862	0.9890	0.9859

It can be seen from Table 14 that the R² of LSSVR and GRNN fitting are around 0.95 in different sample sizes, indicating that the fitting effect is significant and the prediction result is effective. Meanwhile, in order to maintain the consistency of sample size, all the proportions in this paper are set at 10%.

5.2.2 Combination forecasting model based on IOWGA and DDTV

According to Equations (21)–(23) in Definition 17, the prediction accuracy sequence of three points can be obtained as follows (Table 15 is asymmetric, and Table 16 is symmetric):

Then IOWGA operator is used to combine the predicted triangular fuzzy sequences by Definition 18, and the circumcenter and circumradius of the predicted triangular fuzzy sequence are calculated with Definition 18. By substituting the symmetric triangular fuzzy prediction sequence into Equation (30), the optimal weight is (w₁, w₂, w₃) = (0.5105, 0.3511, 0.1384). And by substituting the asymmetric triangular fuzzy prediction sequence into Equation (29), the optimal weights with different parameter values are shown in Table 17.

Table 15
The asymmetric prediction accuracy sequence of three points

t ɛ_1t ɛ_2t ɛ_3t φ _1t φ _2t φ _3t φ _1t φ _2t φ _3t

1 0.9255 0.9693 0.9959 0.9262 0.9882 0.9999 0.9561 0.9215 0.9990

2 0.9802 0.9840 0.9854 0.9929 0.9525 0.9602 0.9811 0.9946 0.9999

3 0.8706 0.9453 0.9533 0.8731 0.9235 0.9301 0.9004 0.9346 0.9583

4 0.9581 0.9315 0.9430 0.9710 0.9529 0.9641 0.9647 0.9650 0.9624

5 0.9811 0.9711 0.9638 0.9977 0.9962 0.9931 0.9750 0.9851 0.9875

6 0.9888 0.9857 0.9882 0.9526 0.9998 0.9970 0.9156 0.9650 0.9581

7 0.9456 0.9440 0.9547 0.9606 0.9347 0.9550 0.9978 0.9441 0.9566

8 0.9477 0.9995 0.9936 0.9919 0.9987 0.9967 0.9685 0.9838 0.9869

9 0.9938 0.9822 0.9811 0.9672 0.9982 0.9953 0.9232 0.9978 0.9892

10 0.9591 0.9823 0.9962 0.9872 0.9971 0.9889 0.9339 0.9947 0.9857

t	ɛ_1t	ɛ_2t	ɛ_3t	φ _1t	φ _2t	φ _3t	φ _1t	φ _2t	φ _3t
1	0.9255	0.9693	0.9959	0.9262	0.9882	0.9999	0.9561	0.9215	0.9990
2	0.9802	0.9840	0.9854	0.9929	0.9525	0.9602	0.9811	0.9946	0.9999
3	0.8706	0.9453	0.9533	0.8731	0.9235	0.9301	0.9004	0.9346	0.9583
4	0.9581	0.9315	0.9430	0.9710	0.9529	0.9641	0.9647	0.9650	0.9624
5	0.9811	0.9711	0.9638	0.9977	0.9962	0.9931	0.9750	0.9851	0.9875
6	0.9888	0.9857	0.9882	0.9526	0.9998	0.9970	0.9156	0.9650	0.9581
7	0.9456	0.9440	0.9547	0.9606	0.9347	0.9550	0.9978	0.9441	0.9566
8	0.9477	0.9995	0.9936	0.9919	0.9987	0.9967	0.9685	0.9838	0.9869
9	0.9938	0.9822	0.9811	0.9672	0.9982	0.9953	0.9232	0.9978	0.9892
10	0.9591	0.9823	0.9962	0.9872	0.9971	0.9889	0.9339	0.9947	0.9857

Table 16

The symmetric prediction accuracy sequence of three points

t	ɛ_1t	ɛ_2t	ɛ_3t	φ _1t	φ _2t	φ _3t	φ _1t	φ _2t	φ _3t
1	0.9255	0.9693	0.9959	0.9413	0.9447	0.9975	0.9561	0.9215	0.9990
2	0.9802	0.9840	0.9854	0.9807	0.9952	0.9931	0.9811	0.9946	0.9999
3	0.8706	0.9453	0.9533	0.8860	0.9398	0.9559	0.9004	0.9346	0.9583
4	0.9581	0.9315	0.9430	0.9616	0.9490	0.9532	0.9647	0.9650	0.9624
5	0.9811	0.9711	0.9638	0.9963	0.9783	0.9760	0.9750	0.9851	0.9875
6	0.9888	0.9857	0.9882	0.9515	0.9752	0.9729	0.9156	0.9650	0.9581
7	0.9456	0.9440	0.9547	0.9746	0.9440	0.9557	0.9978	0.9441	0.9566
8	0.9477	0.9995	0.9936	0.9902	0.9920	0.9902	0.9685	0.9838	0.9869
9	0.9938	0.9822	0.9811	0.9577	0.9902	0.9852	0.9232	0.9978	0.9892
10	0.9591	0.9823	0.9962	0.9871	0.9939	0.9947	0.9339	0.9947	0.9857

Table 17

Optimal weights with different parameter values

	w ₁	w ₂	w ₃
α = 0	0.4982	0.5018	0.0000
α = 0.5	0.4989	0.4720	0.0291
α = 1	0.4966	0.4239	0.0795

Finally, the combination forecasting values can be obtained as shown in the Tables 18 and 19, in which Table 18 shows the symmetric triangular fuzzy combination forecasting values, and Table 19 shows the asymmetric triangular fuzzy combination forecasting values.

Table 18

The symmetric triangular fuzzy combination forecasting values

t	${\hat{x}}_{t}^{L}$	${\hat{x}}_{t}^{M}$	${\hat{x}}_{t}^{U}$
1	107.3454	111.8322	113.3632
2	99.2942	103.3991	107.5704
3	107.4232	111.3213	115.2137
4	105.5550	110.1584	115.9862
5	103.4278	107.1360	110.3144
6	100.8298	104.4053	107.8577
7	101.2977	105.5313	109.8993
8	106.1806	108.9434	111.6339
9	102.1462	104.9989	107.3717
10	104.8828	107.1270	109.1928

Table 19

The asymmetric triangular fuzzy combination forecasting values

t	α = 0	α = 0.5	α = 1
1	(108.9911,111.4271,113.3367)	(108.6357,111.8360,113.0339)	(108.0672,112.5459,112.4904)
2	(99.2444,104.1176,107.2053)	(99.2556,104.0944,107.2837)	(99.2742,104.0458,107.4100)
3	(108.6428,112.6962,115.7751)	(108.3735,112.5057,115.6462)	(107.9328,112.1933,115.4274)
4	(105.5243,110.5665,114.7237)	(105.5529,110.5984,114.9810)	(105.6316,110.6725,115.3769)
5	(103.3873,107.6082,110.1664)	(103.4065,107.6182,110.1992)	(103.4531,107.6364,110.2537)
6	(100.7952,102.9234,107.2720)	(100.8029,103.0581,107.4012)	(100.8156,103.2791,107.6143)
7	(101.3094,106.9869,110.0376)	(101.3048,106.9177,109.9965)	(101.2943,106.8030,109.9139)
8	(107.0770,108.5845,111.4076)	(106.8822,108.5468,111.4576)	(106.5675,108.4859,111.5404)
9	(102.1454,104.7591,106.4475)	(102.1480,104.8460,106.6505)	(102.1560,104.9899,106.9852)
10	(105.2119,107.6884,108.4546)	(105.1369,107.6936,108.6174)	(105.0100,107.7047,108.8863)

In order to verify the effectiveness of the combination forecasting model in this paper, in addition to the comparison of three single methods, the method in literature [25] is also selected for comparison. The errors are calculated for each group by Definition 19. In order to facilitate the comparison with the single prediction method, the prediction error indicators of each single prediction method are given in Tables 20 and 21. The symmetric triangular ambiguity is shown in Table 20, and several special values of asymmetry are shown in Table 21.

Table 20

The symmetric triangular error indicators

Model	RMSEL	RMSEM	RMSEU	RMSET
ARIMA [2]	1.9400	1.7329	1.8851	5.5580
LSSVR [42]	1.2548	1.3084	1.4798	4.0430
GRNN [40]	1.0451	0.9782	0.9735	2.9968
The method in literature [25]	0.9767	0.9711	0.9973	2.9452
IOWGA-DD	0.9297	0.9052	0.9552	2.7901

Table 21

The asymmetric triangular error indicators

Model	RMSEL	RMSEM	RMSEU	RMSET
ARIMA [2]	1.9400	1.9380	1.8851	5.7631
LSSVR [42]	1.2548	1.3947	1.4798	4.1292
GRNN [40]	1.0451	1.1559	0.9735	3.1745
IOWA-DDTV	0.9252	0.9019	0.9520	2.7791
IOWGA-DDTV (α = 0)	0.8524	1.0444	0.8778	2.7745
IOWGA-DDTV (α = 0.16)	0.8508	1.0434	0.8785	2.7695
IOWGA-DDTV (α = 0.5)	0.8655	1.0633	0.8891	2.8181
IOWGA-DDTV (α = 1)	0.8934	1.0974	0.9169	2.9076

Firstly, for single prediction methods ARIMA has lowest accuracy GRNN has highest accuracy and the accuracy of LSSVR is between another two methods, but the gap with the other two is even. At the same time, it can be found that the accuracy of each point of these three prediction methods is also inconsistent, ARIMA has higher accuracy at upper points LSSVR has higher accuracy at lower points and GRNN has higher accuracy at upper points. At last, it can be seen from Tables 20 and 21 that the values of RMSEL, RMSEM, RMSEU and RMSET of the model in this paper are all smaller than those of all single prediction methods and the combination forecasting model of triangular fuzzy number in literature [25]. Therefore, the combination forecasting model based on IOWGA and DDTV proposed in this paper effectively improves the prediction accuracy of triangular fuzzy numbers.

5.2.3 Sensitivity analysis of model parameter α

In the following, the parameter α is analyzed in the model. If α approaches to 1, then the triangular fuzzy distance is more important than circumradius, and if α approaches to 0, then the circumradius is more important than the triangular fuzzy distance. It can be seen from Equation (29) that the optimal weights change with the different values of α. In order to verify whether different α are feasible for the established model, take α ∈ [0, 1] and calculate the corresponding optimal weights and prediction error indexes, respectively. The results are shown in Figs. 13 and 14.

It can be seen from Fig. 13: w₁ remains basically unchanged, w₂ decreases as theα increases, w₃ approaches 0 when α ≤ 0.16 and increases as the α increases from α = 0.16. As can be seen from Fig. 14, the prediction error indexes all are the lowest when the α is 0.16, monotonically decreasing from 0 to 0.16, and monotonically increasing from 0.16 to 1, which indicates that it is necessary to think about the triangular fuzzy distance and the circumradius.

Fig. 13

The relationship between and the optimal weights.

Fig. 14

The relationship between α and prediction error indexes.

5.3 Comparative analysis and results discussion

For the purpose of comparison, we construct two experiments, using seven different methods to compare. The first experiment is power load prediction. The combination forecasting method proposed in this paper is compared with the three methods proposed by Zeng [24], and also compared with the method proposed by Zhang and Chen [25], who converts a triangular fuzzy number (x^L, x^M, x^U) to area and gravity center as $S = \frac{x^{U} - x^{L}}{2}$ , $G = \frac{x^{L} + x^{M} + x^{U}}{3}$ , then the correlation coefficient is used to calculate the weight. There is no need to introduce the model of correlation coefficient, because as far as the area and gravity center themself are concerned, they are only effective when the data is symmetric triangular fuzzy numbers. For asymmetric triangular fuzzy numbers, these two indicators cannot cover all the information. For example, (3,9,12) and (4,7,13) two triangular fuzzy numbers have the same area and gravity center but are not the same triangular fuzzy numbers. Therefore, this transformation will cause information loss under asymmetric conditions. The second experiment is oil price prediction. The combination forecasting method proposed in this paper is compared with the method proposed by Zhang and Chen [25]. And the prediction results are obtained respectively. Figures 15 and 16 show that the errors of the combination forecasting method are smaller than those of the comparison method. Therefore, this method greatly improves the accuracy of triangular fuzzy prediction. At the same time, since the transformation of the area center of gravity is not suitable for asymmetric triangular fuzzy, it can be concluded that the proposed model is quite competitive with the compared models from the overall effectiveness and efficiency of the model.

Fig. 15

Error comparison for power load prediction.

At the same time, as mentioned above, the single forecasting method can be replaced, so another three single forecasting methods are selected for further analysis, including statistical ensemble learning approaches Adaboost Bagging and support vector machine approach SVR. The results are shown in Table 22.

Fig. 16

Error comparison for oil price prediction.

Table 22

The error indicators for another three single forecasting methods

Model	RMSEL	RMSEM	RMSEU	RMSET
SVR [43]	1.5424	2.0784	1.7717	5.3925
Adaboost [20]	0.4975	0.7611	0.7788	2.0374
Bagging [20]	0.2516	0.7227	0.6242	1.5985
IOWGA-DDTV (α = 0.6)	0.3657	0.7407	0.4883	1.5946
IOWGA-DDTV (α = 1)	0.3538	0.7291	0.4633	1.5463

We can find that the overall accuracy of combination forecasting method is still higher than that of a single prediction method. Although SVR-Adaboost-Bagging combination forecasting method has higher accuracy, we can also find that when the accuracy gap of the single prediction method selected is too large, the accuracy of the combination prediction model will not be improved as expected, and it is even necessary to sacrifice the original method with high accuracy to balance the overall. Therefore, ARIMA-LSSVR-GRNN combination prediction model better combines the advantages of a single model, meanwhile it shows that the choice of a single forecasting model is very important, and how to choose a more suitable single forecasting model will be studied in the future research.

From what has been discussed above, the proposed method has the following characteristics:

A prediction criterion for arbitrary triangular fuzzy numbers is proposed for the first time. It complements the theory of asymmetric triangular fuzzy prediction. In some practical applications, because the original data are not exact numbers, it is more reasonable to use fuzzy numbers to represent these data, so the prediction framework developed has a broad application prospect.

The prediction model proposed in this paper has a good prediction effect at the inflection point and can reflect the fluctuation trend of triangular fuzzy sets well.

IOWA and IOWGA operators are used respectively to integrate the prediction information and compare the results. It is found that the prediction errors of IOWGA operator are smaller than IOWA operator. It can also be concluded that replacing an operator more suitable for this scene will make the prediction results more effective for each prediction scene.

The accuracy of the combination forecasting model in this paper is further verified by replacing the single forecasting method, and the advantage of combined forecasting is that it can learn from each other and make up for its shortcomings.

6 Conclusions

In this paper, a combination forecasting model based on IOWGA and dispersion degree of triangular numbers (DDTV) has been proposed based on ARIMA, LSSVR and GRNN for single prediction method. First, we have introduced three single prediction methods, namely ARIMA, LSSVR and GRNN, which combine the traditional statistical time sets prediction with the latest machine learning time prediction method to learn from and complement each other. In addition, we have proposed the distance of triangular fuzzy numbers and the dispersion degree of triangular fuzzy numbers based on circumcenter and circumradius to improve the prediction accuracy and make the application more widely. Finally, an example of BRT oil futures prediction has been given to show that the proposed prediction method is reasonable and effective. Under the same experimental conditions, the prediction performance of the model proposed in this paper is better than that of single prediction method and triangular fuzzy prediction based on barycenter area. Through the analysis of two examples, it can be found that the model proposed in this paper can be applied not only to the economic field but also to other fields such as engineering. Of course, the model works best for forecasting futures, securities, stocks, and other economic data, which are characterized by daily highest and lowest prices as well as closing prices. And these are the three pieces of information that people are most interested in and want to be able to predict accurately. Meanwhile, for other types of time series data, such as engineering data that can be used to predict power load, carbon emission data for environmental protection, and drug consumption data in the medical and health field, they can obtain the maximum and minimum values in a period of time to construct the upper and lower points. Although the selection of medium points in these series of data are controversial, the proposed model, however, is useful for any index selected as a medium point because it can be applied to arbitrary triangular fuzzy numbers rather than being restricted to median points. In the future work, we will introduce other factors affecting oil futures into the mixed model and study more suitable operators for triangular fuzzy combination forecasting to improve the prediction accuracy.

Footnotes

Acknowledgments

The work was supported by National Natural Science Foundation of China (Nos. 72171002, 71771001, U22A20366, 72271002, 71901001, 71901088, 72071001, 72001001, 72201004), Natural Science Foundation for Distinguished Young Scholars of Anhui Province (No. 1908085J03), Research Funding Project of Academic and technical leaders and reserve candidates in Anhui Province (No. 2018H179), Top Talent Academic Foundation for University Discipline of Anhui Province (No. gxbjZD2020056), Anhui Provincial Natural Science Foundation (No. 1808085QG211, 2108085QG290),College Excellent Youth Talent Support Program (gxyq2019236), Key Research Project of Humanities and Social Sciences in Colleges and Universities of Anhui Province (SK2019A0013), Statistics and Science Research Foundation of China (No. 2017LZ11), College Student Innovation and Entrepreneurship Training Program (Nos. 202210357011, 202210357240, 202210357278, S202210357015, S202210357016).

The authors would like to thank the reviewers and editors for their meticulous suggestions.

References

Guerriero

, Mazzoli

, Iannace

, Vitale

, Carravetta

, Strauss

A permeability model for naturally fractured carbonate reservoirs, Marine and Petroleum Geology 40 (2013), 115–134.

Xiang

, Zhuang

X.H.

Application of ARIMA model in short-term prediction of international crude oil price,, Advanced Materials Research (2013), 979–982.

Gao

, Lei

A new approach for crude oil price prediction based on stream learning, Geoscience Frontiers 8 (2017), 183–187.

Arezki

Mr. , Medas

P.A.

Mr. , Sommer

Mr. , Breuer

Mr. , Haksar

Mr. , Helbling

Mr. , Husain

A.M.

Mr. Global implications of lower oil prices, IMF Staff Discussion Notes (2015). https://ideas.repec.org/p/imf/imfsdn/2015-015.html (accessed September 28, 2022).

Amri

Amamou and S. Aguir Bargaoui, Energy markets responds to Covid-19 pandemic, Resources Policy 76 (2022), 102551

Abedin

M.Z.

, Moon

M.H.

, Hassan

M.K.

, Hajek

Deep learning-based exchange rate prediction during the COVID-19 pandemic,, Annals of Operations Research (2021), 1–52.

Alquist

, Kilian

, Vigfusson

R.J.

Forecasting the price of oil, Handbook of Economic Forecasting 2 (2013), 427–507.

Pindyck

R.S.

The Long-Run evolution of energy prices, The Energy Journal 20 (1999), 1–27.

Baumeister

, Kilian

Forecasting the real price of oil in achanging world: a forecast combination approach, Journal ofBusiness & Economic Statistics 33 (2015), 338–351.

10.

Wang

, Athanasopoulos

, Hyndman

R.J.

, Wang

Crude oil price forecasting based on internet concern using an extreme learning machine, International Journal of Forecasting 34 (2018), 665–677.

11.

Wang

, Wang

, Zeng

, Lu

A novel ensemble probabilistic forecasting system for uncertainty in wind speed,. , Applied Energy 313 (2022), 118796

12.

, Wang

, Niu

, Lu

A newly combination model based on data denoising strategy and advanced optimization algorithm for short-term wind speed prediction, Journal of Ambient Intelligence and Humanized Computing (2022). doi:10.1007/s12652-021-03595-x.

13.

Abedin

M.Z.

, Chi

, Uddin

M.M.

, Satu

M.S.

, Khan

M.I.

, Hajek

Tax default prediction using feature transformation-based machine learning, IEEE Access 9 (2021), 19864–19881.

14.

, Wang

, Lai

K.K.

Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm, Energy Economics 30 (2008), 2623–2635.

15.

Xie

, Yu

, Xu

, Wang

A new method for crude oil price forecasting based on support vector machines, Lecture Notes in Computer Science LNCS-IV (2006), 444–451.

16.

Bashiri Behmiri

, Pires Manso

J.R.

Crude oil price forecasting techniques: a comprehensive review of literature, SSRN Electronic Journal (2013).

17.

Theerthagiri

, Ruby

A.U.

Seasonal learning based ARIMA algorithm for prediction of Brent oil Price trends, Multimedia Tools and Applications (2023). doi:10.1007/s11042-023-14819-x.

18.

Kim

II, Jang

Petroleum price prediction with CNN-LSTM and CNN-GRU using skip-connection, Mathematics 11 (2023). doi: 10.3390/math11030547.

19.

Lazcano

, Herrera

P.J.

, Monge

A combined model based on recurrent neural networks and graph convolutional networks for financial time series forecasting, Mathematics 11 (2023). doi: 10.3390/math1101022.

20.

Guliyev

, Mustafayev

Predicting the changes in the WTI crude oil price dynamics using machine learning models, Resources Policy 77 (2022), 102664

21.

Zhang

, Luo

, Wang

, Liu

Oil price forecasting: A hybrid GRU neural network based on decomposition-reconstruction methods, Expert Systems with Applications 218 (2023). doi: 10.1016/j.eswa.2023.119617.

22.

Zhao

L.T.

, Zheng

Z.Y.

, Wei

Y.M.

Forecasting oil inventory changes with Google trends: A hybrid wavelet decomposer and ARDL-SVR ensemble model, Energy Economics 120 (2023). doi: 10.1016/j.eneco.2023.106603.

23.

Regnier

Oil and energy price volatility, Energy Economics 29 (2007), 405–427.

24.

Zeng

X.Y.

, Shu

, Huang

G.M.

, Jiang

Triangular fuzzy series forecasting based on grey model and neural network, Applied Mathematical Modelling 40 (2016), 1717–1727.

25.

Zhang

, Cheng

Fuzzy combination forecasting model withvariable weights of IOWA operator based on prediction accuracy ofarea and gravity center, Statistics & Decision 34 (2018), 24–28.

26.

Bates

J.M.

, Granger

C.W.J.

The Combination of Forecasts, OR 20 (1969), 451–468.

27.

Tak

Forecast combination with meta possibilistic fuzzy functions, Information Sciences 560 (2021), 168–182.

28.

Guo

, Diao

, Zhao

, Wang

, Sun

A double-levelcombination approach for demand forecasting of repairable airplanespare parts based on turnover data, Computers & Industrial Engineering 110 (2017), 92–108.

29.

Perera

, de Hoog

, Bandara

, Halgamuge

Multi-resolution, multi-horizon distributed solar PV power forecasting with forecast combinations, Expert Systems with Applications 205 (2022), 117690

30.

Wang

, He

Prediction of port air pollutant emission based on gray combination model –taking dalian port as an example, ICIC Express Letters 16 (2022), 731–739.

31.

Valle Dos Santos

R.D.O.

, Vellasco

M.M.B.R.

Neural Expert Weighting: A NEW framework for dynamic forecast combination, Expert Systems with Applications 42 (2015), 8625–8636.

32.

Zhou

, Si

, Zheng

, Xu

, Qu

, Zhang

CMBCF: A cloud model based hybrid method for combining forecast, Applied Soft Computing Journal 85 (2019), 105766

33.

, Valle Dos Santos

, Araujo

C.F.F.

, Accioly

R.M.S.

, Oliveira

F.L.C.

Horizon-Optimized weights for forecast combination with cross-learning, Pesquisa Operacional 41 (2021), e245564

34.

Liu

, Wang

, Chen

, Zhu

A combination forecasting model based on hybrid interval multi-scale decomposition: Application to interval-valued carbon price forecasting, Expert Systems with Applications 191 (2022), 116267.

35.

Zadeh

L.A.

Fuzzy sets, Information and Control 8 (1965), 338–353.

36.

Szmidt

, Kacprzyk

Distances between intuitionistic fuzzy sets, Fuzzy Sets Systems 114 (2000), 505–518.

37.

Garg

, Rani

Novel distance measures for intuitionistic fuzzy sets based on various triangle centers of isosceles triangular fuzzy numbers and their applications, Expert Systems with Applications 191 (2022), 116228

38.

Suykens

J.A.K.

, Vandewalle

, de Moor

Optimal control by least squares support vector machines, Neural Networks 14 (2001), 23–35.

39.

Suykens

J.A.K.

, Vandewalle

Recurrent least squares support vector machines, IEEE Transactions on Circuits and Systems 47 (2000), 1109–1114.

40.

Specht

D.F.

A general regression neural network., IEEE Transactions on Neural Networks 2 (1991), 568–576.

41.

Yager

R.R.

, Filev

D.P.

Induced ordered weighted averaging operators, IEEE Transactions on Systems, Man, and Cybernetics 29 (1999), 141–150.

42.

, Xu

, Tang

LSSVR ensemble learning with uncertain parameters for crude oil price forecasting, Applied Soft Computing Journal 56 (2017), 692–701.

43.

Gholami

, Ansari

H.R.

, Hosseini

Prediction of crude oil refractive index through optimized support vector regression: a competition between optimization techniques, Journal of Petroleum Exploration and Production Technology 7 (2017), 195–204.

BRT oil price combination forecasting based on the dispersion degree of triangular fuzzy numbers

Abstract

Keywords

1 Introduction

2.1 Fuzzy sets and distance measure of fuzzy sets

4.1 Transformation of the triangular fuzzy series

5.1 The application in power load prediction

5.2.1 The predicted results of the single forecasting model

Table 8 The statistical characteristics of the selected samples Lowest Closing Highest Mean 71.22 72.18 73.09 Min 20.07 21.33 23.22 Max 121.70 125.43 139.13

Footnotes

Acknowledgments

References

Table 8
The statistical characteristics of the selected samples

Lowest Closing Highest

Mean 71.22 72.18 73.09

Min 20.07 21.33 23.22

Max 121.70 125.43 139.13