Hybrid short-term traffic flow prediction model of intersections based on improved complete ensemble empirical mode decomposition with adaptive noise

Abstract

Based on improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) algorithm and kernel online sequential extreme learning machine (KOSELM) algorithm, a new hybrid short-term traffic flow prediction model (ICEEMDAN-KOSELM-ARIMA) for signalized intersections is proposed according to the current and historical traffic flow data. First, traffic flow historical time series are decomposed by ICEEMDAN algorithm for the purpose of improving the prediction accuracy. Several intrinsic mode functions could be obtained by the decomposition process. Then, permutation entropy algorithm is employed to analyze the random properties of intrinsic mode function components. According to the different random properties of intrinsic mode functions, different prediction models can be built. On this basis, KOSELM prediction models are established for the intrinsic mode function components with big randomness. And auto-regressive integrated moving average (ARIMA) prediction models are built for the intrinsic mode function components with small randomness. Finally, an actual signalized intersection is selected to verify the effect and performance of the hybrid prediction model proposed in this article. Results show that compared with other models, the new proposed hybrid prediction model can effectively improve prediction accuracy, of which prediction errors are the lowest and fitting effect with actual values is the best.

Keywords

Short-term traffic flow prediction hybrid prediction model improved empirical mode decomposition kernel extreme learning machine

Introduction

It has been well known that with the rapid development of economy, vehicles increase dramatically, which leads to increasingly serious traffic congestion problem. As one of the effective means to solve traffic problems, intelligent transportation systems (ITS) has become a hot research topic gradually. Traffic control system and traffic guidance system are two important subsystems of ITS, which are both built based on real time and accurate prediction of traffic state. However, traffic flow prediction is the basis of the traffic state prediction. Therefore, reliable traffic flow prediction becomes more and more important for ITS.

Generally speaking, traffic signal control cycle is usually within 3 min and guidance cycle is usually 5 min. Although traffic flow acquisition equipment has been becoming more complete, the real-time collected information still has certain hysteresis for efficient traffic control and guidance systems. Accurate short-term traffic flow prediction can provide prospective traffic flow information for the formulation and optimization of intelligent traffic signal control and guidance schemes and meet the real-time demand of traffic data. The prediction accuracy of short-term traffic flow is directly related to the effect of traffic control and guidance. However, the inherent resilience, uncertainty, and nonlinearity of traffic flow are a difficult problem that high-precision short-term traffic flow prediction needs to overcome. In addition, the rapid development of ITS also puts forward higher requirements for traffic flow prediction. As a consequence, researching efficient and accurate short-term (usually 5 min) traffic flow prediction method is of great significance to the whole ITS.

Short-term traffic flow prediction usually takes real-time traffic flow information of the intersection and road section to forecast the traffic flow in the future period progressively with proper methods and technologies, in order to achieve the goal of dynamic traffic flow prediction. The predicted traffic flow can provide basis for selecting optimal path, balancing road network traffic flow, optimizing management schemes and improving signal control strategies.

In recent years, there have been many researches in the field of short-term traffic flow prediction. And a variety of prediction models have been set up, which could be divided into four types in general:

Statistical theory prediction models, which mainly use traditional mathematical statistics method to forecast traffic parameters such as traffic flow, speed, and travel time. The models are implemented based on the same statistical characteristics between the predicted data and the historical data. Typical statistical models have been developed and applied for years, such as multiple regression model,¹ auto-regressive moving average (ARMA) model,² auto-regressive integrated moving average (ARIMA) model,³ Kalman model,^4,5 and nonparametric regression model.⁶

Nonlinear prediction models, mainly based on nonlinear theories such as chaos theory, self-organization theory, coordination theory, and dissipative structure theory. The models are established by bringing chaotic attractor and fractal concepts in and employing phase-space reconstruction and digital ecological simulation methods. The established typical models include chaos theory model⁷ and wavelet theory model.⁸

Intelligent prediction models, which are based on modern intelligent algorithms through training historical data. They include artificial neural network (ANN) model,^9–11 fuzzy logic model,¹² echo state network (ESN) model,¹³ support vector machine (SVM) model,¹⁴ and deep belief network (DBN) model.¹⁵

Hybrid prediction models, which can improve the overall accuracy through integrating several single prediction models. Some hybrid models have been proposed, represented by SSA-ELM model,¹⁶ EMD-PSO-SVR model,¹⁷ WNN-MAR-VOA model,¹⁸ and GFMNB-K models.¹⁹

A brief summary of the four types of models is shown in Table 1.

Table 1.

A brief summary of the four types of models.

Types	Advantages	Disadvantages
Statistical theory models (multiple regression model, ARMA model, ARIMA model, Kalman model, nonparametric regression model)	Belong to static statistical models, simple and convenient calculation, rarely involves different factors data in application	Cannot reflect the uncertainty and nonlinearity of traffic flow, cannot avoid the interference of random factors
Nonlinear prediction models (chaos theory model, wavelet theory model)	Utilize chaos theory and dissipative structure theory to establish prediction models, can reflect the nonlinear characteristics of traffic flow	Large amount of data and computation
Intelligent models (ANN model, fuzzy logic model, ESN model, SVM model, DBN model)	Do not need to build specific model structures and formulas, have strong adaptability	Require a training procedure, slow learning speed, more vulnerable to local optimum
Hybrid prediction models (SSA-ELM model, EMD-PSO-SVR model, WNN-MAR-VOA model, GFMNB-K models)	Integrate multiple single prediction models, employ the advantages of each method effectively, more accurate	Difficult to select appropriate single prediction models, need to improve the algorithm computational speed

Among the numerous models above, chaos theory, neural network, and support vector machine all have obvious advantages in the analysis of complex nonlinear and uncertain systems. But most of the traditional neural network models always use gradient descent algorithm, more vulnerable to local optimum, and slow learning speed.

The extreme learning machine (ELM) algorithm put forward by Huang et al.²⁰ can solve the problem above effectively. As a new kind of single hidden-layer feedforward neural network algorithm, ELM has fast learning speed and good generalization performance. Therefore, it has been widely applied in engineering problems. On the basis of ELM algorithm, Liang et al.²¹ proposed an improved algorithm which was referred as online sequential extreme learning machine (OSELM). The generalization ability of the algorithm could be improved by training historical data in batches. Besides, the new algorithm was applicable to variable samples. Due to the fact that data missing and errors are inevitable in the actual collection and transmission of traffic flow, partial training should be carried out according to the real-time data acquisition situation. Therefore, OSELM algorithm could satisfy the robustness need of practical application well. On this basis, Scardapane et al.²² brought kernel function into OSELM algorithms and put forward kernel online sequential extreme learning machine (KOSELM) algorithm. Approximate linear dependence (ALD) method belonging to kernel adaptive filter (KAF) was applied to screen training data online. Therefore, KOSELM algorithm could be used to predict traffic flow and satisfy the robustness requirement.

In addition, it has been well known that traffic flow data at intersections are non-stationary time series, affected by many factors. So, priori analysis on fluctuation characteristics of traffic flow could contribute to improving prediction accuracy. As an adaptive decomposition method for analyzing nonlinear and non-stationary signals, empirical mode decomposition (EMD)²³ had the multi-resolution advantages of wavelet transform and overcame the drawbacks of selecting wavelet and decomposition scale. But the disadvantage of EMD method was prone to arising modal aliasing phenomenon. In order to overcome the shortcoming, Wu and Huang²⁴ put forward ensemble empirical mode decomposition (EEMD) method. However, the results still included a certain amount of residual noise. Complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN)²⁵ method was the improvement of EEMD, which solved the problem of adding different signal noises to different intrinsic mode function (IMF) components and the reconstruction errors were almost zero. This technique has been applied to several research fields such as biomedical engineering,²⁶ seismology,^27,28 and building energy consumption.²⁹ Despite this, CEEMDAN algorithm still needed to be improved. There were still residual noise components in the decomposition components. In addition, components had been assumed in the early signal decomposition stage. In order to overcome the above shortcomings, Colominas et al.³⁰ improved CEEMDAN algorithm and proposed improved complete ensemble EMD (ICEEMDAN) algorithm to obtain less noise and make sequence decomposition more efficient. Thus, there are reasons to believe that ICEEMDAN algorithm can be applied for priori analysis on fluctuation characteristics of traffic flow effectively.

The main objective of this article is to propose a new hybrid short-term traffic flow prediction model named ICEEMDAN-KOSELM-ARIMA employing ICEEMDAN algorithm and KOSELM algorithm, combined with ARIMA algorithm, to improve prediction accuracy of short-term traffic flow. The KOSELM-ARIMA model can achieve a better prediction performance for the reason that the original traffic flow data have been decomposed and the random properties are analyzed by ICEEMDAN-PE algorithm. The novelty of the proposed model is highlighted in the following aspects:

The ICEEMDAN algorithm as a relatively novel adaptive decomposition method is first employed to decompose traffic flow time series data and obtain multiple IMF components.

The permutation entropy (PE) algorithm as an effective randomness and dynamic mutation characteristic analysis method is applied to analyze the randomness of IMF components.

According to different random properties of IMF components, hybrid short-term traffic flow prediction process and model are established.

KOSELM prediction models are built for the IMF components with big randomness, and ARIMA prediction models are built for the IMF components with small randomness.

Several different prediction models are constructed, respectively, to compare and analyze the prediction performance for a real signalized intersection.

The rest of this article is organized as follows. Section “Traffic flow decomposition and random property analysis based on ICEEMDAN-PE algorithm” formulates traffic flow decomposition and random property analysis based on ICEEMDAN-PE algorithm, including traffic flow decomposition by ICEEMDAN algorithm presented in section “Traffic flow decomposition by ICEEMDAN algorithm” and random property analysis by PE algorithm presented in section “Random property analysis of decomposed traffic flow sequences by PE algorithm.” In section “Hybrid traffic flow prediction model based on ICEEMDAN-KOSELM-ARIMA,” hybrid traffic flow prediction model is established based on ICEEMDAN-KOSELM-ARIMA, including KOSELM prediction model building presented in section “KOSELM prediction model building” and hybrid prediction model building presented in section “Hybrid prediction model building.” In section “Model verification,” the model proposed in this article is verified by choosing a typical intersection, including traffic flow sequence decomposing by ICEEMDAN algorithm and randomness analysis in section “Traffic flow sequence decomposing by ICEEMDAN algorithm and randomness analysis by PE algorithm” and prediction model building and results analysis presented in section “Prediction model building and results analysis.” And prediction results are compared with other traditional models. Section “Conclusion” reveals the conclusions and recommendations for future research.

Traffic flow decomposition and random property analysis based on ICEEMDAN-PE algorithm

Traffic flow time series of intersection is decomposed by ICEEMDAN algorithm first. In order to analyze the random characteristics of traffic flow IMF components, PE algorithm is employed to calculate permutation entropies according to the diversity of different IMF components, preparing to establish prediction models.

Traffic flow decomposition by ICEEMDAN algorithm

The constraint of Fourier transform is gotten rid of by EMD algorithm fundamentally, which gives a good interpretation of instantaneous frequency. Unlike wavelet transform, the decomposition of EMD method does not need a priori basement. And signal time sequences can be broken down into a number of IMF components with adaptability. The IMF components own different scales and have little impact on each other. As the signal sequences change, the IMF components obtained by EMD method change. Then, the interference or coupling relationship between system feature information can be simplified.²³ In conclusion, the nature of EMD method is to decompose original sequence in accordance with the fluctuations in different scales and obtain IMF components with different amplitudes. By adding adaptive white noise in each stage of decomposition, the only allowance is calculated by CEEMDAN algorithm to gain IMF components.²³ However, ICEEMDAN algorithm³⁰ defines the true mode as the difference between the current residue and the average of its local means by estimating the local means of each signal plus noise realization.

The original traffic flow sequence is represented by $s (n)$ . The standard normal distribution of white noise sequence added in the $i (1, 2, \dots, I)$ th time is expressed as $v^{i} (n)$ . $M (\cdot)$ denotes the local mean of the signal, and symbol 〈·〉 denotes the averaging action throughout the realization. Then, the decomposed sequence in the ith time can be represented by $s^{i} (n) = s (n) + v^{i} (n)$ . Define $E_{k} (\cdot)$ and $\tilde{IM F_{k}}$ as the kth IMF component by EMD algorithm and ICEEMDAN algorithm, respectively. Then, the specific steps of ICEEMDAN method can be obtained as follows:

Step 1. Same as EMD method,²³ the first residue sequence is calculated through the local means of the decomposition results $s^{i} (n) = s (n) + β_{0} E_{1} (v^{i} (n))$ by ICEEMDAN algorithm for I times as formula (1)

r_{1} (n) = 〈 M (s {(n)}^{(i)}) 〉

(1)

Step 2. The first IMF component can be calculated in the first stage ( $k = 1$ ) as formula (2)

\tilde{IM F_{1}} (n) = s (n) - r_{1} (n)

(2)

Step 3. Calculate the second traffic flow residue sequence and set it as the average local means of the flow sequence $r_{1} (n) + β_{1} E_{2} (v^{i} (n))$ . And then, the second IMF component could be obtained as formula (3)

\begin{matrix} \tilde{IM F_{2}} (n) & = r_{1} (n) - r_{2} (n) \\ = r_{1} (n) - 〈 M (r_{1} (n) + β_{1} E_{2} (v^{i} (n))) 〉 \end{matrix}

(3)

Step 4. Similarly, the $k (2, 3, \dots, K)$ th residue sequence in other stages can be calculated separately as formula (4)

r_{k} (n) = 〈 M (r_{k - 1} (n) + β_{k - 1} E_{k} (v^{i} (n))) 〉

(4)

Step 5. Calculate the Kth modal components as formula (5)

\tilde{IM F_{k}} (n) = r_{k - 1} (n) - r_{k} (n)

(5)

Step 6. Return to Step 4 to continue the algorithm. When the obtained residue sequence cannot be decomposed, the algorithm is over.

From the steps above, it is not hard to find that the decomposition process of ICEEMDAN algorithm is complete and the original traffic flow sequence can be decomposed precisely.

Random property analysis of decomposed traffic flow sequences by PE algorithm

It is known that traffic flow time series is usually non-stationary, leading to several IMF components decomposed by ICEEMDAN algorithm. In order to reduce computing scale, PE algorithm is employed to analyze the random properties of traffic flow IMF components. The IMF components with different randomness are put into different prediction models and prediction results would be added together to obtain final prediction values.

PE algorithm is a new algorithm, proposed by Bandt and Pompe³¹ in recent years to measure the complexity of system. Similar to Lyapunov index, Kaplan–Yorke dimension, and correlation dimension, permutation entropy is also used to characterize randomness and dynamic mutation characteristics of time series. The concept of PE algorithm is simple and computing speed is fast. Besides, it has strong anti-interference ability, especially suitable for nonlinear data. Because of the high sensitivity to the change of time sequence and good robustness, PE algorithm has been widely applied to many kinds of time series.³² It is well known that traffic flow at signalized intersection is a typical data which has a certain randomness and non-stationary. As a result, PE algorithm is suitable for analyzing the randomness of traffic flow decomposed sequences. The specific calculation steps are as follows:

Step 1. Phase-space reconstruction theory is applied to process the IMF sequences ${X (i), i = 1, 2, \dots, N}$ obtained by ICEEMDAN algorithm above. And phase-space matrix Y can be gained as shown in formula (6)

\begin{matrix} Y = [\begin{matrix} x (1) & x (1 + τ) & \dots & x (1 + (m - 1) τ) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ x (j) & x (j + τ) & \dots & x (j + (m - 1) τ) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ x (K) & x (K + τ) & \dots & x (K + (m - 1) τ) \end{matrix}], \\ j = 1, 2, \dots, K \end{matrix}

(6)

where m and $τ$ represent the embedding dimension and delay time, respectively.

Step 2. Take each row of matrix Y above as a component. Then, a total of K components could be obtained. By the phase-space reconstruction theory, formula $K = n - (m - 1) τ$ is established. According to the numerical size of each element, sort the data in ascending order. Take the jth component as an example and formula (7) could be built

\begin{matrix} x (i + (j_{1} - 1) τ) \leq x (i + (j_{2} - 1) τ) \\ \leq \dots \leq x (i + (j_{m} - 1) τ) \end{matrix}

(7)

where $j_{1}, j_{2}, \dots, j_{m}$ represents the column number of each component element j. If there exists the relation formula $x (i + (j_{1} - 1) τ) = x (i + (j_{2} - 1) τ)$ , sort the data according to the values of $j_{1}$ and $j_{2}$ . In conclusion, in the view of the phase-space reconstruction matrix Y for any given time sequence $X (i)$ , every row vector can get a set of sequence $S (l)$ as shown in formula (8)

S (l) = {j_{1}, j_{2}, \dots, j_{m}}

(8)

where formula $l = 1, 2, \dots, k$ and $k \leq m!$ exist. What is more, there are a total of $m!$ different mapping symbol sequences for m dimensional phase space.

Step 3. Calculate the appearing probability of each symbol sequence $S (l)$ , which is denoted by $P_{1}, P_{2}, \dots, P_{k}$ . According to Shannon’s entropy, $H_{p} (m)$ of time series $X (i)$ can be defined as formula (9), representing the permutation entropy of the kth different symbol sequences

H_{p} (m) = - \sum_{j = 1}^{k} P_{j} \ln P_{j}

(9)

From the formula above, it can be concluded that when $P_{j} = 1 / m!$ , $H_{p} (m)$ reaches the maximum value $\ln (m!)$ . In order to calculate conveniently, permutation entropy $H_{p} (m)$ is standardized as formula (10)

H_{PE} (m) = \frac{H_{p} (m)}{\ln (m!)}

(10)

where $H_{PE} (m)$ satisfies formula $0 \leq H_{PE} (m) \leq 1$ and the value of $H_{PE} (m)$ represents random degree of time series. The higher the value, the bigger the randomness of the time sequence. However, the randomness of time sequence will be weaker. Embedding dimension m and delay time $τ$ are two important parameters of PE algorithm. Research has shown that when sample size is small, embedding dimension m is usually small. For the time series of which the sample numbers are more than 1024, stable permutation entropy could be obtained when $m = 6$ .³³ And the effect of delay time $τ$ on permutation entropy is little.

Through the steps above, the permutation entropy of each IMF component can be calculated. Considering the decomposition results of ICEEMDAN algorithm comprehensively, the randomness of IMF components can be analyzed which provides the basis for establishing hybrid prediction models.

Hybrid traffic flow prediction model based on ICEEMDAN-KOSELM-ARIMA

According to the randomness of IMF components analyzed by PE algorithm, the IMF components with big randomness are put into KOSELM prediction models and the other IMF components with small randomness will be put into ARIMA models. Through adding multiple traffic flow prediction results together, the final predicted traffic flow can be gained.

KOSELM prediction model building

The original kernel extreme learning machine (KELM) algorithm is an offline algorithm, which needs to obtain all the training data before the algorithm starts. However, the calculation of matrix inverse process in the learning algorithm will be larger when dataset is too large, leading to insufficient storage. Furthermore, the samples of time series are often obtained in a single or group in prectical problems, which requires real-time online learning. OSELM algorithm is the improvement of ELM algorithm. Current batch data are the only input in training process. Network weights are updated without repeating to scan historical data, having strong generalization ability. Based on the two algorithms, kernel fuction is brought in OSELM algorithm and KOSELM algorithm is proposed, which can filtrate training data online.²²

The input parameters of the algorithm are kernel function $k (x, y)$ and regularization parameter C. The algorithm output is the obtained weight vector $β$ after inputing the last sample $x_{N}$ . For each new sample $(x_{i}, y_{i})$ , the algorithm stores a new unit of which the center is $x_{i}$ and the coefficient is $r_{i}^{- 1} e_{i}$ , where $e_{i}$ denotes the filter error before updating. In addition, the algorithm updates all the previous coefficients by the factor $- z_{i} r_{i}^{- 1} e_{i}$ at the same time. It is important to note that different from OSELM, KOSELM algorithm is redefined for each input, because it needs to calculate a new center for each input sample. The steps of KOSELM algorithm are as follows:

Step 1. Initialize matrix Q using the first sample as formula (11)

Q_{1} = (C^{- 1} + k {(x_{1}, x_{1})}^{- 1})

(11)

Step 2. Calculate the weight vector $β_{1}$ of the first sample $(x_{1}, y_{1})$ as formula (12)

β_{1} = Q_{1} y_{1}

(12)

Step 3. Calculate the kernel function matrix of each sample, from the second sample $x_{i} (i = 2)$ to the nth sample $x_{i} (i = N)$ as formula (13)

k_{i} = {[k (x_{i}, x_{1}), \dots, k (x_{1}, x_{i - 1})]}^{T}

(13)

Step 4. Calculate the intermediate variable matrix $z_{i}$ as formula (14)

z_{i} = Q_{i - 1} k_{i}

(14)

Step 5. Calculate the coefficient $r_{i}$ of each sample as formula (15)

r_{i} = C^{- 1} + k (x_{i}, x_{i}) - z_{i}^{T} k_{i}

(15)

Step 6. Calculate and update matrix $Q_{i}$ as formula (16)

Q_{i} = r_{i}^{- 1} [\begin{matrix} Q_{i - 1} r_{i} + z_{i} z_{i}^{T} & - z_{i} \\ - z_{i}^{T} & 1 \end{matrix}]

(16)

Step 7. Calculate filter error $e_{i}$ as formula (17)

e_{i} = y_{i} - k_{i}^{T} β_{i - 1}

(17)

Step 8. Update weight vector $β_{i}$ of the output sample $x_{i}$ as formula (18)

β_{i} = [\begin{matrix} β_{i - 1} - z_{i} r_{i}^{- 1} e_{i} \\ r_{i}^{- 1} e_{i} \end{matrix}]

(18)

Step 9. Return to Step 3 to continue the algorithm. The whole algorithm will end until the nth sample $x_{i} (i = N)$ and return the weight vector value $β_{N}$ .

In the steps above, variables $Q_{i}$ , $k_{i}$ , $z_{i}$ , and $e_{i}$ are all intermediate variables to calculate the weight vector $β_{i}$ . From the computing point of view, the time complexity of OSELM algorithm is between $ϑ (P^{2})$ and $ϑ (P^{3})$ , where P is related to the size of the mini-batch. Instead, the time complexity of KOSELM algorithm at iteration i equals $ϑ (i^{2})$ approximately.

Hybrid prediction model building

Through the analysis above, intersection short-time traffic flow hybrid prediction model ICEEMDAN-KOSELM-ARIMA could be built, of which the specific process is shown in Figure 1.

Figure 1.

Process of hybrid short-term traffic flow prediction model (ICEEMDAN-KOSELM-ARIMA).

First, ICEEMDAN algorithm is applied to decompose the original intersection traffic flow time series as shown in section “Traffic flow decomposition by ICEEMDAN algorithm” and several IMF components are obtained. In order to calculate the permutation entropy of each IMF component, PE algorithm is employed as shown in section “Random property analysis of decomposed traffic flow sequences by PE algorithm.” According to the difference of permutation entropies for IMF components, the randomness of IMF components can be analyzed.

In order to improve prediction accuracy, IMF component sequences are normalized as formula (19) before building prediction models

X' (i) = \frac{(X (i) - min {X (i)})}{(max {X (i)} - min {X (i)})}

(19)

where $X' (i)$ denotes the normalized data and $X (i)$ denotes traffic flow IMF component sequence data.

After normalization, the IMF component sequences with big randomness are put in KOSELM prediction models and the IMF component sequences with small randomness are put in ARIMA prediction models. Multiple prediction results of traffic flow sequences would be obtained.

Use formula (20) to get converse normalized results of the predicted IMF components

X (i) = X' (i) \cdot (max {X (i)} - min {X (i)}) + min {X (i)}

(20)

Add the results of each IMF component sequence together to get the final traffic flow prediction results.

Model verification

A typical intersection is selected to validate the proposed short-term traffic flow prediction model. The geometry characteristic information of the intersection which has four import approaches in total is shown in Figure 2. The intersection adopts pre-timed signal control scheme. There are two phases. The first phase is used to release the straight traffic flow in eastern and western approaches. The second phase is used to release the straight and left traffic flow in southern and northern approaches.

Figure 2.

Geometry information of the intersection.

Under the influence of various factors such as signal timing, intersection traffic flow often has some characteristics of nonlinearity and uncertainty. In addition, it can be found that traffic flow has similar cycle characteristics by drawing time series curve of collected traffic flow within a certain time. What is more, the traffic flow and change rules of working days (Monday to Friday) are more similar, which is the same as the characteristic of traffic flow on weekends (Saturday and Sunday). It has been well known that one of the main applications of intersection traffic flow data is signal timing parameter optimization. And traffic flow is usually distinguished through driving direction. Therefore, traffic flow data of five continuous working days with 5 min interval are collected and analyzed by taking the straight flow in eastern approach as an example.

The original traffic flow data collected by vehicle detectors are preprocessed first. Threshold value method is employed for detecting and eliminating abnormal data, and the missing data are repaired based on historical data. Through data processing, a total of 1440 traffic flow data points can be obtained. Among the data, the 1152 traffic flow data points of the first four working days are used to establish the prediction model and the 288 traffic flow data points of the fifth working day are applied to validate the prediction performance of the proposed model. The intersection traffic flow distribution characteristic of the straight flow in eastern approach at the interval of 5 min is shown in Figure 3.

Figure 3.

Traffic flow distribution at the interval of 5 min.

Traffic flow sequence decomposing by ICEEMDAN algorithm and randomness analysis by PE algorithm

Through MATLAB programming, ICEEMDAN algorithm is employed to decompose the original intersection traffic flow sequence. The input data of the algorithm are 5 min traffic flow time series and output is IMF components. The decomposition results are shown in Figure 4, in which symbols IMF1 to IMF9 indicate nine IMF components of intersection traffic flow decomposed by ICEEMDAN algorithm. In the decomposition process of intersection original traffic flow time series, 500 groups of white noise signal are added, of which the standard deviation is 0.2. From Figure 4, it can be concluded easily that the original traffic flow sequence can be decomposed into nine IMF components of which the randomness properties are different, providing basis for the construction of subsequent hybrid prediction model.

Figure 4.

Intersection traffic flow data decomposition results by ICEEMDAN.

Relative percentage errors of the traffic flow sequences decomposed by ICEEMDAN algorithm are demonstrated in Figure 5. Figure 6 shows the iteration boxplot of IMF components. From the figures, it can be easily concluded that the decomposition relative percentage error of original traffic flow time series is very small, reaching a level of 10⁻¹². From Figure 6, it can be seen that the iteration times of IMF components decrease with the decomposition process until it reaches zero. It illustrates that the traffic flow sequence is decomposed completely, which proves the effectiveness of the model.

Figure 5.

Relative percentage errors of ICEEMDAN algorithm.

Figure 6.

Iteration boxplot of IMF components.

As stated earlier, in order to reduce computing scale and improve running efficiency, the randomness of IMF components needs to be analyzed. As a result, permutation entropies of IMF components are calculated by applying PE algorithm. First, each IMF component is dealt with phase-space reconstruction. For improving computing efficiency, embedding dimension is set to $m = 6$ and delay time is set to $τ = 3$ according to the above analysis. Then, permutation entropies of each IMF component can be obtained using MATLAB programming, which are shown in Table 2. In the table, $H_{p} (m)$ represents permutation entropy of IMFm component and $H_{PE} (m)$ represents standardized permutation entropy calculated by formula (10). And $Δ H_{PE} (m)$ is the difference value of standardized permutation entropies between adjacent IMF components. The distribution of standardization permutation entropy values for IMF components is shown in Figure 7.

Table 2.

Permutation entropy values of IMF components.

IMFm	$H_{p} (m)$	$H_{PE} (m)$	$Δ H_{PE} (m)$
IMF1	5.479	0.833	0.051
IMF2	5.141	0.781	0.115
IMF3	4.382	0.666	0.180
IMF4	3.198	0.486	0.155
IMF5	2.179	0.331	0.066
IMF6	1.744	0.265	0.028
IMF7	1.557	0.237	0.039
IMF8	1.298	0.197	0.092
IMF9	0.693	0.105	–

Figure 7.

Standardized permutation entropy values of IMF components.

Taking the decomposition results and permutation entropy values into account, random characteristics of IMF components are analyzed. From Table 2 and Figure 7, it is not hard to find that the permutation entropy of IMF1 is the biggest. Furthermore, with the increase in IMF component number, permutation entropy values decrease, which shows that the randomness of IMF sequences weakens gradually. The components from IMF1 to IMF4 have the bigger randomness, mostly affecting the prediction results. As a consequence, the components are put into KOSELM prediction models, respectively. The permutation entropy values of IMF5 to IMF9 are smaller, indicating the randomness is weaker. So, the components from IMF5 to IMF 9 are put into ARIMA prediction models, respectively. In conclusion, two types of prediction models are built to realize the hybrid prediction of traffic flow at signalized intersection.

Prediction model building and results analysis

In order to establish intersection traffic flow prediction model, traffic flow single-step rolling prediction method is employed, using historical data to predict the traffic flow of the next moment. Take the traffic flow sequence 2 h before the predicted point as the model input. And the predicted traffic flow sequence would be the model output. Based on the above rules, 1152 traffic data points of the first four working days could build 1128 input–output datasets, making up training set of the model. Traffic flow dataset is divided into training set and test set and the ratio is 4.

To verify the effectiveness of the proposed model in this article, several different prediction models are constructed, respectively, to compare and analyze the prediction effects, including traditional ARIMA model, BP model, ELM model, and KOSELM model. In addition, considering the integrity of validation, ICEEMDAN-ELM model and ICEEMDAN-KOSELM model are built to prove the effectiveness of the ICEEMDAN algorithm. It is known that ARIMA model is suitable for stationary time series and it requires difference processing for non-stationary time series to transform it into a smooth sequence. It is worth noting that apart from ARIMA model, other models are all intelligent learning models.

Original traffic flow sequence is applied to build ARIMA, BP, ELM, and KOSELM prediction models. Decomposed intersection traffic flow subsequences obtained by ICEEMDAN algorithm are used to construct ICEEMDAN-ELM model, ICEEMDAN-KOSELM model, and ICEEMDAN-KOSELM-ARIMA prediction model. SPSS software is employed to establish ARIMA prediction model, while MATLAB software is used to build other prediction models.

ARIMA prediction model requires stationary sequences, so the data should be analyzed for stationarity. We all know that traffic flow data sequence is usually non-stationary which is also proved by drawing autocorrelation graph and partial autocorrelation graph. However, the sequences after first-order difference processing are stable generally. Experiment also shows that the first difference sequence is stationary. Therefore, ARIMA (p, 1, q) model can be established for the original sequence. After repeated tests, the prediction model is determined to be ARIMA (1, 1, 1). To establish BP prediction model, a typical single-layer neural network model with 10 hidden layer neurons is selected in this article. The network is trained for 1000 epochs and the minimum error of training goal is set to 0.001. Training display frequency is set to 10 and learning rate is set to 0.1.

It is known that ELM algorithm generates the connection weight between input layer and hidden layer randomly, as well as the threshold value of hidden layer neurons. Moreover, no adjustment is needed in the training process. A unique optimal solution can be obtained by setting the neuron number of the hidden layer only. Therefore, for ELM prediction model, the only parameter to be determined in advance is the neuron number of the hidden layer which is denoted by L. Cross-validation method is used to select optimal parameter L which has the minimum error. Through experiment, the optimal hidden layer neuron number L for ELM model is set to 30. Since short-term traffic flow prediction is attributed to regression and fitting problems, the parameter application TYPE is set to the default value 0. Sigmoid function is chosen as the activation function.

Besides, to build KOSELM model and the proposed hybrid prediction model in this article, “rbf” is selected as the kernel function type. Moreover, kernel parameter is searched uniformly in the range ${0.05, 0.1, \dots, 0.8}$ , of which the final value is set to 0.15. Regularization factor C is searched on ${2^{- 8}, 2^{- 4}, \dots, 2^{13}}$ and set to 1 finally. Pruning criterion is set to “fb.” The searching range of pruning parameter is ${50, 100, \dots, 400}$ . After searching uniformly, the optimal pruning parameter value 400 could be obtained.

Four predictive evaluation indexes are chosen to compare and analyze the prediction effects of each model, including mean absolute error (MAE), mean absolute percentage error (MAPE), mean square error (MSE), and equal coefficient (EC). The evaluation indexes can be calculated as formulas (21)–(24)

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - \hat{y_{i}} |

(21)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - \hat{y_{i}}}{y_{i}} |

(22)

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}

(23)

EC = 1 - \frac{\sqrt{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}}{\sqrt{\sum_{i = 1}^{n} {(y_{i})}^{2}} + \sqrt{\sum_{i = 1}^{n} {(\hat{y_{i}})}^{2}}}

(24)

where $\hat{y_{i}}$ and $y_{i}$ represent the predicted traffic flow value and the real traffic flow value at the moment of i, respectively, and n denotes the sample size. The smaller the values of MAE, MAPE, and MSE, the smaller the prediction errors, and hence the prediction effect would be better. The best value of EC is 1. The closer the EC value to 1, the better the fitting degree between the predicted traffic flow values and the real traffic flow values. It also indicates that the model which has the EC value closer to 1 could predict traffic flow more effectively.

Traffic flow prediction effect curves for different models are shown in Figure 8, and absolute error curves are shown in Figure 9. In Figure 8, the X-axis represents the 24 hours on the fifth day of which the traffic volume needs to be predicted. The Y-axis represents the traffic volume at the interval of 5min. In Figure 9, the X-axis represents the real value of traffic flow and the Y-axis represents the predicted value obtained by different prediction models. Table 3 shows the comparisons of prediction performance indexes of different prediction models. From Figure 8, it can be easily concluded that compared with other prediction models, the fitting degree between the predicted traffic flow values and the real traffic flow values of the proposed hybrid ICEEMDAN-KOSELM-ARIMA prediction model is the best. Figure 8(a)–(d) indicates that the prediction performance of KOSELM model is superior to other models including ARIMA model, BP model, and ELM model. Through Figure 8(c)–(f), it can be concluded that for the same model, the prediction accuracy of the models with ICEEMDAN sequence decomposition process is significantly improved.

Figure 8.

Prediction results of different models.

Figure 9.

Comparison between predicted traffic flow values and real values of different models.

Table 3.

Comparisons of prediction performance indexes of different prediction models.

Model	MAE	MAPE (%)	MSE	EC
ARIMA	22.09	17.13	872.59	0.898
BP	20.39	15.50	723.37	0.907
ELM	17.75	13.88	576.71	0.917
KOSELM	14.00	10.84	380.07	0.932
ICEEMDAN-ELM	12.97	9.38	353.00	0.935
ICEEMDAN-KOSELM	8.99	9.19	278.88	0.942
ICEEMDAN-KOSELM-ARIMA	9.42	6.92	158.07	0.957

MAE: mean absolute error; MAPE: mean absolute percentage error; MSE: mean square error; EC: equal coefficient; ARIMA: auto-regressive integrated moving average; BP: backpropagation; ELM: extreme learning machine; KOSELM: kernel online sequential extreme learning machine; ICEEMDAN: improved complete ensemble empirical mode decomposition with adaptive noise.

From Figure 9, the prediction fitting degree of ARIMA model to real value is the lowest, and the prediction deviation degree is the highest. The prediction fitting degree of BP model, ELM model, and KOSELM model to real values is improved gradually, of which deviation degree is still high compared with ICEEMDAN-ELM model and ICEEMDAN-KOSELM model after sequence decomposition process. The proposed hybrid prediction model ICEEMDAN-KOSELM-ARIMA performs better than ICEEMDAN-ELM model and ICEEMDAN-KOSELM model, with the minimum prediction errors. It proves that the proposed hybrid model could make use of the advantages of KOSELM model in dealing with sequences having big randomness and ARIMA model in dealing with sequences having small randomness. In conclusion, the prediction deviation degree of ICEEMDAN-KOSELM-ARIMA to real values is lowest, proving the validity of the model.

In addition, Table 3 shows that the MAPE and MSE values of the proposed ICEEMDAN-KOSELM-ARIMA model are both lower than other models, indicating that the prediction errors are the least and the prediction accuracy is the highest. Although the MAE values of ICEEMDAN-KOSELM-ARIMA model are slightly higher than ICEEMDAN-KOSELM, other indicator values are better than ICEEMDAN-KOSELM model. Besides, EC value for the new proposed model is 0.957, which is greatly higher than that of traditional ARIMA model (0.898). Compared with other models, the EC value is the most closest to 1, indicating that the prediction performance is the best and has better stability. In terms of the same model, the model which is processed by ICEEMDAN algorithm has better prediction performance.

In conclusion, the hybrid intersection short-term traffic flow prediction model based on ICEEMDAN-KOSELM-ARIMA model has good prediction performance and can meet prediction requirements.

Conclusion

Focused on historical intersection traffic flow time series, a new hybrid short-term traffic flow prediction model based on ICEEMDAN-KOSELM-ARIMA is put forward in this article. Considering the nonlinear and stochastic characteristics of intersection traffic flow, original flow sequence is decomposed by ICEEMDAN algorithm first. To analyze the random properties of IMF components, PE algorithm is employed to calculate the permutation entropies. Then, different prediction models are established according to different randomness of IMF components. KOSELM prediction models are established for the IMF components with big randomness. ARIMA prediction models are established for the IMF components with small randomness. The multiple predicted results are added together to obtain the final predicted traffic flow. Finally, traffic flow data of five consecutive working days at a typical intersection are collected to validate the performance of the new proposed hybrid model. In order to compare the model effect, several prediction models are built. Results show that compared with other classical prediction models, the performance indexes of the proposed model in this article are lowest and has the best fitting degree to actual values. Besides, the absolute error range of the model based on ICEEMDAN-KOSELM-ARIMA is less than other models, improving prediction accuracy effectively. Then, the effectiveness of the algorithm in this article is proved.

In this article, many parameters are evaluated according to the existing literature in establishing the prediction models. Besides, the traffic dataset used in this article is a little bit limited. The verification data used for the proposed model are only the traffic flow at the intersection. And the traffic flow has obvious periodic characteristics. Limited by the experimental conditions, the proposed model has not been verified by freeway traffic flow in this article. Because of the limitations, it is advisable to build different traffic flow prediction models by taking parameter optimization into consideration with more field data collection efforts in the future research. Moreover, model verification will be carried out for traffic flow data with different periodic characteristics and freeway traffic flow data.

Footnotes

Acknowledgements

The authors thank the valuable comments from the editor and reviewers.

Handling Editor: Crinela Pislaru

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research has been supported by the National Key Technology Research and Development Plan Project (Grant No. 2014BAG03B03).

ORCID iDs

Xiujuan Tian

Shiguang Wang

References

Papalexopoulos

Hesterberg

TC.

A regression based approach to short-term system load forecasting. IEEE T Power Syst 1990; 5: 1535–1547.

Huang

Wang

ML.

A particle swarm optimization to identifying the ARMAX model for short-term load forecasting. IEEE T Power Syst 2005; 20: 1126–1133.

Lee

CN.

Short-term load forecasting using lifting scheme and ARIMA models. Expert Syst Appl 2011; 38: 5902–5911.

Guan

Luh

Michel

et al . Hybrid Kalman filters for very short-term load forecasting and prediction interval estimation. IEEE T Power Syst 2013; 28: 3806–3817.

Gao

Leng

Zhang

et al . The application of adaptive Kalman filter in traffic flow forecasting. Adv Mater Res 2013; 680: 495–500.

Zhang

HP.

Non-parametric regression and application for short-term traffic flow forecasting. J Tsinghua Univ (Sci Technol) 2009; 49: 1471–1475.

Cheng

Jiang

Multiple sources and multiple measures based traffic flow prediction using the chaos theory and support vector regression method. Physica A 2017; 466: 422–434.

Ghosh

Basu

O’Mahony

Random process model for urban traffic flow using a wavelet-Bayesian hierarchical technique. Comput-Aided Civ Inf 2010; 25: 613–624.

Lelitha

Laurence

. A comparison of the performance of artificial neural networks and support vector machines for the prediction of traffic speed. In: 2004 IEEE intelligent vehicles symposium, Parma, 14–17 June 2004. New York: IEEE.

10.

Jiang

Adeli

Dynamic wavelet neural network model for traffic flow forecasting. Transport Eng 2005; 131: 771–779.

11.

Kandil

Wamkeue

Saad

et al . An efficient approach for short term load forecasting using artificial neural networks. Int J Elec Power 2006; 28: 525–530.

12.

Che

Wang

An adaptive fuzzy combination model based on self-organizing map and support vector regression for electric load forecasting. Energy 2012; 37: 657–664.

13.

Deihimi

Showkati

Application of echo state networks in short-term electric load forecasting. Energy 2012; 39: 327–340.

14.

Hong

Dong

Zheng

et al . Forecasting urban traffic flow by SVR with continuous ACO. Appl Math Model 2011; 35: 1282–1291.

15.

Jia

Traffic flow prediction with rainfall impact using a deep learning method. J Adv Transport 2017; 2017: 6575947.

16.

Shang

Lin

Yang

et al . A hybrid short-term traffic flow prediction model based on singular spectrum analysis and kernel extreme learning machine. PLoS ONE 2016; 11: e0161259.

17.

Wang

A hybrid model of EMD and PSO-SVR for short-term load forecasting in residential quarters. Math Probl Eng 2016; 2016: 9895639.

18.

Yang

Zou

Wang

et al . A hybrid method for short-term freeway travel time prediction based on wavelet neural network and Markov chain. Can J Civil Eng 2018; 45: 77–86.

19.

Zou

Ash

Park

et al . Empirical Bayes estimates of finite mixture of negative binomial regression models and its application to highway safety. J Appl Stat 2017; 45: 1652–1669.

20.

Huang

Zhu

Siew

CK.

Extreme learning machine: theory and applications. Neurocomputing 2006; 70: 489–501.

21.

Liang

Huang

Saratchandran

et al . A fast and accurate online sequential learning algorithm for feedforward networks. IEEE T Neural Networ 2006; 17: 1411–1423.

22.

Scardapane

Comminiello

Scarpiniti

et al . Online sequential extreme learning machine with kernels. IEEE T Neur Net Lear 2015; 26: 2214–2220.

23.

Huang

Shen

Long

et al . The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. P Roy Soc A-Math Phy 1998; 454: 903–995.

24.

Huang

NE.

Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal 2009; 1: 1–41.

25.

Torres

Colominas

Schlotthauer

et al . A complete ensemble empirical mode decomposition with adaptive noise. In: 2011 IEEE international conference on acoustics, speech and signal processing, Prague, 22–27 May 2011. Piscataway, NJ: IEEE.

26.

Navarro

Poree

Carrault

ECG removal in preterm EEG combining empirical mode decomposition and adaptive filtering. In: Proceedings of the 37th IEEE international conference on acoustics, speech and signal process (ICASSP 2012), Kyoto, Japan, 25–30 March 2012, pp.661–664. New York: IEEE.

27.

Han

van der Baan

Empirical mode decomposition for seismic time-frequency analysis. Geophysics 2013; 78: O9–O19.

28.

Hooshmand

Nasseri

Siahkoohi

. Seismic data denoising based on the complete ensemble empirical mode decomposition. In: International geophysical conference and oil & gas exhibition, Istanbul, 17–19 September 2012, pp.1–4, https://library.seg.org/doi/abs/10.1190/IST092012-001.28

29.

Fontugne

Ortiz

Tremblay

et al . Strip, bind, and search: a method for identifying abnormal energy consumption in buildings. In: 2013 ACM/IEEE international conference on information processing in sensor networks (IPSN), Philadelphia, PA, 8–11 April 2013, pp.129–140. New York: IEEE.

30.

Colominas

Schlotthauer

Torres

Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed Signal Process 2014; 14: 19–29.

31.

Bandt

Pompe

Permutation entropy: a natural complexity measure for time series. Phys Rev Lett 2002; 88: 174102.

32.

Zunino

Rosso

Soriano

MC.

Characterizing the hyper chaotic dynamics of a semiconductor laser subject to optical feedback via permutation entropy. IEEE J Sel Top Quant 2011; 17: 1250–1257.

33.

Yan

Liu

Gao

RX.

Permutation Entropy: a nonlinear statistical measure for status characterization of rotary machines. Mech Syst Signal Pr 2012; 29: 474–484.