Sage Journals: Discover world-class research

Abstract

Signal phasing and timing can be adaptive and actuated in practice. This makes it challenging to understand what the cycle length and phase duration of the next few cycles will be. Many innovative applications can be designed based on the knowledge of future signal timing states such as dilemma zone warning, efficient route planning, and so forth. This work proposes a long short-term memory model capable of predicting both cycle length and phase duration prediction up to six cycles in the future. GPS information of several vehicles are merged with signal timing information of eight intersections. Several key features such as waiting time, approach speed and acceleration, departing speed and acceleration, are calculated based on the geolocation of individual journeys. The results show that cycle length prediction can reach mean absolute error (MAE) of about 7 s while phase prediction MAE is about 9 s.

Keywords

data and data science artificial intelligence machine learning (artificial intelligence)probe vehicle data

Recent advancements in technology have changed the transportation domain significantly since the 2010s. The availability of several new data sources (i.e., sensor technology or vehicle technology) allows for data-driven methodologies that can be incorporated into well-established traffic management systems. In this work, we focus on using deep learning architecture to model signal timing parameters from probe vehicle data. Each approach level volume and occupancy was aggregated at the cycle level from high-resolution detector data. The data were then fed into several machine learning models to compare their performance. Various input-output window sizes were analyzed, to determine the one which is most optimal. Different data preparation methods were also used to calculate the traffic parameters and signal information. Of the five different models tested; long short-term memory (LSTM) gave the best results for mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The proposed model is thoroughly validated on eight different intersections which utilize actuated signal controllers. The model can accurately predict the cycle length and phase duration of the next six cycles which can be instrumental in infrastructure-to-vehicle (I2V) applications. The input data not only consists of traffic features of the target intersection but also considers the features related to the probe vehicle data upstream and downstream of the intersections.

Traditionally, the problem of signal control can be decomposed into a two-stage solution ( 1 ). The first stage is to develop traffic flow models that help to estimate macroscopic traffic parameters at intersections. The next stage is to develop appropriate signal plans based on the estimated traffic flow parameters. The methodology proposed in this paper would aid in reducing the dependency on the optimization algorithms while simultaneously simplifying the traditional two-stage traffic signal control problem into a one-stage solution. A model trained on one intersection can be easily transferred to another intersection, thereby reducing the dependency on signal control optimization. Additionally, most real-time systems, involving delivering signal phasing and timing (SPaT) messages to vehicles, usually have a system delay ( 2 ). This is because the signal states need to travel through several servers to reach the cloud. Data from the cloud is finally sent to the vehicles. In many cases this delay can be so large that information about the next cycle is not readily available. Some of the delay is caused by transmission latency but it is more the result of the way adaptive and actuated signals are programmed. Such signal controllers can extend/reduce cycle length or phase length based on the traffic flow. As such, cycle length and phase length are very dynamic in nature which necessitates some form of prediction algorithm. In the dataset that has been used in this paper, this delay was in between 30 and 60 s. A predictive SPaT model can help to bridge this gap in latency. Moreover, given the time drift of fixed signals and ever-changing traffic condition of adaptive and actuated signals, accurate SPaT information is seldom available ( 3 ). The proposed model can also find applications in connected vehicles where predictive SPaT information can aid in finding optimal routes and in fuel saving.

Literature Review

The literature review is divided into several broad categories to highlight the specific contributions of the current work: review of SPaT prediction, assessment of traffic-related prediction using deep learning, and traffic-related prediction using probe vehicle data. Each of the next paragraphs details the notable work in these three areas.

SPaT Prediction

Many studies have focused on the optimization of signal control ( 4 – 16 ). Some studies have utilized SPaT to find optimal fuel economy ( 17 ) or to guide a vehicle through a signal ( 18 ). Vehicle planning schemes for energy efficiency have also been studied for arterials that predict SPaT based on historical data ( 3 ). SPaT predictions based on GPS data from several buses have been studied by Fayazi et al. ( 19 ) for fixed signal timings. Floating vehicle data from other sources have also been investigated to estimate fixed timing signal parameters by using speed measurements ( 20 – 22 ). A smartphone camera has also been used to detect signal states ( 18 ). Ibrahim et al. ( 23 ) used historical SPaT from a single intersection to calculate future times. Goodall et al. ( 24 ) proposed a microscopic simulation algorithm that utilizes connected vehicle data to obtain future states, which can optimize for delays afterward. That study also mentions that it cannot be implemented in real time because of the computational requirements. A predictive SPaT model is proposed based on vehicles’ arrival time in a connected environment ( 25 ). This method has been only implemented at an isolated intersection, and it was unable to predict beyond the current cycle. As such, it is evident that SPaT predictions at a corridor level using both adaptive and actuated signal control have not yet been studied. SPaT predictions have been explored by Islam et al. ( 26 ) using detector data but not with connected vehicle data.

Assessment of Traffic-Related Prediction Using Deep Learning

Before the advent of deep learning, traffic prediction was conducted by the use of parametric models such as autoregressive models and Kalman filter ( 27 , 28 ) and non-parametric models such as k-nearest neighbor, support vector machine, neural network, and so forth ( 27 , 29 ). In the deep learning era, autoencoders (AE) were first used to show how models can effectively learn existing patterns in data ( 30 , 31 ). Recurrent neural networks (RNN) were also used extensively for traffic parameters like speed, volume, and travel time prediction ( 32 ). LSTM RNNs gathered popularity thereafter because of their memory cells that can decide when to remember past information. Accuracy of prediction also improved by the use of LSTMs even compared with AE and multilayer perceptron (MLP) ( 33 – 35 ). The addition of stacked LSTM layers also further improved the accuracy ( 36 ). Convolutional neural networks (CNN) have also been used to understand traffic flow patterns. Both temporal and spatial features have been used to generate two-dimensional feature sets that can be exploited with CNN ( 37 – 39 ). Several extensions of CNN were also reported to have superior accuracy to traditional CNNs, such as eRCNN ( 40 ) and GraphCNN ( 41 , 42 ). Combinations of several methods have also been used. For example, k-nearest neighbor together with LSTM ( 43 ), AE and LSTM ( 44 ), or CNN and LSTM ( 45 , 46 ). From these studies, it can be concluded that models such as CNN and LSTM are widely used in traffic-related prediction.

Traffic-Related Prediction Using Detector Data

Detector data from various sources such as loop detectors, magnetic and pneumatic tube sensors, radar sensors, infrared sensors, and microwave/radar detectors have also been used in several studies to make traffic-related predictions. Travel time predictions and travel time trends have been studied extensively ( 47 – 50 ). Traffic flow prediction or volume estimation have been studied. Short-term ( 51 – 53 ) and long-term prediction ( 54 – 56 ) has mostly been investigated. Detector data has also been used from camera and loop detectors to predict cycle volumes ( 57 , 58 ). Often advanced detectors can be used to predict speed profiles as well ( 36 , 37 , 39 , 59 , 60 ). Data from smartphones have been used to predict vehicle maneuvers ( 61 ) and activity recognition ( 62 ). In the safety field, several crash risk prediction models have also been proposed and validated using detector data ( 63 – 65 ). Therefore, it can be noted that SPaT predictions have been sparsely studied.

The previous studies discussed have not used real-time detector information for SPaT prediction which provides accurate granular traffic flow for all phases. Additionally, such methods fail to capture the real-time vehicle demand fluctuations in the field and would usually suggest an average solution. The computational requirements of the optimization methods also may not support real-time applications ( 24 ). On the other hand, the LSTM model can be used to make real-time predictions once trained. Additionally, most studies concentrate on one intersection where the reproducibility of the results cannot be understood.

Traffic-Related Prediction Using Probe Vehicle Data

Several studies have been conducted to validate the accuracy and reliability of probe source data ( 66 – 68 ). These found that the quality of data from probe vehicles improved significantly. With high-resolution probe vehicle data, this method has the potential to provide real-time accurate signal timing predictions, based on the variables related to driving behavior, compared to the previous studies. Based on our literature review, there have been limited studies which used probe vehicle data to predict SPaT. However, studies have shown other uses of probe vehicle data. Travel time and speed predictions have been studied extensively ( 69 ). Other studies have also shown the potential for traffic density estimation with probe vehicle data ( 70 ). Each of the studies mentioned above shows the potential that probe vehicle data has to predict many parameters, which are closely related to SPaT prediction.

The proposed work in this paper addresses this research gap and also elaborates on data aggregation at the cycle level from high-resolution probe vehicle data to obtain counts, speeds, and acceleration rates. It also obtains data from traffic signal controllers to calculate cycle length and phase duration. The process of windowing the data and sampling data from the previous day and previous week can also be highlighted as one of the contributions of the paper.

This paper is organized as follows: the next section details the data preparation steps to obtain count, speed, acceleration, and signal parameters. The methodology and model description are presented in the fourth section followed by the results. Finally, discussions about possible applications using this methodology and concluding remarks are made.

Data Preparation

A corridor in Orlando, Florida was selected which operates actuated signal control (SR 434), as shown in Figure 1. It has a total of eight intersections. All eight intersections are equipped with automated traffic signal performance measures (ATSPM) which store high-resolution detector data from each intersection along with the signal timing parameters like cycle length, green time, red time, and so forth. Data from November 11, 2020, to November 17, 2020, were used in the study. Of the eight intersections, two have all protected movements for all phases (Intersections 1295 and 1130). Intersection 1280 does not have protected left turns for the minor road but instead uses dummy phases to balance the ring. Four intersections do not have protected left turns (1285, 1290, 1300, 1305) while Intersection 1275 is a T-intersection. The study area is shown in Figure 1 with the different SignalIDs. ATSPM was used to extract signal and phasing information.

Figure 1.

Study area.

Table 1 show the phases that are related to each intersection identity, as well as the direction each phase is tied to, as defined in the ATSPM signal controller database. It also shows the ring diagram that is generally followed for each of the intersections. It should be noted that slightly different ring diagrams are also implemented within the same barrier, depending on the demand at each approach. For example, the through movements may be served only if there is no left-turn demand.

Table 1.

Ring Diagram for the Selected Intersections in the Study Area

Intersection	Ring diagram							Notes	WB-T	EB-T	WB-L	EB-L	SB-T	NB-T	SB-L	NB-L
1275	Ring 1	1	2	8				T-intersection	2	6	-	1	8	-	-	-
1275	Ring 2		6					T-intersection	2	6	-	1	8	-	-	-
1280	Ring 1	1	2	3	4			-	2	6	5	1	8	4	3	7
1280	Ring 2	5	6	8	7			-	2	6	5	1	8	4	3	7
1285	Ring 1	1	2	3	8	7	4	-	6	2	1	5	4	8	-	-
1285	Ring 2	5	6					-	6	2	1	5	4	8	-	-
1290	Ring 1	1	2	3	8	7	4	-	2	6	5	1	8	4	-	-
1290	Ring 2	6	5					-	2	6	5	1	8	4	-	-
1295	Ring 1	1	2	4	3			-	6	2	1	5	4	8	7	3
1295	Ring 2	6	5	7	8			-	6	2	1	5	4	8	7	3
1300	Ring 1	1	2	4				No protected left for minor road	6	2	1	5	4	8	-	-
1300	Ring 2	5	6	8				No protected left for minor road	6	2	1	5	4	8	-	-
1305	Ring 1	1	2	4				No protected left for minor road	6	2	1	5	4	8	-	-
1305	Ring 2	5	6	8				No protected left for minor road	6	2	1	5	4	8	-	-
1130	Ring 1	1	2	3	4			-	6	2	1	5	4	8	7	3
1130	Ring 2	6	5	7	8			-	6	2	1	5	4	8	7	3

Note: EB = eastbound; NB = northbound; SB = southbound; WB = westbound; L = left; T = through; “-” = not applicable/no data.

The connected vehicle (CV) data used in this paper were provided by the CV data service, Wejo. The data set contains vehicle-specific data from several manufacturers. It mainly has non-commercial fleet data which better represents the vehicles on the roadways. Instantaneous data is sent from the vehicle to the cloud in near real time. The dataset consists of GPS location, heading, speed, postal code, journeyId and dataPointId. The sampling rate of the dataset was limited to 3 s. One week of data was used in this study, the week of November 11–17, 2019. The entire dataset had a total of 100 million GPS points. The CV data was processed according to the pipeline shown in Figure 2. The CV data points were spatially filtered to obtain the trajectories within 330 ft of the selected eight intersections. The arrival time was estimated when a vehicle entered the 330 ft buffer, and the departure time was when the vehicle left the buffer. Based on the arrival and departure, approaching acceleration and speed as well as departing acceleration and speed was computed for each journey. Whether that trajectory waited at the intersection was also estimated from the speed profile. The approaching direction and departing directions were also computed since they will give an indication of the particular phase the vehicle used.

Figure 2.

Data processing pipeline.

Meanwhile, the ATSPM data were processed to extract cycle lengths and phase lengths. Temporal features such as hour, day of week, weekend/weekday, and so forth were also measured. The cycle start time and end time were also noted. The two datasets were merged based on overlapping timestamps. For instance, if the arrival time of a Wejo vehicle is 10:10:12 within the buffer area, and the cycle start time and end time is 10:10:11 and 10:10:14, then this particular vehicle would be considered to use this cycle. The vehicle dynamic features such as speed, acceleration, direction, and so forth were also used as features for this cycle. Finally, the individual journeys were aggregated per cycle to provide information on the number of vehicles that used the cycle, the direction of travel as well as the vehicle dynamics such as aggregated acceleration and speed. This resulted in 18,790 journeys that used the intersection with the study period. The different features estimated are shown in Table 2. Cycle lengths greater than 1,500 s were discarded since this can usually happen at unsaturated condition at nighttime or from detector failure; 95% of the cycle lengths were within 350 s. The journeyID_count is the number of vehicles that have used a cycle. The maximum is shown in the table as 22, which is lower than the expected maximum volume. The penetration rate of CVs was around 3% during the study period.

Table 2.

Variable Statistics

Feature	Description	Mean	Std dev.	Minimum	75%	Maximum
journeyID_count	Number of unique journeys per cycle	3.38	2.5	1	5	23
isStopped_min	Statistics of vehicles stopped per cycle	0.14	0.34	0	0	1
isStopped_max		0.61	0.49	0	1	1
isStopped_mean		0.35	0.35	0	0.56	1
acc_approach_min	Statistics of the approaching acceleration aggregated per cycle (m/s²)	−3.84	3.1	−21.12	−0.77	8.83
acc_approach_max		0.07	2.78	−21.12	1.54	10.37
acc_approach_mean		−1.79	2.2	−21.12	−0.19	8.83
acc_depart_min	Statistics of the departing acceleration aggregated per cycle (m/s²)	0.45	2.04	−9.6	1.15	13.83
acc_depart_max		3.36	2.46	−9.6	4.99	18.43
acc_depart_mean		1.82	1.71	−9.6	2.92	13.83
speed_approach_min	Statistics of the approaching speed aggregated per cycle (mph)	24.74	14.04	0	35.78	70.15
speed_approach_max		40.05	10.67	0	47.96	91.62
speed_approach_mean		32.78	10.23	0	40.80	70.15
speed_depart_min	Statistics of the departing speed aggregated per cycle (mph)	29.45	11.91	0	37.22	81.60
speed_depart_max		41.81	8.85	0	47.96	90.19
speed_depart_mean		36.08	8.14	0	41.75	81.60
waitingTime_min	Statistics of the wait time at an intersection (s)	33.66	217.28	0	12	6,495
waitingTime_max		259.99	754.79	2	126	10,953
waitingTime_mean		94.88	291.92	2	60.5	6,495
direction_min	Direction of travel of the vehicles; both approach and departing directions are considered; each whole number represents a direction	3.11	4.14	0	4	14
direction_max		9.43	4.53	0	12	14
direction_mean		6.22	3.61	0	8.6	14
direction_sum		21.38	19.08	0	29	144
direction_count		3.38	2.5	1	5	23
dayofweek	Day of week	2.9	1.93	0	5	6
hour	Hour	12.96	5.22	0	17	23
weekday	Weekday or weekend	0.74	0.44	0	1	1
timeofday	Time of day	5.43	2.48	1	7	10
cycle_length	Length of a cycle (s)	174.09	118.11	32	191	1,487

Model

Several models were evaluated alongside LSTM: random forest, support vector machine, gradient boosting and extreme gradient boosting. Brief descriptions of the models are provided in this section followed by comparative results.

Random Forest

Random forest (RF) is a tree-based classifier that employs two distinct machine learning techniques: random feature selection and bagging ( 71 ). Random feature collection creates decision trees immediately, whereas bagging creates each tree separately. A decision tree model starts at the root node and splits the data on the features that result in the largest information gain. This partition process is repeated iteratively until the child node is optimized to have values that belong to the same class ( 72 ). Rather than employing all of the features in the decision trees, RF selects the features of the subsets at random. For forecasting the output of a new dataset, RF uses the mean value of the outputs from random independent bootstrap training data.

Support Vector Machine

The support vector machine (SVM) method is a supervised learning algorithm that is widely used. Given a data set D in the form of ${x_{i}, y_{i}}_{i = 1}^{N}$ , where $x_{i} ϵ R_{d}$ are the samples, and the $y_{i}$ is the label, SVM maps the feature vector $x_{i}$ to an N-dimensional space, with N as the number of features of the samples. For a multiclass classification problem, SVM breaks down the multiclassification problem into multiple binary classification problems and finds the hyperplane (decision boundary) to distinctly separate the multiple classes of samples. The distance between the two classes is regarded as the margin distance. SVM uses a loss function to maximize the margin distance, which is to solve Equation 2:

J = min \frac{1}{2} w^{T} w + C \sum_{i = 1}^{N} ε_{i}

(1)

subject to:

y_{i} (w^{T} K_{i} (x_{i}) + b) \geq 1 - ε_{i}

where w is weight vector, C is cost coefficient, and $ε_{i}$ is a slack variable for the non-separable data, and K_i is the kernel function to transform data to the feature space. There are different kernel functions to use, such as linear functions, radial basis functions, and so forth.

Gradient Boosting

Gradient boosting model (GBM) is a machine learning method that utilizes many decision trees (weak learners) to generate results. For GBM, at each iteration a new tree is added. The subsequent trees will give extra weights to the samples that are incorrectly classified by the prior tree. Residuals are added to generate the final classification result based on all the trees ( 73 ).

Extreme Gradient Boosting

XGBoost (eXtreme Gradient Boosting) is an extension of the popular tree gradient boosting algorithms ( 74 ). Boosting is the mechanism to add models recursively until optimal performance metrics are achieved. In gradient boosting, new models are added that predict the residuals of the prior which are added together for the final prediction. XGBoost has been proven to be an efficient and scalable version of gradient boosting trees that is also capable of utilizing maximum memory and hardware resources for data intensive models. It is also able to integrate sparsity aware data handling capabilities as well as a weighted quantile sketch for approximate learning as shown by Chen and Guestrin ( 75 ). The authors were also able to generate a scalable package by gaining insights on cache access patterns, data compression, and sharding that aided in training billions of samples with the more meager of resources.

Mathematically XGBoost is summarized with Equations 2 to 4.

f = ω_{q (x (t))}

(2)

L^{(t)} = \sum_{i = 1}^{n} l (y_{i}, {\hat{y}}_{i}^{t - 1} + f_{t} (x_{i})) + Ω (f_{t})

(3)

Ω (f_{t}) = ϒ T + 0.5 * λ \sum_{i = 1}^{T} ω_{i}^{2}

(4)

where $x (t)$ is the input data, $y_{i}$ is the target variable while ${\hat{y}}_{i}$ is the predicted value for the i-th sample, $q$ is the function that maps a sample into a tree structure, $ω$ is the leaf weight, $T$ is the number of leaf nodes, $L$ is the loss function, $ϒ$ and $λ$ are the regularization parameters.

LSTM

LSTM networks ( 76 ) belong to the RNN family and are adept at overcoming the shortcoming of conventional RNN: the vanishing gradient problem. This limitation means that RNNs can remember only recent information. LSTM can make use of both short-term and long-term information to make a prediction. This is especially important in the prediction of signal timing because generally optimizations are done by considering both the short-term turbulent traffic flow and the long-term mean traffic parameters. LSTMs have an input layer, a hidden layer, and an output layer. While the input and output layers are traditional neurons, the hidden layers are specialized memory cells that can store information.

For the input of sequence $x_{t}$ and $i, o, f, c$ as the input gate, output gate, forget gate, and cell input of LSTM, the hidden layer function ( 77 ) can be formulated as in Equations 5 to 9.

i_{t} = σ (W_{xi} x_{t} + W_{hi} h_{t - 1} + W_{ci} c_{t - 1} + b_{i})

(5)

f_{t} = σ (W_{xf} x_{t} + W_{hf} h_{t - 1} + W_{cf} c_{t - 1} + b_{f})

(6)

i_{t} = f_{t} c_{t - 1} + i_{t} \tanh (W_{xc} x_{t} + W_{hc} h_{t - 1} + b_{c})

(7)

o_{t} = σ (W_{xo} x_{t} + W_{ho} h_{t - 1} + W_{co} c_{t - 1} + b_{o})

(8)

h_{t} = o_{t} \tanh (c_{t})

(9)

where $σ$ is the logistic sigmoid function and $h$ is the hidden vector. The weight matrices are represented with $W$ . $W_{xo}$ is the weight matrix of the input-output gate matrix, $W_{xf}$ is that of input-forget gate, and so forth. A typical structure of LSTM cell is shown in Figure 3 and the proposed network architecture is shown in Figure 4. The model has four stacked LSTM layers followed by one dense layer. Several hyperparameters, such as number of LSTM and dense cells, learning rate, batch size, and so forth, were also selected and the best one was chosen based on the evaluation metrics.

Figure 3.

Long short-term memory (LSTM) cell structure.

Figure 4.

Long short-term memory (LSTM) architecture.

Results

Performance metrics were evaluated for individual intersection models as well as combined intersection models. MAE, MAPE, and RMSE were taken as the metrics and calculated with Equations 10 to 12, where ${\hat{y}}_{i}$ is the predicted values and $y_{i}$ is the actual values.

MAE = \frac{\sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |}{n}

(10)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} \frac{| y_{i} - {\hat{y}}_{i} |}{x_{i}}

(11)

RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{n}}

(12)

The performance of all the models for the cycle length prediction is shown in Table 3. It can be noted that LSTM is the best performing model in MAE, MAPE and RMSE. The tree-based models such as RF, GBM, and XGBT models returned similar results, with the MAE about 40 s, MAPE at 23%, and RMSE around 85 s. Since the average cycle length is 174 s in the processed data set, these models will have average predictions between 134 and 214 s. The results from LSTM are much better, with a MAE of 5.79 s. It is important to note here that all these model results are compared based on the current cycle length prediction. The impact of window lengths is explored in the next part of the results.

Table 3.

Modeling Results

Model	Metric
	Mean absolute error	Mean absolute percentage error	Root mean square error
Support vector machine	290.33	2.15	326.52
Random forest	40.18	0.23	86.05
Gradient booster model	40.24	0.23	89.15
eXtreme gradient booster	40.96	0.23	88.38
Long short-term memory	5.79	0.03	7.22

The impact of input and output window sizes are also explored with the best performing model, LSTM. Two input window sizes are taken, 5 and 6, while the output window size is varied from 1 to 6. For instance, in Table 4, “5,4” means that the model takes the last five cycles’ information and predicts the cycle length and phase length of the next four cycles. In Table 4, both cycle length prediction and phase duration prediction results are shown. For the phase prediction, the model predicts all the phases as individual output. If the prediction window is 4 s, there will be eight phase predictions pre-cycle for a total of 32 phases for the entire output. The results have acceptable performance with the minimum MAE for “6,2” in cycle length prediction (7.16 s) and “6,6” for phase length prediction (8.71 s). It should also be noted that for the phase length prediction, the MAPE is always over 40% even though the MAE is always below 10 s. The reasoning is that sometimes a particular phase will ideally remain closed when there is no demand and phase length is then 0 s. If the model outputs a prediction of 0.4 s, that will lead to a 40% error. The true values and predicted values were plotted on a graph to identity this abnormal error even though the MAE was good. A way to mitigate this error could be to round unreasonably small phase lengths to zero.

Table 4.

Cycle Length and Phase Length Prediction Performance

	Window	5,1	5,2	5,3	5,4	5,5	6,1	6.2	6,3	6,4	6,5	6,6
Cycle length	MAE	12.34	11.39	17.82	18.73	6.51	9.22	7.16	21.6	21.68	9.25	19.7
	MAPE	0.04	0.39	0.05	0.06	0.06	0.01	0.05	0.07	0.13	0.04	0.10
	RMSE	14.43	14.56	19.88	20.61	8.17	11.37	9.12	23.88	25.92	11.34	23.02
Phase length	MAE	9.32	9.08	9.02	8.83	8.95	9.14	8.77	9.00	8.79	8.81	8.71
	MAPE	0.78	0.49	0.40	1.43	0.41	1.19	0.49	0.49	0.57	0.57	0.41
	RMSE	19.14	19.43	18.80	18.19	18.87	19.66	17.95	18.23	18.73	18.65	17.76

Note: MAE = mean absolute error; MAPE = mean absolute percentage error; RMSE = root mean square error.

To understand the importance of the variables, we computed the effect of the prediction when there is a perturbation in the input features. The inputs are perturbed with a random normal distribution centered at zero with a standard deviation of 0.2. The RMSE of the predictions and the ground truth are then compared. The larger the RMSE the more the perturbation and thus the more it is important for the model. The results are shown in Figure 5. It can be noted that the intersection identity of each signal is the most significant factor. Since each intersection experiences different traffic flow and each also has different numbers of lanes and other geometric features, it is likely that the model interprets this identity to be an important feature. Therefore, transferability of the model would be dependent on these factors. The other significant features include various speed and acceleration metrics as well as the number of vehicles (journeyID_count), as well as waiting time of each vehicle.

Figure 5.

Feature importance using perturbation.

Conclusion

This work proposed the use of CV data to predict SPaT information. SPaT information from several intersections was collected for a week in November 2019 alongside several GPS coordinates of individual vehicles. A buffer of 330 ft around each intersection was selected to identify the approaching and departing vehicles within an intersection and several features were calculated. The individual journeys were then aggregated per cycle along with cycle length and phase duration information from ATSPM data. The prediction model based on LSTM was then used to predict up to six cycles in the future. Various input and output window lengths were experimented with and MAE, MAPE, and RMSE were taken as metrics for evaluation. The LSTM model outperformed the other models such as SVM, RF, GBM, and XGBT. The MAE, MAPE, and RMSE for the LSTM model were 5.79 s, 0.03%, and 7.22 s respectively for cycle length prediction and 8.83 s, 1.43%, and 18.19 s for phase length prediction.

Overall, SPaT prediction can help to deploy real-time traffic safety and mobility features. Often in complicated urban settings, even high-speed internet facilities cannot deliver SPaT messages in real time. A study by Goodall et al. ( 24 ) proposes an algorithm that can optimize delay by using CV data but also notes that it cannot be applied in real time because of the computational requirements. The future forecast of the SPaT timings can be relied on in such cases. It can aid in overcoming the system delay in processing and broadcasting SPaT messages. Moreover, traffic flow prediction or speed prediction can also be improved, for the case of arterials, if future signal states are known. In such cases, improving the prediction by even a few seconds can prove to be useful. Furthermore, a new intersection with similar traffic flow parameters may benefit from a trained model of a totally different intersection. While the proposed methodology cannot rule out signal optimization altogether, it can easily aid relevant authorities to replicate a well-performing signal timing to other intersections easily.

There are many potential applications of predicting the SPaT timing of the upcoming cycles. It can make route planning and trajectory estimation ( 78 ) more efficient in a connected environment since future states of the signals can be predicted. This would aid in relevant studies where vehicle velocity is optimized to traverse intersections at green times, thereby reducing the carbon footprint ( 17 ), or to find optimal velocity ( 18 ). The predicted signal timings can also be used to aid vehicles in the dilemma zone if the predictions can be transferred to the vehicles using onboard units. Most prominently, the signal retiming effort can be reduced to a great extent. Recent studies related to safety that try to predict pedestrian and vehicle conflicts using signal timing information ( 79 , 80 ) can also benefit from having extended SPaT information up to six cycles in the future.

Footnotes

Acknowledgements

The authors acknowledge Wejo for providing the connected vehicle data.

Author Contributions

The authors conform contributions to the paper as follows: model development: Z. Islam, J. Ugan; data preparation: Z. Islam, J. Ugan, M. Abdel-Aty; analysis and result interpretation: Z. Islam, J. Ugan, M. Abdel-Aty; draft manuscript: Z. Islam, J. Ugan, M. Abdel-Aty. All authors have reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Zubayer Islam

Mohamed Abdel-Aty

Jorge Ugan

This paper and its contents, including conclusions and results, are solely those of the authors; they do not represent opinions or policies of Wejo Limited.

References

Wang

F-Y.

Traffic Signal Timing via Deep Reinforcement Learning. IEEE/CAA Journal of Automatica Sinica, Vol. 3, No. 3, 2016, pp. 247–254.

Mathew

J. K.

Kim

Saldivar-Carranza

E. D.

Sturdevant

Smith

W. B.

Bullock

D. M.

Connected Vehicle Corridor Deployment and Performance Measures for Assessment. Purdue University, West Lafayette, IN, 2019.

Mahler

Vahidi

An Optimal Velocity-Planning Scheme for Vehicle Energy Efficiency Through Probabilistic Prediction of Traffic-Signal Timing. IEEE Transactions on Intelligent Transportation Systems, Vol. 15, No. 6, 2014, pp. 2516–2523.

Porche

Lafortune

Adaptive Look-Ahead Optimization of Traffic Signals. Journal of Intelligent Transportation System, Vol. 4, No. 3–4, 1999, pp. 209–254.

Shen

Wang

Zhu F

, eds. Agent-Based Traffic Simulation and Traffic Signal Timing Optimization with GPU. 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, D.C., October 5–7, 2011, IEEE, New York, pp. 145–150.

Sun

Benekohal

R. F.

Waller

S. T.

Bi-Level Programming Formulation and Heuristic Solution Approach for Dynamic Traffic Signal Optimization. Computer-Aided Civil Infrastructure Engineering, Vol. 21, No. 5, 2006, pp. 321–333.

Putha

Quadrifoglio

Zechman

Comparing Ant Colony Optimization and Genetic Algorithm Approaches for Solving Traffic Signal Coordination Under Oversaturation Conditions. Computer-Aided Civil Infrastructure Engineering, Vol. 27, No. 1, 2012, pp. 14–28.

Liu

Han

Gayah

Friesz

Yao

Data-Driven Linear Decision Rule Approach for Distributionally Robust Optimization of On-Line Signal Control. Transportation Research Procedia, Vol. 7, 2015, pp. 536–55.

Islam

S. M. A. B. A.

Hajbabaie

Distributed Coordinated Signal Timing Optimization in Connected Transportation Networks. Transportation Research Part C: Emerging Technologies, Vol. 80, 2017, pp. 272–285.

10.

Zhao

Wang

Ban

Dynamic Traffic Signal Timing Optimization Strategy Incorporating Various Vehicle Fuel Consumption Characteristics. IEEE Transactions on Vehicular Technology, Vol. 65, No. 6, 2016, pp. 3874–3887.

11.

Chang

T. -H

Sun

G. -Y.

Modeling and Optimization of an Oversaturated Signalized Network. Transportation Research Part B: Methodological, Vol. 38, No. 8, 2004, pp. 687–707.

12.

Yang

Jayakrishnan

Real-Time Network-Wide Traffic Signal Optimization Considering Long-Term Green Ratios Based on Expected Route Flows. Transportation Research Part C: Emerging Technologies, Vol. 60, 2015, pp. 241–257.

13.

Zhang

Yin

Chen

Robust Signal Timing Optimization with Environmental Concerns. Transportation Research Part C: Emerging Technologies, Vol. 29, 2013, pp. 55–71.

14.

Elefteriadou

Ranka

Signal Control Optimization for Automated Vehicles at Isolated Signalized Intersections. Transportation Research Part C: Emerging Technologies, Vol. 49, 2014, pp. 1–18.

15.

Liang

Guler

S. I.

Gayah

V. V.

Signal Timing Optimization with Connected Vehicle Technology: Platooning to Improve Computational Efficiency. Transportation Research Record: Journal of the Transportation Research Board, 2018. 2672(18): 81–92.

16.

Beard

Ziliaskopoulos

System Optimal Signal Optimization Formulation. Transportation Research Record: Journal of the Transportation Research Board, 2006. 1978(1): 102–112.

17.

Asadi

Vahidi

Predictive Cruise Control: Utilizing Upcoming Traffic Signal Information for Improving Fuel Economy and Reducing Trip Time. IEEE Transactions on Control Systems Technology, Vol. 19, No. 3, 2010, pp. 707–714.

18.

Koukoumidis

Peh

L.-S.

Martonosi

M. R.

, eds. Signalguru: Leveraging Mobile Phones for Collaborative Traffic Signal Schedule Advisory. In Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services, Bethesda, MD, 2011.

19.

Fayazi

S. A.

Vahidi

Mahler

Winckler

Traffic Signal Phase and Timing Estimation from Low-Frequency Transit Bus Data. IEEE Transactions on Intelligent Transportation Systems, Vol. 16, No. 1, 2014, pp. 19–28.

20.

Ban

Herring

Hao

Bayen

A. M.

Delay Pattern Estimation for Signalized Intersections Using Sampled Travel Times. Transportation Research Record: Journal of the Transportation Research Board, 2009. 2130(1): 109–119.

21.

Fayazi

S. A.

Vahidi

Crowdsourcing Phase and Timing of Pre-Timed Traffic Signals in the Presence of Queues: Algorithms and Back-End System Architecture. IEEE Transactions on Intelligent Transportation Systems, Vol. 17, No. 3, 2015, pp. 870–881.

22.

Wang

Jiang

, eds. Traffic Signal Phases’ Estimation by Floating Car Data. In 2012 12th International Conference on ITS Telecommunications, Taipei, Taiwan, November 5–8, 2012, IEEE, New York, pp. 568–573.

23.

Ibrahim

Kalathil

Sanchez

R. O.

Varaiya

Estimating Phase Duration for SPaT Messages. IEEE Transactions on Intelligent Transportation Systems, Vol. 20, No. 7, 2018, pp. 2668–2676.

24.

Goodall

N. J.

Smith

B. L.

Park

Traffic Signal Control With Connected Vehicles. Transportation Research Record: Journal of the Transportation Research Board, Vol. 2381, No. 1, 2013, pp. 65–72.

25.

Yao

Shen

Liu

Jiang

Yang

A Dynamic Predictive Traffic Signal Control Framework in a Cross-Sectional Vehicle Infrastructure Integration Environment. IEEE Transactions on Intelligent Transportation Systems, Vol. 21, No. 4, 2019, pp. 1455–1466.

26.

Islam

Abdel-Aty

Mahmoud

Using CNN-LSTM to Predict Signal Phasing and Timing Aided by High-Resolution Detector Data. Transportation Research Part C: Emerging Technologies, Vol. 141, 2022, p. 103742.

27.

Vlahogianni

E. I.

Karlaftis

M. G.

Golias

J. C.

Short-Term Traffic Forecasting: Where We Are and Where We’re Going. Transportation Research Part C: Emerging Technologies, Vol. 43, 2014, pp. 3–19.

28.

Mena-Oreja

Gozalvez

A Comprehensive Evaluation of Deep Learning-Based Techniques for Traffic Prediction. IEEE Access, Vol. 8, 2020, pp. 91188–91212.

29.

Qiao

Haghani

Hamedi

Short-Term Travel Time Prediction Considering the Effects of Weather. Transportation Research Record: Journal of the Transportation Research Board, 2012. 2308(1): 61–72.

30.

Duan

Kang

Wang

F. -Y.

Traffic Flow Prediction with Big Data: A Deep Learning Approach. IEEE Transactions on Intelligent Transportation Systems, Vol. 16, No. 2, 2014, pp. 865–873.

31.

Zhang

Yao

Zhao

Deep Autoencoder Neural Networks for Short-Term Traffic Congestion Prediction of Transportation Networks. Sensors. Vol. 19, No. 10, 2019, p. 2229.

32.

Van Lint

Hoogendoorn

van Zuylen

H. J.

Freeway Travel Time Prediction with State-Space Neural Networks: Modeling State-Space Dynamics with Recurrent Neural Networks. Transportation Research Record: Journal of the Transportation Research Board, 2002. 1811(1): 30–39.

33.

Tao

Wang

Long Short-Term Memory Neural Network for Traffic Speed Prediction Using Remote Microwave Sensor Data. Transportation Research Part C: Emerging Technologies, Vol. 54, 2015, pp. 187–197.

34.

Zhao

Chen

P. C.

Liu

LSTM Network: A Deep Learning Approach for Short-Term Traffic Forecast. IET Intelligent Transport Systems, Vol. 11, No. 2, 2017, pp. 68–75.

35.

Tian

Pan

, eds. Predicting Short-Term Traffic Flow by Long Short-Term Memory Recurrent Neural Network. 2015 IEEE International Conference on Smart city/SocialCom/SustainCom (SmartCity), Chengdu, China, December 19–21, 2015, IEEE, New York, pp. 153–158.

36.

Cui

Wang

Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-Wide Traffic Speed Prediction. arXiv preprint arXiv:180102143, 2018.

37.

Dai

Wang

Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction. Sensors, Vol. 17, No. 4, 2017, p. 818.

38.

Zang

Ling

Cheng

Tang

, eds. Using Convolutional Neural Network with Asymmetrical Kernels to Predict Speed of Elevated Highway. In: International Conference on Intelligence Science ( Shi

Goertzel

Feng

, eds), Springer, Cham, 2017, pp. 212–221.

39.

Kim

Wang

Zhu

Mihaylova

, eds. A Capsule Network for Traffic Speed Prediction in Complex Road Networks. In: 2018 Sensor Data Fusion: Trends, Solutions, Applications (SDF), Bonn, Germany, October 9–11, 2018, IEEE, New York, pp. 1–6.

40.

Wang

Liu

Xiong

, eds. Traffic Speed Prediction and Congestion Source Exploration: A Deep Learning Method. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), Barcelona, Spain, December 12–15, 2016, IEEE, New York, pp. 499–508.

41.

Shahabi

Liu

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. arXiv preprint arXiv:170701926, 2017.

42.

Yin

Zhu

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. arXiv preprint arXiv:170904875, 2017.

43.

Luo

Yang

Zhang

Spatiotemporal Traffic Flow Prediction with KNN and LSTM. Journal of Advanced Transportation, Vol. 2019, 2019, p. 4145353.

44.

Lin

Yang

A Spatial-Temporal Hybrid Model for Short-Term Traffic Prediction. Mathematical Problems in Engineering, Vol. 2019, 2019, p. 4858546.

45.

Wang

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks. Sensors, Vol. 17, No. 7, 2017, p. 1501.

46.

Tan

Short-Term Traffic Flow Forecasting with Spatial-Temporal Correlation in a Hybrid Deep Learning Framework. arXiv preprint arXiv:161201022, 2016.

47.

Kwon

Coifman

Bickel

Day-to-Day Travel-Time Trends and Travel-Time Prediction from Loop-Detector Data. Transportation Research Record: Journal of the Transportation Research Board, 2000. 1717(1): 120–129.

48.

Xia

Chen

Huang

A Multistep Corridor Travel-Time Prediction Method Using Presence-Type Vehicle Detector Data. Journal of Intelligent Transportation Systems, Vol. 15, No. 2, 2011, pp. 104–113.

49.

Zhao

Gao

Yin

Liu

Sun

Travel Time Prediction: Based on Gated Recurrent Unit Method and Data Fusion. IEEE Access, Vol. 6, 2018, pp. 70463–70472.

50.

Matsui

Fujita

Travel Time Prediction for Freeway Traffic Information by Neural Network Driven Fuzzy Reasoning. In: Neural Networks in Transport Applications ( V.

Himanen

Nijkamp

Reggiani

, eds.), Routledge, London, 2019, p. 355–364.

51.

Wang

, eds. A Piecewise Hybrid of ARIMA and SVMs for Short-Term Traffic Flow Prediction. In: International Conference on Neural Information Processing ( D.

Liu

Xie

Zhao

El-Alfy

E. S.

, eds.), Springer, Cham, 2017, pp. 493–502.

52.

Castro-Neto

Jeong

Y. -S.

Jeong

M -K.

Han

L. D.

Online-SVR for Short-Term Traffic Flow Prediction Under Typical and Atypical Traffic Conditions. Expert Systems with Applications, Vol. 36, No. 3, 2009, pp. 6164–6173.

53.

Zhang

Shu

Wang

Short-Term Traffic Flow Prediction Based on Spatio-Temporal Analysis and CNN Deep Learning. Transportmetrica A: Transport Science, Vol. 15, No. 2, 2019, pp. 1688–1711.

54.

Wang

Daily Long-Term Traffic Flow Forecasting Based on a Deep Neural Network. Expert Systems with Applications, Vol. 121, 2019, pp. 304–312.

55.

Hou

Repeatability and Similarity of Freeway Traffic Flow and Long-Term Prediction Under Big Data. IEEE Transactions on Intelligent Transportation Systems, Vol. 17, No. 6, 2016, pp. 1786–1796.

56.

Zang

Fang

Wang

Wei

Tang

, eds. Long Term Traffic Flow Prediction Using Residual Net and Deconvolutional Neural Network. In Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Springer, Cham, 2018, pp. 62–74.

57.

Mahmoud

Abdel-Aty

Cai

Yuan

Predicting Cycle-Level Traffic Movements at Signalized Intersections Using Machine Learning Models. Transportation Research Part C: Emerging Technologies, Vol. 124, 2021, p. 102930.

58.

Mahmoud

Abdel-Aty

Cai

Yuan

Estimating Cycle-Level Real-Time Traffic Movements at Signalized Intersections. Journal of Intelligent Transportation Systems, Vol. 26, No. 4, 2021, pp. 400–419.

59.

Zang

Ling

Wei

Tang

Cheng

Long-Term Traffic Speed Prediction Based on Multiscale Spatio-Temporal Feature Learning Network. IEEE Transactions on Intelligent Transportation Systems, Vol. 20, No. 10, 2018, pp. 3700–3709.

60.

Vanajakshi

Rilett

L. R.

, eds. A Comparison of the Performance of Artificial Neural Networks and Support Vector Machines for the Prediction of Traffic Speed. IEEE Intelligent Vehicles Symposium, Parma, Italy, June 14–17, 2004, IEEE, New York, pp. 194–199.

61.

Abdel-Aty

Cai

Islam

A Deep Learning Approach to Detect Real-Time Vehicle Maneuvers Based on Smartphone Sensors. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, No. 4, 2020, 3148–3157.

62.

Islam

Abdel-Aty

Sensor-Based Transportation Mode Recognition Using Variational Autoencoder. Journal of Big Data Analytics in Transportation, Vol. 3, No. 1, 2021, pp. 15–26.

63.

Islam

Abdel-Aty

Cai

Yuan

Crash Data Augmentation Using Variational Autoencoder. Accident Analysis & Prevention, Vol. 151, 2020, p. 105950.

64.

Cai

Abdel-Aty

Yuan

Lee

Real-Time Crash Prediction on Expressways Using Deep Generative Models. Transportation Research Part C: Emerging Technologies, Vol. 117, 2020, p. 102697.

65.

Abdel-Aty

Yuan

Real-Time Crash Risk Prediction on Arterials Based on LSTM-CNN. Accident Analysis & Prevention, Vol. 135, 2020, p. 105371.

66.

Abdelraouf

Abdel-Aty

Mahmoud

Sequence-to-Sequence Recurrent Graph Convolutional Networks for Traffic Estimation and Prediction Using Connected Probe Vehicle Data. IEEE Transactions on Intelligent Transportation Systems, Vol. 24, No. 1, 2022, pp. 1395–1405.

67.

Ahsani

Amin-Naseri

Knickerbocker

Sharma

Quantitative Analysis of Probe Data Characteristics: Coverage, Speed Bias and Congestion Detection Precision. Journal of Intelligent Transportation Systems, Vol. 23, No. 2, 2019, pp. 103–119.

68.

Adu-Gyamfi

Y. O.

Sharma

Knickerbocker

Hawkins

Jackson

Framework for Evaluating the Reliability of Wide-Area Probe Data. Transportation Research Record: Journal of the Transportation Research Board, 2017. 2643(1): 93–104.

69.

Nanthawichit

Nakatsuji

Suzuki

Application of Probe-Vehicle Data for Real-Time Traffic-State Estimation and Short-Term Travel-Time Prediction on a Freeway. Transportation Research Record: Journal of the Transportation Research Board, 2003. 1855(1): 49–59.

70.

Aljamal

M. A.

Abdelghaffar

H. M.

Rakha

H. A.

Developing a Neural-Kalman Filtering Approach for Estimating Traffic Stream Density Using Probe Vehicle Data. Sensors, Vol. 19, No. 19, 2019, p. 4325.

71.

Breiman

Random Forests. Machine Learning, Vol. 45, No. 1, 2001, pp. 5–32.

72.

Brieman

Friedman

Stone

C. J.

Olshen

Classification and Regression Tree Analysis. CRC Press, Boca Raton, FL, 1984.

73.

Friedman

J. H.

Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, Vol. 29, No. 5, 2001, pp. 1189–1232.

74.

Friedman

Hastie

Tibshirani

Additive Logistic Regression: A Statistical View of Boosting (with Discussion and a Rejoinder by the Authors). Annals of Statistics, Vol. 28, No. 2, 2000, pp. 337–407.

75.

Chen

Guestrin

, eds. XGBoost: A Scalable Tree Boosting System. In Proc., 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, August 13–17, 2016.

76.

Hochreiter

Schmidhuber

LSTM Can Solve Hard Long Time Lag Problems. Advances in Neural Information Processing Systems, Vol. 9, 1997, 473–479.

77.

Graves

Generating Sequences with Recurrent Neural Networks. arXiv preprint arXiv:13080850, 2013.

78.

Islam

Abdel-Aty

Real-Time Vehicle Trajectory Estimation Based on Lane Change Detection Using Smartphone Sensors. Transportation Research Record: Journal of the Transportation Research Board, Vol. 2675, No. 6, 2021, pp. 137–150.

79.

Zhang

Abdel-Aty

Cai

Ugan

Prediction of Pedestrian-Vehicle Conflicts at Signalized Intersections Based on Long Short-Term Memory Neural Network. Accident Analysis & Prevention, Vol. 148, 2020, p. 105799.

80.

Zhang

Abdel-Aty

Yuan

Prediction of Pedestrian Crossing Intentions at Intersections Based on Long Short-Term Memory Recurrent Neural Network. Transportation Research Record: Journal of the Transportation Research Board, Vol. 2674, No. 4, 2020, pp. 57–65.

Signal Phasing and Timing Prediction Using Connected Vehicle Data

Abstract

Keywords

Literature Review

SPaT Prediction

Assessment of Traffic-Related Prediction Using Deep Learning

Traffic-Related Prediction Using Detector Data

Traffic-Related Prediction Using Probe Vehicle Data

Data Preparation

Model

Random Forest

Support Vector Machine

Gradient Boosting

Extreme Gradient Boosting

LSTM

Results

Conclusion

Footnotes

Acknowledgements

Author Contributions

Declaration of Conflicting Interests

Funding

ORCID iDs

References