Abstract
The study of expressway rear-end conflicts is of great significance to analyze driving behaviors and improve traffic safety. However, research on the classification and modeling of conflict patterns is still lacking. This study aimed to explore conflict patterns and their relationship with influencing factors. The conflict data used in this study was extracted from a trajectory data set that collected 3 h of data during a morning peak hour on an expressway in Shanghai, China. An improved k-means algorithm, which can automatically obtain the optimal number of clusters, was used to classify the conflict events into six conflict patterns. The conflict patterns were interpreted from five aspects: risk level, speed of risk-changing, risk-avoidance response, risk-avoidance attitude, and risk-avoidance action. Furthermore, a multivariate Poisson-lognormal (MVPLN) model considering spatial–temporal correlation was applied. The relationship between the independent variables and the number of each conflict pattern within the spatial–temporal unit was obtained. The root mean square error of the MVPLN model was 0.81. Compared with univariate Poisson model, univariate negative binomial model, and univariate Poisson-lognormal model, the MVPLN model improved by 73.8%, 81.3%, and 29.6% in accuracy respectively. The results of this study can classify expressway rear-end conflict patterns and obtain the number of each conflict pattern within spatial–temporal units using available traffic data.
Keywords
Expressways are of great importance to a transportation system because they support the majority of long-trip traffic in cities ( 1 ). In China, an expressway refers to a two-way carriageway with a divider in the center. To provide urban long-distance and rapid transportation services, the design speed of expressways is between 60 km/h and 100 km/h. Because of the high traffic volume and speed, traffic crashes occur frequently on expressways. The traffic flow characteristics at merging, diverging, and weaving sections are completely different from those of the basic roadway. Traffic flows of different road options frequently change lanes and intertwine with each other, leading to high crash risks. The higher speed of the expressway than that of basic roads also makes the severity of crashes higher ( 2 ). Reducing traffic crashes on expressways plays an important role in traffic safety and efficiency.
However, because of the long observation period and difficult acquisition of crash data, the application of traffic safety methods is limited. Currently, traffic conflict is widely used as a surrogate safety measure (SSM) for traffic crashes ( 3 , 4 ). Traffic conflict data is relatively easy to access and can be used for safety evaluation without relying on a huge amount of crash data ( 5 ). Traffic conflicts are influenced by traffic flow, road design, and other factors. Building a traffic conflict model can obtain the quantitative relationship between the number of traffic conflicts and the influencing factors. It provides the theoretical and technical basis for traffic safety design.
Meanwhile, previous research has shown a strong correlation between traffic safety and driving behaviors ( 6 , 7 ). Because of external circumstances and individual differences, drivers’ risk-avoidance behaviors vary in many aspects, such as the attitude, speed, and intensity of reaction. This leads to conflicts taking on different patterns. There is considerable potential for increasing road safety using behavior modification ( 8 ). Considering pattern classification in the conflict model can help provide more targeted driving assistance and safer driving experiences.
Research on Pattern Classification
The classification of data into one of several categories is the problem addressed by pattern classification theory ( 9 ). In traffic safety research, distinct driving behaviors can be identified by exploring driving patterns. Commonly used methods can be divided into model-based and learning-based. Model-based methods classify driving patterns based on establishing driving models, whereas learning-based methods directly analyze driving data. Different terms and methods have been used to classify driving style based on individual, social, and technological factors ( 8 , 10 , 11 ). According to driving and driver characteristics, longitudinal driving behavior types have been determined and distinguished ( 12 ). The efficient k-means clustering algorithm was used to classify longitudinal driving behavior into driving patterns ( 13 ). The impacts of driving tendency on the dilemma zone distribution and the probability of rear-end crashes were explored. Three types of driving tendencies, which were conservative, normal, and aggressive, were classified based on driving variables ( 14 ). More researchers have studied specific driving styles such as longitudinal behavior, lateral control, gap acceptance, and others ( 15 – 17 ). According to the above discussions, studies of driving patterns focus on the irregular behaviors of the driver in some specific driving scenarios such as car following behaviors and intersection dilemma selection behaviors. The lack of research on conflict patterns has led to a lack of precision in traffic safety measures.
Feature parameter selection is a key issue in driving pattern classification. It is difficult to select a pair of feature parameters that can fully represent or define all aggressive (or normal) drivers, though the rule-based strategies can classify most drivers into different categories ( 17 ). Average speed, speed standard deviation, and the percentage of time over speed are commonly used feature parameters in driving style classification ( 18 – 20 ). Other microscopic features such as average acceleration/deceleration, acceleration/deceleration standard deviation, headway time distance, and other parameters are also used in some studies ( 21 , 22 ). The impact of sociodemographic and trip generation parameters on real-time traffic safety has been explored to predict real-time collision risk on highways ( 23 ). Based on the above analysis, the driving parameters were chosen with little consideration of conflict risk-changing characteristics. The lack of risk-changing parameters made driving pattern classification not detailed enough. In addition, the road and traffic environment are also important factors that affect driving patterns. Neglecting these external factors can affect the performance of the driving pattern model.
Research on Conflict Models
Traffic conflict models are designed to model the number of conflicts within a specific period at a certain location or segment. The methods widely used are various statistic models and machine learning methods. Nested logit models, multinomial logit model, and random parameters logistic models were calibrated to estimate the probabilities of the rear-end crash and determine the relevant variables ( 24 – 26 ). Generalized linear regression models such as Poisson model and negative binomial model are widely used in conflict and crash prediction ( 27 , 28 ). Poisson-lognormal model was another generalized Poisson model used to estimate crash counts by severity ( 29 , 30 ). Poisson-lognormal models have advantages in handling outliers ( 31 , 32 ). The application of multivariate Poisson-lognormal model provided a superior fit to the independent univariate models ( 33 ). Some other models such as Bayesian Tobit model, hierarchical Bayesian model, multilevel Bayesian logistic regression model, and the extreme value theory were utilized to estimate conflict or crash frequency ( 34 – 37 ). Machine learning methods such as neural networks and hidden Markov model are also receiving increasing attention and application in traffic conflict models (3, 38–41). Other methods such as combining simulated trajectories with real-time safety models were also applied to predict conflicts ( 42 ). Nowadays traffic conflict models mainly focus on predicting the number of conflicts. However, traffic conflicts are usually caused by irregular behaviors of road users, so the role of road users in traffic conflict models cannot be ignored. Research on driver behavior and characteristics has been studied to help reduce the probability of rear-end collisions, thereby improving vehicle safety ( 26 ). Because of various driving behaviors and reactions to risks between drivers, conflicts can have different patterns. Driving patterns should be incorporated into conflict models.
Spatial–temporal correlation has received increasing attention in traffic conflict models. Because of the limited conflict observation time, traffic conflicts are usually studied in small time intervals. Because the observed targets and other traffic objects are correlated in different spatial–temporal units, the conflicts and the corresponding characteristics in different units also have spatial–temporal correlations. A negative binomial model was chosen to study the spatial relationship between conflicts and crashes ( 5 ). A temporal and spatial analysis of rear-end collision data at signalized intersections revealed a high correlation between longitudinally or spatially correlated rear-end collisions ( 43 ). Considering the possible temporal correlation of conflict-frequency data from the same road entity in short time intervals, random effects models were used to predict the number of conflicts ( 44 ). Some research has been carried out using Bayesian hierarchical models to incorporate spatial and temporal correlations in traffic conflicts ( 45 – 47 ). Taking exogenous variables, such as climate, seasonal events, and road properties, into consideration, a log-Gaussian Cox model was also used to describe the spatial–temporal stochastic process ( 48 ). Machine learning models such as convolutional neural network and long short-term memory were also widely used to simulate the correlation ( 49 , 50 ). Based on the above research, the number of conflict patterns within spatial–temporal units is not completely independent from each other. Thus, the independence between conflict patterns at different times and road segments needs more attention. Further research is needed on the spatial–temporal correlations between patterns.
This study will focus on the following two components:
Classify conflict patterns based on risk formation characteristics and risk-avoidance behavior characteristics.
Explore the relationship between conflict pattern frequency and influencing factors considering spatial–temporal correlation.
This paper is organized into five parts. The second part presents the data preparation. The third part presents the methodologies of the improved k-means algorithm and multivariate Poisson-lognormal model considering spatial–temporal correlation. The fourth part provides conflict pattern clustering results and regression model results. The fifth part summarizes the conclusions and limitations of this study.
Data Preparation
Data Source
The data used in this study is the MAGIC data set. The MAGIC data set is a trajectory data set extracted from an aerial video by a group of six unmanned aerial vehicles (UAV). The experiment was conducted from 7:40 to 10:40 a.m. on a section of the Shanghai Inner Ring, Shanghai, China, with a total length of 4000 m in both directions including a large radius curve and six ramps. This section contains merge segments, diverge segments, weaving segments, and basic segments. The widest part of the main road contains three lanes in one direction, and the narrowest part contains two lanes in one direction ( 51 ). From the MAGIC data set, vehicle characteristics, such as vehicle id, vehicle type, speed, acceleration, lane, and location, can be extracted. Traffic conflicts can be measured using this data set, and no crash data is observed in this data set. According to the division of the aerial camera area, the road section is divided into six sections with the corresponding numbers as shown in Figure 1. More details can be obtained at: https://magic.tongji.edu.cn/english/ACHIEVEMENTS/MAGIC_Dataset.htm.

Road section numbers.
Data Preprocessing
Time to collision (TTC) was utilized as the SSM for traffic safety performance evaluation in this study. As a time-based SSM, TTC was easy to measure and most commonly used in traffic conflict studies ( 52 ). In many traffic conflict studies, traffic events with TTC less than 3 s were considered as conflicts ( 53 – 56 ). Therefore, TTC from falling below 3 s to rising above 3 s was considered as one conflict event in this paper. Based on previous studies ( 57 – 59 ), 2 s for TTC was used as the threshold for high-risk conflicts.
TTCs of rear-end conflicts were extracted from the MAGIC data set. When the TTC was less than 3 s, it was recorded as a traffic conflict event. After the data extraction and denoising process, the number of rear-end conflict events in this data set was 7847. The numbers of minimal points of TTC curves were calculated by polynomial fitting. An eighth-degree polynomial was used to fit each TTC curve. By finding the extreme point of the curve, the fluctuation of the TTC curve could be simulated. As presented in Figure 2, 7262 curves with one minimal point, 549 curves with two minimal points, and 39 curves with more than two minimal points were obtained.

Classification of the number of minimal points.
Conflict event curves differed in the minimum TTC, the slope of decline/rise, and other characteristics. Therefore, it was necessary to explore the differences between conflict patterns. In this paper, only curves with only one minimal point were analyzed.
Conflict Characteristics
To explore conflict patterns in depth, the conflict characteristics were defined in more detail as shown in Figure 3. The definitions and symbols of characteristics are presented in Table 1.

Conflict characteristics.
Conflict Characteristics
Note: TTC = time to conflict.
The risk characteristics of a conflict event were described by conflict risk level, duration of conflict, risk deterioration rate, and risk disengagement rate. The minimum value of TTC was an important characteristic to measure the risk severity of a conflict event. The duration of conflict indicated how long the high risk lasted. The risk deterioration rate and risk disengagement rate indicated how quickly the risk level changed over a conflict event. Risk characteristics could be influenced by external road and traffic environmental factors, as well as by internal driver factors.
The selected variables were mainly related to deceleration. By defining these four characteristics, a driver’s response action to the risk during a conflict event was depicted. Risk-avoidance reaction speed reflected the time for a driver to take braking action after perceiving a high risk. The short-term risk-avoidance intensity was reflected by the maximum value of deceleration during a conflict event. The long-term risk-avoidance intensity was captured by the average deceleration during a conflict event. In this study, the risk-avoidance characteristics were considered to be related to the internal driver factors.
Independent Variable Definition and Test for Multicollinearity
Defining and testing the independent variables was the basis for building a generalized linear regression model. The spatial–temporal unit was defined as 100 m and 10 min in this study. Variables such as distance to the ramp, traffic volume, and average speed in the table were calculated for spatial–temporal units. As shown in Table 2, seven independent variables reflecting road characteristics and traffic environment were selected. These variables could be obtained from the velocity, acceleration, vehicle type, and other information in the trajectory data. The correlation between independent variables may negatively affect the model result. Therefore, it was necessary to test the independent variables for multicollinearity before modeling. Variance inflating factor (VIF) was used to test for multicollinearity in this study. All VIFs were smaller than 10 as shown in Table 2, so the multicollinearity between independent variables was acceptable.
Independent Variables
Note: VIF = variance inflating factor. Means and standard deviations for nominal and interval variables are not given and dashes are used instead.
Methodology
This study consists of conflict pattern classification and modeling. In the first part, conflict patterns are classified based on risk formation characteristics and risk-avoidance behavior characteristics extracted from trajectory data. In the second part, a conflict pattern model is built based on the count data of conflict patterns obtained in the first part. The complete algorithm flowchart is shown in Figure 4. The improved k-means algorithm and the multivariate Poisson-lognormal model in the two red outline boxes are introduced in detail in this section of the paper. The processing of other parts is introduced in the sections on data processing and results.

Flowchart of the algorithm.
In the first part, the clustering is based on TTC curves of conflict events. Each curve represents the process of risk changing over a conflict event, which means the curve starts with TTC below the threshold, declines to the lowest point of TTC, and ends with TTC rising above the threshold. Based on the observation of the TTC curves, the relevant characteristic parameters are extracted for clustering.
In the second part, a multivariate Poisson-lognormal model considering spatial–temporal correlation is applied to the conflict pattern count data. Spatial–temporal correlation is considered as the basis function using a spatial–temporal prediction method called fixed rank kriging (FRK). The relationship between influencing factors and the number of each conflict pattern in spatial–temporal units is described. Accuracy and goodness of fit are compared between MVPLN and three other generalized linear regression models.
Improved K-Means Algorithm
Original K-Means Algorithm
Clustering is the process of dividing the dataset into several categories such that objects in the same category are more similar than other objects in different categories. The k-means algorithm is a commonly used unsupervised learning algorithm for clustering. The basic idea of the k-means algorithm is to update the clustering center until the objective function achieves the minimum value. The objective function here is the sum of squares of the distances from each point to the nearest-cluster center, as shown by
where
Improved K-Means Algorithm
In the original k-means algorithm, k is a given number and cannot be determined by the algorithm. Therefore, the silhouette score, which is the mean silhouette coefficient of all samples, is used to decide the optimal number of clusters. The silhouette score is calculated as
where

Flowchart of improved k-means algorithm.
Multivariate Poisson-Lognormal Model Considering Spatial–Temporal Correlation
Different road characteristics and traffic states in different spatial–temporal units result in variability in the number of different conflict patterns that occur. The number of different conflict patterns within a spatial–temporal unit is not completely independent from each other. Therefore, in the generalized linear regression model, it is necessary to consider the influence of different spatial–temporal factors. The multivariate Poisson-lognormal (MVPLN) model considering spatial–temporal correlation proposed in this paper can solve this problem.
FRK
In this model, spatial–temporal correlation is incorporated into the model by basis functions. To measure the effect of spatial–temporal correlation on the dependent variable, FRK spatial–temporal prediction method is applied. FRK is a modification of a spatial prediction method called kriging. Kriging allows for unbiased optimal estimation of regionalized variables in a finite region.
FRK reduces the complexity of the kriging algorithm for large-scale data by dividing the fitted variables into two parts: fixed effects and random effects. FRK hinges on the use of a spatial random effects (SRE) model, in which a spatially correlated mean-zero random process is decomposed using a linear combination of spatial basis functions with random coefficients plus a term that captures the random process’s fine-scale variation ( 60 ). The SRE model has a spatial covariance function that is always nonnegative-definite and, because any basis functions can be used, it can be constructed to approximate standard families of covariance functions ( 61 ). The FRK prediction formula is
where
The establishment of the spatial covariance function is the core of the kriging method. The traditional kriging method is to fit the covariance function of the whole region by selecting a suitable model based on the covariance values at different lag distances. There can be many different types of base function choices such as exponential function, wavelet function, and harmonization functions. Based on the idea of basis functions in FRK, several spatial–temporal basis functions are constructed to be incorporated into the model as independent variables in this study. The number of automatically generated basis functions is 36. Basis functions are represented as B1– B36 in the independent variables, which is a single resolution of the default Gaussian radial function.
MVPLN Model
There are two approaches for modeling count data with multiple categorical variables. One is univariate generalized linear regression models for each type of variable separately. The other is a multivariate generalized linear regression model. Not considering the correlation of different conflict patterns may cause bias in the fitting results or result in wrong statistical inference. The MVPLN model is a generalized Poisson model for multi-categorical count data and can be described by
where
Let
where
Bayesian estimating method is used for parameter estimation, which is one of the most common methods used in conflict-frequency models ( 62 ). The deviance information criteria (DIC) is used to compare the goodness of fit of Poisson-lognormal models. The DIC is considered the Bayesian equivalent of the Akaike information criterion. The smaller the DIC, the better the model fit ( 63 ). DIC is defined as
where
For comparison, univariate Poisson regression model, univariate negative binomial regression model, and univariate Poisson-lognormal regression model are considered in this study. The root mean square error (RMSE) is used to compare the accuracy of fit between different models, which is expressed as
where
Results
Conflict Pattern Clustering Results
First, a distinction was made between low-risk conflict events and high-risk conflict events. The minimal points of all curves were divided into two categories as shown in Figure 6.

Conflict event severity classification results.
The high-risk and low-risk conflict events were clustered separately using improved k-means algorithm as previously introduced. The feature parameters to measure the similarity between clusters included seven conflict characteristics except for Risk_level, which were Risk_duration, Risk_deteriorate, Risk_disengage, Avoid_response, Avoid_stability, Avoid_short_intensity, Avoid_long_intensity.
For the low-risk conflict events, the silhouette score corresponding to

Clustering results.
The inter-cluster distances between the six clusters were calculated by Euclidean distance as shown in Table 3.
Inter-Cluster Distances
After min-max normalization, the radar map of characteristics in each cluster is shown in Figure 8.

Radar map of characteristics.
After obtaining the clusters, the traffic meanings of the six clusters needed to be interpreted. To obtain a more intuitive explanation, fuzzy C-means algorithm was used to classify characteristic levels. The silhouette score was also used to determine the optimal number of clusters. The characteristic values and the corresponding levels are shown in Table 4.
Characteristic Levels of Conflict Patterns
A correlation can be seen between different conflict characteristics. The higher the level of conflict risk, the longer the duration of the conflict event. The risk deterioration rate and the risk disengagement rate roughly correspond to each other. The short-term risk-avoidance intensity and long-term risk-avoidance intensity also remain mostly consistent.
Based on the above clustering of patterns and grading of characteristics, the conflict patterns were finally named in five aspects: risk level, speed of risk-changing, risk-avoidance response, risk-avoidance attitude, and risk-avoidance action, as shown in Figure 9.

Naming of conflict patterns.
From the perspective of risk formation characteristics, the above six conflict patterns can be summarized into four risk formation patterns. Take the first risk formation pattern (Low risk - Slow change) as an example and the heat maps on the road section are shown in Figure 10a. From the perspective of risk-avoidance behavior characteristics, there are five different risk-avoidance behavior patterns. Take the fourth risk-avoidance behavior pattern (Neutral response - Panic attitude - Tough action) as an example and the heat map on the road section is shown in Figure 10b. There is variability in the distribution of conflict patterns, but each conflict pattern appears most in section 6. Focus on road section 6 for further analysis, and the distribution of conflict pattern 1 is shown in Figure 10c. There are differences in frequency between different flow directions and different lanes in the same direction.

Heat map of conflict patterns: (a) first risk formation pattern, (b) fourth risk-avoidance behavior pattern, and (c) road section 6 distribution.
To summarize the above discussion, the frequency of conflict patterns is related to road and traffic environment characteristics as introduced earlier. In the next part, the regression model is built to fit the influencing factors and the frequency of conflict patterns.
Multivariate Poisson-Lognormal Model Considering Spatial–Temporal Correlation
Model Parameters
This study divides 100 m and 10 min into one spatial–temporal unit. The upward direction (from east to west direction) is Section 1–20, and the downward direction (from west to east direction) is Section 21–40. Section 17–19, and 36–39 are weaving segments, Section 1–5 are merge segments, and Section 21–26 are diverge segments.
The categorical variable seg_type was incorporated into the model as dummy variables D1–D3, and the spatial–temporal basis functions B1–B36 were incorporated into the model as independent variables. The parameters of the MVPLN model are shown in Table 5. Numbers in the table without brackets are coefficients of independent variables, while those in brackets are standard errors.
Coefficients and Standard Errors of Independent Variables
Note: Numbers in the table without brackets are coefficients of independent variables and those in brackets are standard errors; conflict patterns:
1: low risk - slow change - neutral response - calm attitude - tough action;
2: low risk - slow change - sensitive response - calm attitude - tender action;
3: low risk - intense change - neutral response - panic attitude - tough action;
4: high risk - intense change - neutral response - panic attitude - tough action;
5: high risk - moderate change - insensitive response - calm attitude - tough action;
6: high risk - intense change - insensitive response - calm attitude - tender action.
variable significant at 90% interval; **variable significant at 95% interval.
Road characteristics independent variables mainly affect the three conflict patterns of low risk. There is a lot of acceleration, deceleration, and lane-changing behaviors in the weaving and merge segments on expressways, which may lead to traffic disorder and traffic conflicts. Conflict pattern 1 and 2, which are the two conflict patterns that account for the largest percentage, are more likely to occur in the weaving segments. When there are fewer lanes, the conflict pattern tends to be more of a Low risk - Slow change pattern, and the risk-avoidance attitude is calmer. The reason may be that when there are fewer lanes the driver is exposed to fewer distracting factors. As for the number of lanes, it is not significant in the high-risk patterns. It is assumed that in high-risk situations, traffic volume and speed will directly affect the driver’s short-term reactions. But in low-risk situations, the indirect factors of road characteristics also have an influence. The risk-avoidance behaviors are more moderate and the conflict patterns are milder on sections further away from the ramp for the same reason.
The traffic environment independent variables have a significant effect on almost all conflict types, especially the average speed of road segments. In conflict pattern 1, 2, 3, 4, and 6, the effects of traffic volume are similar. Rear-end conflicts are more likely to occur when the traffic volume is higher. When the speed is lower, there is more congestion and frequent traffic bottlenecks, which also increases the number of conflicts. Conflict pattern 4 increases the most, by about 10.6%, when the speed is reduced by 1%. The larger speed standard deviation means that the current road speed stability decreases and the probability of conflict increases. Relatively speaking drivers’ risk-avoidance attitude is calmer. Finally, the decrease in the proportion of heavy vehicles has a great effect on conflict pattern 1, 3, and 4. When the proportion of heavy vehicles is small, lane-changing behaviors may be more often and quick, thus conflicts changing intensely are more likely to happen. Drivers also tend to have medium response speed, more panic attitude, and stronger braking measures in such scenarios.
Model Comparison
Three other generalized linear regression models were applied to compare fitting accuracy; these were univariate Poisson model, univariate negative binomial (NB) model, and univariate Poisson-lognormal (PLN) model. RMSE was used to evaluate accuracy. Model RMSE results are shown in Table 6.
Root Mean Square Error for Four Models
By comparing the four models, the MVPLN had a significant advantage in fitting accuracy. The total weighted average accuracy was 73.8%, 81.3%, and 29.6% improved compared with the other three models respectively. Compared with the other three models, the MVPLN model had the most obvious advantage in conflict pattern 1 and 2. For conflict pattern 1, the fitting accuracy of MVPLN was 77.9% and 85.2% better than the first two models. For conflict pattern 2, the fitting accuracy was 74.7% and 79.5% better than the first two models. For conflict pattern 3, the optimizations were also all above 50%.
In addition, the DIC values for Poisson-lognormal models were calculated to measure the goodness of fit. The DIC value was 6853.39 under the MVPLN. The DIC values of the univariate PLN models for each conflict pattern were 2241.21, 1844.48, 952.60, 477.88, 619.18, and 810.03, respectively. Thus, the MVPLN model had an advantage in the goodness of fit over the six univariate PLN models since the DIC was less than the sum of the DICs of the univariate models. Therefore, among the models applied in this study, MVPLN was the most suitable model.
Conclusions
This study explored rear-end conflict pattern classification and modeling on an expressway in China. In the first part, a parameter system of conflict pattern characteristics was constructed of both risk formation characteristics and risk-avoidance behavior characteristics. An improved k-means algorithm was used to classify the conflict patterns, and six risk patterns were finally obtained. These six risk patterns differed in five aspects and were named based on risk formation and risk-avoidance behavior characteristics. The number of conflict patterns within the spatial–temporal units was counted and heat maps were drawn. In the second part, seven independent variables were selected from both road characteristics and traffic environment perspectives. The spatial–temporal correlation was added to the model by the basis function derived from FRK. An MVPLN model was constructed for quantitative analysis. Model parameters were obtained and their traffic meanings were interpreted. The results showed that the MVPLN model performed better than the other three univariate models.
The difference between conflict patterns was described by risk level, speed of risk-changing, risk-avoidance response, risk-avoidance attitude, and risk-avoidance action. Among the six conflict patterns, conflict pattern 1 and 2 have a much larger proportion than the other four patterns. In most conflict events, the vehicle speed is moderate and the deceleration process is stable. When the risk changes drastically, drivers may be more panicked. There is variability in the distribution of the six conflict patterns over the expressway, which can be attributed to differences in roadway type, number of lanes, traffic volume, average speed, and so on. Through qualitative analysis, the possible influencing factors were summarized in road characteristics and traffic environment.
Road user behavioral variability was considered by modeling with conflict pattern count data. The effects of seven independent variables on the number of six conflict patterns within spatial–temporal units were quantified using the MVPLN model. Results showed that road characteristics had a more significant effect on conflict patterns with low risk. The traffic environment independent variables had a significant effect on most of the conflict patterns. The significant independent variables differed for different conflict pattern models. All independent variables were significant in at least one of the conflict patterns. The significant independent variables that had more influence were traffic volume, average speed, and the standard deviation of speed.
The other three generalized linear regression models were used to fit the conflict pattern count data. These three models were univariate whereas the MVPLN model was a multivariate model. The multivariate model considered the correlation between the number of different conflict patterns. By comparing RMSE, the MVPLN model fitted much better than the other three. The total fitting accuracy of MVPLN was 73.8%, 81.3%, and 29.6% improved compared with the other three models, respectively. By comparing the DIC values, the MVPLN model had an advantage over the univariate PLN model. Therefore, the MVPLN model was optimal among the models applied in this study.
The conclusions and extensions of this study can provide a basis for the development of the functions in advanced driver assistance systems. Since the independent variables selected in this study, such as volume, average speed, the standard deviation of speed, and percent of heavy vehicles, can be obtained by roadside detectors, the real-time data collected by detectors can be used to predict the number of specific conflict patterns in practical applications. In future research, more efforts are needed to classify the driver’s risk-avoidance style in the whole process. By recording the historical style of a specific driver and designing corresponding driving assistance functions, drivers can then be provided with more accurate driving traffic guidance in the connected environment. This provides new ideas for the application of traffic conflict studies.
There were some limitations in this study. This study only analyzed conflict events where TTC curves had only one minimal point in high risk. More research is needed to explore conflict events with two or more minimal points. In addition, this study was conducted based on trajectory data, but the specific driver’s conflict style has not been discussed. In future studies, more attention needs to be paid to the conflict pattern of the same vehicle on multiple road segments.
Footnotes
Author Contributions
The authors confirm their contribution to the paper as follows: study conception and design: Ling Wang, Yunting Miao, and Wanjing Ma; analysis and interpretation of results: Yunting Miao and Ling Wang; draft manuscript preparation: Yunting Miao; revision: Ziliang He, Ling Wang, Wanjing Ma, and Mohamed Abdel-Aty. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is supported by National Natural Science Foundation of China (52131204, 52102415), Fundamental Research Funds for the Central Universities (22120220137).
