Abstract
The hybrid framework of extreme value theory (EVT) models and machine-learning-based anomaly detection methods has emerged as an effective approach for traffic-conflict-based crash estimation. This study investigates the generalizability of the hybrid framework across prevalent safety hierarchy models, namely pyramid-shaped and diamond-shaped hierarchies. Two anomaly detection methods—isolation forest (IForest) and minimum covariance determinant (MCD)—were integrated into the EVT modeling framework and compared with the conventional peak over threshold (POT) method. The methods were applied to two distinct datasets. One is the highD dataset corresponding to pyramid-shaped safety hierarchy, and the other is an intersection dataset corresponding to diamond-shaped hierarchy. The results reveal that IForest and MCD exhibit performance comparable to POT when applied to pyramid-shaped data with large sample sizes. However, these methods underperformed with diamond-shaped data because of its double-peak distribution, with performance improving when the data was reshaped to a pyramid shape. Furthermore, in the context of smaller datasets, an increase in the contamination rate for IForest and MCD was found to improve performance, surpassing POT in certain cases. Additionally, IForest displays sensitivity to the order of data, whereas MCD yields consistent outcomes regardless of the data arrangement.
Keywords
Get full access to this article
View all access options for this article.
