Abstract
Surrogate safety measures (SSM) have been used extensively in traffic safety studies for crash risk estimation. Most SSM-based studies employing extreme value theory (EVT) use the peak over threshold (POT) approach to detect anomalies or extreme events during safety-critical situations. This study investigated the efficacy of unsupervised machine learning (ML)-based anomaly detection methods as an extreme event sampling approach compared with the conventional POT sampling approach by developing bivariate EVT models for rear-end crash risk estimation on a freeway segment. Three widely used SSMs, namely time-to-collision (TTC), modified time-to-collision (MTTC), and deceleration rate to avoid crash (DRAC), were considered for the bivariate EVT modeling. Video data were collected from the selected segment of the I-40 expressway in Memphis, Tennessee. Among three SSMs, the combination of MTTC and DRAC bivariate EVT models provided the most accurate crash risk estimation (within the 99% confidence interval of the observed crashes), applying the traditional POT sampling approach, and ML-based isolation forest (iForest) and one-class support vector machine (OCSVM) sampling approaches. ML-based OCSVM sampling method provided a 21% crash estimation accuracy improvement over the POT and iForest sampling methods. Based on these findings, it can be concluded that unsupervised ML anomaly detection can be an effective sampling approach, reducing subjectivity in the threshold selection encountered in the POT sampling method. Safety improvement programs aim to maximize outcomes with limited resources, and an accurate estimation of the expected number of crashes helps engineers prioritize high-impact improvement locations.
Keywords
Get full access to this article
View all access options for this article.
