Abstract
Traffic crashes, especially rear-end collisions, fixed-object crashes, and rollovers, are common and severe, highlighting the need for comprehensive classification and management. However, research on classifying and managing imbalanced traffic crash data is limited. Therefore, this study proposes an analytic framework for imbalanced traffic crash type classification and management using a hybrid tabular transformer approach. The central idea is to analyze data using feature selection, multi-crash type classification, single-crash type clustering, and targeted safety recommendations. First, the multinomial logit model is used for feature selection, removing features with low correlation. Second, the Feature Tokenizer Transformer (FT-Transformer) with the Synthetic Minority Over-Sampling Technique (SMOTE) at varying ratios to perform multiclass crash classification is used. Third, mini-batch K-means clusters crash types based on key features are used. Finally, targeted safety recommendations for each cluster are developed. This study used 5,515 real-world traffic crash records from Anhui Province, China. The results show that: (1) the FT-Transformer model, with SMOTE at a 1:2:4 ratio, outperformed other machine learning models; and (2) rear-end, fixed-object, and rollovers were clustered into three, three, and five categories, respectively. Safety recommendations focus on traffic management (e.g., real time traffic updates), driver behavior (e.g., driving education), and road infrastructure (e.g., reinforced road markings).
Keywords
Get full access to this article
View all access options for this article.
