Abstract
Timely detection of traffic accidents is crucial for enabling rapid emergency response and minimizing road disruptions. Existing surveillance systems often struggle with accurate classification in complex environments due to limitations in processing static and dynamic features. This study presents a Traffic Vision-based Fusion Network (TVFN), enhanced by a Temporal Attentive Inception Network (TAIN), to improve accident detection accuracy and reliability. The model fuses RGB and optical flow features using a dual-feature strategy and leverages temporal attention to capture subtle motion anomalies. Evaluation was conducted on two benchmark datasets the HWID12 dataset, comprising 12 accident categories, and the Accident Detection from CCTV Footage dataset with two classes (accident and non-accident). On the HWID12 dataset, the proposed model achieved an accuracy of 99.61%, precision of 99.62%, recall of 99.61%, and F1-score of 99.60%. Similarly, on the CCTV Footage dataset, it attained 96.9% accuracy, 96% precision, 99% recall, and 96% F1-score, outperforming existing CNN, R-CNN, and SVM-based methods. These results highlight the robustness and generalization capability of the proposed framework, offering a real-time, efficient, and reliable solution for intelligent traffic surveillance and road safety enhancement.
Keywords
Get full access to this article
View all access options for this article.
