Abstract
The digitization of traffic control infrastructure (TCI) is essential for modern traffic management and autonomous mobility, yet it is hindered by the fragmentation of data across commercial, governmental, and open-source platforms. This study proposes and validates a systematic framework to integrate and evaluate TCI data from these disparate sources. Focusing on speed limit, stop, and yield signs across four corridors in the Dallas-Fort Worth area, we developed a ground truth dataset for robust comparison. Our methodology involves a multi-stage process of data pre-processing and harmonization, followed by spatial consolidation using the density-based spatial clustering of applications with noise (DBSCAN) algorithm to reduce redundancy. The quality of each individual dataset and the final integrated dataset was quantitatively assessed using accuracy, completeness, and reliability metrics. Our results reveal significant reliability disparities among the sources, with the commercial dataset (Mobileye) offering the most balanced individual performance. Critically, the spatially integrated dataset achieved consistently high reliability scores (98%–100%), significantly outperforming any single source. The primary contribution of this work is a practical, replicable framework that empowers transportation agencies to systematically evaluate and integrate heterogeneous data, thereby enabling the creation of more comprehensive and trustworthy digital TCI inventories.
Get full access to this article
View all access options for this article.
