Abstract
The development of artificial intelligence (AI) provides an opportunity for rapid and accurate assessment of earthquake-induced infrastructure damage using social media images. Nevertheless, data collection and labeling remain challenging due to limited expertise among annotators. This study introduces a novel four-class Earthquake Infrastructure Damage (EID) assessment data set compiled from a combination of images from several other social media image databases but with added emphasis on data quality. Unlike the previous data sets such as Damage Assessment Dataset (DAD) and Crisis Benchmark, the EID includes comprehensive labeling guidelines and a multiclass classification system aligned with established damage scales, such as HAZUS and EMS-98, to enhance the accuracy and utility of social media imagery for disaster response. By integrating detailed descriptions and clear labeling criteria, the labeling approach of EID reduces the subjective nature of image labeling and the inconsistencies found in existing data sets. The findings demonstrate a significant improvement in annotator agreement, reducing disagreement from 39.7% to 10.4%, thereby validating the efficacy of the refined labeling strategy. The EID, containing 13,513 high-quality images from five significant earthquakes, is designed to support community-level assessments and advanced computational research, paving the way for enhanced disaster response strategies through improved data utilization and analysis. The data set is available at DesignSafe: https://doi.org/10.17603/ds2-yj8p-hs62.
Keywords
Get full access to this article
View all access options for this article.
