Abstract
Accurate and timely assessment of structural damage is critical in response to severe earthquake events. To this end, this study proposes a framework integrating ambient-vibration tests, multivariate features, and machine-learning (ML) models. The focus is to examine the capability of various ML models, including decision trees, random forest, eXtreme Gradient Boosting (XGBoost), Light Gradient Boosted Machine (LightGBM), and Category Boosting (CatBoost), in classifying the seismic performance levels of buildings. To reduce biases due to imbalanced class distribution, a simulated dataset is adopted to train ML models. Particularly, this dataset is generated from the nonlinear time-history analyses of surrogate structural models, whose dynamic properties are calibrated from prior on-site testing. The analyses show that the XGBoost model mostly outperforms others and achieves an average F1-score of 0.859 across all performance levels in the test sets. Moreover, SHapley Additive exPlanations (SHAP) analyses are performed to determine the dominant features for classification task with six critical features identified. The reduced-dimension XGBoost model attains similar average F1-scores as that using all examined features. The study also investigates cost-sensitive models that account for the asymmetrical consequences of performance levels misclassification. Lastly, the proposed method is validated using publicly available data from real-world structures with seismic monitoring and demonstrated for regional real earthquakes and hypothetical seismic risk assessments. The predictions from XGBoost models for real earthquake assessments generally agree with actual observations.
Keywords
Get full access to this article
View all access options for this article.
