Named entity recognition model for bridge inspection report with multi-features fusion and correction method

Abstract

Bridge inspection reports serve as valuable sources for assessing bridge condition and guiding maintenance, yet accurately extracting structured information from them remains a challenge. While advanced general-domain models and large language models (LLMs) have shown promise, they often fail to fully leverage unique inspection report text features such as specialized Chinese character properties and unique entity distribution patterns, while overlooking the existence of missed or misclassified entities. These failures lead to potential inaccuracies in critical entity extraction. To address these challenges, this paper proposed an NER model, MF-Attention-BiLSTM-CRF, that integrates Multi-Features (MF) fusion and a domain-specific correction method to accurately extract key information from bridge inspection reports. Unlike generic pre-trained architecture, this model introduces a relative position feature to capture the standardized reporting structure and a correction method to safeguard against the misclassification of vital structural defects. Furthermore, according to the characteristics of report texts, an improved Easy Data Augmentation (EDA) method was proposed to construct a bridge inspection report dataset with 23,192 characters and 2,752 entities. Experimental results demonstrate that the proposed model outperforms prior models, general domain models, and large language models, achieving an optimal F1 score of 89.3%. This work advances intelligent infrastructure management systems that strongly support intelligent early warning, predictive maintenance, and decision-making applications.

Keywords

named entity recognition bridge inspection reports data augmentation multi-features fusion knowledge graph

Get full access to this article

View all access options for this article.

References

Bogdanov

Constantin

Bernard

, et al. (2024) Nuner: Entity recognition encoder pre-training via llm-annotated data. In: proceedings of the 2024 conference on empirical methods in natural language processing, Miami, Florida, USA, November 12-16, 2024, pp. 11829-11841.

Che

Feng

Qin

, et al. (2021) N-LTP: an open-source neural language technology platform for Chinese. In: proceedings of the 2021 conference on empirical methods in natural language processing: system demonstrations, Online and Punta Cana, Dominican Republic, 2021, pp. 42–49.

Chen

El-Gohary

(2024) Deep learning-based coreference resolution for bridge report analytics. Construction Research Congress 2024: 249–258.

Choi

Bae

Kwon

, et al. (2025) Machine learning-based future performance prediction model for bridge inspection and performance data in South Korea. Advances in Structural Engineering 28(12): 2260–2275. https://doi.org/10.1177/13694332251327835

Cui

Yang

, et al. (2023) Fusion of softlexicon and RoBERTa for purpose-driven electronic medical record named entity recognition. Applied Sciences-Basel 13(24): 13296. https://doi.org/10.3390/app132413296

Sanyal

Mukherjee

(2025) Fine-tuned encoder models with data augmentation beat ChatGPT in agricultural named entity recognition and relation extraction. Expert Systems with Applications 277: 127126. https://doi.org/10.1016/j.eswa.2025.127126

Grattafiori

Dubey

Jauhri

, et al. (2024) The Llama 3 Herd of Models. arXiv preprint, arXiv:2407.21783. Available at: https://doi.org/10.48550/arXiv.2407.21783

Graves

Schmidhuber

(2005) Framewise phoneme classification with bidirectional LSTM networks. In: Proceedings. 2005 IEEE international joint conference on neural networks, Montreal, Quebec, Canada, July 31 - August 4, 2005, Vol. 4, pp. 2047–2052.

, et al. (2023) Data masking for Chinese electronic medical records with named entity recognition. Intelligent Automation & Soft Computing 36(3): 3657–3673. https://doi.org/10.32604/iasc.2023.036831

10.

Kim

(2024) Recursive label attention network for nested named entity recognition. Expert Systems with Applications 249: 123657. https://doi.org/10.1016/j.eswa.2024.123657

11.

Lai

Dong

Andriotis

, et al. (2024) Synergetic-informed deep reinforcement learning for sustainable management of transportation networks with large action spaces. Automation in Construction 160: 105302. https://doi.org/10.1016/j.autcon.2024.105302

12.

Lei

Xia

Dong

, et al. (2022a) Multi-level time-variant vulnerability assessment of deteriorating bridge networks with structural condition records. Engineering Structures 266: 114581. https://doi.org/10.1016/j.engstruct.2022.114581

13.

Lei

Xia

Komarizadehasl

, et al. (2022b) Condition level deteriorations modeling of RC beam bridges with U-Net convolutional neural networks. Structures 42: 333–342. https://doi.org/10.1016/j.istruc.2022.06.013

14.

Lei

Dong

Frangopol

(2023) Sustainable life-cycle maintenance policymaking for network-level deteriorating bridges with a convolutional autoencoder–structured reinforcement learning agent. Journal of Bridge Engineering 28(9): 04023063. https://doi.org/10.1061/jbenf2.beeng-6159

15.

Harris

(2019) Automated construction of bridge condition inventory using natural language processing and historical inspection reports. Nondestructive characterization and monitoring of advanced materials. Aerospace, Civil Infrastructure, and Transportation XIII: 28.

16.

Yang

, et al. (2021a) Bridge inspection named entity recognition based on Transformer-BiLSTM-CRF. Journal of Chinese Information Processing 35(4): 83-91. Available at: https://doi.org/10.3969/j.issn.1003-0077.2021.04.012

17.

Yang

, et al. (2021b) Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model. Advanced Engineering Informatics 50: 101416. https://doi.org/10.1016/j.aei.2021.101416

18.

Wang

, et al. (2025) Lightweight structural health monitoring and safety evaluation: review and case studies. Advances in Structural Engineering 28(12): 2157–2179. https://doi.org/10.1177/13694332251325043

19.

Qian

(2023) Type-Aware Decomposed Framework for few-shot Named Entity Recognition. In: Findings of the association for computational linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, pp. 8896–8911.

20.

Liu

El-Gohary

(2017) Ontology-based semi-supervised conditional random fields for automated information extraction from bridge inspection reports. Automation in Construction 81: 313–327. https://doi.org/10.1016/j.autcon.2017.02.003

21.

Liu

El-Gohary

(2021) Semantic neural network ensemble for automated dependency relation extraction from bridge inspection reports. Journal of Computing in Civil Engineering 35(4): 04021007. https://doi.org/10.1061/(asce)cp.1943-5487.0000961

22.

Liu

El-Gohary

(2022a) Bridge deterioration knowledge ontology for supporting bridge document analytics. Journal of Construction Engineering and Management 148(6): 04022030. https://doi.org/10.1061/(asce)co.1943-7862.0002210

23.

Liu

El-Gohary

(2022b) Improved similarity assessment and spectral clustering for unsupervised linking of data extracted from bridge inspection reports. Advanced Engineering Informatics 51: 101496. https://doi.org/10.1016/j.aei.2021.101496

24.

Liu

Yang

(2022) Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Systems with Applications 207: 117991. https://doi.org/10.1016/j.eswa.2022.117991

25.

Liu

Pang

Wang

, et al. (2022) Life-cycle maintenance strategy of bridges considering reliability, environment, cost and failure probability CO2 emission reduction: a bridge study with climate scenarios. Journal of Cleaner Production 379: 134740. https://doi.org/10.1016/j.jclepro.2022.134740

26.

Liu

Wei

Huang

, et al. (2023) Naming entity recognition of citrus pests and diseases based on the BERT-BiLSTM-CRF model. Expert Systems with Applications 234: 121103. https://doi.org/10.1016/j.eswa.2023.121103

27.

Liu

Zhang

Tong

, et al. (2025) A two-stage boundary-enhanced contrastive learning approach for nested named entity recognition. Expert Systems with Applications 271: 126707. https://doi.org/10.1016/j.eswa.2025.126707

28.

Moore

Glenncross-Grant

Mahini

, et al. (2012) Regional timber bridge girder reliability: structural health monitoring and reliability strategies. Advances in Structural Engineering 15(5): 793–806. https://doi.org/10.1260/1369-4332.15.5.793

29.

Morgese

Ansari

Domaneschi

, et al. (2020) Post-collapse analysis of Morandi’s Polcevera viaduct in Genoa Italy. Journal of Civil Structural Health Monitoring 10(1): 69–85. https://doi.org/10.1007/s13349-019-00370-7

30.

Qiu

Tian

Huang

, et al. (2024) Chinese engineering geological named entity recognition by fusing multi-features and data enhancement using deep learning. Expert Systems with Applications 238: 121925. https://doi.org/10.1016/j.eswa.2023.121925

31.

Shahrivar

Sidiq

Mahmoodian

, et al. (2025) AI-based bridge maintenance management: a comprehensive review. Artificial Intelligence Review 58(5): 135. https://doi.org/10.1007/s10462-025-11144-7

32.

Vaswani

Shazeer

Parmar

, et al. (2023) Attention is all you Need. arXiv:1706.03762.

33.

Wang

Zhu

Xiong

, et al. (2024) A few-shot word-structure embedded model for bridge inspection reports learning. Advanced Engineering Informatics 62: 102664. https://doi.org/10.1016/j.aei.2024.102664

34.

Wei

Lai

Shi

(2025) CMiNER: named entity recognition on imperfectly annotated data via confidence and meta weight adaptation. Expert Systems with Applications 275: 126987. https://doi.org/10.1016/j.eswa.2025.126987

35.

Weng

Tian

Zhu

, et al. (2017) Dynamic condensation approach to calculation of structural responses and response sensitivities. Mechanical Systems and Signal Processing 88: 302–317. https://doi.org/10.1016/j.ymssp.2016.11.025

36.

Lin

Leng

, et al. (2022) Rule-based information extraction for mechanical-electrical-plumbing-specific semantic web. Automation in Construction 135: 104108. https://doi.org/10.1016/j.autcon.2021.104108

37.

Guo

(2025) Advances in AI-powered civil engineering throughout the entire lifecycle. Advances in Structural Engineering 28(9): 1515–1541. https://doi.org/10.1177/13694332241307721

38.

Yang

, et al. (2022) Complex knowledge base question answering for intelligent bridge management based on multi-task learning and cross-task constraints. Entropy 24(12): 1805. https://doi.org/10.3390/e24121805

39.

Yang

, et al. (2023) BERT and hierarchical cross attention-based question answering over bridge inspection knowledge graph. Expert Systems with Applications 233: 120896. https://doi.org/10.1016/j.eswa.2023.120896

40.

Yang

Zhang

, et al. (2025) Qwen2.5 Technical Report. arXiv preprint, arXiv:2412.15115. Available at: https://doi.org/10.48550/arXiv.2412.15115

41.

Yang

Jin

Zhou

(2025a) Analysis of the hysteresis effect between temperature and bearing displacement of a steel truss bridge. Advances in Structural Engineering 28(15): 2942–2958. https://doi.org/10.1177/13694332251365968

42.

Yang

Xin

Tang

, et al. (2025b) Prediction method of condition degradation for network-level bridges based on U-Net plus plus convolutional neural network. Measurement 241: 115748. https://doi.org/10.1016/j.measurement.2024.115748

43.

, et al. (2022) S-NER: a concise and efficient span-based model for named entity recognition. Sensors 22(8): 2852. https://doi.org/10.3390/s22082852

44.

Zhang

El-Gohary

(2016) Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. Journal of Computing in Civil Engineering 30(2): 04015014. https://doi.org/10.1061/(asce)cp.1943-5487.0000346

45.

Zhang

Zheng

Liu

, et al. (2023) AWdpCNER: automated wdp Chinese named entity recognition from wheat diseases and pests text. Agriculture 13(6): 1220. https://doi.org/10.3390/agriculture13061220

46.

Zhang

Lei

Xia

(2025) Enhancing bridge inspection data quality using machine learning. Automation in Construction 175: 106182. https://doi.org/10.1016/j.autcon.2025.106182

47.

Zhou

El-Gohary

(2017) Ontology-based automated information extraction from building energy conservation codes. Automation in Construction 74: 103–117. https://doi.org/10.1016/j.autcon.2016.09.004